PagerDuty • Span the WAN? Yes you can!
Most Cassandra usages take advantage of its exceptional performance and ability to handle massive data sets. At PagerDuty, we use Cassandra for entirely different reasons: to reliably manage mutable application states and to maintain durability requirements even in the face of full data center outages. We achieve this by deploying Cassandra clusters with hosts in multiple WAN-separated data centers, configured with per-data center replica placement requirements, and with significant application-level support to use Cassandra as a consistent datastore. Accumulating several years of experience with this approach, we've learned to accommodate the impact of WAN network latency on Cassandra queries, how to horizontally scale while maintaining our placement invariants, why asymmetric load is experienced by nodes in different data centers, and more. This talk will go over our workload and design goals, detail the resultant Cassandra system design, and explain a number of our unintuitive operational learnings about this novel Cassandra usage paradigm.