PACELC theorem

From Wikipedia, the free encyclopedia
The tradeoff between availabiliy, consistency and latency, as described by the PACELC theorem.

In theoretical computer science, the PACELC theorem is an extension to the CAP theorem. It states that in case of network partitioning (P) in a distributed computer system, one has to choose between availability (A) and consistency (C) (as per the CAP theorem), but else (E), even when the system is running normally in the absence of partitions, one has to choose between latency (L) and loss of consistency (C).

Overview[edit]

PACELC builds on the CAP theorem. Both theorems describe how distributed databases have limitations and tradeoffs regarding consistency, availability, and partition tolerance. PACELC goes further and states that an additional trade-off exists: between latency and loss of consistency, even in absence of partitions, thus providing a more complete portrayal of the potential consistency trade-offs for distributed systems.[1]

A high availability requirement implies that the system must replicate data. As soon as a distributed system replicates data, a trade-off between consistency and latency arises.

The PACELC theorem was first described by Daniel Abadi from Yale University in 2010 in a blog post,[2] which he later clarified in a paper in 2012.[1] The purpose of PACELC is to address his thesis that "Ignoring the consistency/latency trade-off of replicated systems is a major oversight [in CAP], as it is present at all times during system operation, whereas CAP is only relevant in the arguably rare case of a network partition." The PACELC theorem was proved formally in 2018 in a SIGACT News article.[3]

Database PACELC ratings[edit]

[1]Original database PACELC ratings are from.[4] Subsequent updates contributed by wikipedia community.

  • The default versions of Amazon's early (internal) Dynamo, Cassandra, Riak, and Cosmos DB are PA/EL systems: if a partition occurs, they give up consistency for availability, and under normal operation they give up consistency for lower latency.
  • Fully ACID systems such as VoltDB/H-Store, Megastore, MySQL Cluster, and PostgreSQL are PC/EC: they refuse to give up consistency, and will pay the availability and latency costs to achieve it. Bigtable and related systems such as HBase are also PC/EC.
  • Amazon DynamoDB (launched January 2012) is quite different from the early (Amazon internal) Dynamo which was considered for the PACELC paper.[4] DynamoDB follows a strong leader model, where every write is strictly serialized (and conditional writes carry no penalty) and supports read-after-write consistency. This guarantee does not apply to "Global Tables[5]" across regions. The DynamoDB SDKs use eventually consistent reads by default (improved availability and throughput), but when a consistent read is requested the service will return either a current view to the item or an error.
  • Couchbase provides a range of consistency and availability options during a partition, and equally a range of latency and consistency options with no partition. Unlike most other databases, Couchbase doesn't have a single API set nor does it scale/replicate all data services homogeneously. For writes, Couchbase favors Consistency over Availability making it formally CP, but on read there is more user-controlled variability depending on index replication, desired consistency level and type of access (single document lookup vs range scan vs full-text search, etc.). On top of that, there is then further variability depending on cross-datacenter-replication (XDCR) which takes multiple CP clusters and connects them with asynchronous replication and Couchbase Lite which is an embedded database and creates a fully multi-master (with revision tracking) distributed topology.
  • Cosmos DB supports five tunable consistency levels that allow for tradeoffs between C/A during P, and L/C during E. Cosmos DB never violates the specified consistency level, so it's formally CP.
  • MongoDB can be classified as a PA/EC system. In the baseline case, the system guarantees reads and writes to be consistent.
  • PNUTS is a PC/EL system.
  • Hazelcast IMDG and indeed most in-memory data grids are an implementation of a PA/EC system; Hazelcast can be configured to be EL rather than EC.[6] Concurrency primitives (Lock, AtomicReference, CountDownLatch, etc.) can be either PC/EC or PA/EC.[7]
  • FaunaDB implements Calvin, a transaction protocol created by Dr. Daniel Abadi, the author[1] of the PACELC theorem, and offers users adjustable controls for LC tradeoff. It is PC/EC for strictly serializable transactions, and EL for serializable reads.
DDBS P+A P+C E+L E+C
Aerospike[8] Yes paid only optional Yes
Bigtable/HBase Yes Yes
Cassandra Yes Yes[a]
Cosmos DB Yes Yes [b]
Couchbase Yes Yes Yes
Dynamo Yes Yes[a]
DynamoDB Yes Yes Yes
FaunaDB[10] Yes Yes Yes
Hazelcast IMDG[6][7] Yes Yes Yes Yes
Megastore Yes Yes
MongoDB Yes Yes
MySQL Cluster Yes Yes
PNUTS Yes Yes
PostgreSQL Yes Yes Yes Yes
Riak Yes Yes[a]
VoltDB/H-Store Yes Yes

See also[edit]

Notes[edit]

  1. ^ a b c Dynamo, Cassandra, and Riak have user-adjustable settings to control the LC tradeoff.[4]
  2. ^ Cosmos DB has five selectable consistency levels to control the LC tradeoff.[9]

References[edit]

  1. ^ a b c d Abadi, Daniel J. "Consistency Tradeoffs in Modern Distributed Database System Design" (PDF). Yale University.
  2. ^ Abadi, Daniel J. (2010-04-23). "DBMS Musings: Problems with CAP, and Yahoo's little known NoSQL system". Retrieved 2016-09-11.
  3. ^ Golab, Wojciech (2018). "Proving PACELC". ACM SIGACT News. 49 (1): 73–81. doi:10.1145/3197406.3197420. S2CID 3989621.
  4. ^ a b c Abadi, Daniel J.; Murdopo, Arinto (2012-04-17). "Consistency Tradeoffs in Modern Distributed Database System Design". Retrieved 2022-07-18.
  5. ^ "Global tables - multi-Region replication for DynamoDB". AWS Documentation. Retrieved 4 January 2023.
  6. ^ a b Abadi, Daniel (2017-10-08). "DBMS Musings: Hazelcast and the Mythical PA/EC System". DBMS Musings. Retrieved 2017-10-20.
  7. ^ a b "Hazelcast IMDG Reference Manual". docs.hazelcast.org. Retrieved 2020-09-17.
  8. ^ Porter, Kevin (29 March 2023). "Where does aerospike fall in PACELC?". Aerospike Community Forum. Retrieved 30 March 2023.
  9. ^ "Consistency Levels in Azure Cosmos DB". Retrieved 2021-06-21.
  10. ^ Abadi, Daniel (2018-09-21). "DBMS Musings: NewSQL database systems are failing to guarantee consistency, and I blame Spanner". DBMS Musings. Retrieved 2019-02-23.

External links[edit]