This article summarizes two terms, CAP and ACID of data storage, which are commonly mentioned in system design (see the full list of concepts in System Design Introduction For Interview).
ACID is the abbreviation of Atomicity, Consistency, Isolation and Durability. All these properties characterize a transaction, a logical unit grouping several reads and writes.
- Atomicity (sometimes called abortability): All operations in a transaction are either committed or aborted. The partial commit is not allowed.
- Consistency: Certain invariant statements about the data. It is more an application property.
- Isolation: Each transaction virtually views it as the only transaction running on the entire database. Globally, all transactions are viewed as serial given the context that multiple clients can concurrent access the same data.
- Durability: The data committed will not be forgotten regardless of hardware or software fault. This is usually implemented with non-volatile storage, write-ahead log, and replicated database.
Systems not meeting the ACID criteria are called BASE (Basically available, soft state and eventually consistency), which is a vague concept.
In practice, there is no one technique that can provide absolute guarantees. Not all Databases honor ACID properties. Most traditional relational DBs provide the transaction semantics, BEGIN TRANSACTION, …, COMMIT, but NoSQL DBs forfeit ACID for performance and scalability in late 2000s. Recently NoSQL DBs (MongoDB and DynamoDB) began to support transactions because the designer has noticed the pain to maintain two types of DBs for high-volume, zero-data-loss data use cases.
The complete isolation greatly hurt the performance. Writes blocking reads are common. There are two eaker isolation levels with better performance.
Read committed: the most basic level of transaction
- When reading from the database, you will only see data see data that has been committed.
- When writing to the database, you will only overwrite data that has been committed.
It can be implemented by requiring write and read operations to access a lock first. However, for read operations, a write lock may block multiple reads. A better design is to record both the old value and the value to be committed, and serve the read operations the old value when a write is undergoing.
The read committed still suffer from read skew. “Design Data-intensive Applications” Chapter 7 Figure 7–6 gives a good example of account invariant violation given read committed. Other common cases are read skews during longtime backups, analytic queries and integrity checks.
Snapshot isolation and repeatable read
Snapshot isolation and repeatable read solves the read skew problems. It built snapshots of all committed transactions at timepoints. Regarding implementation
- for write, a transaction needs to access a lock for write.
- for read, no lock is required.
- the database stores multiple committed versions at certain time points, identified by a transaction ID, and only serves the latest snapshot when there is no in-progress transaction before that time point.
CAP is the abbreviation of Consistency, Availability and Partition tolerance in the context of networked shared-data systems.
- Consistency: Every read receives the most recent write or an error
- Availability: Every request receives a (non-error) response, without the guarantee that it contains the most recent write
- Partition tolerance: The system continues to operate despite an arbitrary number of messages being dropped (or delayed) by the network nodes
The theorem is any network shared-data systems can achieve at most 2 of 3 properties. As in the distributed network environment the partition tolerance is inevitable, designers have to choose between consistency and availability. The definition is simple, but there are many subtle issues behind the theorem.
The paper “CAP Twelve Years Later” thinks CAP theorem is misleading in practice, because in reality the trade-off of Consistency and Availability is in a continuous range between 0 to 100% instead of binary, and the trade-off can happen at very finer granularity such as operation level.
When to choose A and how to implement it?
The availability is favored when scalability of the data storage is required while the loss of data integrity is allowed. Most large-scale non-critical web services are in the case. For high availability systems, eventual consistency is usually guaranteed, which means the inconvergence of two nodes will be resolved at a certain future timepoint, without a deadline. This is a liveness but not a safety guarantee.
When to choose C and how to implement it?
Consistency is favored in critical situations like financial services. It is a severe event when the bank accounts do not align after a wired transfer because of inconsistency. There are three levels of consistency that are stronger than eventual consistency.
- Linearizability (atomic consistency, immediate consistency): the strongest level that guarantees the value read is the most recent. If concurrency writes and reads happen, it is guaranteed once a client’s read returns the new value, all subsequent reads will also return the new value. It can be implemented by Consensus Algorithms, e.g. ZooKeeper or etcd.
- Ordering Guarantees: guarantees the causality of the reads and writes. Compared to the total order linearizability, the ordering guarantee is a partial order.
- Sequence number ordering: a weaker guarantee that only guarantees the data storage is consistent with the timeline that a certain read happens after a certain write. It can be implemented by a sequence number or logical timestamp.
Consistency comes with cost. The stronger the consistency model is, the worse the performance is.
See more details in “Design Data-intensive Applications” Chapter 9.