CAP Theorem & NoSQL Databases
The CAPtheorem is also known as Brewer’s after computer scientist Eric Brewer. It was published as the CAP principle 1999 and was presented as a conjecture by Brewer at the Symposium on principles of Distributed computing (PODC) in 2000.
What is CAP Theorem?
CAP theorem states that it’s impossible for a distributed data store to simultaneously provide more than two out of the following three,
➊ Consistency
Every read receives the most recent data written to the data store.
➋ Availability
Every request receives a responce, but no guarantee that it contains the most recent write.
➌ Partition tolerance
The system will continue to operate despite an arbitry number of messages being delayed or dropped among the network nodes.
Systems are categorized under three categories by the CAP theorem.
CP (Consistency and partition tolerant)
Systems under this category are often misunderstood as consistency and partition tolerant but never available, but it is referring to a set of systems where availability is sacrificed only in the case of a network partition.
CA (Consistency and Availability)
Systems under this category are consistent and avalilable systems in the absence of any network partition. single node’s db servers are often considered as CA system as they do not need to deal with partition tolerance.
AP (Availability and partition tolerance)
Systems under this category are available and partition tolerant but cannot guarantee consistency.
What is NoSQL?
It is a non-relational DMS that doesn’t need a fixed schema and mainly centered on the concept of distributed database. It’s commonly known that NoSQL databases are databases that store data in a format other than relational tables.
Comparing to SQL databases many find modeling relationship data in NoSQL to be easier because related data doesn’t have to be split between tables.
NoSQL data models allow related data to be nested within a single data structure.
Documented Databases
examples: MongoDB, CouchDB, OrientDB, RavenDB
Graph Databases
examples: Infinite Graph, Flock DB, InfoGrid, Neo4J
Column Based
examples: BigTable, Hypertable, HBase, Cassandra
Key-value Databases
examples: DynanoDB, Redis, Scalaris, Memcached
NoSQL Databases in CAP Theorem
CA (Consistency and Availability)
examples: MongoDB, Hbase, Redis, etc
AP (Availability and partition tolerance)
examples: CouchDB, Cassandra, DynamoDB, Riak ,etc
This is a very common diagram that shows database categorization in CAP theorem, but this distribution is not entirely correct because categorizing all RDBMS under consistency is problematic as all reads and writes go to a single node/server and also there were some other conflicts raised.
The Blockchain technology sacrifices consistency for availability and partition tolerance but managed to achieve consistency through validation among the nodes over time challenging the CAP theorem.
Also, the CAP theorem is often misunderstood as one has to abandon one of the three guarantees but the real choice is between consistency and availability only when a network partition or a failure happens, at other times no trade-off has to be made.
Conclusion
In this blog post, we have analyzed the cap theorem and how NoSQL databases are categorized under them.
I hope you found this article useful!
Thank you,
See you soon 😊