Quick, five minutes introduction to NoSQL
You are here because you want to know about NoSQL! And you are in the correct place.
This technical article will be easier for people who are familiar with database concepts.
Table of Contents:
- What is NoSQL?
- What led us to this NoSQL era?
- Who uses NoSQL?
- Why should anyone opt for NoSQL instead of RDBMS?
- Key advantages of NoSQL
- Categories of NoSQL
- How to Query NoSQL?
- Are there any Challenges in using NoSQL?
- So, what’s in store for NoSql?
What is NoSQL?
NoSQL databases focus on scalability, fast response times, and availability, and give up atomicity and consistency.
This tradeoff is formalized via the CAP Theorem.
Brewer’s CAP Theorem
CAP stands for Consistency, Availability and Partition tolerance.
These are the three core systematic requirements that exist in a special relationship when it comes to designing and deploying application in a distributed systems
Consistency: perception from the users that a set of operations occurs all at once
Availability: All operation must be performed with an appropriate response time.
Partition tolerance: All operation must operate must complete, even if one component/node is down/broken.
Brewer’s CAP Theorem states that within distributed systems, one can only guarantee the two out of the three in CAP
The prime benefit of NoSQL is performance and scalability
By moving the responsibility of access control, correlating related data, conflict resolution and maintaining integrity constraints to a programmatic layer separate from the database, NoSQL engines are able to achieve exceptional performance and scalability.
Some of the key features of NoSQL are horizontally scalable DB system, Non-Relational, Distributed data, and Data stored in a redundant manner [hence failure of a server does not impact performance].
Also NoSQL is Open Source.
What led us to this NoSQL era?
There are many reasons to this. I am listing a few here.
- Explosion of social media sites (like Facebook, Twitter) with large data needs
- Rise of cloud-based solutions such as Amazon S3 (simple storage solution)
- Just as moving to dynamically-typed languages (Ruby/Groovy), a shift to dynamically-typed data with frequent schema changes
- Open-source community
Who uses NoSQL?
Here’s a few bigshots listed for you.
- Amazon Dynamo
- Google BigTable
- Facebook Cassandra (50 TB for Inbox search)
- LinkedIn Voldemort
- Twitter FlockDB
Why should anyone opt for NoSQL instead of RDBMS?
All applications don’t need relationships between data to be explicitly stored along with the data. When you don’t need this fundamental feature of RDBMS, it is obviously time to look for alternatives.
Over time, RDBMS based data intensive applications has struggled with,
- Indexing large number of documents
- Serving pages on high-traffic websites
- Delivering Streaming media
These type of performance degradation due to RDBMS can be avoided by using NoSQL.
Quick differences between RDBMS and NoSQL
RDBMS |
NoSQL |
Typical RDBMS implementations are tuned for small but frequent read/write transactions and large batch transactions with rare write access. |
NoSQL can service heavy read/write workloads |
Meant for structured data |
Can handle structured and unstructured data |
Strong consistency |
Eventual consistency |
Big dataset |
Huge datasets |
Scaling is possible |
Scaling is easy |
Good availability |
Very high availability |
Key advantages of NoSQL
- Improves performance by eliminating the overhead associated with SQL-based database
- Huge data storing capability (larger than the Biggest RDBMs can handle and store)
- Less effort needed to maintain the data
- It’s much faster and cheaper than RDBMs
- Flexible data Models