NoSQL Databases, in simple terms, are Non-Tabular Databases where data is not related to each other. Every record in a NoSQL Database stands on its own (i.e. does not have any predefined structure). According to a research study by StudyGrid, it has been found that NoSQL Databases are gradually taking over SQL Databases. As of now, the demand for SQL Databases in the market is 20.96% higher when compared to NoSQL Databases. However, this gap between NoSQL Databases and SQL Databases is slowly decreasing.
The 4 key types of NoSQL Databases are as follows:
- Key-Value Pair Databases: In Key-Value Pair Databases, every record in the Database only has 2 fields namely Key (is unique to a value) and Value. For example, if you want to store product information, you can have its Barcode as Key and the Product Details as Value.
- Document-Oriented Databases: Document-Oriented Databases are quite similar to Key-Value Pair Databases. They store and retrieve data in the Key-Value format but the value is stored as a document. The document is generally a JSON or XML file.
- Column Databases: A Column Database simply stores the data in tables. They have fixed rows but dynamic columns where values are stored contiguously.
- Graph Databases: As the name suggests, Graph Databases store data in a Graph format (i.e. with Nodes and Edges). Nodes generally store Values and Edges store the Relationships between the Nodes.
Working of NoSQL Databases
Data stored in a NoSQL Database is not related to each other i.e. every record is independent of each other. Changes made in a particular record do not affect the value of another record. It is an alternative to SQL Databases where data is stored in Tables with a fixed Database Schema. NoSQL Databases aren’t required to follow any predefined Relational Schema i.e. they are Schemaless. This means that the records in a NoSQL Database do not need to have the same structure. Each record can be completely different from one another.
A NoSQL Database generally uses the Key-Value Pair model to store all its data. As mentioned earlier, in Key-Value Pair, you have 2 fields i.e., Key and Value. You can have multiple servers (based on your requirement) known as Partitions in a NoSQL Database. Now the question arises, how can you pinpoint the location of a specific record in NoSQL Databases?
This is where the Key of a record comes into play. It determines on what partition a value is stored. NoSQL Databases use a Hash Function to convert each record’s Key into a number that falls into a fixed range known as Keyspace, say between 0-100. This Hash value and the range are then used to determine where the record should be stored.
The range of a Keyspace and the number of servers depends on the size of your Database. If the size of your Database is small and it doesn’t receive many requests, then you can seamlessly store your data on a single server. It will then be responsible for the entire range. However, if that server is becoming overloaded, you can add a secondary server, which implies that the range will be split in half. Considering the Keyspace example taken above, now the server will be divided into 2 parts. The first server will be responsible for all the records with a hash between 0 to 50, while the second server will be responsible for all the records with a hash between 51-100.
This is how NoSQL Database stores and retrieves data. It solves 2 problems for any individual/organization:
- To determine the location where a new record needs to be stored.
- To determine the exact location where existing data is stored.
You can calculate the Hash value of a record’s Key and keep track of which server is responsible for which Keyspace. Furthermore, each partition in a NoSQL Database is mirrored across various servers. When you create a new record in one of the NoSQL Database servers, it immediately creates a mirror image of that record and then copies it to the other in the background. This automatically gives you a backup in case of a Data Loss.
NoSQL Databases are mostly preferred by organizations that want to focus on narrow operational goals and at the same time produce high-grade Data Consistency.
Advantages of NoSQL Databases
Listed below are some of the key advantages of NoSQL Databases over SQL Databases:
- Easily Scalable: Unlike Relational Databases, NoSQL Databases can scale both Vertically and Horizontally.
- Flexible Schema: NoSQL does not have any fixed schema which makes it highly adaptive to the data it is storing.
- No Complex Relationships: The data stored in NoSQL Databases is not related to each other. Every record in the database is independent of each other i.e. modifications in one record will not affect another record. This allows you to flexibly scale NoSQL Databases, as per the business requirements.
- Easily Handle Large Volume of Data: The flexibility in scaling up the servers in NoSQL Databases allows you to store colossal volumes of data.
- Data Backup and Recovery: Each partition in a NoSQL Database is mirrored across various servers. It automatically creates a mirror image of the data that you store in one of its servers. This does not slow the processing of a NoSQL Database as it only takes a few milliseconds to perform this task. This maintains the Data Consistency in NoSQL Databases. Furthermore, it allows you to retrieve these mirror images in case of a Data Loss.
- Support Multiple Data Formats: As mentioned before, NoSQL Databases are Schemaless i.e. they do not follow any predefined Relational Schema to store and retrieve data. This makes a NoSQL Database compatible to store data in different data formats like Structured, Semi-Structured, and even Unstructured.
Disadvantages of NoSQL Databases
Listed below are some of the key disadvantages of using NoSQL Databases are:
- Lack of Reporting Tools: NoSQL does not have enough reporting tools for Performance Testing and Business Intelligence.
- Inadequate for Complex Queries: As NoSQL Databases do not have any specific schema, it becomes difficult to perform complex queries on the data.
- Does not Support ACID Properties: NoSQL Databases do not support ACID (Atomicity, Consistency, Isolation, and Durability).
- Compatibility Issues with SQL Queries: NoSQL Databases use their own characteristic query language which makes them incompatible with SQL (Structured Query Language) queries. This reduces the processing speed of executing a query.
Popular NoSQL Databases
Listed below are some of the most reliable and widely used NoSQL Databases:
- MongoDB: MongoDB is a Document-Oriented NoSQL Database. It is widely used by individuals/organizations that deal with the integration of 100s of different data sources on a daily basis. Since it is Document-Oriented, MongoDB allows embedded documents, arrays, and represents complex hierarchical relationships using a single record.
- ElasticSearch: ElasticSearch is a popular NoSQL Database by Apache Lucene. It provides real-time Distributed and Analytic Engines that are majorly used for Log Analytics, Security Intelligence, Business Intelligence, and similar use-cases. It is pro-actively used by some multinational companies like Udemy, Medium, StackOverflow, etc.
- DynamoDB: DynamoDB is a fully-managed Key-Value Pair NoSQL Database from Amazon Web Services (AWS). It is highly reliable for individuals/organizations working with Online Transaction Processing (OLTP) workload.
This blog introduces you to NoSQL Databases and how this works. Furthermore, it also provides pros and cons of NoSQL Databases that can help an organization to decide whether to go for a NoSQL Database or SQL Database