NoSQL databases are not anything new. The concept behind storing data in non-relational document or key-value formats has been around for decades. However, it's not until recently that they have become a big contender in the database game mainly thanks to the cheaper cost of storage devices.
You see, back in the day (60's and 70's), storage was restrictively expensive. You couldn't just store a billion records of data on a small 1-inch drive like you can today. Developers had to come up with data storage mechanisms that were optimized for storage space and that could be used on a wide variety of servers for a wide variety of business needs.
This is why relational databases (such as SQL Server and MySQL) became a popular choice. Thanks to database normalization principles, which you can read more about here, developers were able to reduce duplicate data and to build relationships between this data in order to reduce file sizes and reduce costs.
Jump to the 2000's though and things start to change.
What is NoSQL?
NoSQL does not necessarily imply the absence of SQL (Structured Query Language). It mainly implies being "different" than your typical SQL based language. The main difference between both paradigms is that NoSQL does not rely on a static predetermined relational schema in order to model data. Instead, it relies on something else, such as JSON styled documents or key-value pairs to store and to retrieve data.
There are a variety of methods used in NoSQL databases to store data, those including documents, key-value pairs, graphs and wide-column tables. But I will focus on JSON based databases as a baseline for examples moving forward.
Take your typical User database schema for example. On a standard RDBMS, you would end up with a table that resembles the following.
fname |
lname |
email |
bob |
smith |
mail@mail.com |
And any other related data might or might not be in a separate table in which you case you will need to create relationships between this and all of those tables. This includes creating foreign keys, primary keys and candidate keys.
That same data in a NoSQL database could be modeled in a JSON object like the following:
{fname: 'bob', lname: 'smith', email: 'mail@mail.com'}
You are essentially reducing the overhead that comes with having to maintain a relational model (the table), with a simple ready to use object.
JSON is only one style of NoSQL storage mind you. There are currently 4 popular styles used by various vendors and they each have their own implementation and use cases.
4 types of data storage
The first type of NoSQL storage, is the one that I discussed above, and that is the document based approach.
Document - Document databases store data in documents similar to JSON objects found in JavaScript. Values can benefit from having a variety of types (strings, numbers, booleans, arrays, objects) like those found in the various programming languages. MongoDB is a popular example of a document based database.
Key-value - Key value databases store data as a collection of key and value pairs in which each value can only be retrieved using its respective key. These type of databases are ideal for storing large amounts of data particularly if you are not concerned with complex querying logic. A few use case examples are storing user preferences or in caching data. Redis and DynamoDB are good examples of key-value databases.
Wide-column - Wide-column databases store data in table rows and dynamic columns. This means that each row is not required to have the same columns, as you would find in a relational database model. You can think of these as 2D key-value databases. Use cases can include storing user profile data as well as IOT data. A few examples of wide-column databases are Cassandra and HBase.
Graph - Graph databases store data in nodes and edges. Nodes store information about people, places and things, while edges store information about the relationships between these nodes. These type of databases are ideal for when you need to traverse relationships to look for patterns, such as in social networks and recommendation feeds. A few examples of these include Neo4j and JanusGraph.
A few benefits
So when does it make sense to make the switch from a traditional relational model to a NoSQL model? Well, as it turns out, it might be more often than you think. The biggest benefit that NoSQL databases offer are in their highly flexible and dynamic nature. With traditional tabular models, you are locked in to a schema early on in the development phase. This can both lock you into rigid coding practices or it could make changes a very expensive task later on in the software life cycle.
By loosely allowing for structured, semi-structured and unstructured data to be collected, you can improve development time while at the same time handling database control back to the developers as opposed to a database administrator.
Because NoSQL databases are essentially just large collections of document objects, you also have the benefit of scaling outwards almost indefinitely. What does that mean exactly? Well, it means that if your database unexpectedly requires an increase in storage space you can make the upgrade without resulting in any downtime on your servers, as you would find with a traditional database.
And lastly, because the storage format is already in a programmer friendly format (JSON), a developer does not have to spend any extra time building out data-access-layers to convert from your typical database schema to a usable object.
All in all to say that NoSQL databases these days are ideal for rapid agile development environments with constantly changing data requirements that don't necessarily have a need for complex querying logic.
Try it out
For anyone interested in playing around with a NoSQL database, I recommend starting out with MongoDB as they offer a free-tier to get started and initial setup time is only a few minutes.