"All data is relational, ergo, you should use a relational database"

Post by Jalmari Ikävalko

Almost weekly, I see an online discussion about disadvantages of NoSQL with the above argument being thrown in at some point. Almost all data tends to start to form relationships with other data, so you should use a relational database. Since most of the hot NoSQL databases, like Mongo, aren't relational, you should never use them.

This argument is supported by largely circulated articles such as Why You Should Never Use MongoDB. It takes the assumption that Mongo is inept at modeling data relationships such as a single actor appearing in several TV shows and movies.

And it's absolutely wrong.

Now, instead of getting into why the data relationship modeling scenario presented in that article is in itself terrible, I'm simply not going to bother to defend Mongo. There are many perfectly valid reasons to dislike Mongo, including an absolutely atrocious security record (if you aren't aware of it, suffice to say that for years Mongo shipped accepting connections from anywhere on the internet, with no password set by default. The default installation would literally allow anyone to connect to and read from your database). But this particular argument of most data being inherently relational and thus unsuitable for non-relational databases is terribly misled. It contains a fundamental misunderstanding of what it means for a database to be relational.

There's a very specific definition for the term "relational database". It does not refer to a database "good at modeling relationships between pieces of data". Relational database refers strictly to a database utilizing the relational model. To make a huge oversimplification, a database using the relational model is a database that groups data into fixed-order named tuples, e.g. tables!

That is all. There's nothing else to a relational database. Would you say that graph databases such as Neo4j are not suitable for modeling data relationships simply because they are non-relational databases?

Touting relational databases as the only solution to modeling data relationships makes no more sense than claiming that only linked lists should ever be used for data structures. Actually, I'd go far enough to claim that relational databases such as MySQL are not even particularly amazing in modeling data relationships. You don't get unique keys by default. You have to define all relationships manually by first creating that key for your tables and then referring to it yourself in all the other tables. You need to ensure referential integrity yourself. Now don't go misunderstanding this as me saying that MySQL was unsuitable for modeling relationships; it has many benefits that Mongo for example lacks, such as multi-transaction consistency, which is quite relevant in modeling data relationships.

Generally speaking, the reason why relational databases are pretty awesome is mostly in the fixed structure of their data. The problems in querying and updating that data are very well known and the solutions are extremely optimized as of today. Making fast and complex JOINs and so on is a breeze (given that you're familiar with SQL and certain best practices) and you got great consistency and reliability guarantees.

But this does not automatically translate to a massive advantage in modeling data relationships.

So, go ahead and dislike Mongo and dislike NoSQL databases all you want, but please stop using this brain dead argument that data with relationships requires a relational database. It's not true, it contains a terrible misunderstanding, and it's not helping anyone make better choices or tools in the future.