MySQL vs MongoDB

When designing a Drupal website one often chooses a relational database to store the data of a website. MySQL is a good example of this. But from the moment a website has a lot of data, with a large amount of fields, a relational database is no longer performant enough. In such a situation it is interesting to choose a NoSQL database, such as MongoDB. But what exactly are the big differences between a relational and a NoSQL database? We would like to explain this for you.

For the development of a Drupal website with lots of fields and data, MongoDB is the ideal choice.
Tom Hollevoet
 / 
Drupal expert at Calibrate
Tom
Database schema

Relational database

When installing a Drupal website, a relational database management system (RDBMS) will be chosen by default. Usually a MySQL database is chosen, but other relational databases available for Drupal are MariaDB, SQLite and PostgreSQL.

The data at one's disposal is stored in different tables. In these different tables the data is then divided over different columns.

Relationships in an RDBMS

In an RDBMS, a table may have a relationship with the second table. For example, a table 'cities' can have a reference to the table 'countries' by means of a foreign key, a key that refers to another table.

Drupal fields

Relational database applied to a Drupal website

When creating fields on a content type, different tables are created in the database. For example, for the content type 'Person', with fields 'field_person_name', 'field_person_email' and 'field_person_country', three tables will be found in the MySQL-database.

In Drupal, however, it is possible to create a relation to another content type, by means of an 'Entity Reference'-field. In the example above, 'Country' is a content type and 'field_person_country' is an entity reference field.

Relational database: Drupal problems

In the backend of Drupal, the data of a certain node will be retrieved by a SELECT query with different JOINs to the different fields and tables. The more JOINs need to be applied to a query, the more time it will take to execute this query. Since a RDBMS is vertically scaled, one will have to increase the resources (CPU/RAM) of the web server to increase the speed and feasibility of these queries.

If a certain node has 500 fields, then 500 JOINs must be executed in the back-end. This can lead to performance problems. The web server must have a larger capacity to execute this query, and the execution time of this query will take more time. In such a situation, a relational database is no longer a good option. A NoSQL database is then a better choice.

NoSQL-database

With a NoSQL database one no longer uses the typical JOINs that occur with an RDBMS. The data will no longer be stored in different tables, but with the NoSQL database the data will be stored in different structures such as documents and collections.

The big advantage of NoSQL is that it can be scaled horizontally. This means that the load can be distributed across the various resources of the web server.

MongoDB logo

MongoDB

MongoDB will store the data in the form of a document structure. A document can be compared to one large JSON-file, where the data is stored using field names and values. In this way, a field name can be changed very quickly and, as with an RDBMS, the table structure does not need to be changed. The data is in one document and is no longer divided over different tables, which is the case with an RDBMS.

Mongo_db_drupal

MongoDB in a Drupal-website

For Drupal 6, 7 and 8 there is a contrib module that offers the possibility to store Drupal data in MongoDB. The contrib module has several sub-modules to implement different functionalities of Drupal in a MongoDB database.

Because of these different submodules the functionality cache, field_storage, session, watchdog, block and/or queue can be stored in a MongoDB-database, and no longer in a RDBMS like MySQL.

They are one or more of the above functionalities that are stored in the MongoDB database. The other functionalities and (standard) tables of Drupal remain stored in the RDBMS (MySQL). This means that the RDBMS will still be used.

In this article we will take a closer look at the field_storage submodule.

Drupal fields 0

Drupal 7 Field Storage in MongoDB

In Drupal 7, the RDBMS (MySQL) contains a 'field_config' table where all field configurations (machine name, field type, cardinality...) are stored. This table also contains a storage_type/storage_module column. By default in a Drupal installation this is set to field_sql_storage, so that the data of these fields is stored in the RDBMS.

As soon as you choose to install the mongo_field_storage contrib submodule, new fields will be created with as storage_type/storage_module value mongodb_field_storage. This setting is installed by default when activating the mongo_field_storage module by equating the Drupal variable field_storage_default with mongodb_field_storage.

mongo db node object

hook_field_storage_X

In the backend of the mongodb_field_storage module there are several 'hook_field_storage_X' functions that are called up by the fields that have mongodb_field_storage as storage_module/storage_type.

The most important implementations are: 

Writing a value to the MongoDB database (mongodb_field_storage_write)
Retrieving a field from the MongoDB database (mongodb_field_storage_load)
Performing an Entity Field Query (mongodb_field_storage_query).

1 document per node

The most important change compared to MySQL is that a node will be stored as one large document and no longer spread over different tables.

The node is still stored with its Node ID and other parameters in the node table in the RDBMS. All data of the fields of the node will be stored in MongoDB. Per node, one document will be created.

The big advantage is that when retrieving a node, only one document has to be retrieved from the MongoDB database. With the standard RDBMS, five hundred different joins would effectively have to be executed for five hundred fields in order to retrieve the data from the different tables.

Revisions and other data

Both the revisions and the current data (at mongodb_field_storage) are stored in the MongoDB database. The different elements that could be found in the 'field_sql_storage' in the different columns of the table (such as language, delta...) will be stored as objects in the document.

MongoDB Field Storage module in praktijk

In practice, the MongoDB Field Storage module will be used in a situation with a lot of fields/data.
There are some remarks to use this sub module.

Problems with Views

The contrib module Views will create standard queries suitable for relational databases. There are some solutions for this:

Solution 1: Search with facets (Search API/Solr)
MongoDB is chosen for a large number of nodes/fields. In such a situation the Solr technology (Search API) is often chosen. This technology will index the nodes with the chosen fields, in order to search the data faster. Views can make use of the data indexed by Solr.

Solution 2: Not all content types 'convert' to MongoDB
As we told before, a choice can always be made not to 'convert' certain fields/content types to MongoDB, but to still use the default storage_type/storage_module as storage (field_sql_storage).

Solution 3: Using EFQ Views
Contribution module EFQ Views will use EntityFieldQuery as query back-end.

Paragraphs/Field Collections 
Since MongoDB is not yet widely used (about five hundred Drupal websites), there are still some bugs in this module.

Calibrate is maintainer of the MongoDB contrib module and we have made sure that MongoDB will work with Paragraphs and Field Collections.

Directly addressing MySQL database (db_query)
There is always a chance that certain (contrib) modules will directly use a MySQL query by means of db_query. A golden tip is to always test all new functionalities when using new contrib modules. This way, any problems can be solved faster.

If you choose to use MongoDB, it is best to install and configure it at the start of your project, so that people can notice the above problems faster.

Summary: MongoDB in Drupal

PRO

  • You can retrieve a node with its different fields in one go. 
  • Horizontal scalability.
  • It is suitable for large data volumes and large amounts of nodes/fields.

CON

  • It does not use relationships (JOINs).
  • It is not compatible with Views by default.
  • Certain (contrib) modules are not directly compatible, modifications are necessary.