MongoDB experts model the move from a relational database to MongoDB

By Andy Oram
April 8, 2010 | Comments: 4

A new O'Reilly book, MongoDB: The Definitive Guide has just gone up in a pre-publication (RoughCut) version. Because I've been reporting on NoSQL events recently, I was invited to interview the book's authors, Michael Dirolf and Kristina Chodorow of 10gen, the company that makes the open source MongoDB.

MySQL Conference and Expo Next week starts the annual MySQL conference, from which I'll be blogging, so I decided to spice up discussion a bit by asking Kristina and Mike about one of 10gen's pet topics, "from MySQL to MongoDB." This appeals to people who find MySQL too slow or a hassle to manage (even though it's fast and easy to manage compared to most relational databases)--people who want to move an existing project to MongoDB or just start a new one while shedding their old relational thinking.

First a bit of overview: MongoDB is a document store in the (not very hoary) tradition of CouchDB. Even among the category of projects loosely grouped together under the NoSQL umbrella, MongoDB is a fairly young entrant.

MongoDB is growing quickly in popularity because it offers a relatively rich range of features, while (according to its supporters) maintaining impressive speed. The features include built-in indexes (and secondary indexes), range queries, support for replication, and auto-sharding. A Map/Reduce function allows you to add to the aggregate functions natively supported and do large-scale jobs like nightly reports.

The main relational features missing from MongoDB are joins, foreign key constraints, and multi-row transactions.

Because of the particular combination of features supported by MongoDB, the advice in this blog might not apply to other NoSQL solutions.

Kristina and Mike said the migration of an existing project from a relational database goes through four overarching steps. Which do you think is the step that requires the most time and thinking?

  1. Get to know MongoDB. Download it, read the tutorials, try some toy projects.
  2. Think about how to represent your model in its document store.
  3. Migrate the data from the database to MongoDB, probably simply by writing a bunch of SELECT * FROM statements against the database and then loading the data into your MongoDB model using the language of your choice.
  4. Rewrite your application code to query MongoDB through statements such as insert() or find().

OK, so which step do you think takes the longest? And the answer is...step 2. Design is critical, and there are trade-offs that provide no simple answers but require a careful understanding of your application. Migrating the data and rewriting the application are straightforward by comparison.

Although MongoDB supports arbitrarily large and complex data structures (basically JSON, but in a binary format called BSON), Kristina and Mike say you'd do best to create many different stores for different types of data, just as you'd put them in different tables if you were using a relational database. For instance, in a classic social networking application, you would probably put all information about your users in one document and all your information about their postings in another.

MongoDB documents aren't divided up quite as much as relational databases in third normal form. If you are likely to use a data item in conjunction with a more major item--not on its own--you should probably embed the minor item with the major one. For instance, a relational database for a social networking application would probably have a separate table of tag, which would be represented through foreign keys in the table of postings. But in MongoDB, you'd just embed an array of tags with each posting. Yes, that's redundant. Your budget can handle it.

And querying by tag is still quick and easy. MongoDB has multi-key indexes, so you can index an array of tags and quickly look for all postings containing a particular tag.

Organizing documents by key concepts (user, posting) is relatively intuitive. It is not, however, quite like an object database. MongoDB users don't normally map documents tightly with objects in the application code.

So that's a little help from MongoDB experts for making a move from a relational database to MongoDB. Now I should talk to a MySQL or Drizzle expert about how to extract data from MongoDB into a relational database when you discover you need to do some heavy data mining using joins.

You might also be interested in:


Why are no Visual Basic 2010 examples?

There are "drivers" for major languages and frameworks that you could download and use with your favorite programming language. I know for fact that .Net driver is available in the form of a .Net library which is quite capable. I've been using it with C# on couple hobby projects.

Do you want to access MongoDB from Visual Basic, Swerner? I don't think that's a popular activity, but if you're motivated it shouldn't be hard to develop an interface to MongoDB in any reasonable modern language. The data structure and BSON storage make the data pretty transparent.

There are "drivers" for major languages and frameworks that you could download and use with your favorite programming language. I know for fact that .Net driver is available in the form of a .Net library which is quite capable. I've been using it with C# on couple hobby projects.

News Topics

Recommended for You

Got a Question?