CouchDB and MongoDB announce new products involving replication

By Andy Oram
August 10, 2010

In different ways, CouchDB and MongoDB are both using replication for some valuable enhancements to their open source products. I'll describe the new feature for each in this blog.

CouchDB on Android devices

To me (and probably to many other people), used to thinking of NoSQL as big data solutions, it's a surprise to hear that CouchDB installs and runs on Android now. They let you store a CouchDB database on your device and sync it with other systems through bidirectional replication.

CouchDB's built-in architecture is perfectly suited for this kind of use. Incremental updates minimize traffic, while automatic conflict detection and resolution allow people in different locations to update the same document (both versions are stored and returned to queries).

But why would you use CouchDB instead of a native SQLite database? First, because of its built-in sync that insures you have the most up-to-date data on each system. And, secondly, CouchDB is designed for unreliable systems. A failure on a mobile device cannot leave its database in an inconsistent state.

Selective replication, which is already in CouchDB, let's you choose parts of a much larger, enterprise-sized database to store on your mobile device, such as business contacts pertinent to you. But small small databases can easily fit entirely on a cell phone. (Just think of how much less space is occupied by the text of contact information than by a single MP3 or JPEG. CouchDB founder Damien Katz mentioned that a device's keyboard software could take up 11 megs of space on the SD card, about four times as much as a typical company contact database.)

The Android communicates over HTTP, allowing it to pass through the firewalls around most cell phones. HTML also makes it easy to display the data on the device. A native Android API is planned for the future.

Asynchronous replication with auto-sharding in MongoDB

MongoDB's new version 1.6 contains these two complementary features. I talked last week to Dwight Merriman, CEO of the company that developed MongoDB under an open source license, 10gen. They are holding a webinar on the features today. (You can also find at out about chances to attend live conferences as their event site.)

Sharding had been in the product earlier but was not considered production-ready. Auto-sharding has been requested by users for some time, and has been added in 1.6. Each shard or partition can be placed on a set of nodes that contain a single master that handles writes along with one or more slaves.

Data can be partitioned by any key. For example, if you choose a "user" key as the shard key and configure three shards, the system will sort the data by the user field and split the data into three approximately equal parts.

Replication, load-balancing, and failover are all handled automatically in fairly familiar ways. Merriman said their partitioning is comparable to BigTable. The master streams data at very fast intervals to the slaves. The replication is asynchronous--in other words, the slaves do not have to acknowledge receipt of the data.

Nevertheless, Merriman says that MongoDB falls more on the side of strong consistency than eventual consistency. On a local network, the slaves should be only a few milliseconds behind the master. Replication over a wide-area network has a slightly longer latency, of course, but Merriman says the asynchronous architecture is very effective for replication over the Internet.

Reads can be handled by any node in the set, master or slave. If a master fails, the slaves elect a new master, and if the old master comes up again it take on a slave role.

Overall, the new features should make MongoDB attractive to sites that use clusters. Transaction durability on a single server will be added to the 1.8 release.


You might also be interested in:

News Topics

Recommended for You

Got a Question?