Avoid Gotchas While Starting up with MongoDB
When starting to work with a new technology in a development or sandbox environment, we tend to run things - as much as possible - using their default settings. This is completely understandable, as we're not familiar with what all of the options are. With MongoDB, there are some important issues you might face if you leave your setup in the default configuration. Here are some of our experiences getting started trying out MongoDB on virtual machines both locally and on some Amazon EC2 instances.
Versions
Ensure that the version of MongoDB that you're using has the features you're going to use. For example, if you installed the "mongodb" package from the default debian repositories as of this writing (v 1.4.4) , you'd find that some of the commands used for sharding are available, but sharding isn't actually supported in that version, so this incompatibility is masked. Ensure you get the latest (as of this writing, it's 2.0.2) by adding downloads-distro.mongodb.org/repo/ubuntu-upstart to your repos and installing the package named mongodb-10gen.
Check that your OS version is compatible with what you're going to use it for. MongoDB recommends a 64-bit OS due to file-size limitations in 32-bit OS's.
Why is it taking so long to start up?
- Check to see if journaling is turned on.
If you're on a 64-bit install, version 1.9.2+, journaling is enabled by default. MongoDB may decide it needs to preallocate the journal files, and it waits until the files are allocated before it starts listening on the default port. This could be on the order of tens of minutes. You definitely don't need this option if you're just kicking the tires and trying to see what mongodb can do. Restart with --nojournal.
- What type of filesystem is your database directory mounted on?
A lot of the popular Linux AMIs available in Amazon's list have the ext3 filesystem mounted by default. It's recommended to use ext4 or xfs filesystems for your database directory, due to the file allocation speed. This is especially noticeable if MongoDB starts allocating journal files, as in the above. If you're using an AWS instance, you'll avoid this problem if you set up a RAID10 filesystem for your data directory, as shown here.
Disk Space
Another issue is disk space. If you leave settings to their default and you're on a VM or a machine with limited disk space, you could very well start hitting your limits soon. Even if you are starting up a configserver, it will end up taking up another 3GB if you're not careful. Our recommendation is to use the --smallfiles flag as you're starting, stopping, and configuring, until you figure out what you're doing. As an example, we followed this page to create a sharded database on a debian VM with about 16GB of disk space, and it quickly ballooned to this:
moss@moss-debian:~/mongodb$ du -h .
3.1G ./a/journal
204K ./a/moveChunk/test.data
208K ./a/moveChunk
4.0K ./a/_tmp
3.3G ./a
3.1G ./b/journal
4.0K ./b/_tmp
3.3G ./b
3.1G ./config/journal
4.0K ./config/_tmp
3.3G ./config
9.7G .
Bottom line: use "--smallfiles" in your command line flags or in your /etc/mongodb.conf files until you are actually running in an environment that has the required disk space.
Splitting and Balancing
As Jeremy Zawodny outlines in his excellent blog post “MongoDB Pre-Splitting for Faster Data Loading and Importing”, it is important to understand how MongoDB manages shards. By default, documents are grouped into 200MB chunks which are mapped to shards, and then moves those chunks between shards as the balancer attempts to manage the load. If you’re doing a large data migration, however, this can be tricky. Check Jeremy’s post for some great advice on pre-splitting while maintaining acceptable performance levels.
Spaces in Configuration Files
It's not in the documentation anywhere, but another gotcha was in a configuration file - for example a line like this, with multiple values for a single parameter:
configdb = ip-10-172-31-109.us-west-1.compute.internal,ip-10-170-209-104.us-west-1.compute.internal,ip-10-172-169-38.us-west-1.compute.internal
If you have spaces before or after the ",", the setting will not parse. Just ensure it's a single string with no spaces as it is above.
Archives
- May 2013
- March 2013
- February 2013
- January 2013
- December 2012
- November 2012
- September 2012
- August 2012
- July 2012
- June 2012
- May 2012
- April 2012
- March 2012
- February 2012
- January 2012
- December 2011
- November 2011
- October 2011
- September 2011
- August 2011
- July 2011
- June 2011
- May 2011
- April 2011
- March 2011
- February 2011
- January 2011
- December 2010
- November 2010
- October 2010
- September 2010
- November 2009
- March 2008
- November 2007
- October 2007


Comments
Reply