There are a number of built in tools and commands which can be used to get important information from MongoDB but because it is relatively new, it can be difficult to know what you need to be doing from an operational perspective to ensure that everything runs smoothly.
This is the 5th in a series of 6 posts about MongoDB monitoring based on . View the series index here.
MongoDB Monitoring dashboard + alerting
We provide a service to keep an eye on the health and performance of your MongoDB database cluster automatically and with alerting. .
mongostat
This is a tool that is included with the standard MongoDB distribution package which allows you to view statistics about your database servers in real time.
If you are running mongod locally on its standard 27017 port then you can just start mongostat and it will automatically connect. Otherwise, you can specify one or many hostnames + ports to connect to. The screenshot above shows all the servers in our MongoDB cluster.
Most of the columns are self explanatory (e.g. insert = inserts per second) and noted in the documentation but a few are worth mentioning specifically:
- flushes – this shows how many times data has been . MongoDB only physically writes data to disk every 60 seconds (by default). This has the effect of increasing performance but can decrease durability because a hard crash inbetween flushes will result in that data not being written, and therefore lost. v1.8 solves this with the option for but this stat shows how often mongod is flushing data to disk.
- faults – the faults column shows you the number of Linux page faults per second. This is when Mongo accesses something that is mapped to the virtual address space but not in physical memory. i.e. it results in a read from disk. High values here indicate you may not have enough RAM to store all necessary data and disk accesses may start to become the bottleneck.
- locked % – shows the % of time in a global write lock. When this is happening no other queries will complete until the lock is given up, or the lock owner yields. This is indicative of a large, global operation like a remove() or dropping a collection and can result in slow performance.
- % idx miss – this is like we saw in the server status output except instead of an aggregate total, you can see queries hitting (or missing) the index in real time. This is useful if you’re debugging specific queries in development or need to track down a server that is performing badly.
- qr|qw – when MongoDB gets too many queries to handle in real time, it queues them up. This is represented in mongostat by the read and write queue columns. When this starts to increase you will see slowdowns in executing queries as they have to wait to run through the queue. You can alleviate this by stopping any more queries until the queue has dissipated. Queues will tend to spike if you’re doing a lot of write operations alongside other write heavy ops, such as large ranged removes.
mongostat is useful because it shows what is happening in your cluster right now. This is particularly handy to quickly find out which member of your replica sets is master right now – the final column shows this. If you start seeing slowdowns or suspect a problem with MongoDB, mongostat should be your first point of call to quickly locate where the problem is.