When looking at databases, I’m a huge proponent of the fact that no database is “the best”. When choosing a database to use, it’s all about your application and the database that is the “better” choice for your application today. I say “today” because as your application evolves and the databases of the world evolve, the database you choose today might not be the optimal choice two years from now. For this reason, I also believe you should build your application to be relatively easy to decouple from the database and use a different one instead.
Lets talk about my experiences when looking for a NoSQL database…
Now a quick note before I get into my experiences with NoSQL. These are my experiences and relate to my problem and are based on what I saw. Use this information for what it’s worth, but don’t use it as the sole source in your decision making process, because it’s a highly informal.
So… to get to the comparison of these database… for our application, we needed the following out of a database:
- Schema-less (or very rapid schema modifications)
- Fault tolerant
- Performs reasonably
- Can store a JSON style documents/rows (with multiple levels of nested arrays)
- Can update pieces of the JSON document, including nested arrays, without replacing the entire document.
- Can easily query based on parts of the document (arbitrary field indexing)
OK, so a decent list of requirements. I did some research on the various NoSQL options out there and found the following:
- Cassandra – Looked at this briefly but found too many issues online listing problems like poor documentation, no real commercial support available, and also found that it wouldn’t support all my requirements without having to put a good chunk of effort into my application.
- Couchbase – Looked at this one. Thought it might be good, Downloaded it. Couldn’t get it quickly up and running, so I gave up. I gave up quickly because I had another option (MongoDB) and usually having to fight with something at the start is indicative of your future experiences with it. It might be a fantastic product, but I just gave up early.
- MongoDB – Downloaded this, installed it, and found it was VERY quick to get up and running on my Windows laptop (for development). I did some tests and research and found that online it was generally regarded as a reasonably stable and reliable product. So, I decided to move forward with this one…
I built our app but eventually found MongoDB did not support the ability to individually update single array elements in arrays nested two levels deep. I know that sounds like a pretty far fetched requirement, but in the case of this app, it was necessary. There’s an open feature request for MongoDB to handle this but it has been open for several years with no obvious plan to address it. So, I decided to try a different database engine to see if I could make this work better.
On the recommendation from a friend, I checked out Neo4j which is a graph database. I looked into it and the concept of graph databases seems very cool. I could definitely see the use cases and thought maybe those could help with our app in the future. It meant that I would need to store the arrays in the object as multiple nodes with relationships inking them together. Based on what I read, the database is really fast and pulling multiple related nodes, so I didn’t think we’d have a performance problem. The thing I didn’t account for was creating those nodes and relationships. One query I used to create an object (150 nodes + 150 relationships) ended up taking 19 seconds to run. Ouch. That’s not going to work. I could create the same object in MongoDB in milliseconds. Seems like a great product, but I wouldn’t be able to get past the write performance.
I kept looking but couldn’t find a product which would work for me. So… I decided to go back to MongoDB but I added some more logic into my application by building a mid-layer using Node.js to take care of the features that were lacking in MongoDB.
Overall, I’d say that MongoDB is the most popular NoSQL database for a reason. It’s easy to use, easy to learn, has a ton of driver support, and generally offers a LOT of features.
As I said at the beginning. These are my experiences, not a formal comparison of the products… so do your own research and make sure to consider what you are using it for.