Commentary on the datastore benchmark


I’ve gotten some excellent feedback on the benchmark I posted a few days ago, titled “Hibernate is faster than I thought it was,” from Hibernate supporters who were a little upset, from GigaSpaces people who had some optimization tips, from people who actually prefer accurate benchmarks…

All valid, actually. So I’m rebuilding the speed tests, for version three. I have version two, and it has some illuminating data as well (as well as speeding up almost all of the operations!) but it’s still lacking some crucial pieces, so instead of building on a flawed base, I’m re-architecting the whole dadgum thing.

The next version will have:

  • More datastores supported. JDBC (via Spring’s JdbcTemplate), MongoDB, internal and external GigaSpaces, Hibernate (embedded and external DB), Java Serialization (yes, Java Serialization, which was the first one I implemented to test out the architecture) and any others I can throw together; the new project structure makes this kinda easy.
  • Multithreaded access to the datastores.
  • Cache key misses (the key generation phase generates keys that should not exist).
  • DB-optimal models (I’m willing to contort the DAOs to optimize for the specific datastore, which was required for MongoDB.)

I can tell you from revision two of the benchmarks: Hibernate still does well, but MongoDB and GigaSpaces still fly past it, with embedded GigaSpaces being the fastest, and MongoDB showing some very good times. We’ll see how well that goes in the next round.

I’m still leaving transactions in there, and yes, this is a throughput test; one comment pointed out that I was testing Hibernate cache quite a bit (rather than Hibernate itself), to which I say: right on, brother. I’m not configuring the cache all that much; I’m turning on ehcache and leaving it at that.

Turning off the cache would be horribly unfair to Hibernate, and that violates the whole idea of seeing how it performs with sort-of-fake data.

I’m nowhere near done.

  • I have the Java Serialization done, and most of the JDBC DAO (still haven’t finished query by example, and I’m still testing the rest of it.)
  • The data model isn’t complete yet; I still haven’t added attributes to the model. I’m still debating whether to make them searchable or not, which would translate the model to an RDF-like container. It will also slaughter performance for things that manage relationships externally (i.e., Hibernate, JDBC.) It also violates the spirit of the original purpose (which was to model external procedure calls as messages) so I’m okay with not making them searchable, but I’m pretty sure people would want it.

More as I finish more of the benchmarks.

Related posts:

  1. Hibernate is faster than I thought it was.
  2. New Article: “Considering Datastores.”
  3. Rocket Surgery: What is Inversion of Control?
  4. Modules aren’t expensive enough to avoid them.
  5. Darn location-oriented datastorage systems.

, , ,

  1. No comments yet.
(will not be published)

Powered by WP Hashcash


Rss Feed Tweeter button Linkedin button Digg button Stumbleupon button Youtube button