Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Even if your data doesn't all fit in page cache with cdb, the 8 bytes per entry (hashcode & offset) of overhead in the main hashtable absolutely will, so even if you're too big for memory, you get single-seek-per-lookup.

The only problem with it is the 32bit offsets mean a 4GB max. Not too hard to fork it for 64bit offsets, though.



We ended up using sparkey as the first backend for hammerspace, but hammerspace was written to support multiple backends. We benchmarked both cdb and sparkey, and their performance was very similar for our use case. At under 100mb of data, we weren't concerned about the 4G limitation or about the data not fitting in cache. I don't think sparkey has the 4G limitation, and someone has already forked cdb to support 64 bit offsets: https://github.com/pcarrier/cdb64




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: