Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yeah, the title is misleading because really the problem described in the blog post is to store some unknown number of ints between 1 and 10,000. The better solution than a straight bitmap of 10,000 bits is to use all 4KB and devote three bits to each slot. Then you can remember up to 7 occurrences of each integer.

There's no need for bloom filters here because you can already cover all the possibilities using a bitmap.



Additionally you could decide to only store up to 6 occurrences of each integer in the bit array, use the value of 7 as an indicator that there are 7 or more occurrences and store the actual counts for those cases in a sorted array occupied by the remaining 346 bytes left in the 4KB page.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: