I doubt the hashmap would beat the XOR method from the article. Hashmaps means allocations, and it means hashing. Hashing is going to be at least as much work as XORing a couple values in the mask. Hashmaps also means chasing pointers all over memory and ruining your cache.
You can add an early return to the OPs XOR method by just checking if the char is already in the bit mask. This will be faster if the word size is large.
the domain/range is fixed, so you don't have to allocate, actually, because you can have a fixed sized table and perfect hash :)
That said, I agree you can make the bit munging faster in the end, i'm just saying i don't think the speed up is anywhere near the improvement from early-exiting.
Even if the hashmap library knew that the only valid keys were 'a' .. 'z', it wouldn't be magically faster. The best it could do is use basically the same code as a hand-rolled implementation.
Bit operations and shifts take a single clock cycle, and the mask can be stored in a register throughout the entire loop.
If "early exit" brings any improvement, I don't see why the best wouldn't be to combine the two solutions instead of choosing one or the other.
> Even if the hashmap library knew that the only valid keys were 'a' .. 'z', it wouldn't be magically faster. The best it could do is use basically the same code as a hand-rolled implementation.
This is known as a perfect hash[1]. knowing that you will never have collisions does allow for a faster implementation. The hash map can be backed by an array which will never need to be resized, and you don't have to fiddle with linked lists to chain collisions.
You're correct though, that this is something you will have to implement yourself. Library hashmaps are going to trade performance for general usefulness.
You can add an early return to the OPs XOR method by just checking if the char is already in the bit mask. This will be faster if the word size is large.