Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Very, very small.

The 'birthday attack'[0] article covers this pretty well, but if we take the output size of a SHA-1 hash as 160 bits, and assume it's outputs are equally distributed[1], a brute-force approach (equivalent to a non-maliciously generated accidental collision across all addresses ever) is:

    sqrt(2**160 * PI/2) ~= 1.5 x10**24
for there to be a 50% probability of a collision occurring. (if I understood/got the maths right)

[0] https://en.wikipedia.org/wiki/Birthday_attack [1] This is the intent of all hash functions, and I don't think there are any fundamental attributes of email addresses that would cause systematic bias in the output



To put things into perspective:

Approximately, 10^3 = 1000 ~= 1024 = 2^10, 10^2 = 100 ~= 128 = 2^7.

Assume you have 1 billion (10^9) computers, each computer can do 1 billion hashing operations per second. That is 10^18 operations per second combined.

Rounding up, one day has 1 million seconds (10^6), and one year has 1000 (10^3) days. So, we have 10^27 ~= 2^90 operations per year.

100 million years is 10^8 ~= 2^27. So, you have 2^117 operations in 100 million years. Geologically, there was an Extinction Event [1] about every 100 million years (e.g. 66, 200 and 251 million years ago). So, having an (unintentional) hash collision in more than 128 bits (assuming a good hash function that has uniformly distributed hash) is less likely than an event happening within the next second that kills 50% of the Earth's species.

[1] http://en.wikipedia.org/wiki/Extinction_event




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: