The 'birthday attack'[0] article covers this pretty well, but if we take the output size of a SHA-1 hash as 160 bits, and assume it's outputs are equally distributed[1], a brute-force approach (equivalent to a non-maliciously generated accidental collision across all addresses ever)
is:
sqrt(2**160 * PI/2) ~= 1.5 x10**24
for there to be a 50% probability of a collision occurring.
(if I understood/got the maths right)
[0] https://en.wikipedia.org/wiki/Birthday_attack
[1] This is the intent of all hash functions, and I don't think there are any fundamental attributes of email addresses that would cause systematic bias in the output
Assume you have 1 billion (10^9) computers, each computer can do 1 billion hashing operations per second. That is 10^18 operations per second combined.
Rounding up, one day has 1 million seconds (10^6), and one year has 1000 (10^3) days. So, we have 10^27 ~= 2^90 operations per year.
100 million years is 10^8 ~= 2^27. So, you have 2^117 operations in 100 million years. Geologically, there was an Extinction Event [1] about every 100 million years (e.g. 66, 200 and 251 million years ago). So, having an (unintentional) hash collision in more than 128 bits (assuming a good hash function that has uniformly distributed hash) is less likely than an event happening within the next second that kills 50% of the Earth's species.
The 'birthday attack'[0] article covers this pretty well, but if we take the output size of a SHA-1 hash as 160 bits, and assume it's outputs are equally distributed[1], a brute-force approach (equivalent to a non-maliciously generated accidental collision across all addresses ever) is:
for there to be a 50% probability of a collision occurring. (if I understood/got the maths right)[0] https://en.wikipedia.org/wiki/Birthday_attack [1] This is the intent of all hash functions, and I don't think there are any fundamental attributes of email addresses that would cause systematic bias in the output