While GoDaddy has a point about the opt-in component being important for deciding whether spamming took place, they certainly didn't need to release her personal information to the spammer. That's a terrible, serious breach of privacy.
A naive approach that might work without either party needing to divulge emails:
GoDaddy: "We have received complaints that you've been spamming. Give us a list of SHA-1 hashes of addresses of the people that opted in and show us how they opted in."
Customer: "Here's the list."
GoDaddy: "At least one complaint email we received does not match the SHA-1s on this list."
While GoDaddy has a point about the opt-in component being important for deciding whether spamming took place
I don't think they do have a point. If someone is spamming, why the hell aren't they just going to lie to GoDaddy about who opted in?
Even if they're not outright lying, lots of businesses have a very sketchy idea of what "opting in" means. I had my email address posted on a website once as a public contact. You would be surprised how many people consider that "opting in." When I used that email address to make sales enquirers, plenty of sales departments considered that an "opt-in" too.
The spam filters we had were fine for the outright trash, but the the flavor of spam that doesn't fall under the legal definition of spam was a nightmare.
I eventually had Constant Contact blacklist every single email alias I had at the organization because of how often I was "opting in" to email.
Whether someone is intentionally spamming or not, if they want to keep the registrar happy and stop getting complaints, they remove the email address of the complainant from the database if they receive a complaint. Unless there's a sufficiently large body of spam reports to automatically take action against the account of the alleged spammer - which may have happened after the OP's complaint - that's actually the best way to resolve the problem. 99.9999% of mailing list owners and blatant spammers alike are in it for the money rather than trolling complainants
I'm really struggling to see how GoDaddy could have a policy that fixes the issue of a person complaining about unwanted emails without disclosing the email of the person who doesn't want to receive the email any more.
"When I used that email address to make sales enquirers, plenty of sales departments considered that an "opt-in" too."
This seems like more of a gray area to me. Sending sales material to people who have actually sent queries to your sales department doesn't seem nearly as bad as spamming random people (as long as there's a clear, and working, way to turn the sales emails off if you don't want them).
Sending sales material to people who have actually sent queries to your sales department doesn't seem nearly as bad as spamming random people
And it's all not nearly as bad as sending me spam for horse pron websites. You can rationalize it however you want. Still doesn't make any of it cool. Why didn't the person I'm already in an email conversation with ask if I wanted to be on the list? Because they know I'd say no (especially when the conversation turns to the fact their company can't do anything for us). That's what makes it opt-out bullshit.
It's not something that gets me hot under the collar, even at it's worst it was a minor nuisance I dealt with over coffee. But after a year having a published address and 4 years of fallout afterwards, I've heard all the bad rationalizations for spam and they don't stand up. I have a polite and friendly "fuckoff" form letter for people without unsubscribe links. The second time I have to send it I CC the technical contacts in the domain's whois record. When someone gets upset or angry at me for doing so, I know damn well that they know they're lying when they try to justify their spam.
I did say "as long as there's a clear, and working, way to turn the sales emails off if you don't want them".
"It's not something that gets me hot under the collar"
Dude, you're emailing the domain's technical contacts, who likely have no say whatsoever in company sales policy. That sounds pretty hot under the collar to me.
Dude, you're emailing the domain's technical contacts, who likely have no say whatsoever in company sales policy. That sounds pretty hot under the collar to me.
Eh, I never thought of it as that big of a deal, just another task at work where something needs to happen or stop happening, and I only have a handful of routes to take. If asking the sales contact to stop didn't work, it turns out that most people don't make public the contact info for the sales managers' boss.
I'm not going to play cooperate politics somewhere I A) don't work B) have no ability to contact anybody with control over any policy and C) even if they were publicly accessible, don't understand why spamming isn't cool. It easy enough just to contact the dudes running the infrastructure used to spam me. And because they're techies and not salesmen they're actually nice people and already know this kind of behavior is unacceptable emailing. They might not have control over the policy, but they have something I don't: access to the people who can fix the policy or at least get me off the list.
It was actually the nice alternative to calling my netadmin. He was a very good admin, the emails would disappear from my inbox instantaneously, but when he checked the spam filter and marked true positives, his scripts made people wind up on email blackhole lists.
I read this as GoDaddy releasing her email address only. In theory, isn't an email address only personally identifiable if the address owner has done some action linking it to a real world identity? I assume that's the argument GoDaddy would make.
However, it should have been made abundantly clear to someone reporting spam that their email address may be disclosed to the accused party.
That may be true if your email address is pm@example.com, but less so if it is philip.mclelland@example.com. For example.
And no, reporting abuse should not carry the expectation of having anything about the reporter disclosed to the abuser. That would severely discourage the reporting of abuse.
Except of course if you deal in any way whatsoever with GoDaddy you should always expect the worst possible outcome.
So hypothetically speaking you'd be OK with receiving a message from a hosting provider you did business with accusing you of spamming an unspecified person at an unspecified e-mail address and threatening to terminate your account, leaving you with no way of knowing what actually happened?
The opt-in argument is useless since there is no way to verify that the user subscribed in the first place, giving them the address or not. All you do is providing value to the spammer since they have now verified that the email is indeed real and read by a person. When reporting abuse you can already forge any email out of nothing, and you cannot prove that the email was forged unless they have a trace of the email being sent by their server (logs), and if they have that trace they can see easily see a pattern of mass distribution and start an investigation by contacting the other recipients on that list, or just wait for more reports to come in. Guess it's been a while since I worked at an ISP, but I have never heard of a spam abuse investigation strategy that involves forwarding the address to the suspected spammer.
If I am innocent, I will tell the ISP that fist.last@example.com opted in, and I will be telling the truth. If I am a spammer, I will say the same, and I will be lying. So what difference does it make?
If GoDaddy released the email address, then all the person had to do was go Google that email address and most likely they would have found it. Or, they could find it using DomainTools whois lookup (if they didn't use whois privay on ALL domains they own at all times), or use Gmail or Google Plus to find out who is associated with that email address.
Once the email address is given out, then it's just as if someone had all their personally identifiable details.
Yes, I agree, even if it is firstname dot name @ whatever, you still can just google that email address and even get more information or look it up online to get photos, address information, etc. etc.
> they certainly didn't need to release her personal information to the spammer
At that point in the process, your premise that they are a spammer is flawed. They are an accused spammer. Even though Godaddy's customer service process isn't a courtroom, the principle of innocent before before proven guilty should apply when penalties could be applied.
A small business, individual, big company, anybody should have the right to have full information to adequately defend themselves from false claims.
You don't think there are unscrupulous small businesses out there that file false spamming claims on their competitors? That does happen.
Or you don't think that people actually do opt into email lists, forget it about, and then accuse a company of spamming a few months later? It also happens.
Small businesses making false claims are pathological and completely identifiable cases by a hosting company themselves, they don't need to give the 'accused spammer' _anything_ to verify that sort of thing.
If you even want to go with the 'courtroom' analogy, accused only get the chance to 'confront' their accuser in court, they don't get a dossier on them outside of court so they can do whatever they want. You know why? Because this type of thing would happen.
This is nothing short of harassment and defamation/libel
GoDaddy: "We have received complaints that you've been spamming. Give us a list of SHA-1 hashes of addresses of the people that opted in and show us how they opted in."
Considering that the spammer has the email addresses already, it would be as simple as forging a letter. Even fake a handwritten sign up form should "prove" it. No one is going to do a handwriting check to make sure it's actually correct.
So, that process assumes that the end users are technically competent. Based on the way it was phrased (it sounds like this guy just CCd everyone), that does not seem to be the case.
That also assumes that a person looked at this email before it was forwarded on. With a hosting company the size of Godaddy, that's unlikely.
OK, email me at f234567a360f54c1d31a70936f336bc679ba4f9f (sha1sum of an email address with no trailing carriage return or line feed[1]) and I'll believe you.
In general, the search space even for email addresses is probably too large for me to crack in a few days, but in the context above, where the author's email was already available online (on her website, in SPAM databases, in leaked credential datasets, ...), there is hardly any difference. In any case, if you consider my email address "personally identifiable information", I consider its checksum such information as well.
> In any case, if you consider my email address "personally identifiable information", I consider its checksum such information as well.
I wonder what the odds are on a hash collision from another email address (including abusing + addressing) that genuinely belongs to another person (rather than just exists) and therefore the resulting hash does not uniquely identify a single person.
The 'birthday attack'[0] article covers this pretty well, but if we take the output size of a SHA-1 hash as 160 bits, and assume it's outputs are equally distributed[1], a brute-force approach (equivalent to a non-maliciously generated accidental collision across all addresses ever)
is:
sqrt(2**160 * PI/2) ~= 1.5 x10**24
for there to be a 50% probability of a collision occurring.
(if I understood/got the maths right)
[0] https://en.wikipedia.org/wiki/Birthday_attack
[1] This is the intent of all hash functions, and I don't think there are any fundamental attributes of email addresses that would cause systematic bias in the output
Assume you have 1 billion (10^9) computers, each computer can do 1 billion hashing operations per second. That is 10^18 operations per second combined.
Rounding up, one day has 1 million seconds (10^6), and one year has 1000 (10^3) days. So, we have 10^27 ~= 2^90 operations per year.
100 million years is 10^8 ~= 2^27. So, you have 2^117 operations in 100 million years. Geologically, there was an Extinction Event [1] about every 100 million years (e.g. 66, 200 and 251 million years ago). So, having an (unintentional) hash collision in more than 128 bits (assuming a good hash function that has uniformly distributed hash) is less likely than an event happening within the next second that kills 50% of the Earth's species.
I'm not willing to answer the challenge, but I definitely believe it could be done. If someone was willing to purchase a large list of harvested e-mail addresses and sha1sum them all, it is very likely a commonly used address would show up in it. Now, if the address you used above is actually some single-purpose address similar to what I use for all my online accounts, that would not work, but I believe that very few people use dynamic partial addresses in that way. Not even the simple ones that gmail provides.
> The document then says that in 2011 he sent an email to “hundreds of atheists” with a link to his website and that I had reported him for violating GoDaddy’s policies against spam.
Give it to me in a list along with "hundreds" of red-herrings (let's say < 10000), and sure, no problem.
If you have the original list of addresses, and you are given a shasum, you can easily determine to which address the sum belongs. The proposals above do not indicate that GoDaddy should provide the sum to the e-mail sender though.
Umm. Just leaving this here for anyone who doesn't know - the whole point of hashing things like emails or passwords is that reversing the hash is very difficult (read: near impossible). Indeed, once it becomes feasible to do, the hash is no longer considered useful (for this purpose).
So no, given a hash you can't get the email easily. If this were the case, there would be no point in hashing passwords - might as well store them as plain text.
Password hashing algorithms make it a bit harder to guess passwords by doing thousands of iterations ("rounds") of hashing, in addition to adding a random salt to prevent creating a dictionary for common passwords.
However, e-mail addresses are generally short, human readable, and have a high probability of being at one of a handful of common domains. It would be easy to brute force your way through common e-mail address patterns at common domain names fairly quickly, if they were only protected by a single round of SHA1.
OpenSSL's benchmarking tool claims that one of my servers can do 30 million SHA1s per second given 64 bytes of input each. And we know from Bitcoin that GPUs and FPGAs can do many orders of magnitude faster than that.
How long would it take to get an arbitrary "firstname.lastname@gmail.com" given only its SHA1? The US Census reports that there are about 5,200 common first names and 89,000 common last names, for a total of around 460 million pairs or 15 seconds on my server to try all of them.
I suspect that with some heuristics to favor common e-mail address patterns, guessing at least half of a list of arbitrary e-mail addresses really wouldn't take that long.
A naive approach that might work without either party needing to divulge emails:
GoDaddy: "We have received complaints that you've been spamming. Give us a list of SHA-1 hashes of addresses of the people that opted in and show us how they opted in."
Customer: "Here's the list."
GoDaddy: "At least one complaint email we received does not match the SHA-1s on this list."