Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Discovered: Botnet Costing Display Advertisers over $6,000,000 per Month (spider.io)
128 points by blahpro on March 19, 2013 | hide | past | favorite | 87 comments


As much as the HN crowd might rail against walled gardens (most notably iOS) and managed platforms like ChromeOS, every time I read one of these posts I think that for the vast majority of people, it's the best thing for them.

Botnets typically don't spread in a sophisticated way. Most of the time it's spam emails or dodgy ads with "hey! install this random .exe file and you can have emoticons in Outlook!"

I think Chrome has shown us the advantages of an automatically updated browser. The future in personal computing I think lies squarely in an automatically updated (even managed) sandboxed environment.

This isn't to say that's right for everyone of course.

But how much fraud, extortion, DDoSing, identity theft, invasion of privacy (eg ratting), etc do people need to put up with before they demand a better way?

EDIT: to address two points:

1. Side-loading is orthogonal to the issue of a sandboxed managed environment. I agree users should be able to side-load. Most won't know how and won't care and that's a Good Thing [tm];

2. Sure the central server can get compromised but the thing is botnets rarely spread in a sophisticated fashion. It's all social. The Facebooks, Apples and Googles of the world have far more experience and a far better track record in dealing with these kinds of threats.


>The future in personal computing I think lies squarely in an automatically updated sandboxed environment

- When the future moves to sandboxed environments, (financial) risk will move there too and attacks will target that future in the sandbox, and attacks will develop with technologies allowed within that sandbox. Disallowing anything but port 80/443, appeared to be a secure environment 10yrs ago, and now javascript, cookies, and the web sites incur many more diverse types of attacks than earlier, within what was allowd.

- A managed, self updated environment, or a walled garden, already gives the authority a bigger power on every activity and that's often a bigger cut (no botnet can dream of 30% of all transactions...) and a draconian power than sporadic illegal attacks to some business. And this is a bigger risk overall, to diverse businesses, than this poor botnet poses.


I don't disagree with your conclusions about managed platforms being better for enduser security, but for what it's worth a vast majority of botnet payloads are being loaded by drive by exploits, email based infections are generally rather targeted these days because spray and pray email infections typically get caught in spam filters.

While the bulk of the drive bys come from plugins that don't exist on your classic walled garden (ios) webkit is far from invulnerable. ios (and others) are quite resistant, but so was ASLR at one point - malware rises to the occasion.


The case about walled gardesn is the same old argument but in new clothes.

Every so often, someone comes along and says that people don't know what is good for them, so its better that someone decide for them rather to let "them" decide.

Sometimes it happens in politics, and we get dictatorships. Sometimes it happen in the market, and we get illegal monopolies. Sometimes it happen with the police force, and we get a police state. Sometimes it happens in crime, and we get mafias. Sometimes it happens in families, and we get forced marriages. And sometimes it happen with software, and we get walled gardens.

Deciding for people by removing all choices are not "a better way". We just tend to forget what the cost are of losing the ability to choice for our self. Its that, or we are just saying that all software around us is just of such small importance that being allowed to decide your own fate is not an important thing.


Indeed its a matter of basic education really: most people don't know about spammers or how well organized they are but the moment they learn about their tactics they catch on quick and stop falling for link bait blogs and comment spam.

Better that than not being able to install what you want because "the garden protects us!"


cletus: I agree, but also think that lack of diversity in user environments is a big contributing factor to this problem. With 90%+ of people using only three computing platforms (Windows, Android, iOS), criminals and other sociopaths have a powerful economic incentive to develop antisocial viruses and botnets. So long as only two or three user environments remain dominant, people will be exposed to this kind of 'Irish potato famine' risk.[1]

--

[1] http://en.wikipedia.org/wiki/Great_Famine_%28Ireland%29#Pota...


This is a great point and you should totally start a competitor.


Are you suggesting he start work on an new computing platform because the existing market lacks diversity?

While I agree with the point that diversity begets vulnerability, I don't think starting a competitor would be the most effective idea. Monocultures usually have strong forces that keep them in existence. With bananas, it's an inability to reproduce. With invasive plants like garlic mustard, it's allelopathy. With the industr(ies) we're talking about, it's the existence of massive inter-indusrial coalitions between hardware manufacturers, software companies, infrastructure providers, and the like that instigate the lack of diversity.

If cs702 tried to take action by building a competitor, he would have about the same chance of success a sapling in a field of kudzu. Looking at the factors that cause the monoculture and finding a way to disrupt them, I think, is a more promising idea.


Sandboxing is great for the typical user.

But we should never stop demanding that we can install our own certificate roots for signed binaries to verify against, and that users are allowed to make their own decisions about allowing access outside the box (if they think they are capable).


Honestly, this should be requirement #1 in every system. Give the user an easy (but not easily exploitable, of course) way to unlock the system.


Sandboxing only helps so much. You can run a browser extension in a sandbox that defrauds advertisers.


A fresh verified binary (including all extensions) can be loaded into the sandbox each time the user opens the browser, so the infection would be limited to the current session only.


The sand-boxing problem is not infecting people with binary's they don't want. It's a social attack where users installing bonzi buddy style addons that have both wanted and unwanted behavior. Or people just clicking yes to download that video viewer software, etc. Because, in the end all sand boxing does is buy's you a popup which people are trained to click on.


What about a sandbox where all programs that came from the internet were locked in except for verified binaries. Then make it difficult to add new certificates, like require the user to enter a password (perhaps different from their login). Have microsoft say no program should require this, and as long they do a decent job of adding certificates themselves most users will not even bother learning how to.


For those users, I guess we'd hope they don't install their own certificates and Microsoft/Apple/Google/Canonical can invalidate the Bonzi Buddy certificates for bad behavior.

You're right that it's a difficult problem and probably unsolvable but IMO that doesn't mean we can't reduce it to much less than it is currently.


Totally agree, and I think part of the popularity of this whole 'web application' swing is because browsers like Chrome et al are so well sandboxed also. It makes it almost entirely risk free to automatically download and run random code (javascript), something that would be entirely unthinkable if done natively in today's environments.


"The future in personal computing I think lies squarely in an automatically updated (even managed) sandboxed environment."

Until the day where the main server serving the automated updates gets compromised and instead of serving, say, an updated Chrome, it serves a version of Chrome which a) is compromised on the behalf of a botnet master and b) never ever accepts any other update automatically.

Because people have been trained never to update their browser themselves anymore they'll think everything is fine.

Because hundreds of millions --if not billions-- of people are running Chrome suddenly you have the biggest botnet ever out there.

I find it very interesting that you fear extortion, exploits, DDoS, identity theft, invasion of privacy and whatnots as an argument for putting something in place which potentially can be way more destructive.

But of course this shall never happen right? Just as we haven't seen FaceBook getting penetrated and just as we've seen any major bank getting hacked right? And rogue employees also don't exist right?

Be careful about what you wish in the future...


It is really just choosing between lesser of 2 evils.

1.) A million people using $BROWSER, never updating, continually downloading free_ipad.exe

2.) A million people being owned by an intelligent hacker because of a fault of Google, that is relatively quickly patched, and I'm sure someone will figure out something to disable those rouge Chrome installs.

The first scenario is much, much more likely to happen, and while the second could happen I doubt, Google would sit on their hands while it does happen.

I guess the only problem is in the second issue, its not your fault if you got screwed. Its your mom's, and just your mom's, problem if she downloads free_ipad.exe, but if Google is hacked all the sudden all your info is comprised and it wasn't your fault.

Considering these two options however, IMO, I'd rather place my faith in Google overloads keeping us all safe.


How is this different to the status quo of hundreds of millions running old exploitable versions of $BROWSER, infected with malware which hijacks your internet connection? All you've done is swap one delivery mechanism for another.

You don't fix this by switching out automatic updates for manual ones, because users will blindly install an update even if it's a badly-designed 'update dialog' popup on a shady website. You fix it by ensuring users can trust their software, and care enough to do so.


Did the "centralized anything == bad" camp show up or something? This is a surprisingly hyperbolic rant about browser security.


I don't think many home users are using Outlook.


Until she got an iPad my mother ran Outlook. It’s all she knew how to operate for a mail program, and she "wasn’t learning something new".

I don’t think she was alone - I’d bet a ridiculous number of $300 netbooks have a $100 Windows OS and $200 MS Office Suite installed on top.


and my mother has been using yahoo mail forever... I still doubt most home users actually use outlook (most home users wouldn't even know how to set it up).

It's very much an enterprise / business thing for the most part.


Strictly speaking, you are probably right. Outlook has often been bundled with the higher tiers of the Microsoft Office suite and omitted from the lower tiers. But Outlook Express and Windows Mail? Certainly there are many home users using them. They've been bundled with Windows or IE for years.

I don't think this distinction affects the grandparent comment's point.


It doesn't matter if it's bundled or not. How would any grandparent know how to set up a mail client?

Most computer illiterate people I know have always used hotmail or yahoo mail (slowly changing to gmail) rather than setting up pop3 or imap access to a mail server....


Seriously?


Well that leaves the real questions: what sites were the bots clicking ads on, who owns those sites and which ad networks were they using?


> spider.io has observed the Chameleon botnet targeting a cluster of at least 202 websites. 14 billion ad impressions are served across these 202 websites per month. The botnet accounts for at least 9 billion of these ad impressions.

The odd thing is that either these sites are already very big, or there are others ways they are getting 5 billion ad impressions.

A list of these 202 websites would be informative. I guess a number of them could be fake, to throw the scent off?


Perhaps a coincidence, but this AdWeek report[1] flags a number of "ghost sites" that offer huge volume of impressions for sale on the exchanges and have real advertisers but don't seem to have any actual human beings present. One site in particular they mention usbuildingdigest.com has a very large number of DSP and other data integrations[2]. Suggestive to say the least. A number of other sites are mentioned in the AdWeek report.

[1] http://www.adweek.com/news/technology/meet-most-suspect-publ... [2] http://imgur.com/WHxmmHo


This list of sites will change in 60 days. This sites are like pop-up store fronts for ads.


Very informative and, though it is not explicitly stated, we can infer that this evidence cuts to the point of how low some competitors will stoop to exploit pitfalls of web advertising.

The team at spider.io has found a great niche and has impressive results - I always enjoy seeing posts like these pop up from them. Keep up the great work!


I'm fairly confident all of the revenue from one of my sites comes from botnet ad clicks. I use CloudFlare and when I set it up at first I used the standard settings for blocking bots. My ad revenue flat lined. Took the botnet protections off and my ad revenue went right back to what it was before.


haha, I like this post.

Are you gonna leave it? I must admit, screwing advertisers doesn't feel so wrong, I'm certainly no fan, but you can't ignore that it's not very ethical.

Given that it's not technically your fault, you'd very likely never get blamed, and your actions will likely not change the world in any way, what will you do?


Before you blacklist the IPs listed in the article, it might be worthwhile to query your transactional history and verify real purchases are not occurring on those addresses.

When I did this, a few of the IPs had a significant number of orders. Interestingly, the IP with the most orders mapped to E! corporate headquarters.

http://www.networksolutions.com/whois/results.jsp?ip=208.78....


The only guaranteed antidote to this kind of fraud is performance advertising (pay per sale). I think pay per click and pay per impression, though arguably useful for brand advertising, will always be vulnerable to sophisticated scams like this.


There is no guarantee against this, which is part of what keeps web advertising prices low. Fraud is essentially priced into the product. The title is misleading, as it assumes all cost is bared by the advertisers. In truth part of the cost is implicitly payed for by legitimate website owners/the ad network who must price impressions lower to offset this kind of fraud


> The only guaranteed antidote to this kind of fraud is performance advertising (pay per sale).

It's guaranteed, but it probably won't work: "Thanks for doing business with us -- by the way, where did you hear about us?" "Online, but I forget where."

Advertisers need to be able to associate an advertisement with a result. Otherwise the effectiveness of advertising is a myth.


That's not how it works. Typically you have a referer ID in the URL which tracks the source.


Not when a customer walks in off the street. My point is that online tracking seems necessary to evaluate the effectiveness of advertising. If we remove tracking, which many people advocate, we're reduced to older methods to evaluate advertising.

And believe me, i'm not arguing in favor of tracking -- only presenting the most often heard argument.


That does not address what the post you're replying to stated: That PPS is a more trustworthy metric than PPC. The user is rarely, if ever, asked in this - it's all tracking.


Your argument assumes that sales follow from clicks (and can be tracked in the same way), so one need only follow the clicks to a sale. But this doesn't take into account that many sales take place long after a client has browsed online advertising. This is particularly true for big-ticket items.

> That PPS is a more trustworthy metric than PPC.

Yes, true, but only for sales that follow directly from an advertisement, with no intervening time or context changes.

I agree with the basic argument the PPS is more reliable and meaningful, but it would be worthwhile to know how many sales follow directly from an initial advertising exposure, as opposed to a more complex decision-making process.

Consider the diamond campaign waged by De Beers described in another HN thread today. The advertising costs were high, but the goal was to change consumer perceptions over a period of decades (and successfully). This is far removed from the model we're discussing, essentially an impulse purchase -- the De Beers campaign wasn't directly correlated with diamond sales at all.

My point is that not all advertising can be shoehorned into a directly trackable purchase, yet those other kinds of advertising might still be valuable.


PPA can be as misleading as CPC or CPM. If you ask for an email, it could be iframed with a more compelling offer. If you ask for a credit card purchase, it could be done with big blocks of stolen credit card numbers. This also does not include covariance between marketing channels aka the attribution model problem (ex. if you blast TV ads for freecreditreport.com then the CPA for online advertising magically drops).


All the scammers have to do is purchase say x% of clicks and the. Refund the money. It's very hard to track which ad click generated exactly which sale. For example through Adwords content network you don't know which site generated the click/purchase. You can know that content network ads gets more refunds, but without access to googles data you can't know the specifics.

So as good as CPA sounds, its not full proof either. You can just bypass it with either stolen credit card purchases or refunds.


Maybe it really is just windows users running IE 9 on windows 7...and maybe it just crashes on clicks sometimes because the tracking overloads it?

Do they have the bot code? I didn't see anything about where it came from...just an assumed analysis of effect. Just saying, it might not be a bot or malware at all.


The analysis of click and mouse traces location distribution vs humans at the end makes it pretty clear that these are not humans.


The infections by state chart would be more interesting if it was per capita.


A few things -

1. Why?

2. How widespread is this in general? How long before most web advertising is bot-fraud as users learn about ad-blockers?

3. Didn't realise my mouse traces were being recorded by advertisers in such detail.... I do not like this.


1.) The only "useful" application of this that I can immediately think of is to drive up the cost per impression on your competitors. But this seems like a very short term strategy as anyone doing meaningful online marketing is watching the cost per acquisition (a person who actually buys something) just as close or closer than impression cost, and your acquisition cost is going to go through the roof if you get a bot attack.

2.) I haven't been able to find meaningful info on this yet.

3.) I work in e-commerce tech. I had NO IDEA how bad it was before I got into this industry. Just had a sales call with a company who is tracking 400 unique data points per second on users. They track mouse clicks, mouse movements, what your highlight, how long on site, navigation pattern, buying behavior form other in-network sites, page position, etc... and then use that data in real time to dynamically generate promos and offers to "encourage buying behavior".


That there is quite scary! May have to investigate "NoScript" a bit more closely...


Ghostery should be enough and is less invasive. https://www.ghostery.com/download


Thanks for that, had heard of it before but have now investigated and I like it!

It does automatically what I'd been manually doing with Adblock i.e. block loading resources from facebook when not actually on facebook.com, and a whole lot more. Useful tool.


1. There's a link a comment up in the thread[1] that points out Ghost Sites: Websites with high visitor traffic but obviously no one would ever really visit. Like Babypowder.net or some such malarkey. If you're running a ghost site, you contract out to a botnet to deliver clicks and impressions at rate lower than the return from the advertising.

2. The article shows a bunch of these ghost sites so I would suspect that it's common in the world of scamming/ghost sites to drive up their CPM (Cost per Thousand). Hell, if I was running a legit site, I'd outsource the bullshit ghost sites to a Russian company so that my own CPM gets higher. As an example: I bet the value of "Baby Powder" as a search term is probably twice what it should be.

3. Get "Live HTTP Headers" or run your traffic through a proxy and see how many HTTP requests you generate just using a basic ad. I worked in Rich-Media Advertising and the way we tracked was every interaction. Right when I left in 2008-2009 we started tracking more and more. Really depended what the client bought/paid for. But we'd always track opens, clicks, etc.. the basic stuff. We didn't do heatmaps or X,Y coordinate tracking, but we had the data.

It was impressive to see, we would host our units on Akamai and due to the sheer traffic were unable to do real-time SQL inserts, so we would crunch the HTTP logs in hour increments after the fact to generate the data. So a typical GET string would be ad.domain.tld/?id=adunitMD5&type=Click&x=0&y=0&action=MD5

Of course it wouldn't be that verbose, but the GET string would be quite long with all the data. This was all done through a combo of JS and ActionScript.

[1] http://www.adweek.com/news/technology/meet-most-suspect-publ...


They own the sites that are displaying those ads and get a share of everything google makes, so their adsense earnings go through the roof


I don't think they can fool Google. In the article there's patterns of clickthrough and mouseovers for both a real user and a bot. If anyone, I believe Google can detect what's likely to be a fake impression and click. It's their bread and butter.


>It's their bread and butter.

Don't they get more bread and butter because they charge advertizers for the fraudulent clicks too and profit from them? It's not like the advertisers can ignore Ad Sense and go to a competitor easily without losing a lot of traffic.


money, that's all.


Oh sure, I understand that motivator.

I was just wondering who it benefitted. I guess the site operators themselves, but it seems pretty underhand.


the operators sign up to be publishers with ad networks who pay them a cpm rate for all the impressions they generate.


Using the assumption that there is never just one cockroach, what is a good multiplier to arrive at total-fraudulent dollars-per-month?


It's interesting that the infection seems most common in the Southwest. Do botnets like this spread geographically based on email address or physical connection/proximity? Or are the targeted sites or infection points targeted at users in the southwest? Or are people in the north/northeast more likely to use anti-virus software or be savvy enough to avoid this?


They appear to have coloured states based on the number of infected hosts in such states. Since states in the northeast are generally smaller (i.e. less populous) than, say, California, we can assume that there are also fewer people getting infected there.

E.g., there are approx. 1e6 people in Maine and 4e7 people in California. If you assume 1e2 infected hosts in Maine (0-99) and 3e4 in California (>1e4, 1.2e5 in total), you get an infection rate in California of about 7.5 that in Maine.

Given the very coarse graining in the data source, such a factor can either be dismissed as statistical fluctuation or you could try to explain it using, for example, an infection model that favours geographical proximity, such as one based on Facebook friends. Furthermore, it might well be that internet connectivity is better in California than it is in Maine and the bot prefers hosts with high uplink rates. I don’t know :)

Edit: We don’t know what websites were targeted, but maybe they ran ads that would prefer users from the southwest for some reason?


Ah, didn't realize they weren't weighted. Still, it'd be interesting to see if the apparent bias to the southwest is statistically significant.


Any idea what the list of 200 sites are?


Well, that's what you get for being in the click based ads game really. At this point, I would assume that these companies should just accept this as an occupational hazard. It's not like they can ever really beat the bots.


At Strata (O'Reilly's BigData conference), there was an entire session track dedicated to fraud prevention and legal movement on this type of stuff. Although, you will never be able to stop this (unless you stop people from being driven by money) the data community is taking this kind of detection very seriously. Every business, if it get large enough, is going to care about this type of fraud, if only to pay it some amount of lip service

I saw a presentation. From a data scientist at bitly. She showed that spammer links have a distinctive traffic shape (constant over time) while real links have a totally different one (initial peak followed by logarithmic drop to zero). Similar patterns exist in advertising campaigns. 6MM/month seems like a lot of breakage that someone is fine cutting that check.


What's to stop a botnet operator from incorporating that data scientist's findings into their bot behavior? It's a chicken & egg problem, and it will hopefully lead to a better alternative to online advertising (perhaps something more user-centric).


> It's a chicken & egg problem

I think you mean it's an arms race. Arms races are good markets to be in as the hash-checking AV companies have proved.


There will always be an incentive to commit fraud as long as there's money to be made. Why doesn't a robber of houses move to a new city each time they commit a crime? There are costs to all crime, especially in the setup. If you raise the input cost to generate a reward, then you make it less attractive as an avenue to fraud. At least among the unskilled criminals.

I'll totally admit I'm skewed by operating on the data side and want to believe that my work has some lasting positive impact, and isn't a band-aid.


"Well, that's what you get for being in the click based ads game really."

Do you find something particularly sinister or unethical about click-based ads (more so than any advertising)?


Oh no, not at all. I realize that could have come across as malignant, but it was just supposed to be a neutral assessment.

I do freelance dev work for some guys that run ads; it's just another business to me.


I don't.

But this is analogous to TV advertisers complaining about TiVO and the ability to skip commercials. Can't stop it!


No matter what line of work you are in, there is a cost of doing business related to fraud - that does not mean it doesn't require some kind of attention even when you can't combat 100% of the fraud.


Is the comparison valid to say botnets are the bacteria of the Internet?


Botnets are inherently malicious. This is obviously not true of bacteria. But in a way they are similar I suppose. I don't think its a very fruitful analogy either way.


I am not sure if can be called non-malicious but there is definitely a new breed of botnets coming up

e.g. http://internetcensus2012.bitbucket.org/


More like the parasites of the internet.


Except parasites have no central control or a will to do anything beyond survival.


I'm wondering if this might be related to the twitter spam discussed at https://news.ycombinator.com/item?id=5373161


I find this fascinating. How does chameleon infect its victims? Anyone have further reading? Botnets seem incredibly interesting.


Botnets dont really infect their victims. A botnet is just a network of compromised computers (Bots).

The malware that forces your computer to participate in the botnet can be delivered by any avenue imaginable. Drive-By Downloads, crapware, embedded into pirated software, etc. Not sure how chameleon specifically did it.


I wonder if Ad platform companies like adwords will see a drop in their revenues once the Botnet will be dismantled...


Or perhaps an increase because ads become worth more. Fake clicks make less sales, so if there are more sales, people can spend more on advertising. Of course it needs to be at a very large scale for it to influence the actual price per advertisement, but this botnet seems to be rather large-scale.


Wow, I didn't realize the arms race has reached such heights already. Looks the bad guys are bound to win eventually.


There are some crazy clever schemes out there for click fraud. Check out this link - _WARNING NFSW IMAGES_ http://www.behind-the-enemy-lines.com/2011/03/uncovering-adv...

This was estimated to be making $500K a month before being found... and was a work of pure genius.


Interesting read.

I'm still not sure I understood the role of the "HGTV" sites and how the fraudster was getting money by showing the HGTV ads (even after reading the comments on the post explaining this). Weren't the ads on those parked domains enough to generate the revenue for the fraudster?


"Weren't the ads on those parked domains enough"?

Is greed limited?


It's not about that.

If the ads were only on their own domains, this could have gone undetected. The whole thing was discovered as a result of using those 'legit' websites, and as far as I can tell from the article, using those was an essential part of the scam, i.e. without it, it might not work... but I'm just wondering why.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: