> Inside, are certain sites more worthwhile - and which ones (eg reddit, eBay, trade union websites, whatever)
Yes, absolutely. For many purposes websites that sell their own data are less useful (less signal exclusivity). Specific sources of data will be much more valuable depending on what the data is about.
> How about scraper brokers? Do they exist?
Yes. You're not getting access without an NDA in addition to paying quite a lot.
> Are there scammy scrapers? Make up BS and sell as scraped data?
That depends on how easy it is to verify the data. For most of what you'd term "alternative data" you'll know if it's real in 2 - 12 weeks, and it's not sustainable to sell crap.
But a lot of parties scrape dodgy financial timeseries data (ticks and quotes on equities or options) and sell it, priced as though it were tick data when it's barely accurately OHLC. They mostly sell this sort of data to amateurs who don't realize tick data is expensive for a reason.
> How big is this?
Very big. Most hedge funds ingest a lot of data whether they curate it internally or source it from elsewhere.
Not that I'm aware of, no.
> Inside, are certain sites more worthwhile - and which ones (eg reddit, eBay, trade union websites, whatever)
Yes, absolutely. For many purposes websites that sell their own data are less useful (less signal exclusivity). Specific sources of data will be much more valuable depending on what the data is about.
> How about scraper brokers? Do they exist?
Yes. You're not getting access without an NDA in addition to paying quite a lot.
> Are there scammy scrapers? Make up BS and sell as scraped data?
That depends on how easy it is to verify the data. For most of what you'd term "alternative data" you'll know if it's real in 2 - 12 weeks, and it's not sustainable to sell crap.
But a lot of parties scrape dodgy financial timeseries data (ticks and quotes on equities or options) and sell it, priced as though it were tick data when it's barely accurately OHLC. They mostly sell this sort of data to amateurs who don't realize tick data is expensive for a reason.
> How big is this?
Very big. Most hedge funds ingest a lot of data whether they curate it internally or source it from elsewhere.