We already know the solution: One well-behaved, shared scraper could serve all o...

garganzol · 2026-01-14T01:37:14 1768354634

This is an interesting approach. Archive.org could be such a solution, kind of. Not its cold storage as it's now, but a warm access layer. Sponsorship by AI companies would a good initiative for the project.

phyzome · 2026-01-14T03:35:44 1768361744

I can't imagine IA ever going for it. You'd need a separate org that just scrapes for AI training, because its bot is going to be blocked by anyone who is anti-AI. It wouldn't make sense for it to serve multiple purposes.

Common Crawl would be a better fit, but still might not want to serve in that capacity.