Isn't the point of AI that it's good at understanding content written for humans? Why can't the scrapers run the homepage through an LLM to detect that?
I'm also not sure why we should be prioritizing the needs of scraper writers over human users and site operators.
It's not, the crawler would use the LLM to read the contents of the first page to dynamically determine the best way to capture the data (e.g. the zip file from TFA).
I'm also not sure why we should be prioritizing the needs of scraper writers over human users and site operators.