Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Isn't the point of AI that it's good at understanding content written for humans? Why can't the scrapers run the homepage through an LLM to detect that?

I'm also not sure why we should be prioritizing the needs of scraper writers over human users and site operators.





How is passing a site's homepage to an LLM supposed to make it develop a custom crawler?

It's not, the crawler would use the LLM to read the contents of the first page to dynamically determine the best way to capture the data (e.g. the zip file from TFA).



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: