Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Google crawls the entire page, not just the subset of text that you, a human, recognize as the unchanged article.

It’s easy to change millions of pages once a week with on-load CMS features like content recommendations. Visit an old article and look at the related articles, most read, read this next, etc widgets around the page. They’ll be showing current content, which changes frequently even if the old article text itself does not.



I'm pretty sure Google is smart enough to recognize the main content of a page, and ignore things like widgets and navigation. That's Search Engine 101.


Yes, of course, but that analysis happens after the content has been visited by the bot. It’s still a visit, and still hits the “crawl budget.”


So they should stop doing this on pages that they are deleting now.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: