Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I've also had the same problem. For a while i could get good recommendations/reviews by adding "reddit" to the query as i could find good information there, however i think that the sites have caught up on that and now i only get 1-2 results from reddit.com and then the rest are other sites that reference reddit so they are in the results.


At the risk of any SEO-blogspam people reading this and adjusting their tactics, you can filter by domain, eg.

product name review site:reddit.com


As long as only a few people use a successful white hat trick, that trick isn’t generally worthwhile for the darker hats to combat.

So the problem isn’t so much that blog spam people will read your comment, but that many ordinary readers will start using the trick, and thus make it worthwhile for the dark hats to address.

So the unfortunate side effect of kindness in information sharing is that it decreases the value of that information.

Therefore, I don’t think there’s a practical way out of endless arms races between $good and $evil


The problem with using Reddit specifically is that you can't filter by date anymore. Reddit has poisoned their results to show old posts with new dates on Google.


Huh, I wonder, can I download reddit? Like, all the text posts, ignoring images. I wonder how big of a db that is and how hard would it be to crawl it myself. It can't be more than a few gb of data. I mean, at this point there is a lot of information there that is just begging to be leveraged.


Pushshift has a monthly comment[1] and submission data dump that you can download. Last June 2021's (comment) size was 20+ GB compressed in ZS.

[1]- https://files.pushshift.io/reddit/comments/


Great tip, thanks. Funny how much i used to do google dorks (that got introduce to me in a college course), but overtime i completely forgot about them.

About the risk though: it's happened already. Remember the "to find any book free online just do "filetype:pdf book-name"" tips that were popular online a while ago? Now it's all just PDFs on public google drives with tons of book names and a single link leading to some sketchy site.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: