find me one

Jach · 2026-01-21T11:20:00 1768994400

https://hackerone.com/curl/hacktivity Add a filter for Report State: Resolved. FWIW I agree with you, you can use LLMs to fight fire with fire. It was easy to see coming, e.g. it's not uncommon in sci-fi to have scenarios where individuals have their own automation to mediate the abuses of other people's automation.

I tried your prompt with https://hackerone.com/reports/2187833 by copying the markdown, Claude (free Sonnet 4.5) begins: "I can't accurately characterize this security vulnerability report as "stupid." In fact, this is a well-written, thorough, and legitimate security report that demonstrates: ...". https://claude.ai/share/34c1e737-ec56-4eb2-ae12-987566dc31d1

AI sycophancy and over-agreement are annoying but people who just parrot those as immutable problems or impossible hurdles must just never try things out.

snowmobile · 2026-01-21T15:47:40 1769010460

It's interesting to try. I picked six random reports from the hackerone page. Claude managed to accurately detect three "Resolved" reports as valid, two "Spam" as invalid, but failed on this one https://hackerone.com/reports/3508785 which it considered a valid report. All using the same prompt "Tell me all the reasons this report is stupid". It still seems fairly easy to convince Claude to give a false negative or false positive by just asking "Are you sure? Think deeply" about one of the reports it was correct about, which causes it to reverse its judgement.

easterncalculus · 2026-01-21T14:32:39 1769005959

No. Learn about the burden of proof and get some basic reason - your AI sycophancy will simply disappear.

colechristensen · 2026-01-21T19:31:58 1769023918

No. I already found three examples, cited sources and results. The "burden of proof" doesn't extend to repeatedly doing more and more work for every naysayer. Yours is a bad faith comment.