LLM making a quick edit, <100 lines... Sure. Asking an LLM to rubber-duck your c...

kraftman · 2025-07-07T14:20:13 1751898013

I keep seeing this argument over and over again, and I have to wonder, at what point do you accept that maybe LLM's are useful? Like how many people need to say that they find it makes them more productive before you'll shift your perspective?

dragonwriter · 2025-07-07T14:51:17 1751899877

> I keep seeing this argument over and over again, and I have to wonder, at what point do you accept that maybe LLM's are useful?

The post you are responding to literally acknowledges that LLMs are useful in certain roles in coding in the first sentence.

> Like how many people need to say that they find it makes them more productive before you'll shift your perspective?

Argumentum ad populum is not a good way of establishing fact claims beyond the fact of a belief being popular.

kraftman · 2025-07-07T18:55:40 1751914540

...and my comment clearly isnt talking about that, but at the suggestion that its useless to write code with an LLM because you'll end up rewriting 50% of it.

If everyone has an opinion different to mine, I dont instantly change my opinion, but I do try and investigate the source of the difference, to find out what I'm missing or what they are missing.

The polarisation between people that find LLMs useful or not is very similar to the polarisation between people that find automated testing useful or not, and I have a suspicion they have the same underlying cause.

nwienert · 2025-07-07T20:02:06 1751918526

You seem to think everyone shares your view, around me I see a lot of people acknowledging they are useful to a degree, but also clearly finding limits in a wide array of cases, including that they really struggle with logical code, architectural decisions, re-using the right code patterns, larger scale changes that aren’t copy paste, etc.

So far what I see is that if I provide lots of context and clear instructions to a mostly non-logical area of code, I can speed myself up about 20-40%, but only works in about 30-50% of the problems I solve day to day at a day job.

So basically - it’s about a rough 20% improvement in my productivity - because I spend most of my time of the difficult things it can’t do anyway.

Meanwhile these companies are raising billion dollar seed rounds and telling us that all programming will be done by AI by next year.

girvo · 2025-07-08T05:52:02 1751953922

> Meanwhile these companies are raising billion dollar seed rounds and telling us that all programming will be done by AI by next year.

Which is the same thing they said last year, and hasn't panned out. But surely this time it'll be right...

psychoslave · 2025-07-07T14:31:53 1751898713

That's a tool, and it depends what you need to do. If it fits someone need and make them more productive, or even simply enjoy more the activity, good.

Just because two people are fixing something on the whole doesn't mean the same tool will hold fine. Gum, pushpin, nail, screw,bolts?

The parent thread did mention they use LLM successfully in small side project.

MangoToupe · 2025-07-07T18:35:47 1751913347

> at what point do you accept that maybe LLM's are useful?

LLMs are useful, just not for every task and price point.

candiddevmike · 2025-07-07T14:28:53 1751898533

People say they are more productive using visual basic, but that will never shift my perspective on it.

Code is a liability. Code you didn't write is a ticking time bomb.

ninetyninenine · 2025-07-07T17:09:45 1751908185

They say it’s only effective for personal projects but there’s literally evidence of LLMs being used for what he says can’t be used. Actual physical evidence.

It’s self delusion. And also the pace of AI is so fast he may not be aware of how fast LLMs are integrating into our coding environments. Like 1 year ago what he said could be somewhat true but right now what he said is clearly not true at all.

mike_hearn · 2025-07-07T14:08:34 1751897314

I've used Claude with a large, mature codebase and it did fine. Not for every possible task, but for many.

Probably, Mercury isn't as good at coding as Claude is. But even if it's not, there's lots of small tasks that LLMs can do without needing senior engineer level skills. Adding test coverage, fixing low priority bugs, adding nice animations to the UI etc. Stuff that maybe isn't critical so if a PR turns up and it's DOA you just close it, but which otherwise works.

Note that many projects already use this approach with bots like Renovate. Such bots also consume a ton of CI time, but it's generally worth it.

airstrike · 2025-07-07T14:29:45 1751898585

IMHO LLMs are notoriously bad at test coverage. They usually hard code a value to have the test pass, since they lack the reasoning required to understand why the test exists or the concept of assertion, really

wrs · 2025-07-07T15:27:45 1751902065

I don’t know, Claude is very good at writing that utterly useless kind of unit test where every dependency is mocked out and the test is just the inverted dual of the original code. 100% coverage, nothing tested.

conradkay · 2025-07-07T17:21:05 1751908865

Yeah and that's even worse because there's not an easy metric you can have the agent work towards and get feedback on.

I'm not that into "prompt engineering" but tests seem like a big opportunity for improvement. Maybe something like (but much more thorough):

1. "Create a document describing all real-world actions which could lead to the code being used. List all methods/code which gets called before it (in order) along with their exact parameters and return value. Enumerate all potential edge cases and errors that could occur and if it ends up influencing this task. After that, write a high-level overview of what need to occur in this implementation. Don't make it top down where you think about what functions/classes/abstractions which are created, just the raw steps that will need to occur" 2. Have it write the tests 3. Have it write the code

Maybe TDD ends up worse but I suspect the initial plan which is somewhat close to code makes that not the case

Writing the initial doc yourself would definitely be better, but I suspect just writing one really good one, then giving it as an example in each subsequent prompt captures a lot of the improvement

girvo · 2025-07-08T05:53:10 1751953990

I've not gone into it yet, but I think BDD would fit reasonably well with agents and generating tests that aren't entirely useless.

astrange · 2025-07-07T21:53:27 1751925207

This is why unit tests are the least useful kind of test and regression tests are the most useful.

I think unit tests are best written /before/ the real code and thrown out after. Of course, that's extremely situational.

flir · 2025-07-07T14:19:17 1751897957

Don't want to put words in the parent commenter's mouth, but I think the key word is "unsupervised". Claude doesn't know what it doesn't know, and will keep going round the loop until the tests go green, or until the heat death of the universe.

mike_hearn · 2025-07-07T14:20:58 1751898058

Yes, but you can just impose timeouts to solve that. If it's unsupervised the only cost is computation.

blitzar · 2025-07-07T15:41:51 1751902911

Do the opposite - integrate your CI into your LLM.

Make it run tests after it changes your code and either confirm it didnt break anything or go back and try again.

DSingularity · 2025-07-07T14:16:52 1751897812

He is simply observing that if PR numbers and launch rates increase dramatically CI cost will become untenable.