In the end, I think the dream underneath this dream is about being able to manifest things into reality without having to get into the details.
The details are what stops it from working in every form it's
been tried.
You cannot escape the details. You must engage with them and solve them directly, meticulously. It's messy, it's extremely complicated and it's just plain hard.
There is no level of abstraction that saves you from this, because the last level is simply things happening in the world in the way you want them to, and it's really really complicated to engineer that to happen.
I think this is evident by looking at the extreme case. There are plenty of companies with software engineers who truly can turn instructions articulated in plain language into software. But you see lots of these not being successful for the simple reason that those providing the instructions are not sufficiently engaged with the detail, or have the detail wrong. Conversely, for the most successful companies the opposite is true.
This rings true and reminds me of the classic blog post “Reality Has A Surprising Amount Of Detail”[0] that occasionally gets reposted here.
Going back and forth on the detail in requirements and mapping it to the details of technical implementation (and then dealing with the endless emergent details of actually running the thing in production on real hardware on the real internet with real messy users actually using it) is 90% of what’s hard about professional software engineering.
It’s also what separates professional engineering from things like the toy leetcode problems on a whiteboard that many of us love to hate. Those are hard in a different way, but LLMs can do them on their own better than humans now. Not so for the other stuff.
Every time we make progress complexity increases and it becomes more difficult to make progress. I'm not sure why this is surprising to many. We always do things to "good enough", not to perfection. Not that perfection even exists... "Good enough" means we tabled some things and triaged, addressing the most important things. But now to improve those little things now need to be addressed.
This repeats over and over. There are no big problems, there are only a bunch of little problems that accumulate. As engineers, scientists, researchers, etc our literal job is to break down problems into many smaller problems and then solve them one at a time. And again, we only solve them to the good enough level, as perfection doesn't exist. The problems we solve never were a single problem, but many many smaller ones.
I think the problem is we want to avoid depth. It's difficult! It's frustrating. It would be great if depth were never needed. But everything is simple until you actually have to deal with it.
> As engineers, scientists, researchers, etc our literal job is to break down problems into many smaller problems and then solve them one at a time.
Our literal job is also to look for and find patterns in these problems, so we can solve them as a more common problem, if possible, instead of solving them one at a time all the time.
Very true. But I didn't want to discuss elegance and abstraction as people seem to misunderstand abstraction in programming. I mean all programming is abstraction... abstraction isn't to be avoided, but things can become too abstract
I think we're all coping a bit here. This time, it really is different.
The fact is, one developer with Claude code can now do the work of at least two developers. If that developer doesn't have ADHD, maybe that number is even higher.
I don't think the amount of work to do increases. I think the number of developers or the salary of developers decreases.
In any case, we'll see this in salaries over the next year or two.
The very best move here might be to start working for yourself and delete the dependency on your employer. These models might enable more startups.
Alternate take: what agents can spit out becomes table stakes for all software. Making it cohesive, focused on business needs, and stemming complexity are now requirements for all devs.
By the same token (couldn’t resist), I also would argue we should be seeing the quality of average software products notch up by now with how long LLMs have been available. I’m not seeing it. I’m not sure it’s a function of model quality, either. I suspect devs that didn’t care as much about quality hadn’t really changed their tune.
how much new software do we really use? and how much can old software become qualitatively better without just becoming new software in different times with a much bigger and younger customer base?
I misunderstood two things for a very long time:
a) standards are not lower or higher, people are happy that they can do stuff at all or a little to a lot faster using software. standards then grow with the people, as does the software.
b) of course software is always opinionated and there are always constraints and devs can't get stuck in a recursive loop of optimization but what's way more important: they don't have to because of a).
Quality is, often enough, a matter of how much time you spent on nitpicking even though you absolutely could get the job done. Software is part of a pipeline, a supply chain, and someone is somehow aware why it should be "this" and not better or that other version the devs have prepared knowing well enough it won't see the light of day.
Honestly, in many ways it feels like quality is decreasing.
I'm also not convinced it's a function of model quality. The model isn't going to do something if the prompter doesn't even know. It does what the programmer asked.
I'll give a basic example. Most people suck at writing bash scripts. It's also a common claim as to LLMs utility. Yet they never write functions unless I explicitly ask. Here try this command
curl -fsSL https://claude.ai/install.sh | less
(You don't need to pipe into less but it helps for reading) Can you spot a fatal error in the code where when running curl-pipe-bash the program might cause major issues? Funny enough I asked Claude and it asked me this
Is this script currently in production? If so, I’d strongly recommend adding the function wrapper before anyone uses it via curl-pipe-bash.
The errors made here are quite common in curl-pipe-bash scripts. I'm pretty certain Claude would write a program with the same mistakes despite being able to tell you about the problems and their trivial corrections.
The problem with vibe coding is you get code that is close. But close only matters in horseshoes and hand grenades. You get a bunch of unknown unknowns. The classic problem of programming still exists: the computer does what you tell it to do, not what you want it to do. LLMs just might also do things you don't tell it to...
You sound bored. If we triple head count overnight, we'd only slow our backlog, temporarily. Every problem we solve only opens up a larger group of harder problems to solve.
Why wouldn't we find new things to do with all that new productivity?
Anecdotally, this is what I see happening in the small in my own work - we say yes to more ideas, more projects, because we know we can unblock things more quickly now - and I don't see why that wouldn't extend.
I do expect to see smaller teams - maybe a lot more one-person "teams" - and perhaps smaller companies. But I expect to see more work being done, not less, or the same.
What new things would we do? I do contracting so maybe I'm lowest-bidder-pilled but I feel like drops in price in lean organizations sre going to eat the lunch of shops trying to make more quality software in most software disciplines.
How much software is really required to be extensible?
There is tons of stuff to do. Lots of technologies out there that need to be invented and commercialized. Tons of inefficient processes in business, government, and academia to improve.
None of this means that it will be the kinds of professional specialized software development teams that we're used to doing any of this work, but I have some amount of optimism that this is actually going to be a golden age for "doing useful things with computers" work.
I dispute technology in general. There are plenty of examples where industrialisation led to a drop in quality, a massive drop in price, and a displacement of workers.
It hasn't happened in software yet. I suppose this has to do with where software sits on the demand curve currently.
I'm imagining a few more shifts in productivity will make the demand vs price derivative shift in a meaningfully different way, but we can only speculate.
I think you are misunderstanding the point I'm making. I agree that "writing code" is likely to be commoditized by AI tools, much like past industrialization disruptions. But I think there is going to be more things to do in the space of "doing useful things with computers", analogous to how industrialization creates new work further up the value chain.
Of course it often isn't the same people whose jobs are disrupted who end up doing that new work.
If LLMs are good at writing software, then there's lots of good software around written by LLMs. Where is that software? I don't see it. Logical conclusion: LLMs aren't good at writing software.
Are you trying to make a distinction between writing software vs writing code? LLMs are pretty great at writing good code (a relative term of course) if you lay things out for them. I use Claude Code on both greenfield new projects and a giant corporate mono repo and it works pretty well in both. In the giant mono repo, I have the benefit of many of my coworkers developing really nice Claude.md files and skills, so that helps a lot.
It’s very similar to working with a college hire SWE: you need to break things down to give them a manageable chunk and do a bit of babysitting, but I’m much more productive than I was before. Particularly in the broad range of things where I know enough to know what needs to be done but I’m not super familiar with the framework to do it.
Presumably they are writing the same quality software faster, the market having decided what quality it will accept.
Once that trend maxes out it’s entirely plausible that the level of quality demanded will rise quickly. That’s basically what happened in the first dot com era.
I'm not convinced. Honestly it seems like we're in a market of lemons and I don't know how we escape the kind of environment that is ripe for lemons. To get out requires customers to be well informed at the time of purchase. This is always difficult with software as we usually need to try it first and frankly, the average person is woefully tech illiterate.
But these days? We are selling products based on promises, not actual capabilities. I can't think of a more fertile environment for a lemon market than that. No one can be informed and bigger and bigger promises need to be made every year.
"(...) maybe growing vegetables or using a Haskell package for the first time, and being frustrated by how many annoying snags there were." Haha this is funny. Interesting reading.
While this is absolutely true and I've read this before, I don't think you can make this an open and shut case. Here's my perspective as an old guy.
The first thing that comes to mind when I see this as a counterargument is that I've quite successfully built enormous amounts of completely functional digital products without ever mastering any of the details that I figured I would have to master when I started creating my first programs in the late 80s or early 90s.
When I first started, it was a lot about procedural thinking, like BASIC goto X, looping, if-then statements, and that kind of thing. That seemed like an abstraction compared to just assembly code, which, if you were into video games, was what real video game people were doing. At the time, we weren't that many layers away from the ones and zeros.
It's been a long march since then. What I do now is still sort of shockingly "easy" to me sometimes when I think about that context. I remember being in a band and spending a few weeks trying to build a website that sold CDs via credit card, and trying to unravel how cgi-bin worked using a 300 page book I had bought and all that. Today a problem like that is so trivial as to be a joke.
Reality hasn't gotten any less detailed. I just don't have to deal with it any more.
Of course, the standards have gone up. And that's likely what's gonna happen here. The standards are going to go way up. You used to be able to make a living just launching a website to sell something on the internet that people weren't selling on the internet yet. Around 1999 or so I remember friend of mine built a website to sell stereo stuff. He would just go down to the store in New York, buy it, and mail it to whoever bought it. Made a killing for a while. It was ridiculously easy if you knew how to do it. But most people didn't know how to do it.
Now you can make a living pretty "easily" selling a SaaS service that connects one business process to another, or integrates some workflow. What's going to happen to those companies now is left as an exercise for the reader.
I don't think there's any question that there will still be people building software, making judgment calls, and grappling with all the complexity and detail. But the standards are going to be unrecognizable.
Is the surprising amount of detail an indicator that we do not live in a simulation, or is it instead that we have to be living inside a simulation because it doesn't need all this detail for Reality, indicating an algorithmic function run amuck?
I once wrote software that had to manage the traffic coming into a major shipping terminal- OCR, gate arms, signage, cameras for inspecting chassis and containers, SIP audio comms, RFID readers, all of which needed to be reasoned about in a state machine, none of which were reliable. It required a lot of on the ground testing and observation and tweaking along with human interventions when things went wrong. I’d guess LLMs would have been good at subsets of that project, but the entire thing would still require a team of humans to build again today.
Sir your experience is unique and thanks for answering this.
That being said, someone took the idea of you saying LLM's might be good at subsets of projects to consider we should use LLMs for that subset as well
But I digress because (I provided more in depth reasoning in other comment as well) because if there is an even minute bug which might slip up past LLM and code review for subset of that and for millions of cars travelling through points, we assume that one single bug in it somewhere might increase the traffic/fatality traffic rate by 1 person per year. Firstly it shouldn't be used because of the inherent value of human life itself but even from monetary sense as well so there's really not much reason I can see in using it
That alone over a span of 10 years would cost 75 million-130Million$ (the value of life in US for a normal perosn ranges from 7.5 million - 13 million$)
Sir I just feel like if the point of LLM is to have less humans or less giving them income, this feels so short sighted because I (if I were the state and I think everyone will agree after the cost analysis) would much rather pay a few hundred thousand dollars to even a few million$ right now to save 75-130 Million$ (on the smallest scale mind you, it can get exponentially more expensive)
I am not exactly sure how we can detect the rate of deaths due to LLM use itself (the 1 number) but I took the most conservative number.
And that is also the fact that we won't know if LLM's might save a life but I am 99.9% sure that might not be the case and once again it wouldn't be verifiable itself so we are shooting things in the dark
And we can have a much more sensitive job with better context (you know what you are working at and you know how valuable it is/can save lives and everything) whereas no amount of words can convey that danger to LLM's
To put it simply, the LLM might not know the difference between this life or death situation machine's code at times or a sloppy website created by it.
I just don't think its worth it especially in this context at all even a single % of LLM code might not be worth it here.
I had friend who was in crisis while the rest of us were asleep. Talking with ChatGPT kept her alive. So we know the number is at least one. If you go to the Dr ChatGPT thread, you'll find multiple reports of people who figured out debilitating medical conditions via ChatGPT in conjunction with a licensed human doctor, so we can be sure the numbers greater than zero. It doesn't make headlines the same way Adam's suicide does, and not just because OpenAI can't be the ones to say it.
Great for her, I hope she's doing okay now. (I do think we humans can take each other for granted)
If talking to chatgpt helps anyone mentally, then sure great. I can see as to why but I am a bit concerned that if we remove a human from the loop then we can probably get way too easily disillusioned as well which is what is happening.
These are still black boxes but in the context of traffic lights code (even partially) feels to me something that the probability of it might not saving a life significantly overwhelms the opposite.
ChatGPT psychosis also exists so it goes both ways, I just don't want the negative voices to drown out the positive ones (or vice versa).
As far as traffic lights go, this predates ChatGPT, but IBM's Watson, which is also rather much a black box where you stuff data in, and instructions come out; they've been doing traffic light optimization for years. IBM's got some patents on it, even. Of course that's machine learning, but as they say, ML is just AI that works.
I've had good luck when giving the AI its own feedback loop. On software projects, it's letting the AI take screenshots and read log files, so it can iterate on errors without human input. On hardware projects, it's a combination of solenoids, relays, a pi and pizerow, and a webcam. I'm not claiming that an AI could do the above mentioned project, just that (some) hardware projects can also get humans out of the loop.
Don’t you understand? That’s why all these AI companies are praying for humanoid robots to /just work/ - so we can replace humans mentally and physically ASAP!
I'm sure those will help. But that doesn't solve the problem the parent stated. Those robots can't solve those real world problems until they can reason, till they can hypothesize, till they can experiment, till they can abstract all on their own. The problem is you can't replace the humans (unilaterally) until you can create AGI. But that has problem of its own, as you now have to contend with previously creating a slave class of artificial life forms.
No worries - you’ve added useful context for those who may be misguided by these greedy corporations looking to replace us all. Maybe it helps them reconsider their point of view!
But you admit that fewer humans would be needed as “LLMs would have been good at subsets of that project”, so some impact already and these AI tools only get better.
If that is the only thing that you took out of that conversation, then I don't really believe that that job might've been suitable for you in the first place.
Now I don't know which language they used for the project (could be python or could be C/C++ or could be rust) but its like "python would have been good at subsets of that project", so some impact already and these python tools only get better
Did python remove the jobs? No. Each project has their own use case and in some LLM's might be useful, in others not.
In their project, LLM's might be useful for some parts but their majority of the work was doing completely new things with a human in feedback.
You are also forgetting trust factor, yes lets have your traffic lights system be written by a LLM, surely. Oops, the traffic lights glitched and all waymos (another AI) went beserk and oops accidents/crash happened which might cost millions.
Personally I wouldn't trust even a subset of LLM code and much rather have my country/state/city to pay to real developers that can be accountable & good quality control checks for such critical points to the point that no LLM in this context should be a must
For context, if LLM use can even impact 1 life every year. The value of 1 person is 7.5-13 million$
Over a period of 10 years in this really really small glitch of LLM, you end up in 10 years losing 75 million$
Yup go ahead save a few thousand dollars right now by not paying people enough in the first case to use LLM to then lose 75 million $ (on the least case scenario)
I doubt you have a clue regarding my suitability for any project, so I’ll ignore the passive l-aggressive ad hominem.
Anyway, it seems you are walking back your statement regarding LLM being useful for parts of your project, or ignoring the impact on personnel count. Not sure what you were trying to say then.
I went back because of course I could've just pointed out one picture but still wanted to give the whole picture.
my conclusion is rather the fact that this is a very high stakes project (both emotionally and mentally and economically) and AI are still black boxes with chances of being much more error prone (atleast in this context) and chances of it missing something to cause the -75 million and deaths of many is more likely and also that in such a high stakes project, LLM's shouldn't be used and having more engineers in the team might be worth it.
> I doubt you have a clue regarding my suitability for any project, so I’ll ignore the passive l-aggressive ad hominem.
Aside from the snark presented at me. I agree. And this is why you don't see me in a project regarding such high stakes project and neither should you see an LLM at any costs in this context. These should be reserved to the caliber of people who have both experience in the industry and are made of flesh.
Human beings are basically black boxes as far as the human brain is concerned. We don't blindly trust the code coming out of those black boxes, it seems illogical to do the same for LLMs.
Yes but at the end of the day I can't understand this take because what are we worried about for (atleast in this context) a few hundred thousand dollars for a human job than LLM?
I don't understand if its logical to deploy an LLM in any case, the problem is chances of LLM code slipping are very much more likely than the code of people who can talk to each other and decide on all meetings exactly how they wish to write and they got 10's of years of experience to back it up
If I were a state, there are so so many ways of getting money rather easily (hundreds of thousands of $ might seem a lot but they aren't for state) and plus you are forgetting that they went in manually and talked to real people
Counterpoint: perhaps it's not about escaping all the details, just the irrelevant ones, and the need to have them figured out up front. Making the process more iterative, an exploration of medium under supervision or assistance of domain expert, turns it more into a journey of creation and discovery, in which you learn what you need (and learn what you need to learn) just-in-time.
I see no reason why this wouldn't be achievable. Having lived most of my life in the land of details, country of software development, I'm acutely aware 90% of effort goes into giving precise answers to irrelevant questions. In almost all problems I've worked on, whether at tactical or strategic scale, there's either a single family of answers, or a broad class of different ones. However, no programming language supports the notion of "just do the usual" or "I don't care, pick whatever, we can revisit the topic once the choice matters". Either way, I'm forced to pick and spell out a concrete answer myself, by hand. Fortunately, LLMs are slowly starting to help with that.
From my experience the issue really is, unfortunately, that it is impossible to tell if a particular detail is irrelevant until after you have analyzed and answered all of them.
In other words, it all looks easy in hindsight only.
I think the the most coveted ability of a skilled senior developer, is precisely this "uncanny" ability to predict beforehand if some particular detail is important or irrelevant. This ability can only be obtained through years of experience and hubris.
While I know your comment was in sarcastic jest, the question folks are asking this month is "can't we just pay one person to prompt ten models to do that?"
Not just getting it wrong, but also learning from it and be able to internalize those mistakes. There are certainly people who do the same mistakes over and over again (either realizing or not) - it's not just a matter of experience.
> no programming language supports the notion of "just do the usual" or "I don't care, pick whatever, we can revisit the topic once the choice matters"
Programming languages already take lots of decisions implicitly and explicitly on one’s behalf. But there are way more details of course, which are then handled by frameworks, libraries, etc. Surely at some point, one has to take a decision? Your underlying point is about avoiding boilerplate, and LLMs definitely help with that already - to a larger extent than cookie cutter repos, but none of them can solve IRL details that are found through rigorous understanding of the problem and exploration via user interviews, business challenges, etc.
> perhaps it's not about escaping all the details, just the irrelevant ones
But that's the hard part. You have to explore the details to determine if they need to be included or not.
You can't just know right off the back. Doing so contradicts the premise. You cannot determine if a detail isn't important unless you get detailed. If you only care about a few grains of sand in a bucket you still have to search through a bucket of sand for those few grains
Right. But that's where tight feedback loop comes into place. New AI developments enable that in at least two ways: offloading busywork and necessary but straightforward work (LLMs can already write and iterate orders of magnitude faster than people), and having a multi-domain expert on call to lean on.
The thing about important details is that what ultimately matters is getting them right eventually, not necessarily the first time around. The real cost limiting creative and engineering efforts isn't the one of making a bad choice, but that of undoing it. In software development, AI makes even large-scale rewrites orders of magnitude cheaper than they ever were before, which makes a lot more decisions easily undoable in practice, when before that used to be prohibitively costly. I see that as one major way towards enabling this kind of iterative, detail-light development.
> and having a multi-domain expert on call to lean on.
I don't feel like this is an accurate description. My experience is that LLMs have a very large knowledge base but that getting them to go in depth is much more difficult.
But we run into the same problem... how do you evaluate that which you are not qualified to evaluate? It is a grave mistake to conflate "domain expert" with "appears to know more than me". Doesn't matter if it is people or machine, it is a mistake. It's how a lot of conartists work, and we've all seen people who are in high positions and we're all left wondering how in the world they got there.
> The real cost limiting creative and engineering efforts isn't the one of making a bad choice, but that of undoing it.
Weird reasoning... because I agree and this is the exact reason I find LLMs painful to work with. They dump code at you rather than tightening it up, making it clear and elegant. Code is just harder to rebase or simplify when there are more lines. Writing lines has never been and never will be the bottleneck because the old advice still holds true that if you're doing things over and over again, you're doing it wrong. One of the key things that makes programming so amazing is that you can abstract out repetitive tasks, even when there is variation. Repetition and replication only make code harder to debug and harder to "undo bad choices".
Also, in my experience it is even difficult to get LLMs to simplify, even when explicitly instructing them to and pointing them to specific functions and even giving strong hints of what exactly needs to be done. They promptly tell me how smart I am and then fail to do any of that actual abstraction. Code isn't useful when you have the same function written 30 different places and 20 different files. That's way harder to back out of decisions. They're good at giving a rough sketch but it still feels reckless to me to let them actually write into the codebase where they are creating this tech debt.
It's a cliché that the first 90% of a software project takes 90% of the time and the last 10% also takes 90% of the time, but it's cliché because it's true. So we've managed to invent a giant plausibility engine that automates the 90% of the process people enjoy leaving just the 90% that people universally hate.
>So we've managed to invent a giant plausibility engine that automates the 90% of the process people enjoy leaving just the 90% that people universally hate.
OK, for me it is the last 10% that is of any interest whatsoever. And I think that has been the case with any developer I've ever worked with I consider to be a good developer.
OK the first 90% can have spots of enjoyment, like a nice gentle Sunday drive stopping off at Dairy Queen, but it's not normally what one would call "interesting".
Sorry, I don't buy it. I'm an ops guy, and devs who say they like the integration stage mean they like making ops play a guessing game and clean up the mess they left us.
I am an AI hater (atleast in some of its current context precisely used for this) and you have worded some things I like to say in a manner I hadn't thought of and I agree with all you said and appreciate what you said man!
Now, I do agree with you and this is why I feel like AI can be good at just prototyping or for internal use cases, want to try out something no idea, sure use it or I have a website which sucks and I can quickly spin up an alternative for person use case, go for it, maybe even publish it to web with open source.
Take feedback from people if they give any and run with it. So in essense, prototyping's pretty cool.
But whenever I wish to monetize or the idea of monetize, I feel like we can take some design ideas or experimentation and then just write them ourselves. My ideology is simple in that I don't want to pay for some service which was written by AI slop, I mean at that point, just share us the prompt.
So at this point, just rewrite the code and actually learn what you are talking about (like I will give an example, I recently prototyped some simple firecracker ssh thing using gliderlabs/ssh golang package, I don't know how the AI code works, its just I built for my own use case, but If I wish to ever (someday) try to monetize it in any sense, rest assured I will try to learn how gliderlabs/ssh works to its core and build it all by my hands)
TLDR: AI's good for prototyping but then once you got the idea/more ideas on top of it, try to rewrite it in your understanding because as others have said the AI code you won't understand and you would spend 99% time on that 1% which AI can't but at that point, why not just rewrite?
Also if you rewrite, I feel like most people will be chill then buying even Anti AI people. Like sure, use AI for prototypes but give me code which I can verify and you wrote/ you understand to its core with 100% pinning of this fact.
If you are really into software projects for sustainability, you are gonna anger a crowd for no reason & have nothing beneficial come out of it.
So I think kind of everybody knows this but still AI gets to production because sustainability isn't the concern.
This is the cause. sustainability just straight up isn't the concern.
if you have VC's which want you to add 100's of features or want you to use AI or have AI integration or something (something I don't think every company should or their creators should be interested in unless necessary) and those VC's are in it only for 3-5 years who might want to dump you or enshitten you short term for their own gains. I can see why sustainability stops being a concern and we get to where we are.
Or another group of people most interested are the startup entrepreneur hustle culture people who have a VC like culture as well where sustainability just doesn't matter
I do hope that I am not blanket naming these groups because sure some might be exceptions but I am just sharing how the incentives aren't aligned and how they would likely end up using AI 90% slop and that's what we end up seeing in evidence for most companies.
I do feel like we need to boost more companies who are in it for the long run/sustainable practices & people/indie businesses who are in it because they are passionate about some project (usually that happens when they face the problem themselves or curiosity in many cases), because we as consumers have an incentive stick as well. Hope some movement can spawn up which can capture this nuance because i am not anti AI completely but not exactly pro either
Yes! I love this framing and it’s spot on. The successful projects that I’ve been involved in someone either cares deeply and resolves the details in real time or we figured out the details before we started. I’ve seen it outside software as well, someone says “I want a new kitchen” but unless you know exactly where you want your outlets, counter depths, size of fridge, type of cabinets, location of lighting, etc. ad infinitum your project is going to balloon in time and cost and likely frustration.
Is your kitchen contractor an unthinking robot with no opinions or thoughts of their own that has never used a kitchen? Obviously if you want a specific cabinet to go in a specific place in the room, you're going to have to give the kitchen contractor specifics. But assuming your kitchen contractor isn't an utter moron, they can come up with something reasonable if they know it's supposed to be the kitchen. A sink, a stove, dishwasher, refrigerator. Plumbing and power for the above. Countertops, drawers, cabinets. If you're a control freak (which is your perogative, it's your kitchen after all), that's not going to work for you. Same too for generated code. If you absolutely must touch every line of code, code generation isn't going to suit you. If you just want a login screen with parameters you define, there are so many login pages the AI can crib from that nondeterminism isn't even a problem.
At least in case of the kitchen contractor, you can trust all the electrical equipment, plumbing etc. is going to be connected in such a way that disasters won't happen. And if it is not, at least you can sue the contractor.
The problem with LLMs is that it is not only the "irrelevant details" that are hallucinated. It is also "very relevant details" which either make the whole system inconsistent or full of security vulnerabilities.
The login page example was actually perfect for illustrating this. Meshing polygons? Centering a div? Go ahead and turn the LLM loose. If you miss any bugs you can just fix them when they get reported.
But if it's security critical? You'd better be touching every single line of code and you'd better fully understand what each one does, what could go wrong in the wild, how the approach taken compares to best practices, and how an attacker might go about trying to exploit what you've authored. Anything less is negligence on your part.
You kitchen contractor will never cook in your kitchen. If you leave the decisions to them, you'll get something that's quick and easy to build, but it for sure won't have all the details that make a great kitchen. It will be average.
Which seems like an apt analogy for software. I see people all the time who build systems and they don't care about the details. The results are always mediocre.
I think this is a major point people do not mention enough during these debates on "AI vs Developers": The business/stakeholder side is completely fine with average and mediocre solutions as long as those solutions are delivered quickly and priced competitively. They will gladly use a vibecoded solution if the solution kinda sorta mostly works. They don't care about security, performance or completeness... such things are to be handled when/if they reach the user/customer in significant numbers. So while we (the devs) are thinking back to all the instances we used gpt/grok/claude/.. and not seeing how the business could possibly arrive to our solutions just with AI and wihout us in the loop... the business doesn't know any of the details nor does it care. When it comes to anything IT related, your typical business doesn't know what it doesn't know, which makes it easy to fire employees/contractors for redundancy first (because we have AI now) and ask questions later (uhh... because we have AI now).
That still requires you to evaluate all the details in order to figure out which you care about. And if you haven't built a kitchen before you, won't know what the details even are ahead of time. Which means you need to be involved in the process, constantly evaluating whether what is currently happening and if you need to care about it.
Maybe they have a kitchen without dishwasher. So unless asked they won't include one. Or even make it possible to include one. Seems like a real possibility. Maybe eventually after building many kitchens they learn they should ask about that one.
> You cannot escape the details. You must engage with them and solve them directly, meticulously. It's messy, it's extremely complicated and it's just plain hard.
Of course you can. The way the manager ignores the details when they ask the developer to do something, the same way they can when they ask the machine to do it.
> the dream underneath this dream is about being able to manifest things into reality without having to get into the details.
Yes, it has nothing to do with dev specifically, dev "just" happens to be how to do so while being text based, which is the medium of LLMs. What also "just" happens to be convenient is that dev is expensive, so if a new technology might help to make something possible and/or make it unexpensive, it's potentially a market.
Now pesky details like actual implementation, who got time for that, it's just few more trillions away.
> In the end, I think the dream underneath this dream is about being able to manifest things into reality without having to get into the details.
> The details are what stops it from working in every form it's been tried.
Since the author was speaking to business folk, I would argue that their dream is cheaper labor, or really just managing a line item in the summary budget. As evidenced by outsourcing efforts. I don't think they really care about how it happens - whether it is manifesting things into reality without having to get into the details, or just a cheaper human. It seems to me that the corporate fever around AI is simply the prospect of a "cheaper than human" opportunity.
Although, to your point, we must await AGI, or get very close to it, to be able to manifest things into reality without having to get into the details :-)
For me what supports this are things outside of software. If a company or regime wants to build something, they can't just say what they want and get exactly what they envision. If human minds can't figure out what other human wants, how could a computer do it?
> Conversely, for the most successful companies the opposite is true.
While I agree with this, I think that it’s important to acknowledge that even if you did everything well and thought of everything in detail, you can still fail for reasons that are outside of your control. For example, a big company buying from your competitor who didn’t do a better job than you simply because they were mates with the people making the decision… that influences everyone else and they start, with good reason, to choose your competitor just because it’s now the “standard” solution, which itself has value and changes the picture for potential buyers.
In other words, being the best is not guarantee for success.
> The recurring dream of replacing developers
> In the end, I think the dream underneath this dream is about being able to manifest things into reality without having to get into the details.
It's basically this:
"I'm hungry. I want to eat."
"Ok. What do you want?"
"I don't know. Read my mind and give me the food I will love."
This matches what keeps repeating.
Tools change where the work happens, but they don’t remove the need for controlled decisions about inputs, edge cases, and outcomes.
When that workflow isn’t explicit, every new abstraction feels like noise.
My mantra as an engineer is "Devil is in the details".
For 2 almost identical problems, having a little diference between them, the solutions can be radically different in complexity, price & time to deliver.
Well said. This dream is probably for someone who have experienced the hardship, felt frustrated and gave up. Then see others who effortless did it, even felt fun for them. The manifestation of the dream feels like revenge to them.
This framing neatly explains the hubris of the influencer-wannabes on social media who have time to post endlessly about how AI is changing software dev forever while also having never shipped anything themselves.
They want to be seen as competent without the pound of flesh that mastery entails. But AI doesn’t level one’s internal playing field.
Yeah this is a thought provoking framing. Maybe the way in which those of us who really enjoy programming are weird is that we relish meticulously figuring out those details.
It looks there's a difference this time: copying the details of other people's work has become exceedingly easy and reliable, at least for commonly tried use cases. Say I want to vibe code a dashboard, and AI codes it out. It works. In fact, it works so much better than I could ever build, because the AI was trained with the best dashboard code out there. Yes, I can't think of all the details of a world-class dashboard, but hey, someone else did and AI correctly responds to my prompt with those details. Such "copying" used to be really hard among humans. Without AI, I would have to learn so much first even if I can use the open-source code as the starting point: the APIs of the libraries, the basic concepts of web programming, and etc. Yet, the AI doesn't care. It's just a gigantic Bayesian machine that emits code that nearly probability 1 for common use cases.
So it is not that details don't matter, but that now people can easily transfer certain know-how from other great minds. Unfortunately (or fortunately?), most people's jobs are learning and replicating know-hows from others.
But the dashboard is not important at all, because everyone can have the same dashboard the same way you have it. It's like you are generating a static website using Hugo and apply a theme provided on it. The end product you get is something built by a streamline. No taste, no soul, no effort. (Of course, the effort is behind the design and produce of the streamline, but not the product produced by the streamline.)
Now, if you want to use the dashboard do something else really brilliant, it is good enough for means. Just make sure the dashboard is not the end.
Dashboard is just an example. The gist is how much of know-how that we use in our work can be replaced by AI transforming other people's existing work. I think it hinges on how many new problems or new business demands will show up. If we just work on small variations of existing business, then quickly our know-hows will converge (e.g. building a dashboard or a vanilla version of linear regression model), and AI will spew out such code for many of us.
I don't think anyone's job is copying "know-how". Knowing how goes a lot deeper than writing the code.
Especially in web, boilerplate/starters/generators that do exactly what you want with little to no code or familiarity has been the norm for at least a decade. This is the lifeblood of repos like npm.
What we have is better search for all this code and documentation that was already freely available and ready to go.
To put an economic spin on this (that no one asked for), this is also the capitalist nirvana. I don't have an immediate citation but from my experience software engineer salary is usually one of the biggest items on a P&L which prevents the capitalist approaching the singularity: limitless profit margin. Obviously this is unachievable but one of the major obstacles to this is in the process of being destablised and disrupted.
> What are those execs bringing to the table, beyond entitlement and self-belief?
The status quo, which always require an order of magnitude more effort to overcome. There's also a substantial portion of the population that needs well-defined power hierarchies to feel psychologically secure.
The argument is empty because it relies on a trope rather than evidence. “We’ve seen this before and it didn’t happen” is not analysis. It’s selective pattern matching used when the conclusion feels safe. History is full of technologies that tried to replace human labor and failed, and just as full of technologies that failed repeatedly and then abruptly succeeded. The existence of earlier failures proves nothing in either direction.
Speech recognition was a joke for half a century until it wasn’t. Machine translation was mocked for decades until it quietly became infrastructure. Autopilot existed forever before it crossed the threshold where it actually mattered. Voice assistants were novelty toys until they weren’t. At the same time, some technologies still haven’t crossed the line. Full self driving. General robotics. Fusion. History does not point one way. It fans out.
That is why invoking history as a veto is lazy. It is a crutch people reach for when it’s convenient. “This happened before, therefore that’s what’s happening now,” while conveniently ignoring that the opposite also happened many times. Either outcome is possible. History alone does not privilege the comforting one.
If you want to argue seriously, you have to start with ground truth. What is happening now. What the trendlines look like. What follows if those trendlines continue. Output per developer is rising. Time from idea to implementation is collapsing. Junior and mid level work is disappearing first. Teams are shipping with fewer people. These are not hypotheticals. The slope matters more than anecdotes. The relevant question is not whether this resembles CASE tools. It’s what the world looks like if this curve runs for five more years. The conclusion is not subtle.
The reason this argument keeps reappearing has little to do with tools and everything to do with identity. People do not merely program. They are programmers. “Software engineer” is a marker of intelligence, competence, and earned status. It is modern social rank. When that rank is threatened, the debate stops being about productivity and becomes about self preservation.
Once identity is on the line, logic degrades fast. Humans are not wired to update beliefs when status is threatened. They are wired to defend narratives. Evidence is filtered. Uncertainty is inflated selectively. Weak counterexamples are treated as decisive. Strong signals are waved away as hype. Arguments that sound empirical are adopted because they function as armor. “This happened before” is appealing precisely because it avoids engaging with present reality.
This is how self delusion works. People do not say “this scares me.” They say “it’s impossible.” They do not say “this threatens my role.” They say “the hard part is still understanding requirements.” They do not say “I don’t want this to be true.” They say “history proves it won’t happen.” Rationality becomes a costume worn by fear. Evolution optimized us for social survival, not for calmly accepting trendlines that imply loss of status.
That psychology leaks straight into the title. Calling this a “recurring dream” is projection. For developers, this is not a dream. It is a nightmare. And nightmares are easier to cope with if you pretend they belong to someone else. Reframe the threat as another person’s delusion, then congratulate yourself for being clear eyed. But the delusion runs the other way. The people insisting nothing fundamental is changing are the ones trying to sleep through the alarm.
The uncomfortable truth is that many people do not stand to benefit from this transition. Pretending otherwise does not make it false. Dismissing it as a dream does not make it disappear. If you want to engage honestly, you stop citing the past and start following the numbers. You accept where the trendlines lead, even when the destination is not one you want to visit.
> “We’ve seen this before and it didn’t happen” is not analysis. It’s selective pattern matching used when the conclusion feels safe.
> If you want to argue seriously, you have to start with ground truth. What is happening now. What the trendlines look like. What follows if those trendlines continue.
Wait, so we can infer the future from "trendlines", but not from past events? Either past events are part of a macro trend, and are valuable data points, or the micro data points you choose to focus on are unreliable as well. Talk about selection bias...
I would argue that data points that are barely a few years old, and obscured by an unprecedented hype cycle and gold rush, are not reliable predictors of anything. The safe approach would be to wait for the market to settle, before placing any bets on the future.
> Time from idea to implementation is collapsing. Junior and mid level work is disappearing first. Teams are shipping with fewer people. These are not hypotheticals.
What is hypothetical is what will happen to all this software and the companies that produced it a few years down the line. How reliable is it? How maintainable is it? How many security issues does it have? What has the company lost because those issues were exploited? Will the same people who produced it using these new tools be able to troubleshoot and fix it? Will the tools get better to allow them to do that?
> The reason this argument keeps reappearing has little to do with tools and everything to do with identity.
Really? Everything? There is no chance that some people are simply pointing out the flaws of this technology, and that the marketing around it is making it out to be far more valuable than it actually is, so that a bunch of tech grifters can add more zeroes to their net worth?
I don't get how anyone can speak about trends and what's currently happening with any degree of confidence. Let alone dismiss the skeptics by making wild claims about their character. Do better.
>Wait, so we can infer the future from “trendlines”, but not from past events? Either past events are part of a macro trend, and are valuable data points, or the micro data points you choose to focus on are unreliable as well. Talk about selection bias…
If past events can be dismissed as “noise,” then so can selectively chosen counterexamples. Either historical outcomes are legitimate inputs into a broader signal, or no isolated datapoint deserves special treatment. You cannot appeal to trendlines while arbitrarily discarding the very history that defines them without committing selection bias.
When large numbers of analogous past events point in contradictory directions, individual anecdotes lose predictive power. Trendlines are not an oracle, but once the noise overwhelms the signal, they are the best approximation we have.
>What is hypothetical is what will happen to all this software and the companies that produced it a few years down the line. How reliable is it? How maintainable is it? How many security issues does it have? What has the company lost because those issues were exploited? Will the same people who produced it using these new tools be able to troubleshoot and fix it? Will the tools get better to allow them to do that?
These are legitimate questions, and they are all speculative. My expectation is that code quality will decline while simultaneously becoming less relevant. As LLMs ingest and reason over ever larger bodies of software, human oriented notions of cleanliness and maintainability matter less. LLMs are far less constrained by disorder than humans are.
>Really? Everything? There is no chance that some people are simply pointing out the flaws of this technology, and that the marketing around it is making it out to be far more valuable than it actually is, so that a bunch of tech grifters can add more zeroes to their net worth?
The flaws are obvious. So obvious that repeatedly pointing them out is like warning that airplanes can crash while ignoring that aviation safety has improved to the point where you are far more likely to die in a car than in a metal tube moving at 500 mph.
Everyone knows LLMs hallucinate. That is not contested. What matters is the direction of travel. The trendline is clear. Just as early aviation was dangerous but steadily improved, this technology is getting better month by month.
That is the real disagreement. Critics focus on present day limitations. Proponents focus on the trajectory. One side freezes the system in time; the other extrapolates forward.
>I don’t get how anyone can speak about trends and what’s currently happening with any degree of confidence. Let alone dismiss the skeptics by making wild claims about their character. Do better.
Because many skeptics are ignoring what is directly observable. You can watch AI generate ultra complex, domain specific systems that have never existed before, in real time, and still hear someone dismiss it entirely because it failed a prompt last Tuesday.
Repeating the limitations is not analysis. Everyone who is not a skeptic already understands them and has factored them in. What skeptics keep doing is reciting known flaws while refusing to reason about what is no longer a limitation.
At that point, the disagreement stops being about evidence and starts looking like bias.
Respectfully, you seem to love the sound of your writing so much you forget what you are arguing about. The topic (at least for the rest of the people in this thread) seems to be whether AI assistance can truly eliminate programmers.
There is one painfully obvious, undeniable historical trend: making programmer work easier increases the number of programmers. I would argue a modern developer is 1000x more effective than one working in the times of punch cards - yet we have roughly 1000x more software developers than back then.
I'm not an AI skeptic by any means, and use it everyday at my job where I am gainfully employed to develop production software used by paying customers. The overwhelming consensus among those similar to me (I've put down all of these qualifiers very intentionally) is that the currently existing modalities of AI tools are a massive productivity boost mostly for the "typing" part of software (yes, I use the latest SOTA tools, Claude Opus 4.5 thinking, blah, blah, so do most of my colleagues). But the "typing" part hasn't been the hard part for a while already.
You could argue that there is a "step change" coming in the capabilities of AI models, which will entirely replace developers (so software can be "willed into existence", as elegantly put by OP), but we are no closer to that point now than we were in December 2022. All the success of AI tools in actual, real-world software has been in tools specifically design to assist existing, working, competent developers (e.g. Cursor, Claude Code), and the tools which have positioned themselves to replace them have failed (Devin).
There is no respectful way of telling someone they like the sound of their own voice. Let’s be real, you were objectively and deliberately disrespectful. Own it if you are going to break the rules of conduct. I hate this sneaky shit. Also I’m not off topic, you’re just missing the point.
I responded to another person in this thread and it’s the same response I would throw at you. You can read that as well.
Your “historical trend” is just applying an analogy and thinking that an analogy can take the place of reasoning. There are about a thousand examples of careers where automation technology increased the need of human operators and thousands of examples where automation eliminated human operators. Take pilots for example. Automation didn’t lower the need for pilots. Take intellisense and autocomplete… That didn’t lower the demand for programmers.
But then take a look at Waymo. You have to be next level stupid to think that ok, cruise control in cars raised automation but didn’t lower the demand for drivers… Therefore all car related businesses including Waymo will always need physical drivers.
As anyone is aware… this idea of using analogy as reasoning fails here. Waymo needs zero physical drivers thanks to automation. There is zero demand here and your methodology of reasoning fails.
Analogies are a form of manipulation. They only help allow you to elucidate and understand things via some thread of connection. You understand A therefore understanding A can help you understand B. But you can’t use analogies as the basis for forecasting or reasoning because although A can be similar to B, A is not in actuality B.
For AI coders it’s the same thing. You just need to use your common sense rather than rely on some inaccurate crutch of analogies and hoping everything will play out in the same way.
If AI becomes as good and as intelligent as a human swe than your job is going out the fucking window and replaced by a single
Prompter. That’s common sense.
Look at the actual trendline of the actual topic: AI taking over our jobs and not automation in other sectors of engineering or other types of automation in software. What happened with AI in the last decade? We went from zero to movies, music and coding.
What does your common sense tell you the next decade will bring?
If the improvement of AI from the last decade keeps going or keeps accelerating, the conclusion is obvious.
Sometimes the delusion a lot of swes have is jarring. Like literally if AGI existed thousands of jobs will be displaced. That’s common sense, but you still see tons of people clinging to some irrelevant analogy as if that exact analogy will play out against common sense.
How ironic of you to call my argument an analogy while it isn't an analogy, yet all you have to offer is exactly that - analogies. Analogies to pilots, drivers, "a thousand examples of careers".
My argument isn't an analogy - it's an observation based on the trajectory of SWE employment specifically. It's you who's trying to reason about what's going to happen with software based on what happened to three-field crop rotation or whatever, not me.
I argued that a developer today is 1000x more effective than in the days of punch cards, yet we have 1000x more developers today. Not only that, this correlation tracked fairly linearly throughout the last many decades.
I would also argue that the productivity improvement between FORTRAN and C, or between C and Python was much, much more impactful than going from JavaScript to JavaScript with ChatGPT.
Software jobs will be redefined, they will require different skill sets, they may even be called something else - but they will still be there.
>How ironic of you to call my argument an analogy while it isn't an analogy, yet all you have to offer is exactly that
Bro I offered you analogies to show you how it's IRRELEVANT. The point was to show you how it's an ineffective form of reasoning via demonstrating it's ineffectiveness FOR YOUR conclusion because using this reasoning can allow you to conclude the OPPOSITE. Assuming this type of reasoning is effective means BOTH what I say is true and what you say is true which leads to a logical contradiction.
There is no irony, only misunderstanding from you.
>I argued that a developer today is 1000x more effective than in the days of punch cards, yet we have 1000x more developers today. Not only that, this correlation tracked fairly linearly throughout the last many decades.
See here, you're using an analogy and claiming it's effective. To which I would typically offer you another analogy that shows the opposite effect, but I feel it would only confuse you further.
>Software jobs will be redefined, they will require different skill sets, they may even be called something else - but they will still be there.
Again, you believe this because of analogies. I recommend you take a stab at my way of reasoning. Try to arrive at your own conclusion without using analogies.
Look at the past decade. Zero AI to AI that codes and makes movies in an inferior way when matched with humans.
What does common sense tell you the next decade will bring? Does the trendline predict flat lining that LLMs or AI in general won’t improve? Or will the trendline continue like most trendlines typically trend on doing? What is the most logical conclusion?
> You cannot appeal to trendlines while arbitrarily discarding the very history that defines them without committing selection bias.
> When large numbers of analogous past events point in contradictory directions, individual anecdotes lose predictive power. Trendlines are not an oracle, but once the noise overwhelms the signal, they are the best approximation we have.
I'm confused. So you're agreeing with me, up until the very last part of the last sentence...? If the "noise overwhelms the signal", why are "trendlines the best approximation we have"? We have reliable data of past outcomes in similar scenarios, yet the most recent noisy data is the most valuable? Huh?
(Honestly, your comments read suspiciously like they were LLM-generated, as others have mentioned. It's like you're jumping on specific keywords and producing the most probable tokens without any thought about what you're saying. I'll give you the benefit of the doubt for one more reply, though.)
To be fair, I think this new technology is fundamentally different from all previous attempts at abstracting software development. And I agree with you that past failures are not necessarily indicative that this one will fail as well. But it would be foolish to conclude anything about the value of this technology from the current state of the industry, when it should be obvious to anyone that we're in a bull market fueled by hype and speculation.
What you're doing is similar to speculative takes during the early days of the internet and WWW. How it would transform politics, end authoritarianism and disinformation, and bring the world together. When the dust settled after the dot-com crash, actual value of the technology became evident, and it turns out that none of the promises of social media became true. Quite the opposite, in fact. That early optimism vanished along the way.
The same thing happened with skepticism about the internet being a fad, that e-commerce would never work, and so on. Both groups were wrong.
> What skeptics keep doing is reciting known flaws while refusing to reason about what is no longer a limitation. At that point, the disagreement stops being about evidence and starts looking like bias.
Skepticism and belief are not binary states, but a spectrum. At extreme ends there are people who dismiss the technology altogether, and there are people who claim that the technology will cure diseases, end poverty, and bring world prosperity[1].
I think neither of these viewpoints are worth paying attention to. As usual, the truth is somewhere in the middle. I'm leaning towards the skeptic side simply because the believers are far louder, more obnoxious, and have more to gain from pushing their agenda. The only sane position at this point is to evaluate the technology based on personal use, discuss your experience with other rational individuals, and wait for the hype to die down.
>I'm confused. So you're agreeing with me, up until the very last part of the last sentence...? If the "noise overwhelms the signal", why are "trendlines the best approximation we have"? We have reliable data of past outcomes in similar scenarios, yet the most recent noisy data is the most valuable? Huh?
Let me help you untangle the confusion. Historical data on other phenomenons is not a trendline for AI taking over your job. It's a typical logical mistake people make. It's reasoning via analogy. Because this trend happened for A, and A fits B like an analogy therefore what happened to A must happen to B.
Why is that stupid logic? Because there are thousands of things that fit B as an analogy. And out of those thousands of things that fit, some failed and some succeeded. What you're doing and not realizin is you are SELECTIVELY picking the analogy you like to use as evidence.
When I speak of a trendline. It's deadly simple. Literally look at AI as it is now, as it is in the past and use that to project into the future. Look at exact data of the very thing you are measuring rather then trying to graft some analogous thing onto the current thing and make a claim from that.
>What you're doing is similar to speculative takes during the early days of the internet and WWW. How it would transform politics, end authoritarianism and disinformation, and bring the world together. When the dust settled after the dot-com crash, actual value of the technology became evident, and it turns out that none of the promises of social media became true. Quite the opposite, in fact. That early optimism vanished along the way.
Again same thing. The early days of the internet is not what's happening to AI currently. You need to look at what happened to AI and software from the beginning to now. Observe the trendline of the topic being examined.
>I think neither of these viewpoints are worth paying attention to. As usual, the truth is somewhere in the middle. I'm leaning towards the skeptic side simply because the believers are far louder, more obnoxious, and have more to gain from pushing their agenda. The only sane position at this point is to evaluate the technology based on personal use, discuss your experience with other rational individuals, and wait for the hype to die down.
Well if you look at the pace and progress of AI, the quantitative evidence points against your middle ground opinion here. It's fashionable to take the middle ground because moderates and grey areas seem more level headed and reasonable than extremism. But this isn't really applicable to reality is it? Extreme events that overload systems happen in nature all the time, taking the middle ground without evidence pointing to the middle ground is pure stupidity.
So all you need to look at is this, in the past decade look at the progress we've made until now. A decade ago AI via ML was non-existent. Now AI generates movies, music and code, and unlike AI in music and movies, code is being in actuality used by engineers.
That's ZERO to coding in a decade. What do you think the next decade will bring. Coding to what? That is reality and the most logical analysis. Sure it's ok to be a skeptic, but to ignore the trendline is ignorance.
I think a really good takeaway is that we're bad at predicting the future. That is the most solid prediction of history. Before we thought speech recognition was impossible, we thought it would be easy. We thought a lot of problems would be easy, and it turned out a lot of them were not. We thought a lot of problems would be hard, and we use those technologies now.
Another lesson history has taught us though, is that people don't defend narratives, they defend status. Not always successfully. They might not update beliefs, but they act effectively, decisively and sometimes brutally to protect status. You're making an evolutionary biology argument (which is always shady!) but people see loss of status as an existential threat, and they react with anger, not just denial.
> What is happening now. What the trendlines look like. What follows if those trendlines continue. Output per developer is rising. Time from idea to implementation is collapsing. Junior and mid level work is disappearing first. Teams are shipping with fewer people. These are not hypotheticals.
My dude, I just want to point out that there is no evidence of any of this, and a lot of evidence of the opposite.
> If you want to engage honestly, you stop citing the past and start following the numbers. You accept where the trendlines lead, even
“There is no evidence” is not skepticism. It’s abdication. It’s what people say when they want the implications to go away without engaging with anything concrete. If there is “a lot of evidence of the opposite,” the minimum requirement is to name one metric, one study, or one observable trend. You didn’t. You just asserted it and moved on, which is not how serious disagreement works.
“You first, lol” isn’t a rebuttal either. It’s an evasion. The claim was not “the labor market has already flipped.” The claim was that AI-assisted coding has changed individual leverage, and that extrapolating that change leads somewhere uncomfortable. Demanding proof that the future has already happened is a category error, not a clever retort.
And yes, the self-delusion paragraph clearly hit, because instead of addressing it, you waved vaguely and disengaged. That’s a tell. When identity is involved, people stop arguing substance and start contesting whether evidence is allowed to count yet.
Now let’s talk about evidence, using sources who are not selling LLMs, not building them, and not financially dependent on hype.
Martin Fowler has explicitly written about AI-assisted development changing how code is produced, reviewed, and maintained, noting that large portions of what used to be hands-on programmer labor are being absorbed by tools. His framing is cautious, but clear: AI is collapsing layers of work, not merely speeding up typing. That is labor substitution at the task level.
Kent Beck, one of the most conservative voices in software engineering, has publicly stated that AI pair-programming fundamentally changes how much code a single developer can responsibly produce, and that this alters team dynamics and staffing assumptions. Beck is not bullish by temperament. When he says the workflow has changed, he means it.
Bjarne Stroustrup has explicitly acknowledged that AI-assisted code generation changes the economics of programming by automating work that previously required skilled human attention, while also warning about misuse. The warning matters, but the admission matters more: the work is being automated.
Microsoft Research, which is structurally separated from product marketing, has published peer-reviewed studies showing that developers using AI coding assistants complete tasks significantly faster and with lower cognitive load. These papers are not written by executives. They are written by researchers whose credibility depends on methodological restraint, not hype.
GitHub Copilot’s controlled studies, authored with external researchers, show measurable increases in task completion speed, reduced time-to-first-solution, and increased throughput. You can argue about long-term quality. You cannot argue “no evidence” without pretending these studies don’t exist.
Then there is plain, boring observation.
AI-assisted coding is directly eliminating discrete units of programmer labor: boilerplate, CRUD endpoints, test scaffolding, migrations, refactors, first drafts, glue code. These were not side chores. They were how junior and mid-level engineers justified headcount. That work is disappearing as a category, which is why junior hiring is down and why backfills quietly don’t happen.
You don’t need mass layoffs to identify a structural shift. Structural change shows up first in roles that stop being hired, positions that don’t get replaced, and how much one person can ship. Waiting for headline employment numbers before acknowledging the trend is mistaking lagging indicators for evidence.
If you want to argue that AI-assisted coding will not compress labor this time, that’s a valid position. But then you need to explain why higher individual leverage won’t reduce team size. Why faster idea-to-code cycles won’t eliminate roles. Why organizations will keep paying for surplus engineering labor when fewer people can deliver the same output.
But “there is no evidence” isn’t a counterargument. It’s denial wearing the aesthetic of rigor.
> If there is “a lot of evidence of the opposite,” the minimum requirement is to name one metric, one study, or one observable trend. You didn’t. You just asserted it and moved on, which is not how serious disagreement works.
I treated it with the amount of seriousness it deserves, and provided exactly as much evidence as you did lol. It's on you to prove your statement, not on me to disprove you.
Also, you still haven't provided the kind of evidence you say is necessary. None of the "evidence" you listed is actually evidence of mass change in engineering.
> AI-assisted coding is directly eliminating discrete units of programmer labor: boilerplate, CRUD endpoints, test scaffolding, migrations, refactors, first drafts, glue code.
You are not a professional engineer lol, because most of those things are already automated and have been for decades. What on earth do you think we do every day?
What you’re doing here is interesting: you’re flattening everything into “we already had tools” because admitting this is different forces you to ask which parts of your own day are actually irreplaceable. So instead of engaging the claim about leverage, you retreat to credential checks and nostalgia for scaffolding scripts from 2012.
Also, saying “this has been automated for decades” is only persuasive if those automations ever removed headcount. They didn’t. This does. Quietly. At the margin. That’s why you’re arguing semantics instead of metrics.
And the “you’re not a professional engineer” line is pure tell. People reach for status policing when the substance gets uncomfortable. If the work were as untouched as you imply, there’d be no need to defend it this hard.
lol. It was not only wrong. It was wildly wrong. Your tone reeks with pride and entitlement. You’re one of those engineers who thinks he’s so great and a step above everyone else who is a non-swe engineer.
Personally I think being a swe is easy. It’s one of the easiest “engineering” skills you can learn hence why you have tons of people learning by themselves or from boot camps while other engineering fields require much more rigor and training to be successful. There’s no bootcamp to be a rocket engineer and that’s literally the difference.
The confidence you have here and how completely off base you are with your intuition on who is just evidence for how wrong you are everywhere else. You should take that to heart. Everything we are talking about is speculation, but your idiotic statements about me is on the ground evidence for how wrong you can be. Why would anyone trust your speculation about AI by how wildly wrong you are “clocking” me in.
> Do you think we sit around, artisinally crafting code the slow way, or something?
This statement is just dripping with raw arrogance. It’s insane, it just shows you think you’re better than everyone because you’re a swe. Let me get one thing straight, I’m a swe with tons of experience (likely more than you) and I’m proud of my technical knowledge and depth, but do I think that other “non swes” just look at us as if we are artisans? That’s some next level narcissism. It’s a fucking job role bro, we’re not “artisans” and nobody thinks of us that way, get off your high horse.
Also wtf do you mean by the “slow” way? Do you have communication issues? Not only will a non swe not understand you but even a swe doesn’t have a clue what the “slow” way means.
>We don't have test engineers or QA nearly as much as we used to, and a lot of IT work is automated, too.
Oh like automated testing or infra as code?? Ooooh such a great engineer you are for knowing these unknowable things that any idiot can learn. Thanks for letting me know a lot of IT work is “automated.” This area is one of the most mundane areas of software engineering, a bunch of rote procedures and best practices.
Also your “my dude” comments everywhere make you look not as smart as you probably think you look. Just some advice for you.
my blog with a few posts that have been on HN front page (eg [0]), but under my old domain davnicwil.com which unfortunately was poached after I accidentally let it lapse. Doh.
the waste isn't a win, of course, but is a downside of a tradeoff that is massively weighted to the upsides for society -- that is (otherwise) completely clean always-on high capacity energy production.
We understand very well how to safely handle nuclear waste and make it a very (very) low risk downside.
looking backwards in the supply chain for other externalities is a good point, but I'm not sure any energy production method is exempt from this?
Also, by the way, my perspective isn't about nuclear Vs X (wind turbines etc) - I like all the ones that are net clean and useful in different circumstances as part of a mix.
I'm just addressing the narrower point about whether nuclear per se is a net benefit for society, which I believe it is, massively.
much has been written about the deteriorating quality of iOS.
There's bluntly not strong external evidence that software quality is a driving priority at Apple in recent years, so it most probably follows that concerns about maintainability aren't either.
I would say while LLMs do improve productivity sometimes, I have to say I flatly cannot believe a claim (at least without direct demonstration or evidence) that one person is doing the work of 20 with them in december 2025 at least.
I mean from the off, people were claiming 10x probably mostly because it's a nice round number, but those claims quickly fell out of the mainstream as people realised it's just not that big a multiplier in practice in the real world.
I don't think we're seeing this in the market, anywhere. Something like 1 engineer doing the job of 20, what you're talking about is basically whole departments at mid sized companies compressing to one person. Think about that, that has implications for all the additional management staff on top of the 20 engineers too.
It'd either be a complete restructure and rethink of the way software orgs work, or we'd be seeing just incredible, crazy deltas in output of software companies this year of the type that couldn't be ignored, they'd be impossible to not notice.
This is just plainly not happening. Look, if it happens, it happens, 26, 27, 28 or 38. It'll be a cool and interesting new world if it does. But it's just... not happened or happening in 25.
I would say it varies from 0x to a modest 2x. It can help you write good code quickly, but, I only spent about 20-30% of my time writing code anyway before AI. It definitely makes debugging and research tasks much easier as well. I would confidently say my job as a senior dev has gotten a lot easier and less stressful as a result of these tools.
One other thing I have seen however is the 0x case, where you have given too much control to the llm, it codes both you and itself into pan’s labyrinth, and you end up having to take a weed wacker to the whole project or start from scratch.
That's why you dont use LLMs as a knowledge source without giving them tools.
"Agents use tools in a loop to achieve a goal."
If you don't give any tools, you get hallucinations and half-truths.
But you give one a tool to do, say, web searches and it's going to be a lot smarter. That's where 90% of the innovation with "AI" today is coming from. The raw models aren't gettin that much smarter anymore, but the scaffolding and frameworks around them are.
Tools are the main reason Claude Code is as good as it is compared to the competition.
> The raw models aren't gettin that much smarter anymore, but the scaffolding and frameworks around them are.
yes, that is my understanding as well, though it gets me thinking if that is true, then what real value is the llm on the server compared to doing that locally + tools?
You still can't beat an acre of specialized compute with any kind of home hardware. That's pretty much the power of cloud LLMs.
For a tool use loop local models are getting to "OK" levels, when they get to "pretty good", most of my own stuff can run locally, basically just coordinating tool calls.
Of course, step one is always critically think and evaluate for bad information. I think for research, I mainly use it for things that are testable/verifiable, for example I used it for a tricky proxy chain set up. I did try to use it to learn a language a few months ago which I think was counter productive for the reasons you mentioned.
How can you critically assess something in a field you're not already an expert on?
That Python you just got might look good, but could be rewritten from 50 lines to 5, it's written in 2010-style, it's not using modern libraries, it's not using modern syntax.
And it is 50 to 5. That is the scale we're talking about in a good 75% of AI produced code unless you challenge it constantly. Not using modern syntax to reduce boilerplate, over-guarding against impossible state, ridiculous amounts of error handling. It is basically a junior dev on steriods.
Most of the time you have no idea that most of that code is totally unnecessary unless you're already an expert in that language AND libraries it's using. And you're rarely an expert in both or you wouldn't even be asking as it would have been quicker to write the code than even write the prompt for the AI.
I use web search (DDG) and I don’t think I have ever try more than one queries in the vast majority of cases. Why because I know where the answer is, I’m using the search engine as an index to where I can find it. Like “csv python” to find that page in the doc.
It's entirely dependent on the type of code being written. For verbose, straightforward code with clear cut test scenarios, one agent can easily 24/7 the work of 20 FT engineers. This is a best case scenario.
Your productivity boost will depend entirely on a combination of how much you can remove yourself from the loop (basically, the cost of validation per turn) and how amenable the task/your code is to agents (which determines your P(success)).
Low P(success) isn't a problem if there's no engineer time cost to validation, the agent can just grind the problem out in the background, and obviously if P(success) is high the cost of validation isn't a big deal. The productivity killer is when P(success) is low and the cost of validation is high, these circumstances can push you into the red with agents very quickly.
Thus the key to agents being a force multiplier is to focus on reducing validation costs, increasing P(success) and developing intuition relating to when to back off on pulling the slot machine in favor of more research. This is assuming you're speccing out what you're building so the agent doesn't make poor architectural/algorithmic choices that hamstring you down the line.
Respectfully, if
I may offer constructive criticism, I’d hope this isn’t how you communicate to software developers, customers, prospects, or fellow entrepreneurs.
To be direct, this reads like a fluff comment written by AI with an emphasis on probability and metrics. P(that) || that.
I’ve written software used by a local real estate company to the Mars Perseverance rover. AI is a phenomenally useful tool. But be weary of preposterous claims.
I'll take you at your word regarding respectfully. That was an off the cuff attempt to explain the real levers that control the viability of agents under particular circumstances. The target market wasn't your average business potato but someone who might care about a hand waived "order approximate" estimator kind of like big-O notation, which is equally hand waivey.
Given that, if you want to revisit your comment in a constructive way rather than doing an empty drive by, I'll read your words with an open mind.
> It's entirely dependent on the type of code being written. For verbose, straightforward code with clear cut test scenarios, one agent can easily 24/7 the work of 20 FT engineers. This is a best case scenario.
So the "verbose, straightforward code with clear cut test scenarios" is already written by a human?
> I mean from the off, people were claiming 10x probably mostly because it's a nice round number,
Purely anecdotal, but I've seen that level of productivity from the vibe tools we have in my workplace.
The main issue is that 1 engineer needs to have the skills of those 20 engineers so they can see where the vibe coding has gone wrong. Without that it falls apart.
It's an interesting one. We'll have to discover where to draw that line in education and training.
It is an incredible accelerant in top-down 'theory driven' learning, which is objectively good, I think we can all agree. Like, it's a better world having that than not having it. But at the same time there's a tension between that and the sort of bottom-up practice-driven learning that's pretty inarguably required for mastery.
Perhaps the answer is as mundane as one must simply do both, and failing to do both will just result in... failure to learn properly. Kind of as it is today except today there's often no truly accessible / convenient top-down option at all therefore it's not a question anyone thinks about.
How I see it, LLMs aren't really much different than existing information sources. I can watch video tutorials and lectures all day, but if I don't sit down and practice applying what I see, very little of it will stick long term.
The biggest difference I see is, pre-LLM search, I spent a lot more time looking for a good source for what I was looking for, and I probably picked up some information along the way.
Definitely. We have to find ways to replicate this.
One thing I've noticed is that I've actually learned a lot more code about things I didn't understand before. Just because I built guardrails to make sure that they are built exactly the perfect way that I like them to be built. And then I've watched my AI build it that way dozens of times now. Start to finish. So now I've just seen all the steps so many times that now I understand a lot more than I did before.
This sort of thing is definitely possible, but we have to do it on purpose.
OP here, yeah, I think that's a really good point.
I feel like the way I'm building this in is a violent maintenance of two extremes.
On one hand, fully merged with AI and acting like we are one being, having it do tons of work for me.
And then on the other hand is like this analog gym where I'm stripped of all my augmentations and tools and connectivity, and I am being quizzed on how good I could do just by myself.
And based on how well I can do in the NAUG scenario, that's what determines what tweaks need to be made to regular AUG workflows to improve my NAUG performance.
Especially for those core identity things that I really care about. Like critical thinking, creating and countering arguments, identifying my own bias, etc.
I think as the tech gets better and better, we'll eventually have an assistant whose job is to make sure that our un-augmented performance is improving, vs. deteriorating. But until then, we have to find a way to work this into the system ourselves.
there could also be an almost chaos-monkey-like approach of cutting off the assistance at indeterminate intervals, so you've got to maintain a baseline of skill / muscle memory to be able to deal with this.
I'm not sure if people would subject themselves to this, but perhaps the market will just serve it to us as it currently does with internet and services sometimes going down :-)
I know for me when this happens, and also when I sometimes do a bit of offline coding in various situations, it feels good to exercise that skill of just writing code from scratch (erm, well, with intellisense) and kind of re-assert that I can do it now we're in tab-autocomplete land most of the time.
But I guess opting into such a scheme would be one-to-one with the type of self determined discipline required to learn anything in the first place anyway, so I could see it happening for those with at least equal motivation to learn X as exist today.
> We'll have to discover where to draw that line in education and training.
I'm not sure we (meaning society as a whole) are going to have enough say to really draw those lines. Individuals will have more of a choice going forward, just like they did when education was democratized via many other technologies. The most that society will probably have a say in is what folks are allowed to pay for as far as credentials go.
What I worry about most is that AI seems like it's going to make the already large have/not divide grow even more.
that's actually what I mean by we. As in, different individuals will try different strategies with it, and we the collective will discover what works based on results.
because even supposing you have an interface for your thing under test (which you don't necessarily, nor do you necessarily want to have to) it lets you skip over having to do any fake implementations, have loads of variations of said fake implementations, have that code live somewhere, etc etc.
Instead your mocks are all just inline in the test code: ephemeral, basically declarative therefore readily readable & grokable without too much diversion, and easily changed.
A really good usecase for Java's 'Reflection' feature.
An anonymous inner class is also ephemeral, declarative, inline, capable of extending as well as implementing, and readily readable. What it isn't is terse.
Mocking's killer feature is the ability to partially implement/extend by having some default that makes some sense in a testing situation and is easily instantiable without calling a super constructor.
Magicmock in python is the single best mocking library though, too many times have I really wanted mockito to also default to returning a mock instead of null.
Yeah, it's funny, I'm often arguing in the corner of being verbose in the name of plain-ness and greater simplicity.
I realise it's subjective, but this is one of the rare cases where I think the opposite is true, and using the 'magic' thing that shortcuts language primitives in a sort-of DSL is actually the better choice.
It's dumb, it's one or two lines, it says what it does, there's almost zero diversion. Sure you can do it by other means but I think the (what I will claim is) 'truly' inline style code of Mockito is actually a material value add in readability & grokability if you're just trying to debug a failing test you haven't seen in ages, which is basically the usecase I have in mind whenever writing test code.
I cannot put my finger on it exactly either. I also often find the mocking DSL the better choice in tests.
But when there are many tests where I instantiate a test fixture and return it from a mock when the method is called, I start to think that an in memory stub would have been less code duplication and boilerplate... When some code is refactored to use findByName instead of findById and a ton of tests fail because the mock knows too much implementation detail then I know it should have been an in memory stub implementation all along.
the one I've noticed being the worst for lock ups is the camera / photos app, which is frankly very surprising given how central the photography usecase appears to be to iPhone sales and therefore Apple's bottom line.
I'm talking, I pull up the camera and try to take literally 4-5 shots quickly and by the 6th there's what feels like seconds of lag between the button press and the photo being taken.
It feels like I'm using an ancient camera phone, or a more modern phone but in extreme heat when the CPU is just throttling everything. But instead, this is a 2 year old iPhone at room temperature.
Interesting, likewise, the Camera app. And other camera apps, I use Halide too.
And Photos. Will it sync? Yes. When? Who the fuck knows? Doesn't matter whether you're on Ethernet or Wifi, gigabit internet. You can quit Photos on both devices, you can then keep Photos open foreground... so what? Photos will sync when it wants to, not what when you want it to.
you're right! The photo syncing is comically bad, again given the alleged importance of photos in the Apple marketing material. That said I've rarely used it in the past and so wasn't sure if it was a newly degraded experience or had always been that poor.
these things are hard, maybe impossible to define.
For example I mostly agree with your calls/called definition but you also get self-described libraries like React giving you defined structure and hooks that call your code.
React is 100% framework. They even bring their own DSL. It's absurd to call React library.
Library is something that can be pulled off project and replaced with something else. There's no non-trivial project where replacing React with anything would be possible. Every React web app is built around React.
100% this. To this day the official website still describe itself as a library, and I'm convinced it's completely for marketing reasons, since 'framework' feels heavy and bloated, like Angular or Java Spring, while 'library' feels fast and lightweight, putting you in control.
Framework can be more or less modular, Angular or Ember choose to be 'battery included', while React choose to be more modular, which is simply choosing the other end of the spectrum on the convenience-versus-flexibility tradeoff.
React ostensibly only care about rendering, but in a way that force you to structure your whole data flow and routing according to its rules (lifecycle events or the 'rules of hooks', avoiding mutating data structures); No matter what they say on the official website, that's 100% framework territory.
Lodash or Moment.js, those are actual bona fide libraries, and nobody ever asked whether to use Vue, Angular or Moment.js, or what version of moment-js-router they should use.
I think absurd is a bit strong. It'd be absurd to call something like rails a library.
I think you can probably see that distinction already, but to spell it out React is described as a library precisely because it does just one thing - the view - and leaves it to you to figure out the entirety of the rest of the stack / structure of your app.
Framework, at least to me, but I also believe commonly, means something that lets you build a full application end to end using it.
You can't do that with React unless your app is just something that lives in the browser either in-memory or with some localstorage backing or something. If that's your app, then probably I'd agree React is your framework per se, but that's hardly ever the case.
By the way, back to my original point, I still do think these things are impossible to define and in lots of ways these terms don't matter - if it's a framework for you, it's a framework - but I just had to defend my position since you described it as absurd :-)
React is a framework for blatantly obvious reasons.
It introduces JSX, which is technically speaking its own programming language independent of JavaScript.
It defines hooks like useState() and useContext() for state management, meaning it is not UI only, plus "function based" components that act as a pseudo DSL that is pretending to be functional JavaScript (which it isn't).
Most of the time you're expected to render the entire page body via react.
You couldn't get further away from the idea of a library.
Most people don't but you absolutely can use React a library. When React was very new, it was popular to use it as a view layer with backbone.js. In that usage, it's essentially a sophisticated templating library.
You can use Spring as a library if you really want to as well, but it's still a framework.
Id maybe concede that frameworks are a subset of libraries. You can use most frameworks in a library fashion, but the opposite is not true (you'd need a mini framework)
Agree, for modern React with hooks. A React component looks like a normal function, only you can't call it. Only React can call it, in order to set up its hooks state.
That’s because React started as a small, focused library and evolved as even more than a framework, a whole ecosystem, complete with its own best practices
React's homepage says "The library for" and "Go full-stack with a framework. React is a library. It lets you put components together, but it doesn’t prescribe how to do routing and data fetching. To build an entire app with React, we recommend a full-stack React framework like Next.js or React Router." and "React is also an architecture. Frameworks that implement it let you..."
React's Wikipedia page says "React ... is a free and open-source front-end JavaScript library", and has no mention of Framework.
Why die on a hill that it "is" something it says it isn't?
> Why die on a hill that it "is" something it says it isn't?
Because I think they're wrong about that.
If you'd prefer a different metaphor this is windmill I will tilt at.
To provide a little more of a rationale: React code calls the code I write - the JSX and the handlers and suchlike.
It's also pretty uncommon to see React used at the same time as other non-React libraries that handle UI stuff.
Most importantly, the culture and ecosystem of React is one of a framework. You chose React at the start of a project and it then affects everything else you build afterwards.
It's super interesting that you have this definition given your authorship of django (I mean, actually interesting, not 'interesting' :-)
In another comment I used the example of rails as a kind of canonical 'framework' that can potentially do everything for you full stack, and django is in the same category, juxtaposed against something like React that cannot.
To that, I think your last paragraph is the one I agree with most closely. It's true, but only for the view part of the app, right? I think that's where I get stuck on stretching to calling it a framework.
I guess I can see it if you're defining your view/client as a separate logical entity from the rest of the stack. Which is totally reasonable. But I guess just not how I think about it.
The details are what stops it from working in every form it's been tried.
You cannot escape the details. You must engage with them and solve them directly, meticulously. It's messy, it's extremely complicated and it's just plain hard.
There is no level of abstraction that saves you from this, because the last level is simply things happening in the world in the way you want them to, and it's really really complicated to engineer that to happen.
I think this is evident by looking at the extreme case. There are plenty of companies with software engineers who truly can turn instructions articulated in plain language into software. But you see lots of these not being successful for the simple reason that those providing the instructions are not sufficiently engaged with the detail, or have the detail wrong. Conversely, for the most successful companies the opposite is true.
reply