More

madrox · 2026-03-03T19:51:37 1772567497

I encourage everyone to RTFA and not just respond to the headline. This really is a glimpse into where the future is going.

I've been saying "the last job to be automated will be QA" and it feels more true every day. It's one thing to be a product engineer in this era. It's another to be working at the level the author is, where code needs to be verifiable. However, once people stop vibing apps and start vibing kernels, it really does fundamentally change the game.

I also have another saying: "any sufficiently advanced agent is indistinguishable from a DSL." I hadn't considered Lean in this equation, but I put these two ideas together and I feel like we're approaching some world where Lean eats the entire agentic framework stack and the entire operating system disappears.

If you're thinking about building something today that will still be relevant in 10 years, this is insightful.

fmbb · 2026-03-03T22:51:51 1772578311

There are still no successful useful vibe codes apps. Kernels are pretty far away I think.

bonoboTP · 2026-03-03T23:37:39 1772581059

This is a very strange statement. People don't always announce when they use AI for writing their software since it's a controversial topic. And it's a sliding scale. I'm pretty sure a large fraction of new software has some AI involved in its development.

qsera · 2026-03-04T01:46:19 1772588779

> new software has some AI involved in its development.

A large part of it is probably just using it as a better search. Like "How do I define a new data type in go?".

sublinear · 2026-03-04T03:14:46 1772594086

I strongly agree with this. The only place where AI is uncontroversial is web search summaries.

The real blockers and time sinks were always bad/missing docs and examples. LLMs bridge that gap pretty well, and of course they do. That's what they're designed to be (language models), not an AGI!

I find it baffling how many workplaces are chasing perceived productivity gains that their customers will never notice instead of building out their next gen apps. Anyone who fails to modernize their UI/UX for the massive shift in accessibility about to happen with WebMCP will become irrelevant. Content presentation is so much higher value to the user. People expect things to be reliable and simple. Especially new users don't want your annoying onboarding flow and complicated menus and controls. They'll just find another app that gives them what they want faster.

dehrmann · 2026-03-04T01:38:02 1772588282

Apps are a strange measure because there aren't really any new, groundbreaking ones. PCs and smartphones have mostly done what people have wanted them to do for a while.

shimman · 2026-03-04T01:54:59 1772589299

There are plenty of ground breaking apps but they aren't making billions of advertising revenue, nor do they have large numbers. I honestly think torrent applications (and most peer to peer type of stuff) are very cool and very useful for small-medium groups but it'll never scale to a billion user thing.

Do agree it's a weird metric to have, but can't think of a better one outside of "business" but that still seems like a poor rubric because the vast majority of people care about things that aren't businesses and if this "life altering" technology basically amounts to creating digital slaves then maybe we as a species shouldn't explore the stars.

tempaccount5050 · 2026-03-03T23:41:56 1772581316

I think this might miss the point. We put off upgrading to an new RMM at work because I was able to hack together some dashboards in a couple days. It's not novel and does exactly what we need it to do, no more. We don't need to pay 1000's of dollars a month for the bloated Solarwinds stack. We aren't saving lives, we're saving PDFs so any arguments about 5 9s and maintainability are irrelevant. LLMs are going to give us on demand, one off software. I think the SaaS market is terrified right now because for decades they've gouged customers for continual bloat and lock in that now we can escape from. In a single day I was able to build an RMM that fits our needs exactly. We don't need to hire anyone to maintain it because it's simple, like most business applications should be, but SV needs to keep complicating their offerings with bloat to justify crazy monthly costs that should have been a one time purchase from the start. SV shot itself in the face with AI.

theshrike79 · 2026-03-04T13:31:43 1772631103

Define "successful"?

Does it need to be HN-popular or a household name? Be in the news?

Or something that saves 50% of time by automating inane manual work from a team?

raincole · 2026-03-04T10:18:43 1772619523

Name 3 apps that are

1. widely considered successful 2. made by humans from scratch in 2025

It looks like humans and AI are on par in this realm.

GoatInGrey · 2026-03-03T22:55:08 1772578508

To be fair, Claude Code is vibe-coded. It's a terrible piece of software from an engineering (and often usability) standpoint, and the problems run deeper than just the choice of JavaScript. But it is good enough for people to get what they want out of it.

bunderbunder · 2026-03-03T23:08:27 1772579307

But also, based on what I have heard of their headcount, they are not necessarily saving any money by vibecoding it - it seems like their productivity per programmer is still well within the historical range.

That isn’t necessarily a hit against them - they make an LLM coding tool and they should absolutely be dogfooding it as hard as they can. They need to be the ones to figure out how to achieve this sought-after productivity boost. But so far it seems to me like AI coding is more similar to past trends in industry practice (OOP, Scrum, TDD, whatever) than it is different in the only way that’s ever been particularly noteworthy to me: it massively changes where people spend their time, without necessarily living up to the hype about how much gets done in that time.

jmathai · 2026-03-04T03:01:10 1772593270

> But it is good enough for people to get what they want out of it.

This is the ONLY point of software unless you’re doing it for fun.

gwern · 2026-03-04T04:29:47 1772598587

> I encourage everyone to RTFA and not just respond to the headline.

This is an example of an article which 'buries the lede'†.

It should have started with the announcement of the new zlib autoformalization (!) https://leodemoura.github.io/blog/2026/02/28/when-ai-writes-... to get you excited.

Then it should have talked about the rest - instead of starting with rather graceless and ugly LLM-written generic prose about AI topics that to many readers is already tiresomely familiar and doubtless was tldr for even the readers who aren't repelled automatically by that.

† or in my terms, fails to 'make you care': https://gwern.net/blog/2026/make-me-care

bwestergard · 2026-03-03T22:36:10 1772577370

I am as enthusiastic about formal methods as the next guy, but I very much doubt any LLM-based technique will make it economical to write a substantial fraction of application software in Lean. The LLM can play a powerful heuristic role in searching for proof-bearing code in areas where there is good training data. Unfortunately those areas are few and far between.

Moreover, humans will still need to read even rigorously proved code if only to suss out performance issues. And training people to read Lean will continue to be costly.

Though, as the OP says, this is a very exciting time for developing provably correct systems programming.

zozbot234 · 2026-03-03T23:06:55 1772579215

LLMs are writing non-trivial math proofs in Lean, and software proofs tend to be individually easier than proofs in math, just more tedious because there's so much more of them in any non-trivial development.

Some performance issues (asymptotics) can be addressed via proof, others are routinely verified by benchmarking.

madrox · 2026-03-03T23:53:08 1772581988

This assumes everything about current capabilities stay static, and it wasn't long ago before LLMs couldn't do math. Many were predicting the genAI hype had peaked this time last year.

If you want it to be a question of economics, I think the answer is in whether this approach is more economical than the alternative, which is having people run this substrate. There's a lot of enthusiasm here and you can't deny there has been progress.

I wouldn't be so quick to doubt. It costs nothing to be optimistic.

candiddevmike · 2026-03-04T00:45:54 1772585154

> and it wasn't long ago before LLMs couldn't do math

They still can't do math.

Hammershaft · 2026-03-04T01:04:45 1772586285

Pro models won gold at the international math olympiads?

otabdeveloper4 · 2026-03-04T10:00:10 1772618410

[*] According to cloud LLM provider benchmarks.

bandrami · 2026-03-04T02:53:54 1772592834

They have trouble adding two numbers accurately though

evrimoztamur · 2026-03-04T07:16:46 1772608606

Why are they expected to?

haspok · 2026-03-04T13:08:28 1772629708

If you believe the "AGI is just around the corner" hype...

bandrami · 2026-03-05T00:23:28 1772670208

Going to be hard to convince normies it can do harder things than that if it can't do that

charlieflowers · 2026-03-03T21:08:32 1772572112

> "any sufficiently advanced agent is indistinguishable from a DSL."

I don't quite follow but I'd love to hear more about that.

madrox · 2026-03-03T23:41:58 1772581318

If you give an agent a task, the typical agentic pattern is that it calls tools in some non-deterministic loop, feeding the tool output back into the LLM, until it deems the task complete. The LLM internalizes an algorithm.

Another way of doing it is the agent just writes an algorithm to perform the task and runs it. In this world, tools are just APIs and the agent has to think through its entire process end to end before it even begins and account for all cases.

Only latter is turing complete, but the former approaches the latter as it improves.

esafak · 2026-03-03T22:15:14 1772576114

https://en.wikipedia.org/wiki/Clarke's_three_laws

charlieflowers · 2026-03-03T23:31:43 1772580703

No i get the clarke reference. But how is an agent a dsl?

whattheheckheck · 2026-03-03T23:55:24 1772582124

Maybe not an agent exactly but I can see an agentic application is kind of like a dsl because the user space has a set of queries and commands they want to direct the computer to take action but they will describe those queries and commands in English and not with normal programming function calls

thinkling · 2026-03-03T23:49:22 1772581762

My read was roughly that agents require constraining scaffolding (CLAUDE.md) and careful phrasing (prompt engineering) which together is vaguely like working in a DSL?

jpollock · 2026-03-03T22:51:41 1772578301

If the llm is able to code it, there is enough training data that youight be better off in a different language that removes the boilerplate.

kubanczyk · 2026-03-04T12:25:46 1772627146

> RTFA

Sigh. Is there any LLM solution for HN reader to filter out all top-level commenters that hadn't RTFA? I don't need the (micro-)shitstorms that these people spawn, even if the general HN algo scores these as "interesting".

madrox · 2026-03-03T19:30:51 1772566251

Every job in engineering is changing right now. Managers aren't immune. I've been an EM for almost 20 years in some flavor or another, and I've been thinking a lot about how I want to adapt to this era.

This is the first time I've seriously considered swapping out of management. Not for any of the reasons the author says, but because:

- I don't feel as confident mentoring others through this period given how much the work is changing

- I find myself enjoying the work more

- EMs tend to have more difficulty justifying their existence at the best of times let alone a period of change like this

The AI world will still need EMs. It's just unclear what those EMs will be doing every day and how it will work.

madrox · 2026-03-03T18:40:39 1772563239

I've been thinking about this as well, and I'm glad the author is talking about it. However, I don't think he took it far enough.

It is correct to say there's near-infinite demand for AI, and supply is limited. It stands to reason that wealthier people will pay more, and therefore get more, out of AI.

However, this has always been true, but historically instead of AI it's been workers. The economics of labor haven't changed. So it will, as always, be a game of how you deploy the workers you hire. Are you generating useless morning briefs or are you actually generating value for yourself and others with the AI you buy? If you generate more value that the tokens you burn, you'll get ahead.

This will be true in academia as well, the area of interest to the author. He writes like, before AI, grad student level intelligence came for free.

Ok, wait, sorry, bad example...

madrox · 2026-02-26T23:33:35 1772148815

I got my first tech job in 2001. I've been doing this a while and ridden all the waves.

There are two kinds of waves. The ones that don't require collective belief in them to succeed, and those that do.

The latter are kinds like crypto and social media. The former is mobile...and AI.

If no one else in the world had access to AI except me, I would appear superhuman to everyone in the world. People would see my level of output and be utterly shocked at how I can do so much so quickly. It doesn't matter if others don't use AI for me to appreciate AI. In fact, the more other people don't use AI, the better it works out for me.

I'm sympathetic to people who feel like they are against it on principle because scummy influencers are talking about it, but I don't think they're doing themselves any favors.

bigstrat2003 · 2026-02-27T00:16:33 1772151393

> If no one else in the world had access to AI except me, I would appear superhuman to everyone in the world.

You really wouldn't. AI simply isn't that useful because it is so unreliable.

madrox · 2026-02-27T01:05:45 1772154345

I have found that to be utterly untrue

madrox · 2026-02-26T23:27:51 1772148471

I think it is a reasonable moral stance to acknowledge such things are possible, yet not wanting to be a part of it. Regarding making it technically impossible to do...I think that is what Anthropic means when they say they want to develop guardrails.

mvkel · 2026-02-26T23:55:42 1772150142

Are the guardrails not part of their core? Isn't that the whole premise of their existence?

madrox · 2026-02-27T01:14:07 1772154847

If you read the statement, they explicitly state these guardrails don't exist today, and they want to develop them.

Though I have a feeling we're talking about different things. In Claude Code terms, it might want to rm -rf my codebase. You sound like you might want it to never run rm -rf. Anthropic probably wants to catch dangerous commands and send them to humans to approve, like it does today.

mvkel · 2026-02-27T02:50:07 1772160607

That's my point. They formed anthropic under the sole mandate of "guardrails first," now seemingly don't have them at all. So they're just another ai company with different marketing, not the purely altruistic outfit they want everyone to believe

xvector · 2026-02-27T06:33:09 1772173989

The ability of some people to never be happy, and to find a way to twist a good situation into bad, will always impress me.

Here we have a company doing something unprecedented but it is STILL not enough for people like you. The DoD could destroy them over this statement, and have indicated an intent to do so, but it's still not enough for you that they stand up to this.

I wonder what life is like being so puritanical and unwilling to accept the good, for it is not perfect! This mindset is the road to a life of bitterness.

mvkel · 2026-02-28T01:34:22 1772242462

It's more that I'm allergic to hypocrisy.

madrox · 2026-02-11T19:43:54 1770839034

I have noticed, if I hit my session quota before it resets, that Claude gets "sleepy" for a day or so afterward. It's demonstrably worse at tasks...especially complex ones. My cofounder and I have both noticed this.

Our theory is that Claude gets limited if you meet some threshold of power usage.

madrox · 2026-02-11T01:09:45 1770772185

If only we could look into the future to see who is right and which future is better so we could stop wasting our time on pointless doomerism debate. Though I guess that would come with its own problems.

Hey, wait...

madrox · 2026-01-19T22:34:21 1768862061

It's doesn't work...yet. I agree my stomach churns a little at this sentence. However, paying customers care about reliability and performance. Code review helps that today, but it's only a matter of time before it is more performative than useful in serving those goals at the cost of velocity.

AIorNot · 2026-01-19T22:41:14 1768862474

the (multi) billon dollar question is when that will happen, I think, case in point:

the OP is a kid in his 20s describing the history of the last 3 years or so of small scale AI Development (https://www.linkedin.com/in/silen-naihin/details/experience/)

How does that compare to those of us with 15-50 years of software engineering experience working on giant codebases that have years of domain rules, customers and use cases etc.

When will AI be ready? Microsoft tried to push AI into big enterprise, Anthropic is doing a better job -but its all still in infancy

Personally for me I hope it won't be ready for another 10 years so I can retire before it takes over :)

I remember when folks on HN all called this AI stuff made up

madrox · 2026-01-19T22:54:53 1768863293

As a guy in his mid-forties, I sympathize with that sentiment.

I do think you're missing how this will likely go down in practice, though. Those giant codebases with years of domain rules are all legacy now. The question is how quickly a new AI codebase could catch up to that code base and overtake it, with all the AI-compatibility best practices baked in. Once that happens, there is no value in that legacy code.

Any prognostication is a fool's errand, but I wouldn't go long on those giant codebases.

AIorNot · 2026-01-19T23:43:13 1768866193

Yeah agreed - It all depends on how quickly AI (or more aptly, ai driven work done by humans hoping to make a buck) starts replacing real chunks of production workflows

“prediction is hard especially about the future” - yogi berra

As a hedge - I have personally dived deep into AI coding, actually have been for 3 years now - I’ve even launched 2 AI startups and working on a third - but its all so unpredictable and hardly lucrative yet

As an over 50 year old - I’m a clear target for replacement by AI

tacker2000 · 2026-01-20T06:44:05 1768891445

Thats the problem, the most “noise” regarding AI is made by juniors who are wowed by the ability to vibe code some fun “sideproject” React CRUD apps, like compound interest calculators or PDF converters.

No mention of the results when targeting bigger, more complex projects, that require maintainability, sound architectural decisions, etc… which is actually the bread and butter of SW engineering and where the big bucks get made.

KellyCriterion · 2026-01-20T06:51:29 1768891889

>>like compound interest calculators or PDF converters.

Caught you! You have been on HN very actively the last days, because these were exactly the projects in "Show HN: .." category and you would not be able to tell them if you wouldnt have spent your whole time here :-D

Ha! :-D

bandrami · 2026-01-20T09:06:16 1768899976

This is what people were saying about Rails 20 years ago: it wows the kids who use it to set up a CRUD website quickly but fails at anything larger-scale. They were kind of right in the sense that engineering a large complex system with Rails doesn't end up being particularly easier than with Plone or Mason or what have you. Maybe this will just be Yet Another Framework.

bonesss · 2026-01-20T10:09:56 1768903796

Ruby OnRails is an interesting hype counter point.

A substantial number of the breathless LLM hype results come, in my estimation, quicker and better as 15 min RoR tutorials. [Fire up a calculator (from a library), a pretty visualization (from a js library), add some persistence (baked in DB, webhost), customize navigation … presto! You actually built a personal application.]

Fundamental complexity, engineering, scaling gotchyas, accessibility needs, customer insanity aren’t addressed. RoR optimizes for some things, like any other optimization that’s not always a meaningful.

LLMs have undeniable utility, natural interaction is amazing, and hunting in Reddit, stackoverflow, and MSDN forums ‘manually’ isn’t a virtue… But when the VC subsidies stop and the psychoses get proper names and the right kind of egg hits the right kind of face over unreviewed code, who knows, maybe we can make a fun hype cycle called “Actual Engineering” (AE®).

bandrami · 2026-01-21T03:41:40 1768966900

> hunting in Reddit, stackoverflow, and MSDN forums ‘manually’ isn’t a virtue

Agreed, but: being able to read and apply the 1st-party documentation is a virtue

Ronsenshi · 2026-01-20T07:07:27 1768892847

I'm currently in a strange position where I am being that developer with 15+ years of industry experience managing a project that's been taken over by a young AI/vibe-code team (against my advise) that plans to do complete rewrite in a low-code service.

Project was started in late 00s so it has substantial amount of business logic, rules and decisions. Maybe I'm being an old man shouting at the clouds, but I assume (or hope?) it would fail to deliver whatever they promised to the CEO.

So, I guess I'll see the result of this shift soon enough - hopefully at a different company by the time AI-people are done.

fhd2 · 2026-01-20T07:23:12 1768893792

The problem is, feedback cycles for projects are long. Like 1-10 years depending on the nature and environment. As the saying goes, the market can remain irrational longer than you can remain solvent.

Maybe the deed is done here, and I'd agree it's not particularly fun, but you could still think about what you can bring to the table in situations like this. Can you work on shortening these pesky feedback cycles? Can you help the team (if they even accept it) with _some_ degree of engineering? It might not be the last time this happens.

I think right now we're seeing some weird stuff going on, but I think it hasn't even properly started yet. Remember when pretty much every company went "agile"? In most cases I've seen they didn't, just wasting time chasing miracles with principles and methodologies few people understand deeply enough to apply. Yet this went on for, what, 10 years?

onion2k · 2026-01-20T07:21:36 1768893696

How does that compare to those of us with 15-50 years of software engineering experience working on giant codebases that have years of domain rules, customers and use cases etc.

At most of the companies I've worked at the development team is more like a cluster of individuals who all happen to be contributing to a shared codebase than anything resembling an actual team who collaborate on a shared goal. AI-assisted engineering would have helped massively because the AI would be looking outside of the myopic view any developer who is only focused on their tiny domain in the bigger whole cared about.

Admittedly though, on a genuinely good team it'll be less useful for a long time.

TYPE_FASTER · 2026-01-20T19:11:04 1768936264

It's still new, but it's useful now. I'm on the Claude Pro plan personally. I had Claude write a Chrome extension for me personally this morning. It built something working, close to a MVP, then I hit the Claude Pro limit.

I have access to Claude Code at work. I integrated it with IntelliJ and let it rip on a legacy codebase that uses two different programming languages plus one of the smaller SCADA platforms plus hardware logic in a proprietary format used by a vendor tool. It was mostly right, probably 80-90%, had a couple mis-understandings. No documentation, I didn't really give it much help, it just kind of...figured it out.

It will be very helpful for refactoring the codebase in the direction we were planning on going, both from the design and maybe implementation perspectives. It's not going to replace anybody, because the product requires having a deep understanding across many disciplines and other external products, and we need technical people to work outside the team with the larger org.

My thinking changes every week. I think it's a mistake to blindly trust the output of the tool. I think it's a mistake to not at least try incorporating it ASAP, just to try it out and take advantage of the tools that everybody else will be adopting or has adopted.

I'm more curious about the impacts on the web: where is the content going to come from? We've seen the downward StackOverflow trend, will people still ask/answer questions there? If not, how will the LLMs learn? I think the adoption of LLMs will eventually drive the adoption of digital IDs. It will just take time.

madrox · 2026-01-19T22:31:25 1768861885

You're describing genetic algorithms: https://en.wikipedia.org/wiki/Genetic_algorithm

asdff · 2026-01-19T22:50:32 1768863032

Exactly. As compute increases these algorithms will only get more compelling. You can test and evaluate so many more ideas than any human inventors can generate on their own.

HPsquared · 2026-01-19T22:40:26 1768862426

I suppose you could generate prompts from "genes" somehow.

madrox · 2025-11-23T21:37:21 1763933841

I worry a lot about fads in engineering management. Any time you proscribe process over outcomes you create performative behavior and bad incentives in any discipline. In my observation, this tends to happen in engineering because senior leaders have no idea how to evaluate EMs in a non-performative way or as a knee-jerk to some broader cultural behavior. I think this is why you see many successful, seasoned EMs become political animals over time.

My suspicion about why this is the case is rooted in the responsibilities engineering shares with product and design at the management level. In an environment where very little unilateral decision making can be made by an EM, it is difficult to know if an outcome is because the EM is doing well or because of the people around them. I could be wrong, but once you look high enough in the org chart to no longer see trios, this problem recedes.

The author really got me thinking about the timeless aspects of the role underlying fads. I have certainly noticed shifts in management practice at companies over my career, but I choose to believe the underlying philosophy is timeless, like the relationship between day to day software engineering and computer science.

I worry about the future of the EM discipline. Every decade or so, it seems like there is a push to eliminate the function altogether, and no one can agree on the skillset. And yet like junior engineers, this should be the function that grows future leadership. I don't understand why there is so much disdain for it.

brightball · 2025-11-23T23:37:34 1763941054

Process over Outcome is something that I think would be easy for anyone to proscribe to a process that they didn't like.

In my younger years, I was very cavalier about my approach to programming even at a larger company. I didn't particularly want to understand why I had to jump through so many hoops to access a production database to fix a problem or why there were so many steps to deploy to production.

Now that I more experienced, I fully understand all of those guardrails and as a manager my focus is on streamlining those guardrails as much as possible to get maximum benefit with minimum negative impact to the team solving problems.

But this involves a lot of process automation and tooling.

p_v_doom · 2025-11-24T08:56:41 1763974601

The problem imo tends to be not that there are guard rails in place. It's that they are often build by people that only care about the guard rail part and completely forget that its supposed to be last barrier and that there are other things you can do before you get people to hit a guardrail

jaredklewis · 2025-11-23T23:22:32 1763940152

I like your thinking about this problem.

What if teams were integrated groups of engineers, designers, and product people, managed by polymaths with at least some skill in all of these areas. In this case, do you think it would be easier to evaluate the team’s (and thus the manager’s) performance and then higher levels of management would care less about processes and management philosophy?

madrox · 2025-11-23T23:31:35 1763940695

You're describing the GM (general manager) model, sometimes called the single threaded leader. This does work well in large scale organizations...especially ones where teams are built around projects and outcomes but exist for a finite time. Video game development tends to have this model.

I tend to believe in this model because when I've seen it in action, bad GMs are quickly identified and replaced for the betterment of the project.

It can be challenging to implement for a few reasons.

- It is difficult for a GM to performance manage across all disciplines. This model works best when you aren't interested in talent development.

- It's bad for functional consistency. GMs are focused on their own outcomes and can make the "ship your org chart" problem worse. It requires strong functional gatekeepers as a second-order discipline.

Aeolun · 2025-11-23T22:49:23 1763938163

> I don't understand why there is so much disdain for it.

I do. It’s often done by people that become tyrants over their little fiefdom.

madrox · 2025-11-23T23:20:07 1763940007

That's usually a consequence of bad incentives. Either leadership is selecting for that kind of behavior in managers or they don't know how to properly unselect for it.

If a bunch of crap code gets shipped, it isn't always because the engineers are bad. Often it's because they were given a bad deadline. Same with EMs.