Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Only thing? Just off the top of my head: That the LLM doesn't learn incrementally from previous encounters. That we appear to have run out of training data. That we seem to have hit a scaling wall (reflected in the performance of GPT5).

I predict we'll get a few research breakthroughs in the next few years that will make articles like this seem ridiculous.



Re online learning - If I freeze 40 yo Einstein and make it so he can't form new memories beyond 5 minutes, that's still an incredibly useful, generally intelligent thing. Doesn't seem like a problem that needs to be solved on the critical path to AGI.

Re training data - We have synthetic data, and we probably haven't hit a wall. Gpt-5 was only 3.5 months after o3. People are reading too much into the tea leaves here. We don't have visibility into the cost of Gpt-5 relative to o3. If it's 20% cheaper, that's the opposite of a wall, that's exponential like improvement. We don't have visibility into the IMO/IOI medal winning models. All I see are people curve fitting onto very limited information.


> If I freeze 40 yo Einstein and make it so he can't form new memories beyond 5 minutes, that's still an incredibly useful, generally intelligent thing

A "frozen mind" feels like something not unlike a book - useful, but only with a smart enough "human user", and even so be progressively less useful as time passes.

>Doesn't seem like a problem that needs to be solved on the critical path to AGI.

It definitely is one. I know we are running into definitions, but being able to form novel behavior patterns based on experience is pretty much the essence of what intelligence is. That doesn't necessary mean that a "frozen mind" will be useless, but it would certainly not qualify as AGI.

>We don't have visibility into the IMO/IOI medal winning models.

There are lies, damn lies and LLM benchmarks. IMO/IOI is not necessarily indicative of any useful tasks.


>If I freeze 40 yo Einstein and make it so he can't form new memories beyond 5 minutes, that's still an incredibly useful, generally intelligent thing.

But every time you tried to get him to do something you'd have to teach him from first principles. Good luck getting ChatStein to interact with the internet, to write code or design a modern airplane. Even in physics, he'd be using antiquated methods and assumptions, this getting worse as time progresses(like sib comment I believe was alluding to).

And don't even get me started on the language barrier.

I recently read this short story[1] on the topic so it's fresh on my mind.

[1]https://qntm.org/mmacevedo


Never before did we have a combination of well and poison where the pollution of the first was both as instantaneous and as easily achieved.

I‘ve yet to see a convincing article for artificial training data.


It does seem like it helps with math, but in a way that demonstrates the futility of the enterprise: "after training the LLM on 10,000,000 examples of K-8 arithmetic it is now superhuman up to 12 digits, after which it falls off a cliff. Also it demonstrably doesn't understand what 'four' means conceptually and it still fails on many trivial counting problems."


yeah like another commenter said, if you can get synthetic data with some some sort of easily verifiable grounding (math, games, code) models can do very well. this is one of the underpinnings of reinforcement learning that has helped some advancements in past year or so (AFAIK)


> LLM doesn't learn incrementally from previous encounters

This. Lack of any way to incorporate previous experience seems like the main problem. Humans are often confidently wrong as well - and avoiding being confidently wrong is actually something one must learn rather than an innate capability. But humans wouldn't repeat same mistake indefinitely.


You can gather feedback from inference and funnel that back into model training. It's just very, very hard to do that without shooting yourself in the foot.

The feedback you get is incredibly entangled, and disentangling it to get at the signals that would be beneficial for training is nowhere near a solved task.

Even OpenAI has managed to fuck up there - by accidentally training 4o to be a fully bootlickmaxxed synthetic sycophant. Then they struggled to fix that for a while, and only made good progress at that with GPT-5.


The problem is the kinds of "data" users will feed it. It's basically an impossible task to put a continuous learning model online and not have it devolve into the optimal mix of stalin & hitler


Incrementally learning model is pretty hard. That’s actually something I am working on right now and it’s completely different from developing/implementing LLMs.


I think that's what it's going to take. Eventually put the learning model in a robot body and send it out into the real world where there's no shortage of training data.


Yea it'll learn real quick what falling in a ravine is like


Cool, got any previous work to share?


Having run out of training data isn't something holding back LLMs in this sense.

But I agree that being confidently wrong is not the only thing they can't do. Programming, great, maths, apparently great nowadays, since Google and OpenAI have something that could solve most problems on the IMO, even if the models we get to see probably aren't models that can do this, but LLMs produce crazy output when asked to produce stories, they produce crazy output when given too long confusing contexts and have some other problems of that sort.

I think much of it is solvable. I certainly have ideas about how it can be done.


> That we appear to have run out of training data.

I think the next iteration of LLM is going to be "interesting", i.e. now that all the websites they used to freely scrape have been increasingly putting up walls.


Also that no companies involved seem to be making a profit, have a reasonable vision to make a profit, or even revenue in the same ballpark as costs.

Except nvidia perhaps


But that's not so unusual these days, no? I'm not terribly knowledgeable in the field, but don't Amazon, Netflix, Uber and such work on this kind of funding structure?


Uh no, Netflix runs a real business with real fundamentals and do things like invest in their content creation pipeline to provide their customers a better reason to stay, and their price to consumers is set by their costs and desired profit margin, and they have run a pretty straightforward business since the very beginning.

Uber successfully turned a war chest into a partial monopoly of all ride hailing for significant chunks of the world. That was the clear plan from the start, and was always meant to own the market so they could extract whatever rent they want.

Amazon reinvested heavily while competition floundered in order to literally own the market, and has spent every second since squeezing the market, their partners, everyone in the chain to extract ever more money from it.

None of those are even close to buying absurdly overpriced hardware from a monopoly and reselling access to that hardware for less than it costs to run and doing huge PR sweeps about how what you are building might kill everyone so we should obviously give them trillions in government dollars because if an American company isn't the one to kill everyone than we have failed.

"Not terribly knowledgeable in the field"?


Author here.

You’re right in that it’s obviously not the only problem.

But without solving this seems like no matter how good the models get it’ll never be enough.

Or, yes, the biggest research breakthrough we need is reliable calibrated confidence. And that’ll allow existing models as they are to become spectacularly more useful.


The biggest breakthrough that we need is something resembling actual intelligence in AI (human or machine, I’ll let you decide where we need it more ;) )


You might be getting downvoted because you editorialized your own title. If it’s obviously not the only thing then don’t add that to the title :)


> Only thing? Just off the top of my head: That the LLM doesn't learn incrementally from previous encounters. That we appear to have run out of training data.

Ha, that almost seems like an oxymoron. The previous encounters can be the new training data!


The old training was human responses to human questions. From this the bot learned to mimick human responses.

What would be the point of training an LLM on bot answers to human questions? This is only useful if you want to get an LLM that behaves like an already existing LLm


Queries are questions in a sense that they are not the original facts. I don’t think they are useful for training data.


In terms of adoption, I think the user is right. That is the only thing stopping adoption of existing models in the real world.


Unclear limits on how much context can be reliably provided and effectively used without degrading the result.


It does. We keep a section of the context window for memory. The LLM however is the one deciding what is remembered. Technically via the system prompt we can have it remember every prompt if needed.

But memory is a minor thing. Talking to a knowledgeable librarian or professor you never met is the level we essentially need to get it to for this stuff to take off.


> That we appear to have run out of training data

And now, in some cases for a while, it is training on its own slop.


The article is the peak of confidently wrong itself, for solid irony points.





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: