The best way to understand a car is to build a car. Hardly anyone is going to do...

benreesman · 2025-07-01T08:11:20 1751357480

Your example / analogy is useful in the sense that its usually useful to establish the thought experiment with the boundary conditions.

But in between someone commuting in a Toyota and an F1 driver are many, many people, the best example from inside the extremes is probably a car mechanic, and even there, there's the oil change place with the flat fee painted in the window, and the Koenigsberg dealership that orders the part from Europe. The guy who tunes those up can afford one himself.

In the use case segment where just about anyone can do it with a few hours training, yeah, maybe that investment is zero instead of a week now.

But I'm much more interested in the one where F1 cars break the sound barrier now.

eclecticfrank · 2025-07-01T09:25:58 1751361958

It might make sense to split the car analogy into different users:

1. For the majority of regular users the best way to understand the car is to read the manual and use the car.

2. For F1 drivers the best way to understand the car is to consult with engineers and use the car.

3. For a mechanic / engineer the best way to understand the car is to build and use the car.

Davidzheng · 2025-07-01T12:18:59 1751372339

yes except intelligence isn't like a car, there's no way to break the complicated emergent behaviors of these models into simple abstractions. you can understand a LLM by training one the same amount you can understand a brain by dissection.

LtWorf · 2025-07-01T14:00:27 1751378427

I think making one would help you understand that they're not intelligent.

benreesman · 2025-07-01T14:46:20 1751381180

Your reply is enough of a zinger that I'll chuckle and not pile on, but there is a very real and very important point here, which is that it is strictly bad to get mystical about this.

There are interesting emergent behaviors in computationally feasible scale regimes, but it is not magic. The people who work at OpenAI and Anthropic worked at Google and Meta and Jump before, they didn't draw a pentagram and light candles during onboarding.

And LLMs aren't even the "magic. Got it." ones anymore, the zero shot robotics JEPA stuff is like, wtf, but LLM scaling is back to looking like a sigmoid and a zillion special cases. Half of the magic factor in a modern frontier company's web chat thing is an uncorrupted search index these days.

Davidzheng · 2025-07-01T15:43:49 1751384629

OK I, like the other commenter, also feel stupid to reply to zingers--but here goes.

First of all, I think a lot of the issue here is this sense of baggage over this word intelligence--I guess because believing machines can be intelligent goes against this core belief that people have that humans are special. This isn't meant as a personal attack--I just think it clouds thinking.

Intelligence of an agent is a spectrum, it's not a yes/no. I suspect most people would not balk at me saying that ants and bees exhibits intelligent behavior when they look for food and communicate with one another. We infer this from some of the complexity of their route planning, survival strategies, and ability to adapt to new situations. Now, I assert that those same strategies can not only be learned by modern ML but are indeed often even hard-codable! As I view intelligence as a measure of an agent's behaviors in a system, such a measure should not distinguish the bee and my hard-wired agent. This for me means hard-coded things can be intelligent as they can mimic bees (and with enough code humans).

However, the distribution of behaviors which humans inhabit are prohibitively difficult to code by hand. So we rely on data-driven techniques to search for such distributions in a space which is rich enough to support complexities at the level of the human brain. As such I certainly have no reason to believe, just because I can train one, that it must be less intelligent than humans. On the contrary, I believe in every verifiable domain RL must drive the agent to be the most intelligent (relative to RL award) it can be under the constraints--and often it must become more intelligent than humans in that environment.

LtWorf · 2025-07-01T16:08:39 1751386119

So according to your extremely broad definition of intelligence, also a casio calculator is intelligent?

Sure, if we define anything as intelligent, AI is intelligent.

Is this definition somehow helpful though?

Davidzheng · 2025-07-01T19:26:30 1751397990

It's not binary...

benreesman · 2025-07-01T16:59:31 1751389171

Eh...kinda. The RL in RLHF is a very different animal than the RL in a Waymo car training pipeline, which is sort of obvious when you see that the former can be done by anyone with some clusters and some talent, and the latter is so hard that even Waymo has a marked preference for operating in July in Chandler AZ: everyone else is in the process of explaining why they didn't really want Level 5 per se anyways: all brakes no gas if you will.

The Q summations that are estimated/approximated by deep policy networks are famously unstable/ill-behaved under descent optimization in the general case, and it's not at all obvious that "point RL at it" is like, going to work at all. You get stability and convergence issues, you get stuck in minima, it's hard and not a mastered art yet, lot of "midway between alchemy and chemistry" vibes.

The RL in RLHF is more like Learning to Rank in a newsfeed optimization setting: it's (often) ranked-choice over human-rating preferences with extremely stable outcomes across humans. This phrasing is a little cheeky but gives the flavor: it's Instagram where the reward is "call it professional and useful" instead of "keep clicking".

When the Bitter Lesson essay was published, it was contrarian and important and most of all aimed at an audience of expert practitioners. The Bitter Bitter Lesson in 2025 is that if it looks like you're in the middle of an exponential process, wait a year or two and the sigmoid will become clear, and we're already there with the LLM stuff. Opus 4 is taking 30 seconds on the biggest cluster that billions can buy and they've stripped off like 90% of the correctspeak alignment to get that capability lift, we're hitting the wall.

Now this isn't to say that AI progress is over, new stuff is coming out all the time, but "log scale and a ruler" math is marketing at this point, this was a sigmoid.

Edit: don't take my word for it, this is LeCun (who I will remind everyone has the Turing) giving the Gibbs Lecture on the mathematics 10k feet view: https://www.youtube.com/watch?v=ETZfkkv6V7Y

Davidzheng · 2025-07-01T19:28:34 1751398114

I'm in agreement--RLHF won't lead to massively more intelligent beings than humans. But I said RL not RLHF

benreesman · 2025-07-01T22:36:43 1751409403

Well what you said is:

"On the contrary, I believe in every verifiable domain RL must drive the agent to be the most intelligent (relative to RL award) it can be under the constraints--and often it must become more intelligent than humans in that environment."

And I said it's not that simple, in no way demonstrated, unlikely with current technology, and basically, nope.

Davidzheng · 2025-07-02T03:42:43 1751427763

Ah you're worried about convergence issues? My (Bad) understanding was that the self-driving car stuff is more about inadequacies of models in which you simulate training and data collection than convergence of algorithms but I could be wrong. I mean that statement was just a statement that I think you can get RL to converge to close to optimum--which I agree is a bit of a stretch as RL is famously finicky. But I don't see why one shouldn't expect this to happen as we tune the algorithms.