> what is the LLM doing differently when it generates tokens that are "wrong" co...

> what is the LLM doing differently when it generates tokens that are "wrong" compared to when the tokens are "right"?

When they don't recall correctly, it is hallucination. When they recall perfectly, it is regurgitation/copyright infringement. We find issue either way.

May I remind you that we also hallucinate, memory plays tricks on us. We often google stuff just to be sure. It is not the hallucination part that is a real difference between humans and LLMs.

> Why do humans produce speech?

We produce language to solve body/social/environment related problems. LLMs don't have bodies but they do have environments, such as a chat room, where the user is the environment for the model. In fact chat rooms produce trillions of tokens per month worth of interaction and immediate feedback.

If you look at what happens with those trillions of tokens - they go into the heads of hundreds of millions of people, who use the LLM assistance to solve their problems, and of course produce real world effects. Then it will reflect in the next training set, creating a second feedback loop between LLM and environment.

By the way, humans don't produce speech individually, if left alone, without humanity as support. We only learn speech when we get together. Language is social. Human brain is not so smart on its own, but language collects experience across generations. We rely on language for intelligence to a larger degree than we like to admit.

Isn't it a mystery how LLMs learned so many language skills purely from imitating us without their own experience? It shows just how powerful language is on its own. And it shows it can be independent on substrate.