The 5-year old counts with an algorithm: they remember the current number (worki...

notahacker · 2026-01-19T19:49:05 1768852145

A theory behind LLM intelligence is that the layer structure forms some sort of world model that has a much higher fidelity than simple pattern matching texts. In specific cases, like where the language is a DSL which maps perfectly to a representation of an Othello gameboard, this appears to actually be the case. But basic operations like returning the number of times the letter r appears in 'strawberry' form a useful counterexample: the LLM has ingested many hundreds of books explaining how letters spell out words and how to count (which are pretty simple concepts very easily stored in small amounts of computer memory) and yet its layers apparently couldn't model it from all that input (apparently an issue with being unable to parse a connection between the token 'strawberry' and its constituent letters... not exactly high-level reasoning).

It appears LLMs got RHLFed into generating suitable Python scripts after the issue was exposed, which is an efficient way of getting better answers, but feels rather like handing the child really struggling with their arithmetic a calculator...