Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It doesn't necessarily have to see approximately all the English text ever. Real people don't learn English like that, for example.

It's just that given what we know about neural networks, it's often easier and simpler and more effective to increase the amount of training data than to change anything else.



An LLM has nothing in common with a human. An LLM works in ways that have nothing in common with the human brain.


Yes, LLMs and human brains share at most some faint similarities.

Nevertheless, human feats can act as an existence proof of what is possible. Including of what might be possible for a neural network.

(I'm not sure whether a large language model necessarily needs to be a neural network in the sense of a bunch of linear transformations interleaved with some simple non-linear activation functions. But for the sake of strengthening your argument, let's assume that we are assuming this restrictive definition of LLM.)


Real people might not speak every dialect of English. They may not be well versed in local grammatical oddities.


Doesn't seem to be much of a problem in practice?

If someone knew all the math and science in Wikipedia, for example, I think they'd probably be forgiven for not knowing every regionalism.


Unfortunately models aren't always good at knowing what they don't know ("out of distribution data") so it could lead to confidently wrong answers if you leave something out.

And if you want it to be superhuman then you're by definition not capable of knowing what's important, I guess.


Btw, models like GPT 4 can express that they are not confident.

But that looks like a 'smeared' out probability distribution on the next token. Not like the text produced by an unsure human.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: