Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yeah. Monkeys. Monkeys that write useful C and Python code that needs a bit less revision every time there's a model update.

Can we just give the "stochastic parrot" and "monkeys with typewriters" schtick a rest? It made for novel commentary three or four years ago, but at this point, these posts themselves read like the work of parrots. They are no longer interesting, insightful, or (for that matter) true.



If you think about it, humans necessarily use abstractions, from the edge detectors in retina to concepts like democracy. But do we really understand? All abstractions leak, and nobody knows the whole stack. For all the poorly grasped abstractions we are using, we are also just parroting. How many times are we doing things because "that is how they are done" never wondering why?

Take ML itself, people are saying it's little more than alchemy (stir the pile). Are we just parroting approaches that have worked in practice without real understanding? Is it possible to have centralized understanding, even in principle, or is all understanding distributed among us? My conclusion is that we have a patchwork of partial understanding, stitched together functionally by abstractions. When I go to the doctor, I don't study medicine first, I trust the doctor. Trust takes the place of genuine understanding.

So humans, like AI, use distributed and functional understanding, we don't have genuine understanding as meant by philosophers like Searle in the Chinese Room. No single neuron in the brain understands anything, but together they do. Similarly, no single human understands genuinely, but society together manages to function. There is no homunculus, no centralized understander anywhere. We humans are also stochastic parrots of abstractions we don't really grok to the full extent.


> My conclusion

Are you saying you understood something? Was it genuine? Do you think LLM feels the same thing?


Haha, "I doubt therefore I am^W don't understand"


do llms feel?


seems like would be the implication if yes


Great points. We're pattern-matching shortcut machines, without a doubt. In most contexts, not even good ones.

> When I go to the doctor, I don't study medicine first, I trust the doctor. Trust takes the place of genuine understanding.

The ultimate abstraction! Trust is highly irrational by definition. But we do it all day every day, lest we be classified as psychologically unfit for society. Which is to say, mental health is predicated on a not-insignificant amount of rationalizations and self-deceptions. Hallucinations, even.


Every time I read "stochastic parrot," my always-deterministic human brain surfaces this quote:

> “Most people are other people. Their thoughts are someone else's opinions, their lives a mimicry, their passions a quotation.”

- Oscar Wilde, a great ape with a pen


Reading this quote makes me wonder why I should believe that I am somehow special or different, and not just another "other".


That's just it. We're not unique. We've always been animals running on instinct in reaction to our environment. Our instincts are more complex than other animals but they are not special and they are replicable.


The infinite monkey post was in response to this claim, which, like the universal approximation theorem, is useless in practice:

"We have mathematically proven that transformers can solve any problem, provided they are allowed to generate as many intermediate reasoning tokens as needed. Remarkably, constant depth is sufficient."

Like an LLM, you omit the context and browbeat people with the "truth" you want to propagate. Together with many political forbidden terms since 2020, let us now also ban "stochastic parrot" in order to have a goodbellyfeel newspeak.


There is also a problem of "stochastic parrot" being constantly used in a pejorative sense as opposed to a neutral term to keep grounded and skeptical.

Of course, it is an overly broad stroke that doesn't quite capture all the nuance of the model but the alternative of "come on guys, just admit the model is thinking" is much worse and has much less to do with reality.


> novel commentary three or four years ago,

Chatgpt was released November 2022. That's one year and 10 months ago. Their marketing started in the summer of the same year, still far of from 3-4 years.


But chatgpt wasnt the first, openai had coding playground with gpt2, and you could already code even before that, around 2020 already, so I'd say it has been 3-4years


GPT-3 paper announcement got 200 comments on HN back in 2020.

It doesn't matter when marketing started, people were already discussing it in 2019-2020.

Stochastic parrot: The term was coined by Emily M. Bender[2][3] in the 2021 artificial intelligence research paper "On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? " by Bender, Timnit Gebru, Angelina McMillan-Major, and Margaret Mitchell.[4]


Confusing to read your comment. So the term was coined 3 yrs ago, but it's been 4 years out of date? Seems legit

It could be that the term no longer applies, but there is no way you could honestly make that claim pre gpt4, and that's not 3-4yrs ago


Text generated by GPT-3 usually makes more sense than your comments.


Arw, did I hurt your feelings by pointing out how nonsensical you were?

Poor boy <3


AI news article comments bingo card:

* Tired ClosedAI joke

* Claiming it's predictive text engine that isn't useful for anything

* Safety regulations are either good or bad, depending on who's proposing them

* Fear mongering about climate impact

* Bringing up Elon for no reason

* AI will never be able to [some pretty achievable task]

* Tired arguments from pro-IP / copyright sympathizers


> Tired ClosedAI joke

> Tired arguments from pro-IP / copyright sympathizers

You forgot "Tired ClosedAI joke from anti-IP / copyleft sympathizers".

Remember that the training data debate is orthogonal to the broader debate over copyright ownership and scope. The first people to start complaining about stolen training data were the Free Software people, who wanted a legal hook to compel OpenAI and GitHub to publish model weights sourced from GPL code. Freelance artists took that complaint and ran with it. And while this is technically an argument that rests on copyright for legitimacy; the people who actually own most of the copyrights - publishers - are strangely interested about these machines that steal vast amounts of their work.


Interestingly there should be one which is missing which is well appropriate unless everyone is super smart math professor level genius:

These papers become increasingly difficult to properly comprehend.

…and thus perhaps the plethora of arguably nonsensical follow ups.


These papers become increasingly difficult to properly comprehend.

Feed it to ChatGPT and ask for an explanation suited to your current level of understanding (5-year-old, high-school, undergrad, comp-sci grad student, and so on.)

No, really. Try it.


No, really, I've tried it and its okay for a crow's flight over these papers, but I'd never put my trust in random() to fetch me precisely what I'm looking for.

My daily usage of ChatGPT, Claude, etc. for nearly 2 years now shows one and the same - unless I provide enough of the right context for it to get the job done, job is never done right. ever. accidentally maybe, but never ever. and this becomes particularly evident with larger documents.

The pure RAG-based approach is a no go, you cannot be sure important stuff is not omitted. The "feed the document into Context" still by definition will not work correctly thanks to all the bias accumulated in the LLMs layers.

So it is a way to approach papers if you really know what they contain, and know the surrounding terminology. But this is really a no go if you read about .... complex analysis and know nothing about algebra in 5th degree. Sorry, this is not gonna work, and will probably summarily take longer as total time/energy on behalf of the reader.


>* Claiming it's predictive text engine that isn't useful for anything

This one is very common on HN and it's baffling. Even if it's predictive text, who the hell cares if it achieves its goals? If an LLM is actually a bunch of dolphins typing on a keyboard made for dolphins, I could care less if it does what I need it to do. For people who continue to repeat this on HN, why? I just want to know out of my curiosity.

>* AI will never be able to [some pretty achievable task]

Also very common on HN.

You forgot the "AI will never be able to do what a human can do in the exact way a human does it so AI will never achieve x".


> Even if it's predictive text, who the hell cares if it achieves its goals?

Haha ... well in the literal sense it does achieve "its" goals, since it only had one goal which was to minimize its training loss. Mission accomplished!

OTOH, if you mean achieving the user's goals, then it rather depends on what those goals are. If the goal is to save you typing when coding, even if you need to check it all yourself anyway, then I guess mission accomplished there too!

Whoopee! AGI done! Thanks you Dolphins!


I think it's less about what it is, but what it claims to be. "Artificial Intelligence"... It's not. Dolphin keyboard squad (DKS), then sure.

The "just fancy autocomplete" is in response, but a criticism


What's wrong with the phrase "artificial intelligence"? To me, it doesn't imply that it's human-like. It's just human created intelligence to me.


Partly because "artificial intelligence" is a loaded phrase which brings implications of AGI along for the ride, partly because "intelligence" is not a well defined term, so an artificial version of it could be argued to be almost anything, and partly because even if you lean on the colloquial understanding of what "intelligence" is, ChatGPT (and its friends) still isn't it. It's a Chinese Room - or a stochastic parrot.


It's a Chinese Room - or a stochastic parrot.

Show me a resident of a Chinese Room who can do this: https://chatgpt.com/share/66e83ff0-76b4-800b-b33b-910d267a75...

The Chinese Room metaphor was always beneath Searle's intellectual level of play, and it hasn't exactly gotten more insightful with age.


I understand and agree that ChatGPT achieves impressive results but your appeal to incredulity doesn't make it anything more than it is I'm afraid.


It's not incredulity, just pointing out the obvious. Searle placed very specific limitations on the operator of the Room. He rests his whole argument on the premise that the operator is illiterate in Chinese, or at least has no access to the semantics of the material stored in the Room. That's plainly not the case with ChatGPT, or it couldn't review its previous answers to find and fix its mistakes.

And you certainly would not get a different response, much less a better one, from the operator of a Chinese Room simply by adding "Think carefully step by step" to the request you hand him.

It's just a vacuous argument from square one, and it annoys me to an entirely-unreasonable extent every time someone brings it up. Add it to my "Stochastic Parrot" and "Infinite Monkeys" trigger phrases, I guess.


> ... He rests his whole argument on the premise that the operator is illiterate in Chinese, or at least has no access to the semantics of the material stored in the Room.

...and yet outputs semantically correct responses.

> That's plainly not the case with ChatGPT, or it couldn't review its previous answers to find and fix its mistakes.

Which is another way of saying, ChatGPT couldn't produce semantically correct output without understanding the input. Disagreeing with which is the whole point of the Chinese Room argument.

Why cannot the semantic understanding be implicitly encoded in the model? That is, why cannot the program I (as the Chinese Room automaton) am following be of sufficient complexity that my output appears to be that of an intelligent being with semantic understanding and the ability to review my answers? That, in my understanding, is where the genius of ChatGPT lies - it's a masterpiece of preprocessing and information encoding. I don't think it needs to be anything else to achieve the results it achieves.

A different example of this is the work of Yusuke Endoh, whom you may know for his famous quines. https://esoteric.codes/blog/the-128-language-quine-relay is to me one of the most astonishing feats of software engineering I've ever seen, and little short of magic - but at its heart it's 'just' very clever encoding. Each subsequent program understands nothing and yet encodes every subsequent program including itself. Another example is DNA; how on Earth does a dumb molecule create a body plan? I'm sure there are lots of examples of systems that exhibit such apparently intelligent and subtly discriminative behaviour entirely automatically. Ant colonies!

> And you certainly would not get a different response, much less a better one, from the operator of a Chinese Room simply by adding "Think carefully step by step" to the request you hand him.

Again, why not? It has access to everything that has gone before; the next token is f(all the previous ones). As for asking it to "think carefully", would you feel differently if the magic phrase was "octopus lemon wheat door handle"? Because it doesn't matter what the words mean to a human - it's just responding to the symbols it's been fed; the fact that you type something meaningful to you just obscures that fact and lends subconscious credence to the idea that it understands you.

> It's just a vacuous argument from square one, and it annoys me to an entirely-unreasonable extent every time someone brings it up. Add it to my "Stochastic Parrot" and "Infinite Monkeys" trigger phrases, I guess.

With no intent to annoy, I hope you at least understand where I'm coming from, and why I think those labels are not just apt, but useful ways to dispel the magical thinking that some (not you specifically) exhibit when discussing these things. We're engineers and scientists and although it's fine to dream, I think it's also fine to continue trying to shoot down the balloons that we send up, so we're not blinded by the miracle of flight.


Why cannot the semantic understanding be implicitly encoded in the model?

That just turns the question into "OK, so what distinguishes the model from a machine capable of genuine understanding and reasoning, then?"

At some point you (and Searle) must explain what the difference is in engineering terms, not through analogy or by appeals to ensoulment or by redecorating the Chinese Room with furnishings it wasn't originally equipped with. Having moved the goalpost back to the far corner of the parking garage already, what's your next move?

It's easy to dismiss a "stochastic parrot" by saying that "The next token is a function of all of the previous ones," but welcome to our deterministic universe, I guess... deterministic, that is, apart from the randomness imparted by SGD or thermal noise or what-have-you. Again, how is this different from what human brains do? Von Neumann himself naturally assumed that stored-program machines would be modeled on networks of neuron-like structures (a factoid I just ran across while reading about McCullough and Pitts), so it's not that surprising that we're finally catching up to his way of looking at it.

At the end of the day we're all just bags of meat trying to minimize our own loss functions. There's nothing special about what we're doing. The magical thinking you're referring to is being done by those who claim "AI isn't doing X" or "AI will never do X" without bothering to define X clearly.

I don't think it needs to be anything else to achieve the results it achieves.

Exactly, and that's earth-shaking because of the potential it has to illuminate the connection between brains and minds. It's sad that the discussion inevitably devolves into analogies to monkeys and parrots.


> That just turns the question into "OK, so what distinguishes the model from a machine capable of genuine understanding and reasoning, then?"

And that's a great question which is not far away from asking for definitions of intelligence and consciousness, which of course I don't have, however I could venture some suggestions about what we have that LLMs don't, in no particular order:

- Self-direction: we are goal-oriented creatures that will think and act without any specific outside stimulus

- Intentionality: related to the above - we can set specific goals and then orient our efforts to achieve them, sometimes across decades

- Introspection: without guidance, we can choose to reconsider our thoughts and actions, and update our own 'models' by deliberately learning new facts and skills - we can recognise or be given to understand when we're wrong about something, and can take steps to fix that (or choose to double down on it)

- Long term episodic memory: we can recall specific facts and events with varying levels of precision, and correlate those memories with our current experiences to inform our actions

- Physicality: we are not just brains in skulls, but flooded with all manner of chemicals that we synthesise to drive our biological functions, and which affect our decision making processes; we are also embedded in the real physical world and recieving huge amounts of sensory data almost constantly

> At some point you (and Searle) must explain what the difference is in engineering terms, not through analogy or by appeals to ensoulment or by redecorating the Chinese Room with furnishings it wasn't originally equipped with. Having moved the goalpost back to the far corner of the parking garage already, what's your next move?

While I think that's a fair comment, I have to push back a bit and say that if I could give you a satisfying answer to that, then I may well be defining intelligence or consciousness and as far as I know there are no accepted definitions for those things. One theory I like is Douglas Hofstadter's strange loop - the idea of a mind thinking about thinking about thinking about itself, thus making introspection a primary pillar of 'higher mental functions'. I don't see any evidence of LLMs doing that, nor any need to invoke it.

> It's easy to dismiss a "stochastic parrot" by saying that "The next token is a function of all of the previous ones," but welcome to our deterministic universe, I guess... deterministic, that is, apart from the randomness imparted by SGD or thermal noise or what-have-you. Again, how is this different from what human brains do?

...and now we're onto the existence or not of free will... Perhaps it's the difference between automatic actions and conscious choices? My feeling is that LLMs deliberately or accidentally model a key component of our minds, the faculty of pattern matching and recall, and I can well imagine that in some future time we will integrate an LLM into a wider framework that includes other abilities that I listed above, such as long term memory, and then we may yet see AGI. Side note that I'm very happy to accept the idea that each of us encodes our own parrot.

> Von Neumann himself naturally assumed that stored-program machines would be modeled on networks of neuron-like structures (a factoid I just ran across while reading about McCullough and Pitts), so it's not that surprising that we're finally catching up to his way of looking at it.

Well OK but very smart people in the past thought all kinds of things that didn't pan out, so I'm not really sure that helps us much.

> At the end of the day we're all just bags of meat trying to minimize our own loss functions. There's nothing special about what we're doing. The magical thinking you're referring to is being done by those who claim "AI isn't doing X" or "AI will never do X" without bothering to define X clearly.

I don't see how that's magical thinking, it's more like... hard-nosed determinism? I'm interested in the bare minimum necessary to explain the phenomena on display, and expressing those phenomena in straightforward terms to keep the discussion grounded. "AI isn't doing X" is a response to those saying that AI is doing X, so it's as much on those people to define what X is; in any case I rather prefer "AI is only doing Y", where Y is a more boring and easily definable thing that nonetheless explains what we're seeing.

> Exactly, and that's earth-shaking because of the potential it has to illuminate the connection between brains and minds.

Ah! Now there we agree entirely. Actually I think a far more consequential question than "what do LLMs have that makes them so good?" is "what don't we have that we thought we did?".... but perhaps that's because I'm an introspecting meat bag and therefore selfishly fascinated by how and why meat bags introspect.


Do people really associate AI with AGI?

Because we've been using "AI" to describe things many years before AGI became mainstream. Companies used to use "AI" to describe basic ML algorithms.

When I see "AI", I just think it's some sort of NL or ML. I never think it's AGI. AGI is AGI.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: