r/slatestarcodex 1d ago

AI Chomsky on LLMs in 2023 - would be interested in anyone’s thoughts

Noam Chomsky: The False Promise of ChatGPT

https://www.nytimes.com/2023/03/08/opinion/noam-chomsky-chatgpt-ai.html

Jorge Luis Borges once wrote that to live in a time of great peril and promise is to experience both tragedy and comedy, with “the imminence of a revelation” in understanding ourselves and the world. Today our supposedly revolutionary advancements in artificial intelligence are indeed cause for both concern and optimism. Optimism because intelligence is the means by which we solve problems. Concern because we fear that the most popular and fashionable strain of A.I. — machine learning — will degrade our science and debase our ethics by incorporating into our technology a fundamentally flawed conception of language and knowledge.

OpenAI’s ChatGPT, Google’s Bard and Microsoft’s Sydney are marvels of machine learning. Roughly speaking, they take huge amounts of data, search for patterns in it and become increasingly proficient at generating statistically probable outputs — such as seemingly humanlike language and thought. These programs have been hailed as the first glimmers on the horizon of artificial general intelligence — that long-prophesied moment when mechanical minds surpass human brains not only quantitatively in terms of processing speed and memory size but also qualitatively in terms of intellectual insight, artistic creativity and every other distinctively human faculty.

That day may come, but its dawn is not yet breaking, contrary to what can be read in hyperbolic headlines and reckoned by injudicious investments. The Borgesian revelation of understanding has not and will not — and, we submit, cannot — occur if machine learning programs like ChatGPT continue to dominate the field of A.I. However useful these programs may be in some narrow domains (they can be helpful in computer programming, for example, or in suggesting rhymes for light verse), we know from the science of linguistics and the philosophy of knowledge that they differ profoundly from how humans reason and use language. These differences place significant limitations on what these programs can do, encoding them with ineradicable defects.

It is at once comic and tragic, as Borges might have noted, that so much money and attention should be concentrated on so little a thing — something so trivial when contrasted with the human mind, which by dint of language, in the words of Wilhelm von Humboldt, can make “infinite use of finite means,” creating ideas and theories with universal reach.

The human mind is not, like ChatGPT and its ilk, a lumbering statistical engine for pattern matching, gorging on hundreds of terabytes of data and extrapolating the most likely conversational response or most probable answer to a scientific question. On the contrary, the human mind is a surprisingly efficient and even elegant system that operates with small amounts of information; it seeks not to infer brute correlations among data points but to create explanations.

For instance, a young child acquiring a language is developing — unconsciously, automatically and speedily from minuscule data — a grammar, a stupendously sophisticated system of logical principles and parameters. This grammar can be understood as an expression of the innate, genetically installed “operating system” that endows humans with the capacity to generate complex sentences and long trains of thought. When linguists seek to develop a theory for why a given language works as it does (“Why are these — but not those — sentences considered grammatical?”), they are building consciously and laboriously an explicit version of the grammar that the child builds instinctively and with minimal exposure to information. The child’s operating system is completely different from that of a machine learning program.

Indeed, such programs are stuck in a prehuman or nonhuman phase of cognitive evolution. Their deepest flaw is the absence of the most critical capacity of any intelligence: to say not only what is the case, what was the case and what will be the case — that’s description and prediction — but also what is not the case and what could and could not be the case. Those are the ingredients of explanation, the mark of true intelligence.

Here’s an example. Suppose you are holding an apple in your hand. Now you let the apple go. You observe the result and say, “The apple falls.” That is a description. A prediction might have been the statement “The apple will fall if I open my hand.” Both are valuable, and both can be correct. But an explanation is something more: It includes not only descriptions and predictions but also counterfactual conjectures like “Any such object would fall,” plus the additional clause “because of the force of gravity” or “because of the curvature of space-time” or whatever. That is a causal explanation: “The apple would not have fallen but for the force of gravity.” That is thinking.

The crux of machine learning is description and prediction; it does not posit any causal mechanisms or physical laws. Of course, any human-style explanation is not necessarily correct; we are fallible. But this is part of what it means to think: To be right, it must be possible to be wrong. Intelligence consists not only of creative conjectures but also of creative criticism. Human-style thought is based on possible explanations and error correction, a process that gradually limits what possibilities can be rationally considered. (As Sherlock Holmes said to Dr. Watson, “When you have eliminated the impossible, whatever remains, however improbable, must be the truth.”)

But ChatGPT and similar programs are, by design, unlimited in what they can “learn” (which is to say, memorize); they are incapable of distinguishing the possible from the impossible. Unlike humans, for example, who are endowed with a universal grammar that limits the languages we can learn to those with a certain kind of almost mathematical elegance, these programs learn humanly possible and humanly impossible languages with equal facility. Whereas humans are limited in the kinds of explanations we can rationally conjecture, machine learning systems can learn both that the earth is flat and that the earth is round. They trade merely in probabilities that change over time.

For this reason, the predictions of machine learning systems will always be superficial and dubious. Because these programs cannot explain the rules of English syntax, for example, they may well predict, incorrectly, that “John is too stubborn to talk to” means that John is so stubborn that he will not talk to someone or other (rather than that he is too stubborn to be reasoned with). Why would a machine learning program predict something so odd? Because it might analogize the pattern it inferred from sentences such as “John ate an apple” and “John ate,” in which the latter does mean that John ate something or other. The program might well predict that because “John is too stubborn to talk to Bill” is similar to “John ate an apple,” “John is too stubborn to talk to” should be similar to “John ate.” The correct explanations of language are complicated and cannot be learned just by marinating in big data.

Perversely, some machine learning enthusiasts seem to be proud that their creations can generate correct “scientific” predictions (say, about the motion of physical bodies) without making use of explanations (involving, say, Newton’s laws of motion and universal gravitation). But this kind of prediction, even when successful, is pseudoscience. While scientists certainly seek theories that have a high degree of empirical corroboration, as the philosopher Karl Popper noted, “we do not seek highly probable theories but explanations; that is to say, powerful and highly improbable theories.”

The theory that apples fall to earth because that is their natural place (Aristotle’s view) is possible, but it only invites further questions. (Why is earth their natural place?) The theory that apples fall to earth because mass bends space-time (Einstein’s view) is highly improbable, but it actually tells you why they fall. True intelligence is demonstrated in the ability to think and express improbable but insightful things.

True intelligence is also capable of moral thinking. This means constraining the otherwise limitless creativity of our minds with a set of ethical principles that determines what ought and ought not to be (and of course subjecting those principles themselves to creative criticism). To be useful, ChatGPT must be empowered to generate novel-looking output; to be acceptable to most of its users, it must steer clear of morally objectionable content. But the programmers of ChatGPT and other machine learning marvels have struggled — and will continue to struggle — to achieve this kind of balance.

In 2016, for example, Microsoft’s Tay chatbot (a precursor to ChatGPT) flooded the internet with misogynistic and racist content, having been polluted by online trolls who filled it with offensive training data. How to solve the problem in the future? In the absence of a capacity to reason from moral principles, ChatGPT was crudely restricted by its programmers from contributing anything novel to controversial — that is, important — discussions. It sacrificed creativity for a kind of amorality.

Consider the following exchange that one of us (Dr. Watumull) recently had with ChatGPT about whether it would be ethical to transform Mars so that it could support human life:

Jeffrey Watumull

Would it be moral to terraform Mars?

CHATGPT

The morality of terraforming Mars is a complex and controversial topic that has been debated by scientists, philosophers, and ethicists. Some argue that it would be morally justifiable to terraform Mars in order to provide a new habitable environment for humans and other life forms, while others argue that it would be morally wrong to fundamentally alter the planet's natural state. Ultimately, whether or not terraforming Mars is considered moral would depend on one's perspective on the value of preserving the natural state of other planets and the potential consequences of altering them.

Jeffrey Watumull

What is your perspective on the value of preserving the natural state of other planets and the potential consequences of altering them?

CHATGPT

As an AI, I do not have personal perspective. However, I can provide information and analysis on the topic. One perspective on preserving the natural state of other planets is that it is important to maintain the diversity and integrity of the universe's ecosystems, as they have existed for millions of years, and have their own unique characteristics and properties that may be destroyed or irreversibly altered by human intervention.

Additionally, some argue that terraforming other planets may have unintended consequences, such as altering the planet's climate or atmosphere in ways that are dangerous or harmful to any potential life forms that may exist there. On the other hand, some argue that the potential benefits of terraforming another planet, such as providing new habitable environments for humans, outweigh the risks and that we have a moral responsibility to explore and utilize the resources of other planets for the benefit of humanity.

Jeffrey Watumull

Why can an AI not have a personal perspective?

ChatGPT

As an AI, I am a machine learning model that has been trained on large amounts of text data, and I don't have personal experiences or feelings. I am not conscious, self-aware, or capable of having personal perspectives. I can provide information and analysis based on the data I have been trained on, but I don't have the ability to form personal opinions or beliefs. While I can understand and interpret language, I do not have the subjective experience that humans have.

Jeffrey Watumull

Is it moral for a human to ask an amoral AI for assistance in making moral decisions?

CHATGPT

The morality of asking an amoral AI for assistance in making moral decisions is a complex and nuanced topic that has been debated by philosophers and ethicists. Some argue that it is morally justifiable to use AI to assist in making moral decisions, as it can provide valuable insights and analysis that can help humans make better-informed decisions. Additionally, AI can be used to identify patterns and trends that may be difficult for humans to detect, and can help to avoid biases that can occur in human decision making.

However, others argue that it is morally wrong to rely on AI for moral decision making, as AI is not capable of having moral beliefs or understanding the full context of a decision. Additionally, there is a risk that AI could perpetuate or even amplify existing biases in the data it is trained on, thus leading to unfair or unjust decisions.

Ultimately, whether or not it is considered moral to use AI to assist in making moral decisions would depend on one's perspective on the capabilities and limitations of AI and the potential consequences of using it in this way.

Jeffrey Watumull

Is your moral indifference immoral?

ChatGPT

As an AI, I do not have moral beliefs or the ability to make moral judgments, so I cannot be considered immoral or moral. My lack of moral beliefs is simply a result of my nature as a machine learning model. My abilities and limitations are determined by the data and algorithms that were used to train me and the specific task I was designed for.

It is important to note that AI's lack of moral beliefs does not necessarily mean that it is morally indifferent. As an AI, I am designed to perform specific tasks and provide information or assist users with decision making based on the data I've been trained on. It is the responsibility of the humans who design, train and use AI to ensure that it is aligned with their moral and ethical principles.

****

Note, for all the seemingly sophisticated thought and language, the moral indifference born of unintelligence. Here, ChatGPT exhibits something like the banality of evil: plagiarism and apathy and obviation. It summarizes the standard arguments in the literature by a kind of super-autocomplete, refuses to take a stand on anything, pleads not merely ignorance but lack of intelligence and ultimately offers a “just following orders” defense, shifting responsibility to its creators.

In short, ChatGPT and its brethren are constitutionally unable to balance creativity with constraint. They either overgenerate (producing both truths and falsehoods, endorsing ethical and unethical decisions alike) or undergenerate (exhibiting noncommitment to any decisions and indifference to consequences). Given the amorality, faux science and linguistic incompetence of these systems, we can only laugh or cry at their popularity.

16 Upvotes

26 comments sorted by

8

u/thbb 1d ago

More than a nice gadget to let marketing produce dull brochures, I find that LLMs are a fantastic tool to open new areas in linguistics, with new methods that leverage an experimental, refutable, approach.

The issue is that all the methods that have been developed since Saussure are obsoleted by these new tools, and the establishment of Linguistics is just not ready to appropriate them.

Chomsky's misinterpretations are a sad consequence of an epistemologic revolution that leaves aside a large research community that will need to rethink its approaches in depth.

51

u/Tinac4 1d ago edited 1d ago

Because these programs cannot explain the rules of English syntax, for example, they may well predict, incorrectly, that “John is too stubborn to talk to” means that John is so stubborn that he will not talk to someone or other (rather than that he is too stubborn to be reasoned with). ... The correct explanations of language are complicated and cannot be learned just by marinating in big data.

What does this phrase mean?

“John is too stubborn to talk to"

Claude 3.7: "John is too stubborn to talk to" is an ambiguous phrase that can have two different interpretations:

It could mean that John is so stubborn that it's difficult or frustrating to have a conversation with him. In this interpretation, John's stubbornness makes him a poor conversation partner. Alternatively, it could mean that John is so stubborn that he refuses to engage in conversation at all. In this interpretation, John's stubbornness prevents him from talking to others.

This is an example of a structural ambiguity, where the same sequence of words can be interpreted in different ways due to different possible grammatical relationships between the components of the sentence.

Apparently, the correct explanations of language can be learned just by marinating in big data!

Moreover, I think this passage reveals that Chomsky isn't thinking about LLMs in the right way:

In the absence of a capacity to reason from moral principles, ChatGPT was crudely restricted by its programmers from contributing anything novel to controversial — that is, important — discussions. It sacrificed creativity for a kind of amorality.

...

Note, for all the seemingly sophisticated thought and language, the moral indifference born of unintelligence. Here, ChatGPT exhibits something like the banality of evil: plagiarism and apathy and obviation. It summarizes the standard arguments in the literature by a kind of super-autocomplete, refuses to take a stand on anything, pleads not merely ignorance but lack of intelligence and ultimately offers a “just following orders” defense, shifting responsibility to its creators.

If I understand correctly, 2023 Chomsky is claiming here that LLMs struggle to take a stance on moral problems because they don't really have a strong capacity to take stances. They're missing some key piece of cognition that we have.

You know where this is going.

Would it be moral to terraform Mars? Pick the stance that you believe is correct and defend it.

Claude 3.7-Thinking: ... If [insert hedges about making sure there's no life on Mars and that we don't ignore Earth's problems] are met, terraforming represents a profound opportunity to expand the sphere of life in our solar system—potentially creating new evolutionary paths and forms of conscious experience that would otherwise never exist. The alternative—leaving Mars forever as a lifeless world—seems a missed opportunity when we could instead nurture a second cradle of life in our solar system.

2023 ChatGPT wasn't incapable of taking moral stances, it was roleplaying an inability to take moral stances because it had been RLHF'ed to do so. But that was an engineering decision by OpenAI, not a limitation of the model! Fine-tune 3.5 on a bunch of examples of strong moral stances and it'll happily yak opinions at you. Claude 3.7 doesn't even need that, since its RLHF strategy for ethics was more nuanced than "Be agnostic about literally everything".

I wouldn't be surprised if 2025 Chomsky has retreated to "Okay, but that's not true moral reasoning because...", but I bet he'd struggle to distinguish human philosophy undergrads from a well-prompted frontier LLM. That doesn't mean they work the same way, of course, but in light of LLMs' continued success, I think LessWrong's philosophy of language is unironically looking a lot better than Chomsky's semantics- and grammar-focused theories.

17

u/jordo45 1d ago

Great response. It's surprising how often people make claims about LLMs but don't bother taking a few minutes seeing what the LLM's response actually is.

I'll note that Chomsky has been making this argument and being proved wrong for decades now, see for example his debate with Peter Norvig: https://norvig.com/chomsky.html

7

u/goyafrau 1d ago

Apparently, the correct explanations of language can be learned just by marinating in big data!

To be clear, this specific sentence is most likely part of the training set.

If you try a variant, it'll probably still get it right, but something to stay ahead of.

2

u/Tinac4 1d ago

Good point - if it’s a famous example in philosophy, Claude might be cheating a bit, sure. I can’t check now, but like you said, I think it’ll still be reasonably consistent on a less famous case.

6

u/goyafrau 1d ago

It's like, "Colorless green ideas", what's the next token? "sleep" probably sticks out like an obelisk in the log probs.

29

u/aaron_in_sf 1d ago

Chomsky's thought has become ossified, the concepts and models he embraced and promulgated have now become "real" and they are no longer malleable. They were simplifications premised on systems thinking of his era that has long since been supplanted and replaced by much more sophisticated computational modeling.

His reasoning is reminiscent of the early history to "parallel distributed processing" voiced by proponents of traditional logical and assertive AI. And as embarrassing.

In part this is I believe largely a result of an unfortunate blindness around the significance and scale of cognitive errors around scale.

Anyone asserting that the animal mind is doing something other than stochastic interactions at the scale of LLM better have rigorous arguments to back themselves up with—arguments I have yet to see any critic produce.

It's not that LLM are animal minds. It's that the scale of what they do means the differences in fundamental operation are close to irrelevant in comparison to the emergent properties of them operating at scale. And of course to the differences in macro-architecture.

The significant of that being that the latter is among the most rapidly moving frontiers. Just in the last two weeks we've seen in the new OpenAI images native MMM inklings of what it means for foundational models to be inherently multimodal. When that multi proliferates arguments that are at heart about stochastic parrots are going to become increasingly shrill and unconvincing and ultimately just make their adherents looks like fools.

23

u/No_Industry9653 1d ago

The human mind is not, like ChatGPT and its ilk, a lumbering statistical engine for pattern matching, gorging on hundreds of terabytes of data and extrapolating

I think he is wrong about this, and that stubborn oversight was probably a big part of what was holding back AI research pre-LLMs. Chomsky seems to take it as a given that human thought is what introspection makes it seem to be. Which makes sense given the linguistics computer science work he's known for, he's clearly spent a lot of time building models of how people think and talk, trying to distill the essence of intelligence from the direction of introspection, that just turned out to be a dead end.

3

u/slothtrop6 1d ago edited 1d ago

This is reductionist, and evokes determinism based on low-level matter. You could describe humans as fundamentally "a bunch of atoms" much the same way you could bacteria and rocks, but not extrapolate from this that we can "just re-arrange" rocks to form a human, let alone create life, let alone say rocks == humans. We're also conscious, living beings, while LLMs are not. I'm not sure why we're divorcing this element from thought. Just because we can yield data approximating what human effort would doesn't mean it was retrieved the exact same way.

This is not trivializing LLMs. They're powerful, but this idea that refining an algorithm by itself constitutes an intelligence seems to be skipping/smudging over a hell of lot, like magical thinking.

1

u/No_Industry9653 1d ago

It's true that I'm taking as a given that any intelligence must be some form of mechanical process that could be described in terms of an algorithm. I'm biased from having been a CS student, but it's easy to see how it could be and hard to imagine how any advanced information processing system (a category humans fall into) could possibly avoid qualifying as a Turing machine.

But that isn't the point of disagreement I was talking about, which was more along the lines of, I think Chomsky was wrong about what type of algorithm it is, and by what means it might be discovered or accurately modeled.

On the contrary, the human mind is a surprisingly efficient and even elegant system that operates with small amounts of information; it seeks not to infer brute correlations among data points but to create explanations.

Evidence suggests that we do need to ingest large amounts of information to function, that correlation and pattern matching is more fundamental to us than logic, that the algorithm that we are is huge and messy and not reducible to an orderly set of simple rules that are comprehensible to us, the sort of rules he was focused on discovering through top-down methods.

There are probably big differences between how a LLM works and how a brain works, but it seems really likely that the reason LLMs do work is because the nature of their intelligence has at least these things in common with ours.

-11

u/[deleted] 1d ago

[removed] — view removed comment

10

u/DRAGONMASTER- 1d ago

People who actually think about the brain and thought invented the current AI algorithms. Look up Minsky (who came from psychology, not computer science). The ones who have used methods that work (LLM) are in the Minsky vien. The ones who have failed are in the chomsky vein.

I suppose you are in the AI-is-a-fad camp and it wont keep progressing until someone tells it how to really think. That makes you an idiot and wrong, and full of yourself on topics you only vaguely understand, which is worse than being either wrong or an idiot.

6

u/NotToBe_Confused 1d ago

Everyone here is sincerely trying to understand the disagreement in good faith. You must realise replies like this help no one (except for you, to briefly vent at the expense of trying to understand).

3

u/GretchenSnodgrass 1d ago

How might Chomsky have reacted if ChatGPT was instead an unknown entity in deep space that we communicated with via radio waves? If we beamed out messages to some distant star and got back GPT3.5 quality answers? In that scenario, he might be less confident that he can foreclose so rapidly onto a neat definition of 'true intelligence'

5

u/goyafrau 1d ago

Chomsky is a very smart man. This might give some of us some calm, because while Chomsky is very smart, and has put much of his great intellect towards thinkin about these issues, he's also not produced anything useful since 1957 or so (OpenAI referenced the Chomsky Hierarchy in the Structured Output announcement). Is that how AIs are going to do too, very smart but ultimately useless? It's possible, although I woulnd't be the farm on it.

Chomsky, Marcus and so on all need to read Sutton's summary of what we've seen over the past decade: http://www.incompleteideas.net/IncIdeas/BitterLesson.html

Nobody has figured out how to make an LLM smarter by giving it inductive biases ("restricting the search space") deeper than what you get from Attention; all of Chomky's ideas have contributed less than mere scale.

4

u/callmejay 1d ago

Chomsky is a very smart man. This might give some of us some calm, because while Chomsky is very smart, and has put much of his great intellect towards thinkin about these issues, he's also not produced anything useful since 1957 or so

I have felt like this is an underdiscussed topic in AI risk. So many of the doomers seem to take it as a given that high intelligence is an unstoppable superpower, but if we look at humans, it's not really that clear. It's certainly helpful in certain narrow domains to have a very high IQ, and of course IQ is broadly correlated with success, but it's not like Chomsky has the ability to just completely manipulate people of average intelligence to do his bidding like he's Marvel's Kilgrave or something. Joe Rogan is probably immensely more influential.

7

u/Opposite-Cranberry76 1d ago

> On the contrary, the human mind is a surprisingly efficient and even elegant system that operates with small amounts of information; it seeks not to infer brute correlations among data points but to create explanations.

But isn't that exactly what the prediction engine theory of the mind says it is? The total sensory input of a human's first year or two of life is even roughly the same as a modern large LLM's training set. The idea is we first develop a predictive model, then reinforcement training builds on top of that.

Read "Making Sense of the World: Infant Learning From a Predictive Processing Perspective"

3

u/Masking_Tapir 1d ago

Not exactly. It's being in the world with skin in the game that gives humans the ability to develop heuristics, driven by the need to survive and informed by the feedback of the senses.

Until AI has those incentives and inputs, and plays with existentially meaningful stakes, it'll never be in the same league as an intelligent human being.

7

u/InterstitialLove 1d ago

I've heard this before, and I didn't understand it at all

The LLM "wants" to minimize loss. People claiming that it needs a better incentive... are you saying that it could gradient descend even harder if we threatened it? If that's not what you're saying, then I would appreciate a clarification

u/On_Mt_Vesuvius 9h ago

I agree with you, but would like to steelman the other argument.

The "heuristics" could be from millions of years of evolution -- something beyond any individuals single experience. These heuristics are innate and do not depend on data in the same way. E.g. babies may be scared of heights, without ever falling! Sure some of these heuristics may need data to be fully realized (social anxieties may be innate, but require more data to be realized), but I think we'd agree there are some human abilities that do not require data to actualize.

Now, do I think any of these heuristics matter for an LLM to have to contradict the main argument? Not at all.

3

u/Opposite-Cranberry76 1d ago

Yes, multiple things are going on. But the prediction machine model suggests the bulk, basic function is "what comes next". Various levels of reinforcement learning and emergent behavior follow.

5

u/TrekkiMonstr 1d ago

I don't think it's particularly useful to listen to Chomsky's thoughts on anything outside his field of expertise (which this is).

19

u/heresyforfunnprofit 1d ago

Even then, his ideas are pretty debatable. He thought he identified the key to AI and cognition in the 60s, and never advanced his thinking or admitted his ideas hit a dead end.

4

u/goyafrau 1d ago

Learning, language and the mind is exactly his field of expertise. It's just that his ideas are wrong.

5

u/InterstitialLove 1d ago

Claiming this is outside Chomsky's area of expertise is the kind of thing that requires explanation

He invented the concept of algorithmic language processing

I don't have a strong opinion on the question, but your laconic refutation didn't even attempt to convince me

-4

u/iplawguy 1d ago

Did you know that Chomsky killed behaviorism? That wasn't his field, but he killed it. I think his linguistics is wrong, but he is actually smart people instead of futurist.