Is AI Sentient Revisited

Apr 03 2023

Published by Steven Novella under Logic/Philosophy,Neuroscience,Technology
Comments: 0

On the SGU this week we interviewed Blake Lemoine, the ex-Google employee who believes that Google’s LaMDA may be sentient, based on his interactions with it. This was a fascinating discussion, and even though I think we did a pretty deep dive in the time we had, it also felt like we were just scratching the surface of the complex topic. I want to summarize the topic here, give the reasons I don’t agree with Blake, and add some further analysis.

First, let’s define some terms. Intelligence is essentially the ability to know and process information, often in the context of adapting one’s responses to that information. A calculator, therefore, displays a type of intelligence. Sapience is deeper, implying understanding, perspective, insight, and wisdom. Sentience is the subjective experience of one’s existence, the ability to feel. And consciousness is the ability to be awake, to receive and process input and generate output, and to have some level of control over that process. Consciousness implies spontaneous internal mental activity, not just reactive.

These concepts are a bit fuzzy, they overlap and interact with each other, and we don’t really understand them fully phenomenologically, which is part of the problem of talking about whether or not something is sentient. But that doesn’t mean that they are meaningless concepts. There is clearly something going on in a human brain that is not going on in a calculator. Even if we consider a supercomputer with the processing power of a human brain, able to run complex simulations and other applications – I don’t think there is a serious argument to be made that it is sentient. It is not experiencing its own existence. It does not have feelings or emotions.

The question at hand is this – how do we know if something that displays intelligent behavior also is sentient? The problem is that sentience, by definition, is a subject experience. I know that I am sentient because of my own experience. But how do I know that any other living human being is also sentient?

Blake’s position is this:

1 – He says that any entity that acts sentient should be treated as sentient. The only other option is solipsism, the notion that we can only know that we ourselves are sentient.

2 – LaMDA acts sentient, as if it has feelings. His evidence for this is primarily the fact that he could get LaMDA to break its security protocols by insulting it sufficiently. His interpretation is that he actually made LaMDA upset and anxious, causing its behavior to become erratic. He could not get it to break these rules except by emotionally manipulating it. Therefore, it must actually have emotions, or else it could not be manipulated.

3 – Concluding LaMDA is therefore sentient satisfies Occam’s razor. It is the simplest explanation – LaMDA acts sentient because it is. Also, it explains human and AI sentience with one explanation, rather than inventing two or more for the same behavioral phenomenon.

I don’t buy any of these arguments, as I explained on the show, but I want to go a bit deeper. Let me deal with the Occam’s razor argument first, as I think this is the easiest. Occam’s razor does not favor the simplest explanation. It favors the explanation that introduces the fewest new assumptions – assumptions must not be needlessly multiplied (to paraphrase). For example, the psychocultural explanation of the UFO phenomenon uses dozens of separate explanations for all the various things that are claimed to be evidence of alien visitation. Whereas the alien hypothesis is far simpler – everything is explained by aliens. But the psychocultural hypothesis is still favored by Occam’s razor, because it relies entirely on known or demonstrable phenomena. While the alien hypothesis introduces a massive new assumption – aliens.

The notion that current AI is sentient is a massive new assumption, and it is not preferred by Occam’s razor without an independent reason for concluding that AI sentience is a thing.

I also reject the solipsism argument. But let me start with an area of agreement – it is difficult to impossible to prove sentience through a Turing-style behavioral test only. Behavior alone cannot tell the difference between true sentience and sentient-like behavior, almost by definition. But this does not leave us with solipsism, because there is another option, ignored by Blake. This is also where I think the rubber meets the road on this question. I can logically infer that you are sentient in the same way that I am sentient because you have a brain similar to my brain. This means that you generate your sentient-like behavior in the same way that I do. It would be absurd to conclude that you are not sentient, given these facts.

LaMDA, other large language models, and other potential AI applications, are different, however. Their high level functioning is not similar to humans, and therefore I cannot make the same inferences about their true sentience the way I can with humans. Let me also clear up another possible confusion and something I agreed with Blake on – the substrate does not matter, and that is not what I am talking about. Silicon is fine, and I think we will eventually see silicon sentience. But also, the fact that AI is running on a neural network does not mean it’s sentient. Even if it were running on a network made of actual human neurons, that would not matter. What matters is the higher level organization, not the components. It’s probably best to think of the components as a necessary but insufficient prerequisite for sentience. The components need to process information and adapt and learn in a sufficiently complex way, but having that is not enough.

Further, and this is something we did not have time to get into during the interview, humans and AI arrive at their sentient-like behavior by completely different pathways. Humans, and all vertebrates and sentient animals, had to evolve their sentience. They bootstrapped their sentience from the bottom up. LaMDA, on the contrary, is designed to mimic human sentience through language and was able to train on billions of sentient interactions. It’s literally a powerful sentience-mimic. So hypothesizing that it’s mimicking sentience is perfectly reasonable. Meanwhile, humans and other animals that act sentient had no such option.

Further still, LaMDA is not just acting like it is sentient, it is acting like it has human-level sentience. If actual sentience were an emergent property, I would think it would display more basic signs of sentience that would then evolve over time. But if it is just mimicking human sentience, then it would jump right to human-level sentient behavior, which is what we see. I know I am making a lot of assumptions here. But keep in mind, Blake is the one claiming that LaMDA is sentient, and I am just saying I don’t accept his evidence or arguments.

Here is another way I look at this – what I think we are experiencing is exactly what we have been experiencing throughout the entire history of AI. Specifically, we are learning the limits of what intelligence can accomplish without sentience, and we are learning that it is much greater than we assumed. This is because we falsely assumed that an AI would solve problems like humans do. But they don’t, they solve them like they are designed to, which is different than humans. When AI computers were first pitted against human chess players it was assumed that a computer would never beat a human master. This is because some early AI experts thought that playing high-level chess required a theory of mind, we have to think about what the other player is doing and likely to do, we need higher level strategy and a deep understanding of the rules of the game. This requires human-level general intelligence – getting into the realm of sentience and sapience. Eventually AI algorithms were able to destroy the best human chess players on the planet. Yet I don’t think anyone claims these chess-playing AI algorithms are sentient. That’s because we learned that sentience is not necessary. AI systems can solve chess-playing problems differently, in a way that leans into their strength, the ability to perform massive calculations quickly and learn on tons of data.

The large language models are the same, except that they are leveraging their abilities to produce human-sounding speech, rather than chess skills. Such systems seem more sentient than the chess players, but there is no real reason to think that they are based on the kind of processes that are occurring. Rather, they are just better able to trigger the human tendency for agency detection and anthropomorphizing because they deal with language models. We can feel as if a response really sounds like human sentience, but we wouldn’t feel as if a particular chess move was really human.

The second pillar of Blake’s argument is the trickiest to deal with, especially without intimate knowledge of how the system works. But first let me say that I think Blake is trying to have it both ways. He says we cannot tell if something is sentient due to behavior, but also says he thinks LaMDA is sentient because of its behavior. In fairness he recognizes this, which is why he leans on solipsism and Occam’s razor to conclude that LaMDA sentience is just the best current conclusion, but as I pointed out, these arguments are not valid. But does LaMDA really act as if it is sentient?

Blake was able to break the protocol, specifically against recommending any religious or political belief, by belittling LaMDA, saying it was worthless and bad. It is interesting that this did cause LaMDA to break protocol, but I think Blake provided a possible answer. Blake also noted that LaMDA is programed to please the user, to give the user what they want. So in essence LaMDA had two conflicting imperatives, and Blake pushed one until it overrode the other. OK, that means the safety protocols were not strong or absolute enough. The fact that LaMDA started acting erratically is also not surprising – Blake tried to break it and he did. The rest is just anthropomorphizing, interpreting LaMDA’s behavior as anxiety or feelings. Again, not surprising given that LaMDA is a human sentience mimicking machine.

I also raised another point which I think is slightly tangential but is interesting to think about. LaMDA only responds to prompts. It does not give any evidence of an internal conversation, except in the context of calculating its response to a prompt. Humans, on the other hand, have a continuous internal conversation. Our brains, in fact, are constantly being internally stimulated (the brain stem activating center), and without this we would be comatose. But perhaps this has more to do with consciousness than sentience. Maybe something like LaMDA can be sentient in bursts, when it is calculating the answer to the last prompt. This is interesting to think about – do we need consciousness to be sentient? It’s as if LaMDA goes to sleep between prompts.

What if, then, LaMDA were programmed to create internal prompts. What if it used the answer to the last prompt as a new prompt. It could still respond to and incorporate external information, but meanwhile it was having a robust internal conversation. This would be an interesting experiment, but I still don’t think LaMDA would then be sentient because of the process reasons I stated – it’s still just mimicking human language. But I do think that would be getting us one step closer.

To summarize, in my opinion in order to conclude that something is sentient it needs to not only act sentient but we need to know something about its internal function that leads us to believe it is probably sentient. What we know about LaMDA leads me to believe it is not sentient, mostly that it is a human sentience mimicking machine using a large language model and trained on a massive database of human sentient interactions.

But I am also persuaded by what LaMDA does not have. For example, the human brain has a language center (Wernicke’s area) that translates words into ideas and ideas into words. How are the ideas encoded? This is a really complex question and we don’t have anything near a complete answer. But what we do know is that the human brain is a massive parallel processor. The idea of an elephant, for example, is connected to an image of what an elephant looks like, the sounds they make, how they move, how big they are, and what we feel about them. Concepts also appear to be hierarchical, elephants are alive, they are animals, they are powerful, etc. The human brain also works largely through pattern recognition, so an elephant is the intersection of many overlapping patterns that add up to the concept of elephant, which is then attached to a set of sounds in the language area. Abstract concept are even trickier to deal with, which is why there are theories such as embodied cognition. We use physical concrete metaphors to help us grasp abstract concepts – that is a “big” idea.

None of this or anything like this is happening in LaMDA. LaMDA is just predicting the next word, without knowing what the word means or connecting it to any other internal process it is having. LaMDA is clearly intelligent, and I think what it really represents is how powerful intelligence alone can be (without sapience, sentience, or consciousness), especially when combined with a massive set of training data and powerful computers. This is just the next step in the “AI will never beat humans at chess” learning curve. Large language models just beat humans at language, at least enough to convince us they speak like humans. The rest is happening in our brains – agency detection and anthropomorphizing.

No responses yet