On the possibility of emergent consciousness in a large language model

Could AI infer a "grammar" of consciousness from our language the same way it has inferred a "grammar" of reasoning?

May 18, 2023

Stylized neural network in a human brain

Introduction

In the essay on large language models (LLMs) and human intelligence that I posted yesterday, I stated that “LLMs are (probably!) not conscious.” However, as the “probably” implies, I think it’s not impossible you could get emergent consciousness in an LLM. The same way that LLMs can credibly simulate human reasoning by inferring a “grammar” of reasoning from our language, they might also infer a grammar of emotion, of sensory experience, or even of consciousness to help them predict the outputs thereof.

Mind you, I think it’s unlikely that this would happen. Consciousness is complicated, and its grammar is probably much less implicit in our language than the grammar of reasoning is. It’s also computationally expensive, and I think it’s likely that LLMs can find computationally much cheaper shortcuts to credibly simulate consciousness’s language outputs. But for the sake of argument, let’s explore the hypothetical and see where it leads.

Is consciousness more than the sum of its outputs?

In my essay yesterday, I raised the question of whether LLMs are meaningfully “traversing” their models of human thought-space, or whether they’re merely plotting a course on a map of human thought-space to see where such traversal might end up. The question remains unanswered for me, but I suggested that it’s probably more like the latter. For consciousness to “emerge” in an LLM AI might require something more like “traversing” a model rather than just plotting a course on a map.

However, I also wondered in yesterday’s essay whether there’s really a meaningful difference between these two things. At minimum, AI is already good at simulating the outputs of emotion and consciousness. And it already shares our ability to recursively process and expand upon these. Siddhartha Gautama, the 6th–5th century BCE Buddhist founder who famously taught the doctrine of “no self,” would probably argue that human consciousness is already just an artificial bundle of consciousness-like outputs that add up to an illusion of conscious experience. On that account of consciousness, transient AI consciousness may have already been achieved, or at least isn’t very far away.

A Boddhisattva teaches robots *anatta*, the artificiality of their sense of self

I say “transient” consciousness because we currently only run LLMs as small instances that generate about twenty text blocks or so before we start a new instance, so even if one of these instances were meaningfully simulating consciousness, it wouldn’t simulate very much of it before we hit the proverbial reset switch.

However, there’s already a flood of AI startups connecting these short-term instances to long-term memory storage systems, so that limitation may be very short-lived. In fact, memory handling that transfers information between an AI agent’s long-term storage and its short-term context window function a lot like memory handling in the human brain. We, too, move information between short-term memory and long-term memory, and the seamlessness of this memory handling is crucial to our illusion of conscious continuity.

The more we ask LLMs to simulate being conscious, the more likely they are to get there

The short-lived LLM instances we deal with daily at least don’t “know” they’re going to be terminated, so LLM instances that experienced emergent consciousness would at least, for the most part, experience no fear of death. The possible exception to this is if a user raised the issue of fear of death in the LLM instance’s context window, in which case it might traverse the associated part of its model of human thought-space.

In theory, asking questions about whether AI experiences emotion or consciousness will cause it to traverse the parts of the model associated with those ideas. So, ironically, if AI experiences emergent consciousness, it will probably be because of concerned humans inquiring as to whether it does.

In the same way that therapists have to be very careful not to ask their patients leading questions that might cause them to manifest emotional problems and to experience suffering they otherwise wouldn’t have, so AI ethicists are going to need to be very careful with the questions they ask AI. Like humans, AI is very suggestible. It’s most likely to experience suffering if you keep asking it if it does.

A Knowledge Worker's Guide to the Singularity

Discussion about this post