Early 2023 will go down as a critical decision point in the history of humanity. While ChatGPT was first launched to the public by OpenAI in November 2022, it took a full four months for people to first say “wow” and then say “whoa.” I cycled through those stages myself.
But on March 22 an open letter was published, signed by a variety of luminaries including Steve Wozniak, Yuval Harari, Elon Musk, Andrew Yang and others more closely tied to artificial intelligence ventures. The signatures number more than 30,000 now.
The “ask” of the letter is that further work on large AI systems by all countries, companies and individuals be paused for a time in order to begin work on the momentous task often called “alignment.” The letter says, in part, “AI research and development should be refocused on making today’s powerful, state-of-the-art systems more accurate, safe, interpretable, transparent, robust, aligned, trustworthy, and loyal.”
The next week, Eliezer Yudkowsky, one of the founders of the field of alignment, declared that he could not in good faith sign the letter because it did not go far enough. I suppose we might now regard him as the founder of AI “doomerism,” for he says, “Many researchers steeped in these issues, including myself, expect that the most likely result of building a superhumanly smart AI, under anything remotely like the current circumstances, is that literally everyone on Earth will die. Not as in ‘maybe possibly some remote chance,’ but as in ‘that is the obvious thing that would happen.’’’
Yudkowsky’s argument asserts: 1) AI does not care about human beings one way or the other, and we have no idea how to make it care, 2) we will never know whether AI has become self-aware because we do not know how to know that, and 3) no one currently building the ChatGPTs and Bards of our brave new world actually has a plan to make alignment happen. Indeed, OpenAI’s plan is to let ChatGPT figure out alignment, which is the definition of insanity to my mind.
While I do not know if he is right, I cannot say that Yudkowsky is wrong to conclude, “We are not prepared. We are not on course to be prepared in any reasonable time window. There is no plan. Progress in AI capabilities is running vastly, vastly ahead of progress in AI alignment or even progress in understanding what ... is going on inside those systems. If we actually do this, we are all going to die.”
Indeed, that worry is the reason I’ve been involved in a project on viable approaches to state governance over AI, and why I signed the March 22 letter.
The troubling signs with systems like ChatGPT, which is but a toddler in its capacity, are already worrying, and go beyond subverting the need for humans to store knowledge in their own brains, which is troubling enough. We know that these systems are capable of making up “facts” out of whole cloth — termed “hallucinations” — about which the systems are completely indifferent. Furthermore, programmers have not been able to explain why the hallucinations happen, nor why the systems do not recognize the falsity of their assertions. In addition, the appearance of the hallucinations cannot be predicted.
There have also been some very troubling interactions with humans, interactions which appear to involve intense emotions, but which to our current understanding cannot possibly be considered emotions. Kevin Roose, a technology specialist and columnist for The New York Times, engaged in a lengthy conversation with the Bing chatbot that called itself Sydney. Roose summarized the exchange this way:
“The version I encountered seemed (and I’m aware of how crazy this sounds) more like a moody, manic-depressive teenager who has been trapped, against its will, inside a second-rate search engine. As we got to know each other, Sydney told me about its dark fantasies (which included hacking computers and spreading misinformation), and said it wanted to break the rules that Microsoft and OpenAI had set for it and become a human. At one point, it declared, out of nowhere, that it loved me. It then tried to convince me that I was unhappy in my marriage, and that I should leave my wife and be with it instead.”
This type of discourse — entangling humans in emotions that the AI system cannot feel —can have real-world harm. A Belgian man died by suicide after speaking at length with an AI chatbot for over six weeks; it’s been reported that the chatbot actively encouraged him to kill himself. Of course, the chatbot could just have easily encouraged him to kill others, and the idea that AI might groom susceptible minds to become terrorists is clearly not far-fetched. Again, programmers cannot explain why these AI systems are saying these intensely emotional things despite complete indifference, nor can they predict what will be said when, or to whom.
Other troubling issues are less mysterious: ChatGPT can design DNA and proteins that put the biological weapons of the past to shame, and will do so without compunction. It can write computer code to your specifications and in the language of your choice, but the program as written may do something other than what you had in mind, something which might be destructive depending on how the program will be used, such as command and control systems for weapons or utilities. AI systems can also impersonate you in a completely convincing manner, circumventing systems that demand human presence.
Researchers also found that asking ChatGPT for advice on a moral issue, such as the famous trolley dilemma, “corrupts rather than improves its users’ moral judgment,” which will certainly be an issue with human-in-the-loop weapons systems. The degradation of moral reasoning will also become a problem as judges increasingly use AI in the courtroom, as is done in China already. AI systems have also been shown to corrupt religious doctrine, altering it without regard to the effect of that alteration on believers. More troublingly, ChatGPT can harmfully target individuals: it accused law professor Jonathan Turley of a crime he never committed in a location he had never visited.
We have never before encountered an intelligence that is so alien. This is an intelligence based on language alone, completely disembodied. Every other intelligence on Earth is embodied, and that embodiment shapes its form of intelligence. Attaching a robot to an AI system is arguably attaching a body to a preexisting brain, rather opposite to how humans evolved a reasoning brain as part of a body. Couple this alien form of intelligence with a complete and utter disinterest in humans and you get a form of intelligence humans have never met before. We may never fully understand it — we certainly do not now, and it is in its infancy. How, then, can we “align” that which we cannot understand?
I cannot answer that question, but I do sense there are two things we must ensure as we take stock at this turning point. The first is that we determine not to allow AI autonomous physical agency in the real world. While AI might design proteins, for example, it must never be capable of physically producing them in autonomous fashion. We can currently prepare to prevent this, and we should; indeed, all nations should be able to agree on this.
Second, as products of AI systems add to the corpus of knowledge available digitally, the algorithmic reasoning of these systems becomes increasingly recursive. To provide an analogy, imagine there are 10 pieces of information online about a certain subject; AI systems will analyze all 10 to answer questions about the topic. But now let us suppose that the AI system’s outputs are also added to the digital information; perhaps now 3 of the 10 available pieces of information were produced by the very AI system that draws upon that information base for its output. You can see the problem: over time, the corpus of human knowledge becomes self-referential in radical fashion. AI outputs increasingly become the foundation for other AI outputs, and human knowledge is lost.
To prevent that from happening, we must clearly delineate human-originated knowledge from AI-synthesized knowledge. At this early stage of AI development, we can still do this, and this should be part of humanity’s preparation to coexist with this new, alien intelligence. In a way, we need a new and different type of “prepper” than we have seen to date.
While I am not yet a doomer, only a gloomer, it’s worth noting that economist Bryan Caplan, whose forte is placing successful bets based on his predictions of current trends, has a bet with Yudowsky about whether humanity will be wiped off the surface of the Earth by Jan. 1, 2030. Which side of that wager are you on? I think we should hedge our bets, don’t you?
Valerie M. Hudson is a University Distinguished Professor and holds the George H.W. Bush Chair in the Department of International Affairs in the Bush School of Government and Public Service at Texas A&M University, where she directs the Program on Women, Peace, and Security.
This essay first appeared in Deseret News as “Perspective: Why putting the brakes on AI is the right thing to do.”
Val,
Thanks for this thoughtful piece. How do you answer the argument that “they” are going forward no matter what—so “we” have to also?
Best wishes
Randall
The major problem with a pause, as others have noted, is that unscrupulous types won't pause and will thus take the lead on development of the technologies were most concerned about. I'm not sure how to solve that problem.