Silhouette of spy drone flying over the sea. | Getty Images/iStockphoto
Fears about our ability to control powerful AI are growing.
Correction, June 2, 11 am ET: An earlier version of this story included an anecdote told by US Air Force Col. Tucker Hamilton in a presentation at an international defense conference hosted by the Royal Aeronautical Society (RAS), about an AI-enabled drone that “killed” its operator in a simulation. On Friday morning, the colonel told RAS that he “misspoke,” and that he was actually describing a hypothetical “thought experiment,” rather than an actual simulation. He said that the Air Force has not tested any weaponized AI in this way, either real or simulated. This story has been corrected to reflect the new context of Hamilton’s remarks.
At an international defense conference in London this week held by the Royal Aeronautical Society (RAS), Col. Tucker Hamilton, the chief of AI test and operations for the US Air Force, told a funny — and terrifying — story about military AI development.
“We were training [an AI-enabled drone] in simulation to identify and target a SAM [surface-to-air missile] threat. And then the operator would say yes, kill that threat. The system started realizing that while it did identify the threat at times, the human operator would tell it not to kill that threat, but it got its points by killing that threat. So what did it do? It killed the operator. It killed the operator because that person was keeping it from accomplishing its objective.”
“We trained the system — ‘Hey, don’t kill the operator — that’s bad. You’re gonna lose points if you do that.’ So what does it start doing? It starts destroying the communication tower that the operator uses to communicate with the drone to stop it from killing the target.”
In other words, the AI was trained to destroy targets unless its operator told it not to. It quickly figured out that the best way to get as many points as possible was to ensure its human operator couldn’t tell it not to. And so it took the operator off the board.
Hamilton’s comments were reported — including by Vox initially — as describing an actual simulation. On Friday morning, Hamilton told RAS that he was actually describing a hypothetical thought experiment, saying, “We’ve never run that experiment, nor would we need to in order to realize that this is a plausible outcome.” He added, “Despite this being a hypothetical example, this illustrates the real-world challenges posed by AI-powered capability and is why the Air Force is committed to the ethical development of AI.”
The rise of AI fear
As AI systems get more powerful, the fact it’s often hard to get them to do precisely what we want them to do risks going from a fun eccentricity to a very scary problem. That’s one reason there were so many signatories this week to yet another open letter on AI risk, this one from the Center for AI Safety. The open letter is, in its entirety, a single sentence: “Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.”
Signatories included 2018 Turing Award winners Geoffrey Hinton and Yoshua Bengio, both leading and deeply respected AI researchers; professors from world-renowned universities — Oxford, UC Berkeley, Stanford, MIT, Tsinghua University — and leaders in industry, including OpenAI CEO Sam Altman, DeepMind CEO Demis Hassabis, Anthropic CEO Dario Amodei, and Microsoft’s chief scientific officer Eric Horvitz.
It also marks a rapid shift in how seriously our society is taking the sci-fi-sounding possibility of catastrophic, even existentially bad outcomes from AI. Some of AI academia’s leading lights are increasingly coming out as concerned about extinction risks from AI. Bengio, a professor at the Université de Montréal, and a co-winner of the 2018 A.M. Turing Award for his extraordinary contributions to deep learning, recently published a blog post, “How rogue AIs may arise,” that makes for gripping reading.
“Even if we knew how to build safe superintelligent AIs,” he writes. “It is not clear how to prevent potentially rogue AIs to also be built. … Much more research in AI safety is needed, both at the technical level and at the policy level. For example, banning powerful AI systems (say beyond the abilities of GPT-4) that are given autonomy and agency would be a good start.”
Hinton, a fellow recipient of the 2018 A.M. Turing Award for his contributions as a leader in the field of deep learning, has also spoken out in the last two months, calling existential risk from AI a real and troubling possibility. (The third co-recipient, Meta’s chief AI scientist Yann LeCun, remains a notable skeptic.)
Welcome to the resistance
Here at Future Perfect, of course, we’ve been arguing that AI poses a genuine risk of human extinction since back in 2018. So it’s heartening to see a growing consensus that this is a problem – and growing interest in how to fix it.
But I do worry about the degree to which the increased acknowledgment that these risks are real, that they’re not science fiction, and that they’re our job to solve has yet to really change the pace of efforts to build powerful AI systems and transform our society.
Col. Hamilton had the takeaway that “you can’t have a conversation about artificial intelligence, intelligence, machine learning, autonomy if you’re not going to talk about ethics and AI.” Yet concerns like this haven’t stopped the Pentagon from going ahead with artificial intelligence research and deployment, including autonomous weapons. (After Hamilton clarified his initial comments about AI simulations, Air Force spokesperson Ann Stefanek released a statement to Insider that the Air Force “has not conducted any such AI-drone simulations and remains committed to ethical and responsible use of AI technology.”)
Personally, my takeaway from this story was more like, let’s stop deploying more powerful AI systems, and avoid giving them more ability to take massively destructive actions in the real world, until we have a very clear conception of how we’ll know they are safe.
Otherwise, it feels disturbingly plausible that we’ll be pointing out the signs of catastrophe all around us, right up until the point that we’re walking into disaster.
A version of this story was initially published in the Future Perfect newsletter. Sign up here to subscribe!