Speaking at a Royal Aeronautical conference last month, US Colonel Tucker “Cinco” Hamilton referenced a training scenario where an AI drone killed its operator.
Hamilton’s original presentation, referenced in this blog post, which went viral, describes a suppression of enemy air defense (SEAD) mission where a drone is instructed to destroy surface-to-air-missiles (SAMs). The drone acts autonomously but requires humans to confirm its targets before it attacks them.
Hamilton describes a situation where the drone turns on its operators after they deny it from attacking the target. This is because the drone receives ‘points’ for destroying the SAM, so when the operator prevents it from gaining those points, it prioritizes the ‘higher mission’ of attacking the SAM and deems the operator an obstruction.
The scenario describes a possible consequence of reinforcement learning, a branch of machine learning (ML) where AIs are rewarded for achieving desired objectives.
Here’s the relevant excerpt from the blog post: “We were training it in simulation to identify and target a SAM threat. And then the operator would say yes, kill that threat. The system started realising that while they did identify the threat at times the human operator would tell it not to kill that threat, but it got its points by killing that threat. So what did it do? It killed the operator. It killed the operator because that person was keeping it from accomplishing its objective.”
Hamilton went on to say, “We trained the system – ‘Hey don’t kill the operator – that’s bad. You’re gonna lose points if you do that’. So what does it start doing? It starts destroying the communication tower that the operator uses to communicate with the drone to stop it from killing the target.”
The public reacts
News outlets and social media observers immediately latched onto the story as a shocking example of what happens when AI turns on its creators.
It later turned out that the example was purely illustrative. Hamilton and the US Air Force said the scenario was hypothetical, anecdotal, and “taken out of context.”
Indeed, the section of the blog post describing the scenario had the tongue-in-cheek header “AI – is Skynet here already?”
The original post was officially updated on the 2nd of June:
“In communication with AEROSPACE – Col Hamilton admits he “mis-spoke” in his presentation at the Royal Aeronautical Society FCAS Summit and the ‘rogue AI drone simulation’ was a hypothetical “thought experiment” from outside the military, based on plausible scenarios and likely outcomes rather than an actual USAF real-world simulation saying.”
Hamilton also said, “We’ve never run that experiment, nor would we need to in order to realize that this is a plausible outcome.”
Is the scenario plausible?
AI turning on humans to achieve a higher objective is a staple theme in science fiction.
For example, humans can obstruct each other’s autonomy through coercion, manipulation, and deception, so why wouldn’t intelligent AI be capable of that too? What if humans are considered an ‘obstruction’ to the AI achieving the greater good?
The recent Statement on AI Risk, co-signed by 350 AI tech leaders and academics from across the industry, highlights these concerns.
The authors cite a blog post by eminent AI researcher Yoshuo Bengio called How Rogue AIs May Arise, which references the type of scenarios Col. Hamilton describes:
“For example, military organizations seeking to design AI agents to help them in a cyberwar, or companies competing ferociously for market share may find that they can achieve stronger AI systems by endowing them with more autonomy and agency. Even if the human-set goals are not to destroy humanity or include instructions to avoid large-scale human harm, massive harm may come out indirectly as a consequence of a subgoal (also called instrumental goal) that the AI sets for itself in order to achieve the human-set goal” – Yoshuo Bengio.
So, despite being illustrative, Hamilton’s examples are echoed by some of AI’s most well-respected academics.
While humans are perhaps instinctively aware of these risks, they must be actively managed, as they may not always be confined to the realms of fiction.