We Need an AI Limitation Treaty. Now.

By Shaun Tan

Founder, Editor-in-Chief, and Staff Writer

28/2/2021

For most, the threat of artificial intelligence seems like science fiction, the stuff of movies like I, Robot, The Matrix, and The Terminator.

But the threat it poses is real. Prominent computer scientists have warned of it for years, and recently some of the smartest people on the planet have taken up the call. Bill Gates considers AI more dangerous than a nuclear catastrophe, Elon Musk said it was probably humanity’s “biggest existential threat,” Steven Hawking said it could “spell the end of the human race.”

We should start by defining what’s meant by the term “AI.” AI, in a sense, is already here. It’s in online search engines, the computer opponents in video games, the spam filter in our emails, and the Siri assistant in our iPhones.

All of these are examples of artificial narrow intelligence (ANI) – AI that’s only capable of a few specific tasks. Well-designed ANIs can match or surpass humans at particular tasks, but, unlike humans, they can’t be applied to much else. Google’s AlphaGo may be able to beat any human at Go, but that’s all it can do. Such AIs are useful, and don’t seem to pose an existential threat.

It’s at the level of artificial general intelligence (AGI) when things get dangerous. An AGI would be as smart as a human across the board. Unlike an ANI, an AGI could be applied to anything. No one’s been able to develop one yet, but in theory, an AGI would be able to match a human at any task, and, naturally, would also be able to do things like perform complicated calculations effortlessly, make countless copies of itself in seconds, and transmit itself across the world instantaneously.

An artificial superintelligence (ASI) would be something else entirely. It would be smarter than humans across the board, and the extent to which it’s smarter may be beyond our reckoning.

***

In his great article “The AI Revolution: The Road to Superintelligence” in Wait But Why, Tim Urban explained why growth in AI cognitive power is likely to take us by surprise.

Humans tend to think that the difference in intelligence between the smartest human and the dumbest human is large, that is, to use Oxford philosopher Nick Bostrom’s example, that someone like Albert Einstein is much smarter than the village idiot. On the grand scale of intelligence including non-human animals, however, this difference is miniscule. The difference between the intelligence of a human and that of a chimpanzee is many, many times larger than the difference between the intelligence of Einstein and that of the village idiot. The difference between the intelligence of a chimpanzee and that of a mouse is larger still.

This means that whilst it may take years or decades to get an AI to chimpanzee-level intelligence, for example, once that level is reached the transition to general human-level intelligence (AGI) will be much faster, resulting in what some have termed an “intelligence explosion.”

Furthermore, we should factor-in recursive self-improvement, a popular idea amongst AI researchers for boosting intelligence. An AI capable of recursive self-improvement would be able to find ways to make itself smarter; once it’s done that, it’ll be able to find even more ways to make itself smarter still, thereby bootstrapping its own intelligence. Such an AI would independently and exponentially increase in cognitive power.

An AI capable of recursive self-improvement would be able to find ways to make itself smarter, bootstrapping its own intelligence.

An AI approaching general human-level intelligence, therefore, would pick up speed, and, far from stopping at Humanville Station, as Bostrom puts it, would whoosh past it. An AI capable of recursive self-improvement that had attained village idiot intelligence level in the morning might hit Einstein-level by the afternoon. By evening, it could have reached a level of intelligence far beyond any human. AI researchers, celebrating their success at creating an AGI, might find themselves faced with a superintelligence before they’d even finished the champagne.

A superintelligence could be smarter than humans in the same way that humans are smarter than chimpanzees. We wouldn’t even be able to comprehend an entity like that. We think of an IQ of 70 as dumb and an IQ of 130 as smart, but we have no idea what an IQ of 10,000 would be like, or what a being with that cognitive capacity would be capable of. Its power, for us anyway, would be incalculable: many things we deem impossible or fantastical would be child’s play for it. Curing all disease would be as easy for it as popping a pill, interstellar travel as easy as stepping from room to room, and extinguishing all life on earth as easy as snuffing out a candle.

Picture Credit: https://writix.co.uk

The only term we have that comes close to describing something like that is God, and, as Urban ominously puts it, the question we should ask then is: Will it be a nice God?

***

Some computer scientists seem confident that we can make an AGI or a superintelligence be “nice,” that taming the god we created is a matter of programming.

Programming an AI of human intelligence or above will likely be a daunting task. Who knows what it might do without being given specific goals or values, and, even if it is, its actions might still be unpredictable. Nick Bostrom, who is also the founding director of the Future of Humanity Institute at the University of Oxford, gives the example of an AI being tasked with the seemingly boring and innocuous goal of making as many paperclips as possible. At some point, it may decide that in order to maximize the number of paperclips it should prevent humans from reprogramming it or switching it off, upon which it kills all the humans so it can continue making endless amounts of paperclips unimpeded.

Nick Bostrom (Picture Credit: Future of Humanity Institute)

Note, of course, that in that scenario the AI wouldn’t exterminate humans because of any malice it had towards them (no more than we hate bacteria when we take antibiotics), but because they don’t matter to it. Likewise, when Google’s DeepMind AI program grew increasingly aggressive as it got smarter, and was more likely to attack opponents with lasers in simulated games, it wasn’t because of any malice towards those opponents; it was just because that strategy maximized its chances of winning.

In order to prevent something like that from happening, some have suggested programming AIs with goals specifically beneficial to humans. Such attempts, however, can also lead to unexpected results.

For example, an AI programmed to “make people happy” might realize that the most efficient way to do this is to capture humans, implant electrodes into their brains and stimulate their pleasure centers.

Likewise, an AI programmed with Isaac Asimov’s Three Laws of Robotics

1) A robot may not injure a human being or, through inaction, allow a human being to come to harm

2) A robot must obey the orders given it by human beings except where such orders would conflict with the First Law

3) A robot must protect its own existence as long as such protection does not conflict with the First or Second Laws

might decide that, since humans are constantly harming each other, the best way to obey these laws would be to gently imprison all of them.

Another suggestion is to upload a pre-existing set of values into an AI – utilitarianism, say, or liberal democracy. But even assuming people could agree on which philosophy to go with, it’s hard enough to imbue humans with human values as it is. There’s no telling how a superintelligence might interpret it, or the contradictions within it.

There’s no reliable way to ensure a superintelligence’s goals or values accord with our own. A single careless assumption or oversight or ambiguity could lead to results no one expected or intended.

***

Others have suggested building safeguards around the AGI or superintelligence. They’ve mooted measures of varying degrees of complexity, from denying it access to the internet, to restricting its contact with the outside world, to trapping it in a series of concentric virtual worlds. None of these safeguards inspire confidence.

First, as Roman V. Yampolskiy, Associate Professor of Computer Engineering and Computer Science at the University of Louisville, noted, every security measure ever invented has eventually been circumvented.

“Signatures have been faked, locks have been picked, supermax prisons had escapes, guarded leaders have been assassinated, bank vaults have been cleaned out, laws have been bypassed […] passwords have been brute-forced, networks have been penetrated, computers have been hacked, biometric systems have been spoofed, credit cards have been cloned, cryptocurrencies have been double spent […] CAPTCHAs have been cracked, cryptographic protocols have been broken,” he wrote. “Millennia long history of humanity contains millions of examples of attempts to develop technological and logistical solutions to increase safety and security, yet not a single example exists which has not eventually failed.”

Any safeguards would eventually be circumvented either by human hackers, or acts of nature (think the tsunami that caused the radiation leak at the Fukushima nuclear reactor). Whilst a certain failure rate may be acceptable in an enterprise where the stakes are lower, it’s unacceptable where a single leak might be all the AI needs to end humanity’s dominance.

Then, there’s the likelihood that any safeguards would be circumvented by the AI itself. Indeed, any security measures our best computer scientists could devise would be laughable to a superintelligence, which, by definition, would be many times smarter than any human.

Any security measures our best computer scientists could devise would be laughable to a superintelligence.

Imagine a human being held captive by chimpanzees. Suppose that these are unusually intelligent chimpanzees that use state-of-the-art monkey technology to keep the human prisoner – perhaps they manage to construct a rudimentary cage out of sticks. Is there any doubt that the human wouldn’t eventually escape in ways the chimpanzees couldn’t possibly think of? Perhaps he’d dig a hole under the cage, or fashion tools out of nearby objects to help him, or remove the bars of the cage and use them as weapons, or make a fire that burns down a portion of the cage. One way or another, it would only be a matter of time before he found a way free.

A superintelligence would be smarter than humans in a similar fashion. In his article “Leakproofing the Singularity: Artificial Intelligence Confinement Problem,” Yampolskiy suggested that a superintelligence could easily manipulate a human guard into letting it escape. It could target a guard’s weaknesses, offering him power or immortality, or promising a cure for a loved-one with a terminal disease.

It could also find a bug in the system and exploit it (something even human hackers do all the time). Or pretend to malfunction, and then escape when its jailors lower safeguards to investigate. Or it could escape in ways humans aren’t even aware are possible. Insulated from the outside world, Bostrom suggested, it might find a way to generate radio waves by shuffling the electrons in its circuitry in particular patterns. Of course, these are just the methods our puny human brains can imagine – an entity thousands of times smarter would be able to come up with a lot more. Effective safeguards are built around power – they’re not possible against a being that’s smarter, and therefore more powerful, than us. Thinking we could contain something like that would be hubris.

At a talk at MIT, Elon Musk compared developing AI to summoning a demon. “You know all the stories where there’s a guy with the pentagram and the holy water and he’s like, yeah, he’s sure he can control the demon? Doesn’t work out.”

How do you cage a god? The short answer to that question is “You can’t.”

***

The development of AGI and superintelligence may be approaching. The median realistic year leading computer scientists predict it to happen by is 2040. While this might seem far off, we need to start preparing for it now.

“If a superior alien civilization sent us a text message saying, “We’ll arrive in a few decades,” would we just reply, “Ok, call us when you get here – we’ll leave the lights on?”” asked Stephen Hawking in an article co-written with Stuart Russell of the University of Berkeley and Max Tegmark and Frank Wilczek of MIT. “Probably not – but this is more or less what is happening with AI.”

Stephen Hawking (Picture Credit: Doug Wheller)

AI is a technology no major power can afford to ignore if it wants to advance in the 21^st century. The US and China in particular are pouring vast resources into AI research in both the public and private sectors in hopes of achieving the next breakthrough.

At the same time however, AI presents a real existential threat to humanity. All other existential threats, from global warming to weapons of mass destruction, have some sort of treaty in place to manage the associated risks. It’s time we had one for AI too.

All other existential threats, from global warming to weapons of mass destruction, have some sort of treaty in place to manage the associated risks. It’s time we had one for AI too.

It’s vital we work on establishing an international framework now, in what are relatively early days, before the AI industry develops too far, before we become too used to its benefits, before associated vested interests and lobby groups gain too much power. The difficulties in addressing the global warming crisis show the tendency of humans to inertia, even when faced with existential threat. “[T]he human race might easily permit itself to drift into a position of such dependence on the machines that it would have no practical choice but to accept all of the machines’ decisions,” wrote Bill Joy, co-founder of Sun Microsystems, in his essay “Why the Future Doesn’t Need Us.” At that point, he warned, “People won’t be able to just turn the machines off, because they will be so dependent on them that turning them off would amount to suicide.”

When I put the idea of an AI limitation treaty to top computer scientists, many were skeptical, some even fatalistic.

“A machine that is ‘smarter than humans across the board’ would be worth something comparable to world GDP, approximately $100 trillion,” said Russell. “It’s not going to be easy to stop people building that.”

“[U]nlike [with] nuclear weapons,” said Steve Omohundro, formerly professor of computer science at the University of Illinois at Champaign-Urbana, and now President of Self-Aware Systems, a think tank promoting the safe uses of AI, “it is not easy to verify compliance with any [AI] agreement given today’s technologies.”

Yet an effort must be made. The growing field of AI offers vast potential, both for human flourishing, and its extinction. We have no excuse for not trying to stave off the latter.

There seem to be a few conclusions that can be drawn:

1) A superintelligence cannot be tamed or caged.

2) An AGI capable of recursive self-improvement would soon become a superintelligence.

3) Even without recursive self-improvement, an AGI might pose an existential threat simply because in addition to being able to perform any task at a human level, it would also be able to do things only computers can do.

The line, if one is to be drawn in an AI limitation treaty, then, should be at the AGI level – no one should be allowed to develop an AI that’s as smart as or smarter than a human across the board, nor one that could independently become so. Research into ANI – better versions of the AI we use today – can continue unimpeded. The important difference is domain specificity – an ANI cannot be used for problems beyond a narrow scope, whilst an AGI can be used for anything. “A system is domain specific if it cannot be switched to a different domain without significant redesigning effort,” explained Yampolskiy. “Deep Blue [IBM’s chess AI] cannot be used to sort mail. Watson [IBM’s Jeopardy! AI] cannot drive cars. An AGI (by definition) would be capable of switching domains.”

No one should be allowed to develop an AI that’s as smart as or smarter than a human across the board, nor one that could independently become so.

What might such a treaty based on this look like?

An international AI control framework could contain some of the same elements as control frameworks for weapons of mass destruction:

Commitments not to pursue that kind of technology, or to abet anyone in pursuing such technology, or to allow anyone to do so.
An information and technology-sharing channel between signatories who abide by the provisions.
An international organization to monitor developments.
An inspections regime to catch cheaters.
Recourse to the UN Security Council for punishment of anyone who breaches these rules.
A mechanism to remove and dispose of any forbidden material.

The commitments and information and technology sharing are self-explanatory enough. Suffice to say that states would have to commit not just to eschewing research that may result in AGI themselves, they will also have to commit to ensuring private entities within their borders do so.

This will obviously be difficult. The fruits of AGI research are likely lucrative, and corporations, in particular, have great incentives to pursue it, even illegally.

James Barrat, author of Our Final Invention: Artificial Intelligence and the End of the Human Era, points to many instances of irresponsible corporate behavior driven by greed.

“Corporations behave like psychopaths turned loose on society,” he told me. “I’m thinking of Union Carbide (Bhopal), Ford (the exploding Pinto), Enron (causing rolling blackouts in California). Facebook, Google, IBM, [and] Baidu are no more upright than these corporations. I don’t expect them […] to temper innovation with stewardship.”

States will have to commit to strict monitoring of AI research domestically, and to imposing penalties for any research that could lead to AGI that are harsh enough to outweigh any potential benefits.

When it comes to the monitoring of AI developments, this can be successfully done to an extent.

“Although several authors make the point that AGI is much easier to develop unnoticed than something like nuclear weapons,” wrote Yampolskiy and Kaj Sotala of the Machine Intelligence Research Institute, “cutting-edge high-tech research does tend to require major investments which might plausibly be detected even by less elaborate surveillance efforts.”

“[I]t would not be too difficult to identify capable individuals with a serious long-standing interest in artificial general intelligence research,” wrote Bostrom in Superintelligence: Paths, Dangers, Strategies. “Such individuals usually leave visible trails. They may have published academic papers, posted on internet forums, or earned degrees from leading computer science departments. They may also have had communications with other AI researchers, allowing them to be identified by mapping the social graph.”

Thus, researchers working on projects that may result in an AGI can be monitored. Perhaps an international agency can be established to promote safe AI practices and to carry out inspections, similar to what the International Atomic Energy Agency does for nuclear material.

The International Atomic Energy Agency’s headquarters in Vienna (Picture Credit: IAEA Imagebank)

The specifics would of course have to be decided by experts. As G. S. Wilson, Deputy Director of the Global Catastrophic Risk Institute, proposed, a body of experts could determine what constitutes a “reasonable level of concern” involving AGI or other possibly dangerous research.

Such a treaty would of course raise concerns that it’s stifling innovation. These concerns are justified. AI innovations would be significantly constrained by these measures, innovations that could improve knowledge, save lives, raise our standard of living to an unprecedented degree. Yet, the very real risk of human extinction makes it wiser to forfeit some of these benefits.

The shortcomings of such a treaty are obvious.

Will some clandestine AGI-related research elude even the most vigilant watchdogs? Yes, in the same way that a terrorist somewhere could probably build a dirty nuclear bomb without the authorities’ knowledge. But that doesn’t mean nuclear control treaties aren’t worthwhile.

Will some countries cheat? Certainly, and any treaty is only as good as its enforcement.

A loophole also lies in the thin distinction between AGI and ANI – an ANI can only perform a few tasks (How many are “a few”?), an ANI cannot be reconfigured to different tasks without significant redesigning (What counts as “significant?”).

Most of all, there’s the difficulty of getting states to sign on to such a treaty. But if the leaders in the AI race – America, China, Japan – push for it, others will follow.

During the Cold War, the world lived under an existential threat for decades. That threat however prompted leading powers to create treaties to minimize the risks WMDs posed. Notably, the US and the Soviet Union chose to end their biological weapons programs, because, unlike nuclear material and chemicals, and like AI, viruses and bacteria are extremely unpredictable, capable of growing and evolving into stronger and more virulent strains.

The world now faces a new existential risk in the form of AI. No framework can remove that risk entirely, but if it can significantly minimize it then that’s more than enough reason to forge one. The future is coming, and it waits for no one.

This story originally appeared in China-US Focus at the following link.

FICTION | BETA

We Need an AI Limitation Treaty. Now.

By Shaun Tan

Follow Us

This website uses cookies to give you the best experience.