Search
Search
The Uncontrollability of AI
The Uncontrollability of AI

Date

source

share

The creation of Artificial Intelligence (AI) holds great promise, but with it also comes existential risk. How can we know AI will be safe? How can we know it will not destroy us? How can we know that its values . . .
The creation of Artificial Intelligence (AI) holds great promise, but with it also comes existential risk. How can we know AI will be safe? How can we know it will not destroy us? How can we know that its values will be aligned with ours? Because of this risk an entire field has sprung up surrounding AI Safety and Security. But this is an unsolvable problem, AI can never be fully controlled.

If you like reading about philosophy, here’s a free, weekly newsletter with articles just like this one: Send it to me!

Introduction

The invention of Artificial Intelligence will shift the trajectory of human civilization. But to reap the benefits of such powerful technology – and to avoid the dangers – we must be able to control it. Currently we have no idea whether such control is even possible. My view is that Artificial Intelligence (AI) – and its more advanced version, Artificial Super Intelligence (ASI) – could never be fully controlled.

Solving an unsolvable problem

The unprecedented progress in Artificial Intelligence (AI), over the last decade has not been smooth. Multiple AI failures [1, 2] and cases of dual use (when AI is used for purposes beyond its maker’s intentions) [3] have shown that it is not sufficient to create highly capable machines, but that those machines must also be beneficial [4] for humanity. This concern birthed a new sub-field of research, ‘AI Safety and Security’ [5] with hundreds of papers published annually. But all of this research assumes that controlling highly capable intelligent machines is possible, an assumption which has not been established by any rigorous means.

It is standard practice in computer science to show that a problem does not belong to a class of unsolvable problems [6, 7] before investing resources into trying to solve it. No mathematical proof – or even a rigorous argument! – has been published to demonstrate that the AI control problem might be solvable, in principle let alone in practice.

The hard problem of AI safety

The AI Control Problem is the definitive challenge and the hard problem of AI Safety and Security. Methods to control superintelligence fall into two camps: Capability Control and Motivational Control [8]. Capability control limits potential harm from an ASI system by restricting its environment [9-12], adding shut-off mechanisms [13, 14], or trip wires [12]. Motivational control designs ASI systems to have no desire to cause harm in the first place. Capability control methods are considered temporary measures at best, certainly not as long-term solutions for ASI control [8].

The AI Control Problem is the definitive challenge and the hard problem of AI Safety and Security. The Uncontrollability of AI

Motivational control is a more promising route and it would need to be designed into ASI systems. But there are different types of control, which we can see easily in the example of a “smart” self-driving car. If a human issues a direct command – “Please stop the …

Read the full article which is published on Daily Philosophy (external link)

More
articles

More
news

What is Disagreement?

What is Disagreement?

This is Part 1 of a 4-part series on the academic, and specifically philosophical study of disagreement. In this series...

Relativism

Relativism

[Revised entry by Maria Baghramian and J. Adam Carter on January 10, 2025. Changes to: Main text, Bibliography] Relativism, roughly...

Medieval Political Philosophy

Medieval Political Philosophy

[New Entry by Cary Nederman and Alessandro Mulieri on January 10, 2025.] [Editor’s Note: The following new entry by Cary...

Herbert Marcuse

Herbert Marcuse

[Revised entry by Arnold Farr on January 10, 2025. Changes to: Bibliography] Herbert Marcuse (1898 – 1979) was one of...