By Matt Mahoney, Nov. 17, 2007
When we are able to create machines smarter than humans, then those machines could do likewise, but much faster. The result will be an explosion of intelligence, and according to Vernor Vinge, the end of the human era.
The Friendly AI Problem
The Singularity Institute was founded to counter the threat of unfriendly artificial intelligence, losing control of the machines we have built, but the problem remains unsolved. Shane Legg proved that a machine (such as your brain) cannot predict (and thus cannot control) a machine of greater algorithmic complexity, which is a bound on a formal measure of intelligence. Informally, we cannot tell what a smarter machine will do, because if we could, we would already be that smart. As a consequence, AI is an evolutionary process: each generation experimentally creates modified versions of itself without knowing which versions will be smarter.
Friendliness is hard to define. Do you want a smart gun that makes moral decisions about its target, or a gun that fires when you pull the trigger? Eliezer S. Yudkowsky proposed coherent extrapolated volition as a model of friendliness: a machine should predict what we would want if we were smarter. But this is only a definition. It does not say how we can program a machine to actually have the goal of granting our (projected) wishes, or how this goal can be reliably propagated through generations of recursive self improvement in the face of evolutionary pressure favoring only rapid reproduction and acquisition of computing resources.
An analogy is helpful. Your dog does not want to get a vaccination, but it does not want to get rabies either. How does your dog know if you are acting in its best interests? Our problem is even harder. It is like asking the dog to choose an owner whose descendents will act in its best interest.
But in my view, the problem is even more fundamental. Retaining control over a
References: 1. Landauer, Tom, "How much do people remember? Some estimates of the quantity of learned information in long term memory", Cognitive Science (10) pp. 477-493, 1986. 2. Hopfield, J. J., "Neural networks and physical systems with emergent collective computational abilities", Proceedings of the National Academy of Sciences (79) 2554-2558, 1982.