Notes on the Turing Test
Turing begins his paper Computing Machinery and Intelligence (1950):
I propose to consider the question, “Can machines think?”
Wittgenstein, who interacted with Turing in Cambridge in 1939, considers this very same question as early as 1933 in his Blue Book. Wittgenstein compares it to the question of solipsism: What should we say to the solipsist who claims, “No one else feels; I am the only one who really feels”? How do we know whether someone else really feels?
Wittgenstein breaks apart the solipsist’s statement and interprets it in two ways. The first way: To “really feel” something just means that only the speaker “I” feels it, so the sentence becomes a worthless tautology, “Only I feel what only I feel”. The second way: It matters less what is “really feeling”, and it matters more the real-life actions to which those words correspond, what Wittgenstein might call “forms of life”. For example, when someone says “I feel pain”, it may correspond to sounds of moaning. Or it may correspond to the action of going to the doctor.
So returning to the question of machines, it matters less whether the machine “really thinks”, whatever that means, and it matters more that the machine can do real-life things that correspond to what we would describe with the word “thinking”.
Like Wittgenstein, Turing also evades the question of defining “thought”. Instead, he proposes a test based on the “Imitation Game”, in which a man and a woman are hidden behind closed doors, and an interrogator asks questions to try to determine which is the man and which is the woman. In Turing’s adapted test, the interrogator instead aims to determine whether a subject is a person or a machine. If the machine convinces the interrogator that it is a person, then Turing says that the machine can be said to “think”. (We may make a speculation here that Turing was drawing an analogy to his own behaviors as a closeted gay man who needed to pass as straight during a homophobic era.)
Since Turing’s paper, we’ve periodically seen new machines, models, and algorithms that are claimed to pass the Turing Test in one way or another. But as Jaron Lanier argues in You Are Not A Gadget,
The Turing test cuts both ways. You can’t tell if a machine has gotten smarter or if you’ve just lowered your own standards of intelligence to such a degree that the machine seems smart. If you can have a conversation with a simulated person presented by an AI program, can you tell how far you’ve let your sense of personhood degrade in order to make the illusion work for you?
People degrade themselves in order to make machines seem smart all the time. Before the crash, bankers believed in supposedly intelligent algorithms that could calculate credit risks before making bad loans. We ask teachers to teach to standardized tests so a student will look good to an algorithm. We have repeatedly demonstrated our species’ bottomless ability to lower our standards to make information technology look good. Every instance of intelligence in a machine is ambiguous.
As Wittgenstein did in 1933, Lanier transforms a question about machines into a question about ourselves. If we think machines are “intelligent”, then what does that tell us about what we consider to be “intelligence”?
Our daily lives involve constant engagement with myriad forms of life. Neural network techniques have the capability to mimic many such simple human activities as filling out forms, payments processing, movie recommendation, writing, and image recognition. At least this is true assuming that the requisite data exists.
But what of activities that derive from combinations of simple forms of life? Neural networks have difficulty combining different domains for a simple reason of combinatorics: Naively, the amount of data required increases multiplicatively with each additional domain, which makes it multiplicatively more difficult to acquire and scale. There remains a missing “human” element from AI as it currently exists, namely, our underlying principles or intuitions that guide our ability to reason about and synthesize information from diverse domains. We can train an AI to make jokes in isolation, but a skilled comedian can make jokes about anything, even something as mundane as payments processing.
Note that AI jokes often tend to be funny in a non-sequitur sort of way; they lack the deliberate construction and synthesis that human jokes have. If they contain any logic, it is still only implicit logic that we project as human readers.
This isn’t to say that computers can’t eventually become “intelligent”, or that they will never pass a “real” Turing Test. Nor is it to say that intelligent computers are something that we should fear or try to prevent. Rather, this is to say that actual human intelligence entails the ability to constantly navigate through countless novel, complex, dynamic, never-before-seen situations. This is to say that our capabilities are amazing and we shouldn’t mistake the shallow results of neural networks (despite the misleading moniker of “deep” learning) for our true abilities. We should hold higher bars — both for ourselves and for AI.