Hunting monsters with Turing tests


Halloween is upon us and monsters manifest in the mind. Whether it’s zombies, ghosts, killer clowns or demonically possessed babies that provide the horror, people sleep soundly at night due to one reassuring fact — these monsters are undoubtably fictional. Another common horror trope is a malicious artificial intelligence (AI) intent on causing harm. Here, the distinction between fiction and reality begins to blur. Could you quite so assuredly apply the label of fiction to 2001: A Space Odyssey’s HAL as Dracula?

Humans interact increasingly often with AIs in their daily lives. You may be woken by an Alexa alarm, and then decide to browse Amazon and be directed to recommended products by their algorithms. In our world, computers complete progressively more complicated tasks. When you read articles about an AI defeating the world champion at Go, it is easy to ask yourself — “Can computers actually think?”.

Alan Turing asked the same question in 1950. Computers were being used in ever more complicated tasks. This inevitably led to the question of intelligence. To focus the debate, Turing devised a test. The Turing test is simple to state: a human ‘interrogator’ writes questions and passes them to an unseen recipient. The recipient then provides the written response, and it is the task of the interrogator to decide whether the recipient is a human or a computer.

Turing surmised that if a computer can deceive the interrogator and produce human-like responses, it should be regarded as having some form of intelligence

In a modern incarnation of the test, the questions and responses can be mediated through a screen, as if the interrogator were messaging a friend. Turing surmised that if a computer can deceive the interrogator and produce human-like responses, it should be regarded as having some form of intelligence. Of course, this test is not the useful benchmark for detecting intelligence that it purports to be and Turing himself recognized this. He understood the distinction between the appearance of thought and the existence of thought. Indeed, this contradiction was also discussed a century earlier by Ada Lovelace, who reflected upon ‘programmable machines’ that perform many seemingly intelligent tasks, but do not produce original thoughts themselves.

The problem of identifying intelligence is exemplified in Searle’s Chinese Room Argument. Imagine that you are in a room containing nothing but an English to Mandarin dictionary and a note with the instruction to ‘translate’. A note is entered into the room with an English phrase. You find the relevant symbols in the dictionary and produce the correct output.

Crucially, you know nothing of Mandarin, but to the person outside the room who receives the translation, they convince themselves that the room understands the language perfectly. This is a simplified version of the argument, but it gets at the fundamental idea.

A computer cannot understand what it is doing if you are only assessing the output. This is what the Turing test does, so it is insufficient at assessing intelligence.

Instead of demonstrating intelligence, it [Mitsuku] showcases clever deceptions.

Real life applications of the Turing test have demonstrated this. The Loebner Prize invites talented computer programmers to enter software to face a Turing test.

A set of ten judges act as interrogators and assess if responses are from genuine humans or charlatan chatbots. As it stands, computers consistently pass the test, with the reigning champion being a piece of software named Mitsuku, developed by British developer Steve Worswick. The software consists of 350,000 distinct commands to produce natural language processing.

The list of commands can be seen as a complicated version of Searle’s dictionary and, despite consistently fooling the judges, cannot be said to demonstrate intelligence. Worswick himself states that a large part of the work is involved in disguising obvious giveaways, such as immediate and correct responses to numerical questions. Instead of demonstrating intelligence, it showcases clever deceptions.

As we progress into the 21st century, a related problem may occur. Machine learning algorithms may be able to analyse collections of human data and produce convincing replicas. Scammers could create more believable spam emails to defraud the vulnerable, or software could heighten the already extreme political tensions by producing deepfake videos of politicians saying something which they in fact never said.

So-called ‘reverse Turing tests’ are in development which could alleviate this issue. You have probably already encountered one such test in the form of CAPTCHAs. If you pass this test, you have successfully passed a reverse Turing test and convinced a computer that you are human. Congratulations!

While apocalyptic visions of humanity being destroyed by intelligent machines may proliferate in cinema, hopefully the plucky humans of these stories come armed with an effective reverse Turing test. Even if they can’t assess the intelligence of their adversaries, they can stop them spamming their emails.

Image: Hans Braxmeier via Pixabay

Leave a Reply

Your email address will not be published.


This site uses Akismet to reduce spam. Learn how your comment data is processed.