A pants-on-fire AI lie detector?

Artificial Intelligence I could maybe solve the problem of lying and deceit in humans. We should try to find out.

In the dawn of its evolutionary development, probably around 100,000 years ago, Homo sapiens discovered lying. Individuals found that they could deceive and trick other members of the species by saying (or behaving in a way that suggested) that something that did not match reality was in fact true. That is a unique ability I have not been able to observe in other animals living on our planet — although interaction with a couple of tame mongooses and racoons seemed to show very rudimentary forms of conscious deception. In cats, dogs, ravens, crows, grey parrots and other sentient animals I have dealt with: not even a hint! They all seem incapable of doing it.

Lying has been a problem for humanity for tens of thousands of years — on a personal one-on-one level, but also because it can be used to deceive large numbers of people, in politics or religion. We have not been able to do much about it. Coercion and even torture have limited value. Listening to a child repeat “I didn’t do it” in face of clear proof and pressure shows you how badly coercion works, and the false statements of terrorist prisoners proves that torture is not a reliable method either. But what if we had a sure-fire way of telling when someone is lying — that would change the world and the nature of society. And I have my thoughts on this…

ut before we get to them, some background. I have mentioned before that I am involved in a neural network project that has created a chess playing program that basically generated itself. It learnt chess after being told only how the pieces move and what the goal of the game is (checkmate). It figured out the best strategy mainly by playing many millions of high-speed games against itself. We recently launched the program, which we have deemed to call Fat Fritz. Initial testing tells us it could well be the strongest chess playing entity that has ever existed.

While we were developing Fat Fritz I was contacted, in January this year, by a multi-national company that has a department working on Artificial Intelligence. They had developed software that uses the webcam of any notebook to watch players (of chess or any other game) and track their emotional status. When I tried it on myself the system could accurately tell my pulse and breathing — we matched them to external sensors that traditionally give you these readings. And there are many other “emotional” factors the software can detect.

There is in obvious reason why the company contacted me. My company, ChessBase, has the software tools that all ambitious chess players use to study and train. We also have a chess server, Playchess, where you can play 24/7, against people all over the world. So the idea is to have an “emotion tracker” watching your face, and give our software the ability to display your mood in any position during the game. “Did I make the mistake because I was nervous, or over-confident, or too relaxed, or distracted?” These are the questions the emotion tracking AI will be able to answer.

Emotion detection in chess — this is the project we have been working on

Last month a senior developer from the research group visited our company in Hamburg and showed us all the progress they have made. Mood detection is down to one external sensor, and soon should run solely using the webcam of your notebook. I am convinced we should integrate it in our software. Top players and amateurs alike would profit enormously from it. So we should consider building it into Playchess.

ut during the meeting in October I had a question for our guest: it is great, I said, to use AI to help improve the performance of ambitious chess players. Many thousands will be grateful. But: why don’t you use the AI emotion detection software to change the world, the working of society. Build something that would benefit billions of people.

The strategy I suggested is not at all elaborate: for a pilot project we recruit a hundred people and make them read sentences in front of the emotion detection webcam. Say I am one of the subjects. I read sentences like: “My name is Frederic Friedel” (T); “I live in Alaska” (F); “I have two wonderful grandchildren” (T); “There is a bird just outside the window” (F). The Emotion AI watches me speak, and gets the veracity information on each sentence, whether it is true (T) or false (F). We let it draw its own conclusions. After we have done this for a hundred subjects we run the AI on the same or a new group of test persons, with new statements and without the veracity information. We check the ability of the AI to spot false statements, and in each case we tell it what the right answer was (reinforcement learning). After running the experiment on a few hundred test subjects we can draw first conclusions:

It is possible we will find no evidence that the AI can detect lies any better than by chance. In that case, I told the project manager, you write me a letter saying “It doesn’t work, Frederic. It was a crazy idea.” Or the AI does detect lies better than by guessing. Then write: “It looks promising, we will continue testing.” Or the AI gets it 90%+ right. Then congratulations and thanks for suggesting the project would be appropriate.

Polygraphs — lie detectors — are not new. The image (from the Agence de presse Meurisse) shows the American inventor of the lie detector, Leonarde Keeler, testing his machine. It measured physiological indicators such as blood pressure, respiration, pulse and skin conductivity in a person answering questions. The polygraph expert can use the recorded data to draw conclusions.

Nowadays polygraphs are used as interrogation tools on criminal suspects or defenders in lawsuits. In the US, law enforcement and federal government agencies such as the FBI, NSA, CIA and many police departments use polygraph examinations to interrogate suspects. The average cost to administer the test is over $700, polygraphy is part of a $2 billion industry.

However scientific and government bodies generally consider polygraphs unreliable and inaccurate. Despite claims of high validity by advocates, the National Research Council has found no evidence at all of effectiveness. The American Psychological Association states “Most psychologists agree that there is little evidence that polygraph tests can accurately detect lies.” You can read it up, as I did, in this excellent Wikipedia article.

So why could my project succeed? Well, I have seen this kind of thing happen, in the development of our neural network chess engine. The program found out all by itself how chess works, what is the best strategy. Of course there is no way for us to know how it did so. All we can say is that it now plays vastly better than the (human) World Champion; and better than the predecessor brute force chess engines. AI can do that.

There is another thing that impressed me. Trained ophthalmologists cannot tell from a retina scan whether it was taken from a man or a woman (they get it right 50% of the time). Google’s UK-based company DeepMind (who incidentally designed the original AI chess algorithms) trained ocular software on around 15,000 scans. After that the AI was able to deliver referral suggestions for over 50 critical eye diseases with 94 percent accuracy, easily exceeding human experts’ performance. You can read about it in this paper, or in this one.

The astonishing part is that Google’s deep learning software was also capable of predicting age, systolic blood pressure, smoking status, past cardiac events, and gender, from the retinal scans, things that human ophthalmological experts cannot do. And I assume nobody knows how the AI does it. It’s like magic.

Now if a deep learning program, looking at thousands of retinal scans, can outperform experts, using methods that no one can fathom, why should it not in some mysterious way be able to reliably detect lies using just a webcam and emotion detection software? I am convinced it would do a far better job tracking, for instance, involuntary facial muscle activation, than humans experts.

That’s it, but before I sign off I need to mention that my son Martin had three comments. On the one hand he rightly pointed out that AI and deep learning typically need hundreds of thousands of instances to draw any conclusions (or tens of millions of chess games to reach the Fat Fritz level of play). My answer to that we would be running the truth/lie tests on 100 people, just to decide if it is promising. If it is, then you recruit thousands of test candidates and change the world. If not just forget it.

The second objection Martin had was: people could use the AI polygraph to train themselves to lie without detection. They would find techniques and drugs that assist in the process. Here too I had an answer: that would lead to a war between liars and an AI that is attempting to keep up with new deception techniques. Need I reveal to you who I think would win?

Finally, he was not sure that, as I had written, no animals show any capability of lying. David Attenborough, he pointed out, had claimed white-faced capuchins of Costa Rica do indeed use false calls (“Snake!” in their simian dialect) to scare other members of their tribe away. They do this in order to gather food they have spotted all alone. Clearly that is a very simple form of lying. You may enjoy watching Attenborough’s three-minute video on the subject. In addition, I have seen convincing examples of conscious deception by ravens, and Attenborough tells us about this ability in the drongo bird.

In any case we should find out if AI lie detection is possible. The experiment I propose would cost just a fraction of what people are investing in unreliable lie detection every year. Just do it, change the world!

Frederic Alois Friedel, born in 1945, science journalist, co-founder of ChessBase, studied Philosophy and Linguistics at the University of Hamburg and Oxford.