The adventure of chess programming (2)
How do computers play chess, how do they “think”? In our answer to this question we are going to encounter some big numbers — some very, very big numbers. You will be surprised.
But let us start with a module that all chess programs need: the move generator. In order to play the machine needs to know the rules of the game, it must know how the pieces move. Formulating these rules is a painstaking task — if you don’t get it right knights will jump over the edge of the board to the other side, or castling rules will be breached. I have told the amusing story of a young chess genius, twelve years old and already of grandmaster strength, who demonstrated the inability of chess players (and chess programmers) to correctly formulate the promotion rules, at least verbally.
In any case, once the machine has understood the rules it is able to generate a list of every legal move in any position. With that it can start a “look-ahead”. This is what all chess programs — and incidentally all humans — do: if I play this and he plays that, I can play this and if he plays that I can capture his knight. Sound familiar? Of course machines do it at blinding speed, and usually for every possible combination of moves. Which leads to the very large numbers I said we would be dealing with.
Many people know a very large number connected with chess. Legend has it that the game was invented by Sissa ibn Dahir, and his king, Shirham, was so grateful that he said his vizier could have anything he wanted as a reward. “Just some wheat,” Sissa said, “one grain on the first square of the board, two on the second, four on the third, and so on.” The king was surprised at the inexplicable modesty of this wish — until he had worked out the total number of grains Sissa would receive. Mathematicians tell us the exact number: 18,446,744,073,709,551,615. That is 18½ quintillion (10¹⁸), and today it represents the world production of wheat for many centuries. Listen to Stephen Fry describe it.
Sissa apparently understood geometrical progression — how numbers grow exponentially when multiplied by a common ratio (rough definition). This plays a role in the general consideration of chess: Will we ever be able to solve the game? Can we derive a perfect winning strategy by examining every possible sequence of moves? The answer is a resounding no!
Let us take a look: in a normal chess position a player will have, on average, 40 possible moves to choose from. To each of these the opponent can play one of 40 different replies. That comes to 1600 different sequences after just one move for either side. And the numbers grow exponentially: after two moves for either side 102 million different sequences are possible, after three moves it is 4.1 billion — more than the age of a human in seconds. Sissa’s number is reached after just six moves, and after seven moves we are approaching the number of stars in the universe. Let us assume the average chess game lasts 40 moves. At this point the number of sequences has reached 10¹²⁸. Compared to that the number of particles in the known universe, around 10⁸⁶, is completely insignificant. Really.
So the game will never be completely solved in this way — nothing will ever calculate every possible sequence of moves. But that is not actually necessary. If you can work out every continuation to a limited depth you already start playing a reasonable game. Which is exactly what computers do: execute sequences of moves to a given depth, evaluate the resulting position, vary the sequence systematically, and compare the evaluation of all positions at that depth. In the end they play the first move of the sequence that leads to the most favourable position. That is essentially what humans also do. But they do it at a far slower rate than computers. A chess grandmaster will work out one or two continuations per second in his mind; the computer can manage a few million in the same time. So how can humans compete?
Intelligence vs Brute Force
The difference between humans and computers is that experienced chess players will only consider continuations that are meaningful — they know instinctively that a large number of moves can be safely ignored. These moves lead to nothing. So instead of checking every possible continuation humans only look at plausible ones. The branching factor in the look-ahead is not 40, but just a few moves. Computers on the other hand must generally examine everything.
This “brute force” approach appeared doomed to failure. Indeed, in the early days, when computers were still doing shallow searches, they played like rank amateurs. So programmers tried to implement “intelligent pruning”, to cut off less meaningful branches so the program could navigate a slimmer search tree, and penetrate deeper into positions.
One such example was MacHack, a chess program developed at MIT by Richard Greenblatt (seated above with tie). He wrote a “plausible move generator” which restricted the computer’s branching to 15 on the first two moves, and 9 for the rest of the tree. So a five ply search resulted in 127,575 positions that needed to be evaluated, not the 102 million that pure brute force programs required.
MacHack achieved a strong amateur level of skill, and it played hundreds of games against humans, including three, it is rumoured, against Bobby Fischer. It was most famous for winning a game against Hubert Dreyfus, a philosopher and Artificial Intelligence critic who had predicted that computers would never be able to beat a ten-year-old at chess. MacHack proved him wrong.
I visited the MIT team around 1980, when MacHack had reached a playing strength of more than 1500 Elo points, and had in fact become an honorary member of the US Chess Federation. Greenblatt was having a bit of a crisis, because his program was still too error-prone. To correct this he had designed what he described to me as an “blunder blocker and brilliancy finder.” It was a hardware chess rack that ran a brute force search parallel to MacHack. Cheops, as he had named it, was meant to alert MacHack to tactics that the heuristic program would otherwise miss. I do not know how far he got, or how useful this combination of intelligence and brute force proved to be.
Another (more famous) effort at using the intelligent method in chess was undertaken by former World Champion (from 1948 to 1963, with two short breaks) Mikhail Botvinnik. He was by profession an electrical engineer and dreamed of managing the Soviet economy using artificial intelligence.
With a team of computer scientists Botvinnik designed a highly selective chess program, PIONEER, that used general principles of chess to drastically restrict the search. The project suffered from a lack of proper computer power in the Soviet Union, but was well publicised in the West and at least produced a seminal book on artificial intelligence.
I visited Botvinnik and his team in Moscow and met with him a number of times, in Germany, Holland and the US. It was clearly painful for him that a rival Soviet program, KAISSA, was far stronger than Pioneer, and had in fact become the first world computer chess champion in 1974. Kaissa was a pure brute force program.
The reason the non-selective programs won the day is that scientists came up with a number of “tricks” that allowed them to reduce the number of sequences that needed to be considered in the tree search without affecting the result. A prime example is Alpha–Beta pruning. Programmers realized that if you have a final score for one move, and start calculating another, as soon as you have found a line that is inferior to the first you can abandon that branch of the search tree. It is not necessary to waste time determining exactly how much worse the second move is. Then they discovered that if you ordered the sequence of moves the program searched based on a previous, shallower search, you got substantially more cut-offs. That is called iterative deepening.
Using this and a dozen other very clever algorithms, chess programs were able to search deeper and deeper, and become stronger and stronger. All attempts to generally define “meaningful” moves in chess programs were abandoned, the “intelligent” (also known as the “Shannon type B”) method of chess programming had failed. And the truly dramatic speed-up of computer hardware over the years contributed to the triumph of the brute force method. It allowed computer programs to soon rival and then completely outpace human chess abilities. This was described in my previous article (see link below).
End of the story? Not at all. The past year has brought a radically new development in chess programming — one that is relevant not just for the game but for humanity in general. That is the subject of the final article.
Previous article in this series
These articles are reproduced with kind permission of Spiegel Online, where they first appeared (in German language). The author was told to make the series personal, describe the development of chess programming not as an academic treatise but as a personal story of how he had experienced it.
The Adventure of Chess Programming (Part 1)
Did you know that the first chess program was written by Alan Turing a few years before the first computers were built. The first chess program to actually run on a machine was MANIAC, written in 1951 by the atomic bomb scientists in Los Alamos. Fifty years later computers were challenging world champions, and today it is pointless for any human to play against a digital opponent.