Rcently, the world was told that a computer program named Alpha Zero had learned how to beat Stockfish in chess. Stockfish is generally accepted to be one of the world’s best chess computer programs (if not the best), and since I am a fan of the game, and of artificial intelligence (AI), this caught my attention.
The fact that Alpha Zero won by learning about chess in a completely different way than any other popular chess program made people sit up. But I must confess that not everybody around me shared my enthusiasm.
After all, since computers are getting faster all the time, it shouldn’t be a surprise that a new computer has beaten an older one. But first, this is about the software and how it works, rather than the hardware and how fast it runs.
And secondly – and this is important – whereas Stockfish needed human help to understand what a good chess move looks like, Alpha Zero learned it on its own.
Does that puzzle you? How does a machine learn, if it is humans that are programming them? Surely they learn whatever a human says?
Think about how you learn to play tic-tac-toe. In one case, an adult tells you, “Put the X in the centre, and if they put the O on the side, you can win”. In another case, you have to learn on your own by practising against yourself. In the end, after hundreds of trials, you realise it’s good to put the X in the centre.
This is the difference between Stockfish and Alpha Zero. Stockfish was programmed with a set of rules like “encourage knights to occupy the centre” and “penalise doubled, backwards and blocked pawns”. Alpha Zero, on the other hand, was programmed to learn.
There was something else even more amazing than that. Alpha Zero’s predecessor, Alpha Zero Go, discovered some new openings when it was learning the board game Go. You could say that Alpha Zero was being creative.
To be fair, when learning chess, it rediscovered a lot of common openings but did not innovate with a new one. (If you’re interested, it settled on the English Opening and the Queen’s Gambit as it’s “best” openings.)
And the way it plays is a little unusual. I noticed in the sample games that have been made public that Alpha Zero is quite aggressive with pawn advancement and positional sacrifices. But the converse is that it is also very good at constricting the opponent, limiting moves available to them. As some have commented, it doesn’t play like a computer, nor does it play like a human. It’s something more alien-like.
Somebody asked me, does this mean it can learn to do anything? Like balance accounts, for example? Possibly.
Already, there are programs that write annual financial reports in English prose by analysing a company’s balance sheet. There are programs that look at sports scores and then come up with a short column for a newspaper. Google Translate translates between spoken languages in real time. IBM Watson searches through millions of patient records and analyses them to deliver a diagnosis (its accuracy rate of detecting lung cancer is 90%).
The truth is, AI has been around for a long time. All of the broad ideas that built Alpha Zero were there 20 years ago. It’s just that we’ve got better at it.
Is it worrying? Are we putting too much dependence on computers to our own detriment? Will we lose jobs? The politicians sort of understand this. At a recent event to encourage innovation in Malaysians, one was asked about the disruption such advances would make. He said that even if robots take over jobs, there will be new jobs to learn, like how to fix and maintain robots.
There are two problems with this line of thinking. One, if fixing and maintaining robots is a more highly skilled job than most people’s current jobs, then not everybody will be able to take that the step up. Secondly, it is estimated that for every four jobs that will be lost to automation, only one new one will be created in return.
It seems obvious to say, but what we need to do is to learn how to learn. Even as it is, people now change jobs every few years, not every few decades. You have to be able to pick up new skills quickly.
And the second thing is that we need to appreciate and nurture creativity. In an age where the “correct” answer is only one voice request way, what can you do to contribute by giving the wrong answer that’s right for the task?
In short, we want to encourage our schools to teach our kids the Alpha Zero way, not the Stockfish way. Pedagogy already has these concepts: constructivism, where an individual constructs his or her own understanding of the world around them; self-directed learning where a child gains knowledge by working on his or her own. These are not new ideas. It’s time for schools to get with the programme.
The computer programs we write may end up better learners and more creative than us. But it shouldn’t be because we give up.