Reviewing the group stage, how accurate were the AIs at the World Cup?
Updated 15:59, 03-Jul-2018
By Jiang Jiao, Guo Meiping
["china"]
With the 2018 FIFA World Cup already running for two weeks and reaching the knockout stage, it is time for us to review the reliability of artificial intelligence (AI) algorithms written by different groups of data researchers, predicting match results. Is science and technology doing better than Achilles the cat or Paul the octopus? Let’s find out.
Many companies, institutes and organizations around the world have applied the most cutting-edge calculation, simulation and machine learning technology to this “not so scientific” fortune-telling business. In the following, we list three of them that may offer you a glimpse into this science-superstition complex.

Goldman Sachs: “Final winner Brazil”

It is not the first time that the financial firm run calculations to predict the winners, but it didn’t succeed before. This year, to take as many variables as possible into account, they used four different machine learning models to process data on each team’s characteristics and every individual player’s performance in recent matches.
Then, they ran the models to predict how many goals a team will score when confronted with every possible opponent. And voila, a list of unrounded scores came out, showing the results.
2018 FIFA World Cup results predicted by Goldman Sachs. /Photo courtesy of Goldman Sachs Global Investment Research

2018 FIFA World Cup results predicted by Goldman Sachs. /Photo courtesy of Goldman Sachs Global Investment Research

One of Goldman Sachs’ key conclusions is that Brazil will be the final winner by a 1.70 to 1.41 defeat to Germany. But after Germany’s shock 2-0 loss to South Korea, we all know that such a final is out of the picture. It also made other incorrect predictions such as Saudi Arabia’s victory over hosts Russia – again, Russia’s 5-0 start told another story.
Nevertheless, its efforts in analyzing the importance of teams versus players and the strength of individual players do give us a statistical perspective to look into the correlation among players, teams and match results.
Graphs showing the importance of teams vs. players and the strength of individual players released by Goldman Sachs. /Photo courtesy of Goldman Sachs Global Investment Research

Graphs showing the importance of teams vs. players and the strength of individual players released by Goldman Sachs. /Photo courtesy of Goldman Sachs Global Investment Research

Technical University of Dortmund, Germany: Already failed to predict the winner

Unlike Goldman, Andreas Groll and his colleagues at the German university placed a bet on Germany as the most likely final winner. Clearly, they gained no advantages over the former.
But what is worth mentioning is their approach to get the outcome – a method called random-forest. According to the Massachusetts Institute of Technology (MIT), the random-forest technique excels at analyzing large data sets and can avoid “some of the pitfalls” existing in other data-mining methods.
The basic idea of that approach is that a future event can be determined by “a decision tree” where a result is calculated “at each branch”, an MIT review said. The “tree-branch” comparison is quite vivid as is shown in the prediction result graph by Groll and his team below.
Prediction of 2018 FIFA World Cup winners by Prof. Andreas Groll and his colleagues at the Technical University of Dortmund, Germany. /Photo courtesy of Technical University of Dortmund

Prediction of 2018 FIFA World Cup winners by Prof. Andreas Groll and his colleagues at the Technical University of Dortmund, Germany. /Photo courtesy of Technical University of Dortmund

Though the decision trees, in their later stage, can suffer from a problem known as overfitting, which occurs when decisions are affected too much by a particular set of data and therefore, cannot give a reliable prediction, the random-forest way is featured by its random selection of branches. After 100,000 times of simulation, Groll’s team got the average of all the randomly constructed decisions and offered a likelihood about which team will take the cup.
Similarly, those researchers also collected information on participating countries, not only that of their teams and players, but also seemingly “unimportant” factors like the country’s population and GDP, the coach’s nationality and so forth.

Gracenote: Two “potentially surprising” teams

By adopting the methodology based on the Elo system, which is used in zero-sum games such as chess, Gracenote ran the simulation one million times and came up with several discoveries regarding the results of different phases of the World Cup.
While being accurate on most of the results of the group stage, the Nielsen Company failed to predict the results of Group C, F and H.
Gracenote Photo

Gracenote Photo

According to the company’s calculation, Peru and France have the best chances of entering the Round of 16 in Group C. But Denmark at the end made its way through and entered the next phase with France.
In Group F, the defending champion Germany, which had an 80 percent chance of reaching the knockout stage, will now not appear in the rest of the game after its shock elimination in the group stage.
The seeded team Poland of Group H, which has the second highest chance to enter the next phase based on the calculation, finished last in its group.
Gracenote also listed two “potentially surprising” teams outside of the seven teams (Argentina, Brazil, France, Germany, Italy, Netherlands and Spain) that have reached the final of the last 12 World Cups – Colombia and Peru – which have a good chance to perform better than expected. 
Colombia, already one of the last 16, has 72 percent of chance of progressing from its group, 42 percent chance of reaching quarter-finals, and 20 percent chance of being one of the semi-finalists.
Meanwhile, Peru, which has 66 percent chance of reaching last 16, 38 percent chance of being a quarter-finalist, and 21 percent chance of entering the semi-final, was already eliminated from the game.
As for the prediction of the winner, Brazil has the highest chance of bagging the FIFA World Cup Trophy according to Gracenote’s calculation.
Hiroki Sakai of Japan warms up before the 2018 FIFA World Cup Russia group H match between Japan and Poland at Volgograd Arena on June 28, 2018, in Volgograd, Russia. /VCG Photo

Hiroki Sakai of Japan warms up before the 2018 FIFA World Cup Russia group H match between Japan and Poland at Volgograd Arena on June 28, 2018, in Volgograd, Russia. /VCG Photo

Although backed by massive data and cutting-edge technology, results are hard to be predicted because of the uncertainties. "It is difficult to assess how much faith one should have in these predictions," said the Goldman's report.
CGTN believes, too, that the excitement lies in the unpredictability.
(Top image via VCG)