The last few matches of 2019 are taking place today, tomorrow and Monday. We posted some half-way evaluation of RED’s Champions League forecasts against Jean-Louis Foulleys the other day, and promised some more evaluations.

Since mid-August, RED has made forecasts for (including matches taking place today, tomorrow and Monday) 200 Premier League matches, 281 Championship matches, 257 League One matches, and 276 League Two matches. On top of that, about 240 French league matches, 290 German league matches, 170 Italian Serie A matches and about 300 Spanish league matches.

As with our Champions League evaluation, we’ll look at Brier scores, and plots of how good our forecasts do.

First, Brier scores for results (home win, draw, away win):

Bookies RED
Home Draw Away Home Draw Away
0.23 0.19 0.19 0.24 0.19 0.19

Over the almost 2000 matches, RED and the bookmakers come quite close to each other. As with the Champions League comparison, this looks pretty bad, since it implies for each match both RED and the bookmakers are about 0.5 out in their forecasts, and forecasts are only ever between 0 and 1 (hence the error is about half of the range of forecasts). But this is a function of the fact that forecasts only ever really lie between about 0.2 and 0.7, and if the event occurs, then the error is still 0.3.

For scorelines, these numbers are 0.014 and 0.013 respectively, and while these numbers are smaller, they more reflect that there are many low probabilities provided for scorelines, and all scorelines bar one for each match don’t happen, hence are compared to zero for the Brier score. Nonetheless, in both cases, RED and the bookmakers perform comparably.

Sometimes it’s helpful to look at these things graphically. The first is scorelines, and the second is the home win result outcome. In both cases bookmakers are in red, RED is in black. In both cases we want points along the 45-degree line (dotted in both cases), as this says that (say) an event we put at 10% occurs 10% of the time. Points are closer for scorelines, reflecting the greater number of observations, and more scattered for results. But overall, they are close to the 45 degree line.