Scorecasting Economists

Unleashing the power of RED

RED has been producing picks for forthcoming football matches for over a year now. They’re interesting, and can sometimes be unexpected or even controversial.

But perhaps most importantly, up until the start of this season they have been a monstrous reduction in information. Monstrous in that it creates distortion upon distortion.

RED generates a probability, in principle, for every possible scoreline (even for the unlikeliest of scorelines, like the 5-5 draw last season between Nottingham Forest and Aston Villa). And RED isn’t bad, in general, at doing so.

In the plot below, we present the amount of times each forecast RED made for a scoreline turned out in reality during last season’s Premier League. So, for example, a 1-1 scoreline might have a probability of 10%, and a 4-4 scoreline might have a probability of 1%.

What we think instead is “how often would an event forecast at 5% occur”? If RED’s forecasts are good, then such events should occur 5% of the time. Similarly, if RED forecasts a scoreline to occur at 10%, then such forecast events should occur 10% of the time.

We can check this easily, but it requires a significant number of forecasts so that we don’t see distortions due to small numbers of forecasts. We’ve taken forecasts from 370 of last season’s Premier League matches by RED, and categorised them according to the probability. We’ve then looked at how often these forecast events occurred. If RED is pretty good, then if we plot one against the other (forecasts and outcomes), the points should fall along the 45-degree line (dotted line).

The vertical lines are the number of such forecasts that RED made. As RED made 81 scoreline forecasts per match, a lot are very close to zero (22,117), and so we provide this information too (log scale, on right hand side).

The dots are very close to the 45 degree lines for forecasts up to about 12% – the forecasts for which we have the most observations. This is indicative of a good forecast performance, and suggests there is value in RED being unleashed to tell the world this information.

Once we get above about 12%, performance does become more erratic. This can be attributed more to a lack of observations, though, rather than any intrinsic performance. There are fewer than a hundred observations in each bracket above 13%, and fewer than 10 when we get above 20%.

The more RED forecasts, the more such observations will be collected, and it may be that we get a better picture of forecast performance.

But the plot provided here is more than sufficient to give us confidence in unleashing the power of RED.

RED and Bookmakers

As may have been noticed, RED has added bookmaker information to its provision this week (as well as going into Europe). What is behind this?

Well, RED is providing detailed forecasts of forthcoming events, but is hardly the only source of forecasts. In this day and age, bookmakers provide forecasts, via odds, of much more than RED provides. See here for tonight’s match between Bayern and Hertha in Germany.

If RED is confident in its forecasts, why this need to provide other sources of information?

Well, different forecasters have different methods, and can have different strengths in different circumstances. Ideally we would choose the best forecasting method, but there is plenty of literature in forecasting on combining forecasts, helped along by the late, great Nobel-prizewinner Clive Granger.

It’s also interesting. Where does RED differ from bookmakers?

But most mundanely, what exactly is RED presenting in its tables? The answer is a measure that might be called the probability of each decisive result outcome occurring. That is, each team winning. But bookmakers provide odds, not probabilities – so how do we arrive at this number?

The answer is we firstly take decimal odds (which are the fraction implied by traditional odds plus one), and take the reciprocal of them for each outcome (home win, draw, away win). Then we consider what the overround is – do these three numbers sum to one (which they should since they are the only three possible outcomes).

If we want to think about these numbers as some kind of forecast probability, we need them to sum to one. Different methods exist to do this, and the most simple is just to scale by that overround.

So for Bayern vs Hertha, as I write bet365 have Bayern at 7/50, Hertha at 14/1 and the draw at 7/1. In decimal terms that’s Bayern at 1+7/50=1.14, Hertha at 15 and the draw at 8. The reciprocals are Bayern at 87.7, Hertha at 6.7% and the draw at 12.5%. The sum of that is 0.877+0.067+0.125=1.069. That is, 106.9%, and an overround of 6.9%.

So we divide each number by 106.9% to get a corrected *probability* of 0.877/1.069=0.820, or 82%.

We do that for the average decimal odds, so we aren’t presenting any information from any single bookmaker. We aren’t providing any kind of advice on whether it may be profitable to bet on, to be clear. We’re presenting one measure that might be called a probability of an outcome, and we’re doing so to provide an interesting contrast between RED, a statistical model, and whatever kinds of models bookmakers use. Use at your own risk!

Premier League, R2 (17-19 August) — City to beat Spurs 2-0 (most likely)

RED with its Bi-P-VAR-X model returns for R2 of the Premier League.

Looking at the results of last weekend’s matches, one notable departure form the predictions was Man U’s thrashing of Chelsea. Despite this, the Model thinks the (former) giants of the Premier League era are nonetheless underdogs on their travels to Wolves, with just a 27% chance of a win, compared with 48% for Wolves – the game is still likely to be tight, and the most likely scoreline is 1-1 (12% chance).

Manchester City were so dominant this weekend that RED places them as firm favourites at home to supposed title rivals Spurs. City have a 72% chance of victory, and the most likely scoreline is 2-0 (12%).

RED has also been scouring the web looking for other sources of predictions. One such source is bookmakers. In a fit of insecurity, the computer has decided that each week it will sense check its forecasts, by comparing the predicted probabilities of result outcomes with what it can estimate and infer about bookmaker predictions from the average of their current online odds.

The table below gives the rest of the Model’s forecasts for Round 2 of the Premier League, along with forecast probabilities estimated from current online bookmaker odds.

R2 Premier League forecast probabilities, RED.
  • Expected Goals: the forecast average number of goals the Model expects for Home and Away teams
  • Outcome Probs: the model predicted % chance of either a Home or Away win, with 100 minus these two numbers being the % chance of a draw.
  • Score Picks: the Most likely forecast scoreline outcome, as well as the most likely conditional on the most likely result outcome happening.
  • Home wins / Draws / Away wins: the predicted % chance of various potential scoreline outcomes of the match.
  • Mean odds: Estimates of online bookmakers’ average probability forecasts for the Home and Away teams to win, which could be compared with Outcome Probs.

Championship, R3 (16-18 Aug, 2019)

RED has produced its predictions for R3 of the Championship. The unconditional most likely scoreline prediction in every match this round is 1-1, reflecting the general competitive balance in the division, and the uncertainty about each team’s current relative strengths, .

RED has also been scouring the web looking for other sources of predictions. One such source is bookmakers. In a fit of insecurity, the computer has decided that each week it will sense check its forecasts, by comparing the predicted probabilities of result outcomes with what it can estimate and infer about bookmaker predictions from the average of their current online odds.

The table below gives the rest of the Model’s forecasts for Round 3 of the English Championship, along with forecast probabilities estimated from current online bookmaker odds.

Championship R3 probability forecasts, RED
  • Expected Goals: the forecast average number of goals the Model expects for Home and Away teams
  • Outcome Probs: the model predicted % chance of either a Home or Away win, with 100 minus these two numbers being the % chance of a draw.
  • Score Picks: the Most likely forecast scoreline outcome, as well as the most likely conditional on the most likely result outcome happening.
  • Home wins / Draws / Away wins: the predicted % chance of various potential scoreline outcomes of the match.
  • Mean odds: Estimates of online bookmakers’ average probability forecasts for the Home and Away teams to win, which could be compared with Outcome Probs.

RED goes into Europe

Just as Britain, led by Boris Johnson, beats an ill-tempered retreat from Europe, RED is going the opposite way.

Matches from the very beginning of the Spanish, French, Italian and German leagues have been added to the database for the top two divisions, enabling a much better calculation of Elo strengths than was the case last season (see for example RED’s Champions League forecasts).

It seemed as a result a little rude not to make forecasts for these leagues given the effort to collect and merge in the data. This week, German, French and Spanish leagues begin, and we’ve created forecasts in the same way we create them for English leagues – they are in the tables below. The 3-letter acronyms may be a little unhelpful in places; let us know any glaring errors.

For the Bundesliga the table is a little patchy, especially on bookmakers.

First, La Liga:

Then the Bundesliga:

Then Ligue 1:

Spain’s La Liga2:

France’s Ligue 2 (lacking bookmaker odds):

Germany’s Bundesliga 2:

League Two, Round 3 (August 17)

Two of the three teams that have won both of their matches so far meet on Saturday in League Two – Exeter and Swindon. The bookies see it as too close to call (36% to 34.6%), but RED thinks Exeter (46.8%) are clear favourites. A 1-0 home win and a 1-1 draw are almost equally as likely (12.4% and 12.6%).

There’s a Battle of the Roses clash at Valley Parade on Saturday as relegated Bradford face troubled Oldham. Bradford have started with two draws, Oldham with two defeats. Both could do with three points. RED gives Bradford a 43.2% chance, the bookies a 52.5% chance. The respective numbers for Oldham are 30.7% and 21.3%. A 1-1 draw is most likely at 12.4%, and two Bradford wins (1-0 and 2-1) are more likely than the most likely away win (0-1, 8.7%).

The bookmakers see the match at Valley Parade as the most lopsided, but RED thinks that back in Greater Manchester the most uneven match will take place, with new league entrants Salford expected to beat Port Vale – a 1-0 home win at 12.3%, and Salford 61.6% for the win.

League One, Round 3 (August 17)

League One reaches gameweek 3, except if you’re Bury, who have had yet another match postponed.

Sunderland face another blockbuster tie against another ex-Premier League team as they host 2008 FA Cup winners Portsmouth in an early kick-off. It promises to be a tight match, though with the Rokerites having the edge. RED gives Sunderland a 42.2% chance of winning, Portsmouth a 31.6% chance. A 1-0 home win comes in at a 10.5% chance, but a 1-1 draw is most likely.

This week, RED reports average bookmaker odds, as presented on Oddsportal.com and corrected for the overround. These are presented in the final two columns of the table. Recent research we’ve done suggests that RED can fare reasonably well against bookmakers. One thing to bear in mind about these comparisons is that they are what we loosely interpret as probabilities. They don’t correspond to any particular bookmaker, and as such are not recommendations for any kind of betting strategy. We provide them as it is interesting to compare RED’s forecasts with those of the bookmakers.

There are some substantial differences between RED and the bookmakers; RED much more strongly favours Wimbledon against Southend, whereas the bookmakers like the look of Blackpool against Oxford than does RED. RED and the bookies are agreed that promoted Lincoln will defeat already-struggling Southend, and fancy Bolton even less (10%) than RED (23%) at Tranmere.

Championship, R2 (10 Aug, 2019)

RED has reflected on the first round of the English Football League, updated its parameters, and spat out a set of outcome predictions for the second round of fixtures.

The computer’s local team, Reading FC, had a tough start to the season at home last weekend, losing 1-3 despite their opponents being down to ten men by the end. Unsurprisingly then the Model suggests they could struggle again this weekend on their travels to Hull, with just a 27% chance of a win. Hull are expected to score 1.5 goals to Reading’s 1.0. However, the most likely unconditional scoreline outcome is 1-1 (12.5%).

Derby County got off to the perfect start under the new leadership of Phillip Cocu, winning away against Huddersfield. They also now have the maverick coaching and playing skills of a certain Wayne Rooney to look forward to from January. On Saturday they welcome Swansea City to Pride Park Stadium. Derby are favourites, with 46% chance of a win, though the most likely scoreline is 1-1 (12.5%). Derby fans can expect 1.4 goals from their team, but that they will concede 1.1.

The table below gives the rest of the Model’s forecasts for Round 2 of the English Championship.

Championship forecasts R2 2019/20

 

 

 

 

 

The Model’s forecasts for R2 (10 Aug) of the English Championship are in the table below:

  • Expected Goals: the forecast average number of goals the Model expects for Home and Away teams
  • Outcome Probs: the predicted % chance of either a Home or Away win, with 100 minus these two numbers being the % chance of a draw.
  • Score Picks: the Most likely forecast scoreline outcome, as well as the most likely conditional on the most likely result outcome happening.
  • Home wins / Draws / Away wins: the predicted % chance of various potential scoreline outcomes of the match.