The development of the Scorecasting Model is only a small part of a wider research project we are undertaking. As part of this project, we have been working with user data from an online sports prediction and fantasy games platform called Superbru¬†(over 1.5million players). We have been analysing the judgement forecasts of players of Superbru’s English Premier League Predictor Game from 2017/18. We have now written our first research paper based on these data, which we have titled “Going with your Gut: The (In)accuracy of Forecast Revisions in a Football Score Prediction Game.”
The paper asks whether players of the prediction game choosing to revise their scoreline picks leads to more accurate forecasts as match kick-offs approach. At first glance, this ought to be a no brainer: of course revising scoreline picks should improve forecast accuracy, because over time there is more information about the nature of an upcoming football match, including injuries to key players and even starting lineups. But there are several possible sources of bias which could affect scoreline tips (or judgement forecasts) and their revisions.
In this research, we find that football tipsters should stick with their gut instincts. This appears to be true quite generally when it comes to forecasting football match scores, both accounting for any differences in forecasting ability between individuals and the differences in predictability between football matches. Revising a forecast (i.e. not sticking with their gut instincts) left the tipsters only 80% as likely to forecast a correct match scoreline compared with when they stuck with their first predictions. In those cases where game players did revise their forecasts, initial scoreline picks were just as good on average as when players didn’t make any revisions. We also found evidence of how game players managed to do worse when they revised their forecasts: their revisions were excessive — perhaps they overreact to some new and salient piece of information about the upcoming match.
These results have some similarities with those found more widely in the academic literature on behavioural forecasting. We hope to use them to guide possible field experiments among communities of sports tipsters (forecasters). One interesting application is whether or not we can find ways to improve the power of crowds in forecasting.
We are also carrying out other research on how to evaluate the football score judgement forecasts made by tipsters, and how forecasting behaviour and evaluation should respond to differences or changes in the “rules of the game” being played by a sports tipster. In general, we find football matches particularly strange objects to forecast, as we have touched on in this blog before.¬†This is mostly explained by the simple fact that frequently the most likely scoreline in any given football match will conflict with the most likely result. There are other situations where this can also be true, and where forecasts could have somewhat greater socio-economic importance.