After the shocking scoreline between Atletico and Real Madrid last week, I thought it would be interesting to see how likely each scoreline is.
For this purpose, I extracted the scorelines from the last 7 seasons (2007 to 2014) for Spanish, Italian, German, French and English top divisions.
The data comes from the data set described in the first football article (from Soccerway.com and Football-Data.co.uk). The code I used is on Github.
Let’s take a first look at the frequency of each result: home win, draw or away win.
Whatever league we choose, the home side seems less likely to lose than the away side, and home wins are the most common outcome. All charts are similar, but we can identify some interesting trends:
After having seen the results, let’s have a look at each specific scoreline.
We see that scorelines featuring less than two goals are more common than exotic scorelines. The more exotic the scoreline, the less likely it is.
We see that the scorelines are quite different according to the league we consider. For example:
As you can see, I didn’t spend a lot of time on data interpretation and the current version of this article is more about providing the data than doing real analytics. I hope to add more insights in the future.