Friday, June 18, 2010

Fun with World Cup Soccer Statistics

As a teenager I was curious about which minutes in a soccer game are the most likely to have goals scored. I wrote a computer program that stored a database of all the goals scored in the Israeli soccer league for an entire year. I diligently went through all the sports sections of the newspapers and entered all the goals and minutes in which they were scored (feeling very mature that I was able to ignore my strong feelings about some of these goals). I calculated the statistics I was looking for, and the answer was: minute 65 was the most goal-rich minute.

Now, a few decades later, as the 2010 World Cup begins, I find myself asking the same question, or rather, revelling at how easy it is to capture the data, compute the statistics and share them with everyone in the world.

Using Google Fusion Tables, the tool developed by my team at Google, I created the visualization below. We're updating the underlying table as more goals are scored, so you'll always see the latest stats.

But that's not the end of it. Fusion Tables is a tool for data integration. We found some data on and joined it with our own table, and then created more interesting visualizations.

This one shows the height of the goal-scoring players. Read into it what you want.

This one shows the distribution of goal scoring among defenders, forwards and midfielders.

And finally, this visualization shows the clubs at which the goal scorers play.