Math Formulas (via trindade.joao under the Creative Commons license)
One topic that we've covered for years here on Sounder at Heart is the outsized impact of luck on the result of a single game of soccer. In our series on parity in the 2010 offseason we pointed out that in terms of game parity — that is, the likelihood that a worse team will beat a better team due to luck — soccer is rivaled among major sports only by baseball. Major League Baseball overcomes this effect through sheer volume, with 162 games a season meaning that even a large luck factor in a single game will even out and be overcome by skill through a large number of repetitions. But soccer games are only played once or twice a week and it can take a while for those luck factors to even out. Consider that the St Louis Cardinals, who would go on to win the World Series, lost 8 out of 11 games in one stretch in August. Now imagine if that 11 games was a third of their season.
To account for that, we like to look at stats that are more numerous and more consistent than goals to get a sense for how teams are really playing. Thanks to Opta's coverage of the league over the last two seasons we have a wealth of statistics to examine, so I decided to draw up a power rankings system that's based not on the standings or the subjective eye but on those stats that correlate well with winning over the long run.
After a lot of stat grinding, I've found a good set of stats that don't overlap too much but correlate well with goals over time. They're shots in the box, duels, and passing percentage in the offensive third. These are all stats that correlate well with winning (between 0.5 and 0.6 each over the course of a season) and they're nicely divided over different areas of the game. Duels Won will account for teams excellent in defense, Offensive Pass Pct accounts for teams excellent in playmaking, and Shots in the Box accounts for teams good in the attack.
Note that we've tracked net shots for a while now, but it turns out that net shots in the box correlates even better with results and helps discount teams that generate a lot of low-danger shots when they're pressing for an equalizer (or if they just like to do that, like Kansas City in 2009).
So my methodology is to calculate the net value of each of those stats (team value minus the opponent's value), weight it for recency using a linear weighting (formally, if a team has played n games, the first game is weighted as 1/n, second as 2/n, etc until the most recent game is n/n, or 1), and normalize the values. In the end, a team gets 100 points if they're first in all 3 stats down to 0 points for being last in all 3.
The results, including some surprises, are below the jump.
|Team||Net Duels||Net Shots
|Sporting Kansas City||4.72||1.9||6.47||85|
|San Jose Earthquakes||1.3||3.69||5.52||83|
|Seattle Sounders FC||3.16||3.47||2.66||82|
|Real Salt Lake||1.3||-0.07||1.95||59|
|Los Angeles Galaxy||-2.98||0||1.71||47|
|New England Revolution||0.3||-1.85||-0.08||44|
|New York Red Bulls||-1.48||-2.87||-0.42||34|
The results don't differ wildly from the standings or other power rankings out there, which is good. If I came up with a methodology that said that Chivas USA was the best team in the league, I'd have to throw it away. But despite the general similarity to other rankings, there are a few significant differences.
First, Toronto FC is not last. Despite losing all 8 of their league games, having the worst goals against in the league, and being tied for the lowest goals for, they're actually statistically just a below average team. And in fact they're the third best team in the league at net shots in the box, with almost 3 more shots than their opponent per game! It's possible that their accuracy on those shots (and the accuracy they allow their opponents) is uniquely bad. It's also possible they've been really unlucky. It's worth noting here that the same roster beat the LA Galaxy in CCL competition and beat the Impact in the Canadian Championship, so there's some evidence that this roster can win games. It'll be interesting to see whether they regress during the season or whether they'll continue to get such poor results despite decent totals in the stats we're tracking.
Another interesting divergence is Columbus, a middling Eastern Conference team in the standings and a basement team in most power rankings, and yet statistically we have them as the 5th best team in the league. Perhaps their defeat of Dallas this weekend is some validation for that result, though we also show that FC Dallas is the worst team in the league. A big drop for last year's Supporters Shield contender.
Another team we show much lower than the results show is New York. Currently at the top of the East and with 4 wins in a row, and yet their stats are pretty poor. And in fact they're the worst team in the league at net shots in the box. Thierry Henry and Kenny Cooper have been pretty lethal on the shots they do get, but is that sustainable over a full season? If not, and if they keep giving up so many shots to their opponents, their reign at the top of the East is likely to be short.
We'll keep track of these results week to week and see if they have any predictive power. If not, we'll have to fix up the model for teams that can manage to win or lose consistently despite the stats saying they should do otherwise. And if they do have predictive power. . . Vegas, baby. Vegas.