Ever since I started getting serious about my affection for soccer, I've been perplexed at the lack of useful statistical data out there. Graham (as you've hopefully read) has been doing rather exhaustive work over at his Chelsea blog in which he's doing his darndest to make up for lost time.
I'm sure you've also noticed that we have several fans that have been posting various projection tools almost every week, and I have tried my hand at it a couple times as well.
Most of these tools look mostly at goals/goal against as the main predictor of future performance. To be sure, this is probably a solid way to do projections, but I really believe there has to be a better way.
As anyone who follows soccer knows, goals can be fickle. They tend to come in bunches and often come in unpredictable intervals.
With that in mind, I've been working on trying to find a relationship between more predictable indicators and goals. I haven't finished my research, but I'm starting to believe that shots on goal may end up being a better predictor of future goals than anything else we're currently using to predict them.
After looking at the seasonal data from the years 2004-present, I've found some pretty interesting information:
- Shots of any kind turn into goals about 11.4 percent of the time. The lowest of any year I researched was 10.5 percent in 2004 (which was an extreme outlier) and the highest was 11.9.
- Shots on goal resulted in scores about 25.8 percent of the time. This actually has a correlation value of about .711, which I'm assured (I'm really not a statistician nor do I pretend to be one) means there's some legitimate predictive value there.
- Using this information, I was able to come up with predicted goals for/against. The median and average differences between actual goals scored and predicted goals scored were both less than one goal off.
Admittedly, there's still a lot of refinement that needs to go on here to make it really valuable, not the least of which is coming up with a legitimate way to translate goal differential into points won. I know this information is out there, but I've yet to see it as translated into MLS. Despite these obvious holes, I figured I am at least at the point where it's worth sharing how this model sees the Sounders finishing the season.
The Sounders, as has been documented often here, have not been doing a great job at getting shots on goal throughout the season. Still, they've been outshooting opponents by a decent margin and that differential has made up for the fact that they get about 2 percent less of their shots on goal than their opponents.
Using this model, the Sounders would outscore opponents 11.48-10.74. That's only the sixth best margin, but it is better than the Quakes (rounded off differential of -4), Fire (-4), Toronto FC (-3) and Rapids (-1). With the Sounders merely having to keep pace with two those teams to make the playoffs, that would seem to bode well. For whatever it's worth, the Sounders are currently underperforming their predicted goal differential by -6.
If we tweak our model a little bit to recognize the Sounders' change in form (during their six-match unbeaten streak the Sounders have been getting 54 percent of their shots on goal) and assume they can manage to play at the historic league average of putting 44 percent of their shots on goal for the rest of the season, that would account for an increased goals scored total to 12.88 and would give them a three-net goal advantage over Colorado.
Although we haven't worked the model all the way out, we do know that an even goal differential would predict a 3-3-3 record over the Sounders' final nine matches. I think we can safely assume our predicted goal differential would translate to about 12-15 points, giving the Sounders 41-44 points for the season or just enough to make the playoffs.
It's also worth noting that the team that comes out best in the model is FC Dallas, who has a predicted goal differential of about +4. The team that comes out worst is New England at -6. The team that is most overperforming its predicted goal differential is Real Salt Lake at +19. The team that is most underperforming is DC United at -16.
Like I said, this model still has a lot of tweaking in it and is far from finished.
I still don't have the data to prove that SOG is any more stable, game-to-game than goals. I also don't have the data to prove that SOG is a better predictor of future goals than past goals are.
One other piece of data that I'm currently missing is shots-on-goal allowed. On this issue, I'm of two minds. I've long believed that offensive shots on goal are the product of skillful players. I'm less convinced this is the case when it comes to defense. I recognize that a certain number of shots on goal can be prevented by strong defense, but I find it plausible that preventing shots on goal is also about luck. Obviously, more data is needed to figure this out.
Anyway, Dave and I believe there is some promising information in all this research. If nothing else, we think we can derive individual +/- ratings off shots that we think is more valuable than simply using goal differential. We'll see where this goes.