Navigation: Jump to content areas:


Pro Quality. Fan Perspective.
Login-facebook
Around SBN: Bob Sapp Denies Throwing Fights

More fun(?) with numbers

(Crossposted to Seattle Soccer Scene)

At the risk of turning Sounder at Heart into nerd central by adding onto malcontenjake's and his own contributions, Jeremiah invited me to add some detail to the simulated MLS season results I dropped into a comment.

At its core, the idea is to predict the final standings by pseudo-randomly simulating out the remaining games in the season. I say pseudo, because the idea is to predict as accurately as possible the likely outcome of each game. But it's random, because game results are sensitive to the tiniest factors (like, for example, Terry Vaughn's obscene decision to award a penalty after Jason Yeisley fell all over himself in injury time in Dallas) so it's better to take a large number of random results and average out the outcomes. Smarter people than me call these Monte Carlo experiments and they're frequently used for sporting events, which have a lot of randomness.

It has some benefits over more formulaic models (like PPG) in that it takes into account the results of specific games. For example, a game between two playoff contenders might be critical to the final standings, but it's impossible for both teams to win: a fact that a simulation will capture, while a PPG model will assign both teams their average points. On the other hand, the results tend not to be dramatically different from PPG and the simulation can be a lot of work and headaches to setup.

Star-divide

Ken at Sports Club Stats does the same sort of work, and I think it's required reading for any sports fan, but I had a couple of complaints about the SCS methodology. First, he predicts outcomes based on previous winning percentage, whereas I prefer to predict point totals (or in MLS, goal totals) and determine the wins from that. In a winning percentage model, a team that wins 2-1 and a team that wins 5-0 are given the same credit, when the second team is almost certainly more dangerous. Also, it's not entirely clear what to do with ties. Also also, home-field advantage is a leaguewide constant, when in fact each team gets a slightly different advantage (or lack thereof) at home. In a goals model, you capture scoreline domination, you have no problem with ties, you can derive a unique homefield advantage, and it's easier to differentiate a team's defensive contribution (its ability to prevent goals) from its offensive contribution (its ability to score goals). Another problem with SCS is that it doesn't weight recent results more heavily. Teams change over time, and no team has changed more dramatically than the Sounders in the last few weeks, so it wouldn't be appropriate to weight their early season results equally with their recent results.

So I decided to turn my complaints into lemonade and setup my own version, after having done something similar for MLB in the past. Without going into ridiculous detail, the fact that soccer scores are clustered around 1 and 0 and are not normally distributed prevents an easy Gaussian (mean and standard deviation) random score generation that might work for the NBA or baseball. So instead I use a sampling algorithm. Imagine that I write each team's goal results for each game on a slip of paper (so LA would get one 4, a 3, and a bunch of 2s, 1s, and 0s) and divide it into home and away and goals allowed and goals scored. Then for each game I take the home team's home scoring pile and dump them into a bucket and the away team's away goals allowed pile and dump them into the same bucket, then I swizzle it around and pull a number. Voila, that team's simulated score result for that game. Obviously it's an overly simplified way to predict a single game's results, and I wouldn't recommend running to Vegas with it, but over a large number of simulations it averages into dividing the good teams from the bad.

Once the basic process is in place, you can introduce a couple of weighting factors. First, you could give the offense's results more weight than the defense's results (by giving them twice as many slips of paper in the bucket), depending on how much offensive vs defensive factors contribute to scoring output. Barring running a full regression to figure out the weight, assuming that they're evenly weighted is a fine approximation, and that's what I do. Second, you can introduce a recency bias (by adding duplicates of the slips of paper for recent games). I currently bias it on a sliding scale, so that the first game of the season gets a weight of 1 and the most recent game gets a weight of 2 (and the middle game therefore gets 1.5, etc). Again, it would take some real math to figure out the best weight for more recent results, but double seems fair and, if anything, a little conservative.

So you pull your slips over and over for each game, then calculate the standings. Then you do it again. Then about a hundred thousand more times — fortunately computers don't complain about doing repetitive tasks like this — and you average out the final standings from each of those runs. And you get something like this:

TeamAvg PointsAvg Position% playoffs
Galaxy 61.7525 1.1374 1.0
Crew 51.7989 2.892 0.9981
RSL 51.1936 3.1877 0.9942
FC Dallas 47.8517 4.495 0.964
Red Bulls 44.262 6.176 0.8279
Earthquakes 42.2922 6.9051 0.7442
Toronto FC 40.6291 7.964 0.6009
Sounders FC 39.8713 8.3891 0.5404
Rapids 39.9017 8.7607 0.4782
Fire 36.7126 10.5494 0.2603
Dynamo 35.3238 11.0773 0.1823
Chivas 35.6175 11.2548 0.1869
Wizards 33.714 12.2774 0.0835
Union 33.5659 12.2855 0.0882
Revolution 31.6692 13.0951 0.0503
DC United 24.6577 15.5535 6.0E-4
Avg Pts
for Playoffs
40.3896 

FanPosts only represent the opinions of the poster, not of Sounder at Heart.

Comment 9 comments  |  5 recs  | 

Do you like this story?

Comments

Display:

Man, this is great stuff

I’m honestly not mathematically inclined enough to weigh in on its merits, but I do love this stuff.

Because if it's not Love | Then it's the bomb ... | That will bring us together

by Jeremiah Oshan on Aug 1, 2010 6:24 PM PDT reply actions  

too funny that you don't want to turn this into a geek place

It’s way too late for that

I am not a Supporter | I am not a Fan | I am a Sounder
Sounder At Heart

by Dave Clark on Aug 1, 2010 8:03 PM PDT via mobile reply actions  

That ship has not only sailed

But it’s made several trips back and forth, and narrowly avoided an iceberg here and there.

by CarlosT on Aug 1, 2010 10:47 PM PDT up reply actions  

Honestly, this is why I read these kinds of blogs

If I wanted to read sports colloquialisms, I’d stick to newspapers (with kind exception to Meyers). I read this blog (and USSM, LL, and FG) because I like the analytical side of sports, and I’m not smart enough to figure this kind of stuff out on my own.

by J Sep on Aug 1, 2010 10:58 PM PDT up reply actions  

I am all geeked up

Thanks for the read. Hopefully, you will continue to update it when more data comes in every week.

by Coug1990 on Aug 1, 2010 10:53 PM PDT reply actions  

Dude....

Thanks a million, now I’ve got a headache!

by swansuite on Aug 2, 2010 7:32 AM PDT reply actions  

The first thing that jumps out at me

compared to what Jeremiah did earlier, is the massive difference in the Union’s position. This projection hardly gives the Union a chance. Very interesting stuff.

by agtk on Aug 3, 2010 8:06 AM PDT reply actions  

The thing with mine...

Is that it’s more an illustration than accurate portrayal of the whole season. The Union are hot right now, which my little exercise really shows. The chances of them keeping it up all year are admittedly slim, which this illustrates much more effectively.

Because if it's not Love | Then it's the bomb ... | That will bring us together

by Jeremiah Oshan on Aug 3, 2010 8:10 AM PDT up reply actions  

Comments For This Post Are Closed


User Tools

Sounder at Heart is a blog about the Seattle Sounders FC, with occasional forays into Democracy in Sports, Roster Management, Soccer Statistics and Life in Puget Sound. We are not the actual Sounders blog.

FanPosts

Community blog posts and discussion.

Recommended FanPosts

Twitter-icon_small
Fredy Montero with magic at the death vs. the Whitecaps part 1 (animated)
Small
On "fake turf" in Seattle, 2012 edition
Small
Andy Rose!

Recent FanPosts

Small
Sounders go after Drogba, yes or no?
Img957001_small
Substitute +/- Ratings
Twitter-icon_small
Fredy Montero mesmerizes Whitecaps' Joe Cannon (animated)
Acerimmer_small
Eddie Johnson Scores on Michael Gspurning? Yes indeed!
Paraguay_small
Sounders #awaysupport
Small
What's our line-up vs. Dallas?
Gopher2_small
2012 MLS Team Salary info VS Performance

+ New FanPost All FanPosts >

Sounder at Heart exists on Facebook - Like Us

Follow SounderAtHeart on Twitter

Sounder At Heart on Twitter

follow me on Twitter

Follow the rest of us on Twitter

Sounder At Heart (Site Feed)

Sidereal (MLS stats)

Jeremiah Oshan (top 10 soccer journalist on Twitter, Baby!)

Aaron Campeau (Villa, Mariners)

Dave Clark (beer, specfic, mideast)

Brian Floyd (all Seattle sports)

Nos Audietis (podcast stuff, snark)

Chris Coulter (photos, academy)


Managers

Tiny_dave_with_scarf_small Dave Clark

Oshan_small Jeremiah Oshan

Seattlesoccerscene_small sidereal

Nos Audietis Crew

Avatar_small Aaron Campeau

254350_1953423628277_767159_n_small dano_seattle

Authors

Img_0349_small malcontentjake

Devlin_small sum anon

Small dennyoffside

Ravelry_logo_small Abbott Smith

Special1tv_o_small Timm Higgins