At almost exactly this time last year, inspired by the tremendous amount of roster churn the Sounders were going through due to injuries (sound familiar?), I wrote a stats article examining the correlation between roster stability and winning. Read it if you have a free five minutes, but the short version is that yes, there was a correlation. Stable rosters tend to win more.
But at the time I wasn't particularly happy with my formula for roster churn. I actually used two different ones. One was just the number of unique starters, but that's not a great fit for what I was looking for. In particular, it doesn't count how often those guys at the end of the bench are playing. If they start once, they get counted the same as if they start 8 matches. And the value tended to approach the total MLS roster size for all teams as the season wore on and even deep bench guys got an occasional start here or there.
The other was the average number of starters per match that didn't play in the previous match. That's more churny, but punishes teams that have a small pack of reserve players they rotate in regularly. If we're trying to measure how out of sorts a team is because players are unfamiliar with each other, you wouldn't include players who actually play often together. In particular this punished RSL, who had a high churn value for just this reason.
So after a year to think about it, I've come up with a metric I prefer, though it's a little more complicated to calculate. If you imagine player start data as a curve with the most common starters on the left and the least common starters on the right, it will (by definition) trend down. But on a team with a small group of players that start often, the curve will be narrow and steep as it drops quickly from the regular starters to the lesser-used subs. On a team with a bunch of players mixing into the starting rotation regularly, the curve will be flat and stretch out further to the right. Essentially, you want to calculate how much the 'mass' of the curve is concentrated among the regular starters.
Fortunately, there's a formula for just such a thing: the centroid of the curve. The short version is it's where the center of mass of the shape of the curve would be (assuming it had uniform thickness). The wider the shape, the further out to the right the center would be. So we just want to calculate the horizontal value of the centroid. Fortunately, our 'curves' aren't really curves, they're a bunch of discrete values next to each other, so we don't need to use integrals. If we line the players up from most starts to least, take each player's rank and multiply it by his starts, then divide that total by the total number of starts. . voila. The horizontal coordinate of the centroid.
Here are some images to give you an idea of what I mean. By this measure, the highest full season churn in the three years of data I have is Toronto FC's 2011 season. Here's the chart of their players, from most started to least:
You can see that it's wide (there are a lot of players) and flat. . the 13th-18th players are still getting a lot of starts. In comparison, here's the lowest full-season churn I found — Sporting Kansas City last season:
It's comparatively short and steep. After the first 11, the number of starts quickly drops down to 0. To give you an idea of how the data differs, Toronto's 22nd most frequent starter was Eric Avila, who still started 8 matches — nearly a quarter of the season. Sporting's 22nd most frequest starter was. . . nobody. They only used 21 starters last season. Only 19 got more than 1 start and only 16 got as many as Avila's 8. So that looks like a pretty solid contrast in roster churn and closer to what I was trying to get at originally.
Note that the minimum possible centroid coordinate is 6. . that's what it would be if you only had 11 starters and they all started every match. In that case, the 'curve' would just be a rectangle 11 players wide. For that reason, I subtract 6 from the values to get the final churn value. Zero means there's no roster rotation. The theoretical maximum coordinate in a 34 match season would be something like 180 if you started 11 new players every match, but obviously that'll never come close to happening. In practice the churn values almost never go over 5 (from a coordinate value of 11). Sporting's low number last season was 2.17. Toronto's absurd score in 2011 was 6.82. They tend to go up as the season goes on (especially as the midseason transfer window changes the rosters). Counting the partial 2013 season, the lowest value in the data is this season's Houston Dynamo, which is 1.6 and looks like this:
That's pretty close to a rectangle. The Dynamo have been remarkably consistent and fortunate enough to mostly avoid injuries.
So now that I have a numeric value I'm more comfortable with, does it actually mean anything for performance? Here's a scatterplot correlating a team's seasonal churn value with how many points per game they earned in that season:
As you can see, there's a distinct shape to the plot. As the churn goes up, the number of points go down. Note that the causality could go in either direction. In fact, it almost certainly goes in both. Teams that are forced to rotate the roster due to injuries and other absences will struggle. And teams that are struggling will tend to rotate the roster and acquire new players to try to find a winning combination. Also note how off the chart that Toronto season was. It's such an outlier that if you remove it, the correlation actually gets substantially better. Toronto ruins everything.
Here's another look at the data, as quandrants centered on the median x and y values (and with that Toronto season kicked out so it's more compact):
Most points are in the top left quadrant — teams that didn't have much churn and had good seasons — or the bottom right — teams that had a lot of churn and didn't have good seasons. The interesting seasons are those in the other two quadrants, and I've labeled some of them. The bottom left are teams that had consistent rosters but still lost. The worst offender is the Rapids last season, and that rings a bell. Last season the Crunchy Power Rankings identified Colorado as a team that was consistently playing well but not getting good results. This suggests that maybe one of the reasons they were playing well was their roster consistency. And their presence in that quandrant is another angle on a season that I think was mostly just bad luck.
In the top right quadrant are the teams of particular interest — teams that won despite significant roster churn. The biggest example is the 2011 Galaxy, who had nearly 2 points per game despite the 6th highest churn value that season. They would eventually eliminate the Sounders from the postseason. And despite repeating as champions the next season, that 2011 season is no doubt Arena's greatest achievement thus far as a club manager.
Speaking of great coaches, the other data point that stands out there is Real Salt Lake's current season. Here are the top 5 teams in churn this season:
|2013 Roster Churn|
|Real Salt Lake||3.68|
Anything stand out there? Three of those teams are beyond awful. Easily the three worst teams in the league this year. Vancouver is having a pretty good season so far. Real Salt Lake has the best PPG in the league. How could RSL survive so much starter rotation? One thing we know is that they have a very well-established and consistent tactical system. Real Salt Lake have consistently played a 4-diamond-2 under Kreis no matter who's on the team. When Javier Morales was out with an extended injury, they didn't change their tactics. They just inserted Luis Gil and other players at the top of the midfield. This offseason they went through a massive roster overhaul to get under the cap, but didn't change tactics. They just plugged new players into the existing system. Their motto is 'the team is the star'.
They recruit players for the system and if one of them is lost to injury or international duty or a transfer, Kreis plugs in the replacement. That's the sort of consistency that allows a team to churn the roster and still maintain consistency on the field. Players know what the plan is and where the other players will be, even if the names on the jerseys change.
So let's talk about the Sounders. Seattle has a high churn value this season, but not near the top. Their 2.9 is 8th worst in the league. But after this weekend, with at least Gspurning newly out, it will no doubt go up. The difference with Seattle, and one reason why they're currently out of playoffs and RSL is leading the standings despite a higher churn value, is that they don't seem to have a system they can plug players into.
Instead, for the few years of their MLS existence, Seattle's MO seems to have been to go get the best players they can find and then adapt the tactic to fit them. In the first couple of years, particularly post-Ljungberg, Montero's ability in the middle of the field and the speed on the wings meant the team was attacking on the flanks to get behind the defense and center it for a shot. But Mauro Rosales doesn't fit that system at all. He's just a really talented player who they got for a song. It's hard to argue against that, but now the speed on the right is gone. And when Zakuani had his leg broken, his replacement we'd recruited on the left was Alvaro Fernandez. A good player, but not in the style of Zakuani. So now the team transforms into one built on getting the ball to the wings for a cross to the far post or back to the top of the box. Then we add Eddie Johnson, who's a fantastic player. But now the team wants to play head tennis since he's so good in the air. So now the offense is built around Mauro's crosses in the air. Which only work when EJ is on the field, because we failed to recruit more players who could play EJ's style. And when Mauro was (often) injured, his replacement was nothing like Mauro Rosales, so the team either kept trying that system with the wrong players or had to adapt to a new system on the fly. Now, with Montero gone, we're trying to do it again. Obafemi Martins is another tremendous player, but he's nothing like Montero and the team is having to adjust again.
To some extent, that's just the consequence of having really good players. David Beckham's backup was nothing like David Beckham, Landon Donovan's backup was nothing like Donovan, and Robbie Keane' backup was .. (actually Mike Magee is kind of Keane-esque). The trick with the Galaxy was that they were so good when those players were there that they made up for their poor results when they (or Omar Gonzalez) weren't. The trick with irreplaceable stars is a) you have to get lucky in not losing them much and b) you have to win so much when you have them that you can get by with the results when they're missing.
Right now Seattle has neither the luck to keep their best players on the field consistently nor the consistent system to get by without them. And it's showing.