Thursday, August 25, 2011

The Value of Returning Possessions

The praise has been lavish.

Some have gone so far as to make the 2011-12 edition of the Crimson a pre-season Top 50 team, while almost all have made Harvard the prohibitive favorite in the Ivy League.

What is it, exactly, that separates the Crimson from Princeton, with which it shared last year's Ivy title and the latter of which secured the league's automatic bid?

The simple answer is returnees. In this case, Harvard returns everybody from the last campaign, while the Tigers lost two of their top four players off of a team that was already among the shallowest benches in Division I basketball. While that simplistic response is, in and of itself, correct, it hardly quantifies the impact of retaining and losing players. It is that task which we will undertake in this analysis.

Using the data set for the past 15 years of Ivy basketball, we can use regression analysis to predict future impacts of returning players on the outcomes of league teams.

The metric for judging outcomes will be Adjusted Pythagorean Win Percentage, while the metric for judging returning talent will be total possessions used during the previous season. There will be a little bit of noise here, as it is possible for a player to miss an entire season with an injury, during which a senior fills in at a similar level, only to have the original player slotted to return after that senior departs, something for which the team would receive no credit, since the injured player logged minimal possessions that previous year. Regression analysis should smooth out that noise and provide us with the general trend.

The initial model would set the Change in Adj Pythag equal to Returning Possessions plus a constant, but there's one other general force at play here. It is very difficult to be consistently very good or very bad (beyond one's general run-rate) at anything, and there exists a statistical force commonly known as regression to the mean that will tug teams back away from the extremes. Thus, we'll take a team's previous year Z-Score (where the team falls in relation to the average Ivy team based on the average Ivy standard deviation), center those results around zero and add that as a variable in the regression.

Using that model, we get an R-squared of just north of 0.4, which can be taken to mean that our model explains a little more than 40 percent of the year-over-year change in team ability. That's not bad, considering that luck will be a sizable portion of the remainder and that team specific factors will affect the results as well.

The model outputs are as follows:

Predicted Change in Adj Pythag = -.353 + Returning Possessions * .546 + Last Year's Adj Pythag Z-Score * -.179.

The variables were statistically significant to the 99 percent level. The way to interpret the results is that returning possessions have a strong positive effect on the predicted change in Adj Pythag, while teams that were below the mean Adj Pythag in the previous year should see a positive effect just based on that, while teams above the mean Adj Pythag last year should regress a bit.

Plugging in the 2011 results into the model, we can see where each team is expected to land in 2012. (The rankings in parentheses are the implied Pomeroy rankings based on the final 2011 Division I table).

Brown - Was: .273 (247th); Exp. Change: -.009; Exp. 2012: .264 (247th)
Columbia - Was: .311 (233rd); Exp. Change: .049; Exp. 2012: .360 (206th)
Cornell - Was: .387 (197th); Exp. Change: .047; Exp. 2012: .434 (180th)
Dartmouth - Was: .076 (326th); Exp. Change: .153; Exp. 2012: .229 (261st)
Harvard - Was: .721 (83rd); Exp. Change: .111; Exp. 2012: .832 (55th)
Penn - Was: .405 (189th); Exp. Change: .005; Exp. 2012: .410 (189th)
Princeton - Was: .711 (86th); Exp. Change: -.096; Exp. 2012: .615 (116th)
Yale - Was: .422 (186th); Exp. Change: .062; Exp. 2012: .485 (165th)

Here is a refresher on where the Adj Pythag was expected to fall for each team in our initial study:

1. Harvard - 0.7980 (64th)
2. Yale - 0.5558 (139th)
3. Princeton - 0.5530 (140th)
4. Penn - 0.4301 (180th)
5. Cornell - 0.3878 (196th)
6. Columbia - 0.3448 (213th)
7. Brown - 0.3318 (217th)
8. Dartmouth - 0.1590 (294th)

The regression analysis displays the same general pattern as the initial projections, except for Yale. In the initial projection, there was a two-dog race for second, between Yale and Princeton, while the regression analysis points to the Tigers as alone in second, distant from both league-leading Harvard and the third-place Bulldogs.

Following those three, we get the same Cornell, Penn and Columbia cluster (though with the Big Red in the slight lead rather than the Quakers). Brown falls further back in seventh in this analysis, while Dartmouth is expected to get much stronger than the initial projection indicated.

So far, we've looked at the 2011-2012 season three different ways: the initial projection, the historical thumbnail and the regression analysis approach.

Blending the three, the only team that looks to be tough to peg is Yale, which has finished almost as high as 110 and as low as 165. From an Ivy standings point of view, the Bulldogs have been anywhere from a clear second-place team to approaching the cluster for third through sixth. The remainder of the teams have slotted pretty nicely - Harvard in first; Princeton in second or third; Penn, Cornell and Columbia in fourth through sixth; Brown in seventh and Dartmouth bringing up the rear.

The 14-Game Tournament is just that: 14 games. The sample size is small enough that crazy things can and do happen. At least the different types of analysis can provide us with a good starting point from which to judge these eight teams as we head into the fall. From there, the bouncing ball takes over and the force of luck will guide these squads to their ultimate destinies.

No comments:

Post a Comment