Friday, February 20, 2015

An NCAA Selection Process For The 21st Century

Three Top 50 wins. An RPI of 36. An SOS of 41. A 12-3 road/neutral record.

Get ready. The college basketball dialogue is about to morph into an endless string of incomplete metrics meant to summarize a team's worthiness to be one of 36 at-large bids to the NCAA Tournament.

The current selection infrastructure is ripe for mockery, though it's not the RPI's fault. That formula was designed in a different era, when computing power was some trivial fraction of what it is today. For some reason, despite endless evidence to support better metrics, the NCAA has maintained its reliance on that shortcut formula as its method of categorizing team performance for its hard-working selection committee.

Of course, all that an incomplete metric does is breed mistrust in statistics, providing people with an excuse to substitute even more flawed observational anecdotes to support whatever pre-existing biases with which they walked in the room.

What I present here today is a new metric that can roll all of the aspects of the familiar historical terminology surrounding bubble teams into one set of numbers that itself controls for all of the things that have previously been addressed separately. These things include average quality of opponent, wins against top teams and location of game.

The system I will show is a compromise. It is inherently less than perfect. If we define "perfect" as seeding the field where, to the best of our knowledge, each higher rated seed would be expected to defeat those below it, then we should just have Vegas seed the field and be done with it. To be sure, that would keep disasters like last year from happening in the future, where two four seeds were among the three teams with the best odds to win the title. Or from an Ivy perspective, it would keep a true 9-seed in Harvard from being given a 12, and having to face a true 5- or 6-seed and a true 1- or 2-seed to reach the Sweet 16.

It's clear, however, that despite the fact that Vegas puts its money where its mouth is every day, the decision makers in the college basketball community, who are never held accountable for their assessments of team quality, would never trust Vegas with such a task.

So, out of hand, we are doomed to a system that is less than perfect. The good news is that I have a way to get us much closer than the patchwork set of metrics we have today.

There are a lot of different takes on what the NCAA selection process should reward, but fundamentally, it all boils down to picking at-large teams that have the potential to win games in the Big Dance. That's why Top 50 and Top 100 wins become so vitally important - they are perceived as signals that a team has the capability to win games in March.

Myriad problems arise with that metric, however. Teams play wildly different numbers of those types of games, making it difficult to compare a team like Harvard, which could wind up needing an at-large with a 2-2 or 3-2 mark against the Top 50, against a team like NC State, which is already 4-7 versus the Top 50 with more such games to come. Then, there's the issue of venue. The Wolfpack have currently played just two of their 11 Top 50 games as true road contests, while Harvard has already played two of its three Top 50 contests on the road.

There is a simple way to cut through all of that. We can score the outcome of each game, taking the final margin and adjusting for the quality of the opponent and the venue. That score, whether it be ESPN's BPI Game Score or Ken Pomeroy's game Pythagorean Win Percentage, can tell us exactly the quality of team that on average would produce the same result in those circumstances.

For instance, Harvard's 57-46 win over Dartmouth earned a BPI Game Score of 86.7, which equates to the performance of a roughly Top 10 team in that game. While that might seem shocking to some, it actually checks out quite nicely if we look at a comparable upcoming game. Oklahoma, currently No. 10 in Pomeroy, will be about a 10-point favorite on Saturday at Texas Tech, currently No. 170 in Pomeroy. Dartmouth currently checks in at No. 172 in Pomeroy.

Aggregating those Game Scores across the season, we can start to get a sense of the percentage of time that a team plays at different levels of quality. Here are the top 10 teams by percentage of games in which they played like a Top 25 equivalent team:

Team Top 25 Top 50 Top 100
Kentucky 100% 100% 100%
Gonzaga 96% 100% 100%
Wisconsin 92% 92% 92%
Duke 90% 90% 90%
Villanova 88% 88% 92%
Virginia 87% 96% 100%
Arizona 87% 87% 91%
Louisville 86% 86% 90%
Kansas 85% 88% 92%
Utah 84% 84% 89%

Sure enough, this list contains nine of the Top 10 teams in the Pomeroy Ratings and all 10 from the BPI. Then again, these 10 teams were not only locks to get into the tournament, but were also all pretty likely to land on the Sweet 16 seed lines. The fact that this system agrees with those ratings is an important baseline to have as we move down the list to where this can be more useful.

Looking at the percentage of games for which a team registered a Game Score equivalent to a Top 25, Top 50 and Top 100 team, I decided that any team that ranked in the top 50 of all three deserved to be in the tournament field. Ranking that highly in all three means that not only can this particular team play really well a lot of the time, but it also can play decently well a high percentage of the time when it's not showing itself at its best.

That cut yielded 42 teams in the NCAA field.

Team Top 25 Top 50 Top 100 Top 25 Rank Top 50 Rank Top 100 Rank
Kentucky 100% 100% 100% 1 1 1
Gonzaga 96% 100% 100% 2 1 1
Wisconsin 92% 92% 92% 3 4 7
Duke 90% 90% 90% 4 5 11
Villanova 88% 88% 92% 5 8 6
Virginia 87% 96% 100% 6 3 1
Arizona 87% 87% 91% 6 9 8
Louisville 86% 86% 90% 8 11 10
Kansas 85% 88% 92% 9 7 5
Utah 84% 84% 89% 10 12 12
Notre Dame 79% 89% 89% 11 6 12
Northern Iowa 74% 87% 96% 12 9 4
Iowa State 74% 74% 84% 13 21 19
North Carolina 72% 76% 88% 14 18 14
Baylor 71% 71% 81% 15 27 25
Arkansas 70% 70% 70% 16 30 46
Wichita State 70% 78% 87% 17 15 15
Butler 70% 78% 83% 17 15 21
Dayton 68% 73% 82% 19 22 23
Texas 68% 77% 82% 19 17 23
VCU 68% 72% 72% 21 26 39
Ohio State 65% 83% 91% 22 13 8
Oklahoma 65% 70% 74% 22 32 36
Michigan State 65% 74% 74% 22 20 36
Maryland 65% 70% 78% 22 32 29
Georgetown 64% 73% 86% 26 22 16
SMU 64% 73% 86% 26 22 16
West Virginia 63% 79% 79% 28 14 26
Providence 63% 71% 71% 28 29 44
NC State 61% 65% 74% 30 40 36
LSU 60% 65% 75% 32 42 34
Georgia 59% 73% 86% 35 22 16
Purdue 59% 64% 68% 35 44 50
Saint Marys 59% 68% 77% 35 37 32
Old Dominion 58% 68% 68% 38 35 48
St. Johns 57% 67% 71% 39 38 40
Ole Miss 57% 70% 83% 40 32 21
Texas A&M 55% 70% 75% 43 30 34
BYU 54% 75% 83% 44 19 20
Florida 54% 67% 79% 44 38 26
Davidson 53% 68% 79% 47 35 28
Indiana 52% 71% 71% 48 27 40

In a normal year, roughly 20 of the automatic bids will not come from this group, meaning that there will be somewhere between 5-10 at-large bids left to be claimed after this first step. The next step will be to decide the contenders for those final at-large bids.

Given that the differences between the percentage of games playing like a Top 25 team condense dramatically after allowing that first bulk set of teams in, it's important to expand our discussion to percentages of games playing like a Top 50 and Top 100 team in order to break what are essentially statistical ties. This is where things become more of an art than a science, but it's a discussion that is based on a more comprehensive set of metrics than ever before.

I re-sorted all of the teams by their average rank across the average percentage of games played like a Top 25, Top 50 and Top 100 team and then considered the bubble to be the remaining teams yet to be selected up until I hit 68 total teams. Since there were 42 teams admitted to the field based on being ranked Top 50 in all three categories, I added 26 teams to the bubble. In this example, given that there are 20 automatic bids not a part of the initial 42 team set, that leaves us with six at-large spots for those 26 teams on the bubble.

Let's take a deeper look at those 26 teams:

Team Top 25 Top 50 Top 100 Top 25 Rank Top 50 Rank Top 100 Rank
Xavier 60% 64% 68% 32 43 55
San Diego State 61% 65% 65% 30 40 61
Oklahoma St 60% 60% 65% 32 49 64
Vanderbilt 48% 62% 67% 57 45 56
Cincinnati 57% 57% 65% 40 60 61
Miami (FL) 43% 62% 71% 78 45 40
Murray State 48% 57% 67% 57 55 56
G. Washington 48% 57% 67% 57 55 56
Bowling Green 40% 60% 70% 80 49 46
Minnesota 50% 55% 65% 51 63 64
Boise State 50% 55% 65% 51 63 64
Seton Hall 50% 54% 67% 51 72 56
Colorado State 50% 50% 68% 51 81 50
Stanford 43% 52% 78% 74 79 29
Syracuse 45% 55% 68% 67 68 50
Cent Michigan 47% 59% 65% 65 51 70
Oregon 48% 57% 62% 57 55 75
Harvard 47% 58% 63% 64 54 73
Kansas St 38% 52% 76% 84 76 33
Florida St 57% 57% 57% 40 60 96
Rhode Island 45% 55% 65% 69 63 64
Alabama 45% 55% 65% 69 63 64
Green Bay 44% 61% 61% 72 47 77
SF Austin 50% 57% 57% 51 55 91
Iowa 52% 52% 62% 48 76 75
Illinois 50% 55% 59% 51 68 81

Clearly Xavier, San Diego State and Oklahoma State should be three of the next six teams admitted. The general consensus in the Bracketology community agrees, as at Bracketmatrix.com, those three teams are averaging a 10-seed, an 8-seed and a 6-seed, respectively.

That's where things get interesting, though. Teams like Illinois and Iowa are solidly in most people's brackets, yet this Game Score metric shows that roughly 40 percent of the time, they don't even play like Top 100 teams. Even more interesting are teams like UCLA and Temple - also in most people's brackets - which didn't even make this broader bubble list, as they have failed to play like Top 100 teams roughly 43 percent of the time. Potentially most interesting are the teams generally agreed to be barely on the wrong side of the bubble, like Pitt and Tulsa, which have worse Game Score profiles in each of the levels of quality than basically every team listed on the bubble above.

Then, there's the other side of things - teams that have actually played quite consistently well, but are tremendously undervalued by the incomplete metrics used today. Teams like Murray State, Bowling Green, Central Michigan, Harvard, Green Bay and Stephen F. Austin aren't really on anyone's bubble, but have recorded as high if not a higher percentage of strong performances than other teams in that discussion.

And that's where this methodology drastically outperforms the current RPI Top 50/100 metrics. The opportunities for any of those six mid-major programs to earn "wins that matter" according to the current system mostly dried up once conference play began in earnest in January.

From that perspective, no one can tell how well they're playing, given that a team like Harvard is 0-0 against Top 50 competition and 1-0 against the Top 100 in January and February combined. The Game Score metric, however, can reveal that the Crimson has played like a Top 25, Top 50 and Top 100 team in 40%, 50% and 60% of its last 10 games, respectively. Those marks are good enough for ranks in the 60s in all three, which isn't necessarily at large worthy, but is easily bubble conversation worthy, given the performance of the other teams in the conversation.

Now, I know the instant reaction of some folks to this concept will be that you shouldn't count the performance of a team in games that it should easily win. Who cares if you win by 20, 25 or 30? I'm willing to concede this point, so long as we all agree on what constitutes an easily winnable game. Lots of pundits want to draw this line at No. 150 (usually irrespective of location, home/road). That's crazy, as it is roughly as difficult to win at a No. 150 team as it is to beat a No. 50 team at home. 

For this discussion, I have defined an easily winnable game as one in which a bubble team would be expected to win 95 percent of the time. Thus, I eliminated all home wins against teams ranked in the bottom 100 and all road wins against teams in the bottom 20 nationally. (If you lose one of these games, however, it still counts against you). All of the data above contains these exclusions, so the mid-majors shown above haven't earned their place by destroying awful teams.

While this methodology isn't as perfect as letting Vegas seed the field, it is at least a simple-to-understand, ideological cousin of the way in which teams are rated by the linesmakers. With just those three simple categories above, we can easily process all of the information that RPI, SOS, records vs. Top 50/100 and road/neutral record are trying to solve for on an ad hoc basis separately.

What's more is it provides a fair way to understand what is going on in the 80-90 percent of games currently not being considered at all for teams from traditional one-bid conferences. That will go the longest way toward leveling the playing field in terms of vying for at-large bids in March.

1 comment:

  1. Fascinating analysis. What I read into it is that Harvard ought to be considered for a second NCAA bid if it somehow failed to win the Ivy title. But, of course, the committee won't go that far over the cliff. Harvard's team is at least as likely to win a first round game as its last two entrants, assuming similar seeding.

    ReplyDelete