Tuesday, February 15, 2011

A New Approach to Seeding: Top Line Edition

Last year around this time, I wrote a post describing a more structured way to determine seeding for the NCAA tournament. With all the debate over the last week or so about who should be #1 in the polls -- and, more importantly, which four teams are deserving of a #1 seed in March -- I thought I'd run the numbers for the squads in the conversation, and see how each of them look using my approach.

To quickly summarize what I'm doing here, I've picked a team that is sitting around the bubble to use as a baseline which we can compare each team to. At the moment I'm using Richmond, which appears to be squarely on the bubble and currently ranks 55th in the KenPom ratings.

Using those same KenPom ratings, I ran Richmond through each relevant team's schedule, to see what their expected record would be against those opponents in those locations. For example, the Spiders would be expected to go 16.1-9.9 against Ohio St.'s slate, for a winning percentage of .620. In real life, Ohio St. has gone 25-1 against these opponents, for a W% of .962, so they come out at +.342 vs. the baseline.

I did this for each of the eight teams on the top two lines of the most recent Bracketology, and got the following results (through Tuesday's games):

aW%: Actual winning percentage
eW%: Richmond's expected W% against same schedule

(Green always means good -- high aW%, low eW% -- and vice versa for red.)

I don't think it comes as much of a surprise that Ohio St. comes out on top here. Even after their loss on Saturday, it seemed like everyone -- except the people who actually vote in the polls -- agreed that they still had a resume which few could match.

After that, it gets interesting. Pitt is #2 mostly on the strength of their two road wins last week (@WVU, @Nova); they were only +.284 prior to that impressive stretch. I think a big advantage of this approach is being able to quantify things like that; despite those two wins, Pitt didn't budge in either poll last week, staying at #4.

I'm guessing the most surprising thing about the table above is that San Diego St. is five spots ahead of Texas. After seeing this, I wondered if I wasn't using a high enough baseline. After all, I originally came up with this to compare bubble teams, but now that we're focusing on #1 seeds, Richmond isn't all that relevant.

It's possible that by using too low of a baseline, Texas isn't getting enough relative credit for something like winning at Kansas; Richmond wins a very low % of the time against any very strong team, so it's tough to differentiate winning @ KU from winning at, say, BYU by using the Spiders.

That problem is easily solved, of course. Here the same analysis using Pitt (#5 in KenPom) as the baseline:

Some changes on the margins, with a couple teams flipping spots, but nothing major.

So it's clear that no matter what baseline we use, SDSU is going to come out ahead of Texas. And really, this shouldn't be that much of a surprise. Obviously the Aztecs have two fewer losses than the Longhorns, which does not help matters for Texas. You'd think, playing in the Big 12, they could make this up with SOS, but not so much. KenPom actually has SDSU (35th) as having a more difficult schedule than Texas (37th). My approach -- which I think is more relevant than just averaging opponents' ratings -- doesn't quite agree, but SOS still makes up less than half the gap in aW%.

The argument at this point would probably be, "But they won at Kansas; what's SDSU's best win, at UNLV?" Fair point. To further understand what's going on here, we can compare the two teams' resumes on a more granular level.

Below is basically just the inputs that go into the tables above. If the baseline is 80% to win a game and you win, you get +.2; if you lose, it's -.8. Do this for every game for both teams, rank them, and you get:

+/-: Debit/credit for each game
Sum: Running total of debits/credits for each team

(I took Texas' game vs. Navy out for ease of comparison; the baseline on that one is 99.6%, and the Longhorns won by 31. And yes, SDSU really did play IUPUI on a neutral floor on two separate occasions.)

That win in Lawrence sure helps; in fact, by these numbers it's more valuable than SDSU's two best Ws combined. This trend continues, as each of Texas' best wins are better than their Aztec counterpart.

After 21 games (there are 24 total), the Longhorns have built up a substantial lead. But then there are those pesky three losses*. Not only that, but each of their losses are worse than SDSU's single defeat; playing in Provo is nearly as tough as it gets.

*-It may look weird to see UConn as their worst loss, rather than USC, but once again HCA is rather important (and the Huskies are rather overrated).

As you can see, all of Texas' gains relative to the Aztecs are erased with the three losses, and then some.

I wouldn't argue that SDSU is a better team than Texas; if they played on a neutral court tomorrow, Rick Barnes' squad team would unquestionably be favored, and rightfully so. But, with the policy that is currently in place for seeding teams, I think you have to really twist the facts to argue that the Longhorns are more deserving of a #1 seed than the Aztecs at this point.

1 comments: