NCAA Mock Selection Recap: Part V: The Great SOS Air-Ball

What do UDPride, Jerry Palm, Warren Nolan, CollegeRPI.com, CBS Sports, Ken Pomeroy, RealTimeRPI.com, SBNation, most of the players, coaches, athletic directors, fans, reporters, broadcast journalists, and even Wikipedia all have in common?

Beyond random probability and freaky coincidence, we’ve all been led to believe – simultaneously and to exacting specificity – that Strength of Schedule (SOS) is a multi-level rating computation based on:

(2/3) X Opponents’ Winning Percentage + (1/3) X Opponents’ Opponents Winning Percentage.

It’s not.

How almost everyone in the college basketball-speaking world not only arrived at the same misunderstanding at the same time, but also collectively failed to uncover this oversight over the span of many years, is one of the great air-balls in hoops ignorance. Nobody bothered to ask the question, so nobody bothered to question the formula. Somewhere along the line a stat geek got it all wrong and everyone followed suit like cattle to the slaughterhouse.

Until last week.

The happy accident that transpired in Indianapolis and led us to epiphany was happenstance leading to suspicion that ultimately reached shocking surprise. We weren’t exactly looking for an error; it’s hard to spot something not consciously on the prowl for. How it occurred requires a deeper rewind to sketch a broader context.

THE DATA

There are many online sources that publish the RPI on a recurring basis. Some publish weekly while others publish daily. Fewer still publish the SOS alongside it – and that omission list includes the NCAA for reasons still unknown. If you want to match your SOS to theirs, you can’t. In addition, nobody (to our knowledge) goes an extra granular step by publishing the raw SOS values based on all factored computation levels in the SOS equation. The SOS rating determines SOS rank and without the rating alongside it, matching one source’s SOS ranking to another is an open-ended question -- there’s a chance the rank from two or more published SOS sources matches in spite of the rating and notbecause of the rating. In effect, fact-checking rank without rating is like trying to fact-check net earnings without profit and loss.

With nothing to match to (at least publicly) at the NCAA’s Web site, the best alternative was matching to other SOS sources. But that also has drawbacks. Due to time-shifting as well as incomplete or inaccurate results, RPIs and SOSs published by the most widely-known Internet sources tend to vary slightly from one day to the next. To sanity-check your own SOS data with another source's separately-timed data entry protocols and/or update schedules is like trying to hit a moving target. One fat-fingered score in the entire seasonal database would throw off most of the RPI and SOS numbers of most of the teams from one source to the next. Regardless, nobody else publishes the SOS rating broken down by factored computational level anyway, leaving no surrogate reference to audit the variables we need.

PEELING THE ONION

After arriving in Indianapolis it never occurred to us that something was amiss, and it only came to our attention late in the process as at-large spots progressively filled up. Until then, most of the teams earning enough votes to enter the Field of 68 had résumés that were far easier calls to make, sparing us from diving into the most granular weeds of analytical data.

But that wasn’t the case on Friday afternoon when tendering Bullpen votes to send bubble teams to the Big Dance. Evaluating the last eight or 10 teams in the potential field turned into a hair-raising and hair-splitting part of the process that required an extraordinary level of deep-woods interrogation to separate one team from another. To do that, we examined the NCAA’s Nitty Gritty Report on each team – a report so exhaustive in overkill and only place where raw RPI computational values existed (to our knowledge) beyond our own database. Still, we weren’t necessarily looking for a discrepancy. As far as we knew, their data and our data – which was supposedly everyone else’s data – had matched since the beginning of time.

Only when Dayton reappeared in the Bullpen on Friday afternoon did we re-visit the Nitty Gritty Report and spot a mismatch between UDPride data and the NCAA’s. We noticed it because our laptop displaying the UDPride RPI sat at our desk as an additional reference to supplement the NCAA’s. Nobody yelled “Eureka!”; it was more like “WTF?”

NOT SO FAST

Might the conflicting data have an explanation as easy as a small discrepancy in the database (games played, wrong venue, fat-fingered W/L results)? When comparing RPI ratings and rankings for UD and several other teams however, the UDPride and NCAA RPIs matched perfectly.

The UDPride and NCAA databases were congruent – they had to be -- because identical RPI data for 351 teams demands the databases match. If RPIs were also inconsistent between us and the NCAA, chances were the problem was data entry and not computational; data entry errors affect both RPI and SOS at the same time. That didn’t exist. We still had more questions than answers, but knew only the SOS computations were incongruent with the NCAA’s. Yet we had no idea why or how because everyone used the NCAA’s formula – including ourselves.

SNOOKERED

Another member of the media suggested a conversation with JD Hamilton (Asst. Director of Media and Stats) for further guidance. Hamilton was out the room at the time so we approached JD’s boss David Worlock. We’ve worked with David during the season to sanity-check the RPI only (because the NCAA does not publish SOS). After regurgitating the details and pleading for further clarification on the NCAA Selection Committee’s official RPI and SOS computations, we finally had some answers and Worlock was kind enough to supply them.
The RPI computation, as expected, matched our algorithm (and everyone else’s in the basketball universe). Then this:

“But I believe SOS is just second-level points. There is no third-level computation like the RPI.”

Worlock left the room to do some additional fact-checking. He returned a few minutes later with a dusty email from the catacombs that answered the same question for someone else several years prior:

“According to this, SOS is your opponents’ winning percentage. That’s it.”

THE GORY DETAILS

Unlike the RPI that tabulates on three levels:

25% winning percentage
50% opponents’ winning percentage
25% opponents’ opponents winning percentage

Strength of Schedule (SOS) is apparently defined only by 100% of LEVEL 2:

CORRECT: (100%) X Opponents’ Winning Percentage (LEVEL 2).
INCORRECT: (67%) X Opponents’ Winning Percentage (LEVEL 2) + (33%) X Opponents’ Opponents Winning Percentage (LEVEL 3)

We returned to our chair and reconciled the NCAA’s raw SOS values on the Nitty Gritty Report using the proper formula and they were spot-on. While relieved to solve the riddle, disbelief remained:

How could so many respected, well-intentioned sports giants with reputations beholden to accuracy -- with contacts, resources, and bank accounts far exceeding our home-brewed/hamster-wheeled product -- get it wrong for so long and never question the formula or spot the discrepancy?

THERE ARE NO STUPID QUESTIONS

This was our first opportunity to stare at the raw NCAA data responsible for SOS ratings and yet recognized a problem right away. CBS Sports spent several billion dollars to franchise and profit from March Madness over the next 10 years, but their direct inside relationship got them no further to the truth. Are CBS’s Selection Sunday stat boxes displaying team SOS rankings based on the wrong math too? We know this: CBSSports.com is using the incorrect formula because we’ve already checked. So are most other sources.

Had we not been at the MSP and had the luxury of digesting the NCAA’s comprehensive SOS data sets that don’t appear in their daily RPI rankings – nor had the comprehensive data from our own SOS computations to compare and contrast – chances are the oversight (or intentional blindness) would live on in perpetuity. It was a perfect storm of the right people at the right time with the right data standing in the same room.

MOUNTAINS OR MOLE HILLS?

Does it even matter? In short, yes.

At the very least, it’s a failing grade by all of us to repeatedly get it wrong; I take some share of the blame. Unlike others however, we spotted the problem and raised a red flag as soon as illuminating data materialized. In all likelihood, it’s the same data others didn’t notice or didn’t care about. What’s worth doing is worth doing right and the NCAA has never made a conscious effort (as far as we know) to disguise the formula. They assumed everyone had it right while the stat geeks never imagined they had it wrong.

It does beg the question however: what else is mainstream popular media overlooking that’s misguided at best and flat-out erroneous at worst? UDPride is nothing more than a Web site community held together by duck tape and zip ties. The burden to spot and correct a blunder like this should never fall this far and reached the lap of unpaid, half-baked, college basketball fans moonlighting as Grantland Rice wannabes in their discretionary hobby time.

It matters because getting things right still counts for something.

BEYOND THE PHILOSOPHICAL

It also matters because the course correction should empower the schedule-makers to (finally) get a handle on SOS and start playing by the same math as the NCAA. That alone is a step forward in the direction of transparency. To illustrate some of the changes, let’s narrow the focus on non-conference scheduling and non-con SOS:

Assume Dayton plays 10 non-conference opponents:

The incorrect non-con SOS calculation:
Each opponent’s record was worth 6.7% (67% of LEVEL 2)

The corrected non-con SOS calculation:
Each opponent’s record is worth 10% (100% of LEVEL 2)

It might not sound like a lot, but it’s a 50% weighted increase on non-con opponents compared to the prior calculation for LEVEL 2. The benefit of LEVEL3 in the incorrect calculation was its stabilizing force. While hundreds of opponents’ opponents games each counted for far less and -- in totality -- were valued at just 33%, they dispersed the scheduling risk over far more results to minimize the potential for gaming the math with a few high quality opponents to counter an otherwise disappointing non-conference schedule. In short, LEVEL3 was the incorrect SOS’s version of precious metals. It quietly stabilized the system without being overly intrusive. LEVEL 3 alone didn’t make or break an SOS, but it did place some limits on how far teams can manipulate the system.Now, rather than several hundred games driving the SOS bus, it's only a few dozen.

CURRENT EXAMPLES

Here’s a comparison of SOS variance of a few schools using the incorrect and correct formulas. Some are helped while others are penalized.

Non-Con SOS Variance (old/new):
Dayton 93/119
Rhode Island 200/237
Oklahoma 85/113
Drexel 127/160
Northern Iowa 119/142
San Diego State 56/82
Arizona 76/107
California 137/115
Tennessee 82/111
Texas State 164/117

Overall SOS variance (old/new):
Dayton 113/125
Rhode Island 130/168
George Washington 138/169
Massachusetts 36/24
Texas Christian 126/154
Wyoming 207/244
Pepperdine 148/179

THE BOTTOM LINE

The revised SOS calculation will make a difference. How much remains to be seen, but we know this much: non-conference opponents matter 50% more than most schools realized and that’s a wealth of knowledge gained. Schools will adjust their scheduling by modeling off the new math, giving them greater clarity and a better understanding of resulting SOS outcomes. But it may also push aside scheduling opportunities as the big boys insulate themselves even further. We’re already seeing it.

It’s important to have a system where everyone knows the rules of engagement and the RPI/SOS computations are no exception. They are significant tools used by the NCAA to help choose the Field of 68. But that doesn’t mean the NCAA should take the blame. While computations are proprietary formulas and the NCAA is free to use whatever equations they choose – let alone change them as they see fit -- those formulas were never behind lock and key.

The disconnect fell on the public at large and too many mainstream media sources more important than us walked past the same data we stumbled upon and never threw a challenge flag. That’s all we did and all that was needed because the NCAA was more than happy to provide the answers.

The SOS computation change will go into effect on the UDPride RPI at midnight Saturday morning. As for everyone else? That's their business.