Swish or Miss: The Role of Data Bias in NCAA Basketball Predictions

by Apr 26, 2023BI/Analytics0 comments

Swish or Miss: The Role of Data Bias in NCAA Basketball Predictions

The 2023 college basketball season has crowned two unexpected champions, with the LSU women’s and UConn men’s teams hoisting trophies in Dallas and Houston, respectively.

I say unexpected because, before the season began, neither one of these teams was thought of as a title contender. Both were given 60-1 odds to win the whole thing, and media and coaches polls weren’t giving them much respect.

Still, teams have been proving rankings and polls wrong since they first came around in the 1930s. And being atop rankings doesn’t guarantee success.

Since the expansion of the men’s basketball tournament in 1985, only six teams ranked preseason No. 1 in the AP Poll have won the title. It’s almost more of a curse than a blessing at that point.

How many of these rankings and polls are out there?

Even though we have access to a plethora of well-regarded rankings from individual journalists like ESPN’s Charlie Creme and Jeff Borzello, Big Ten Network’s Andy Katz, and Fox Sports’ John Fanta, there are three polls being widely recognized.

The chief among them is the aforementioned AP Top 25 Poll, compiled from a group of 61 sports journalists from across the country.

Then you have the USA Today Coaches Poll consisting of 32 Division I head coaches, one from each of the conferences that receive an automatic bid to the NCAA tournament. And the newest addition is the Student Media Poll, run out of Indiana University. This is a poll of student journalist voters who cover sports at their university daily.

These three groups will all look at teams with similar criteria, particularly before a single game is played. Without anyone scoring a point, media and coaches alike have to use the data that is accessible and make their early predictions.

Here are some of the most common:

Previous season results

It makes sense right? Whoever was best last season will likely be just as good. Well…between graduation, the transfer portal, and the world of one-and-done basketball, many rosters experience significant overhauls in the offseason.

When a team hits the top of the preseason rankings, odds are they’ve retained most of their key players. North Carolina — who missed the NCAA tournament entirely — was selected No. 1 for all three preseason polls after finishing as the runner-up in 2022 and returning four starters.

Experience

Veterans are crucial to any sport. But, in a sport with such a long season — upwards of 30 games a year — to get through, experience is greater.

Iowa women’s basketball made its longest-ever run in the tournament this year. Beyond the talent on the team, the Hawkeyes’ first five played 92 games together as starters. That’s unheard of in today’s game.

It’s no surprise a team like that can make a deep run and it’s a big reason Iowa was picked between No. 4 and No. 6 ahead of the season.

Strong recruiting class

Basketball is, arguably, the collegiate sport where a freshman can have the most impact. Limited roster spots and the rise of pro-ready players have seen many first-years become instant superstars.

And it shows in the polls. Eight of the top-10 men’s recruiting classes were represented in all three preseason polls.

The star factor

Big-time players are a major reason we watch college basketball. The top four men’s teams going into the season featured four of the biggest names in the league (Armando Bacot-North Carolina, Drew Timme-Gonzaga, Marcus Sasser-Houston, and Oscar Tshiebwe-Kentucky).

The reigning national player of the year Aliyah Boston’s South Carolina was almost a unanimous No. 1 in the preseason women’s polls, garnering 85 of 88 possible first-place votes across the three polls.

Where do polls differ?

Journalists and coaches that are responsible for rankings will use some combination of these factors while adding some of their own reasoning.

A journalist or student journalist that covers the Big 12 on a day-to-day basis might rank a team from that conference differently because they likely see all their highs and lows. If a national media member is only paying attention after a big win, it’s likely they could overrate that team.

For instance, Kevin McNamara had UConn the highest of anyone in the preseason AP Poll at 15. McNamara covers sports in New England based out of Providence, Rhode Island. Providence men’s basketball is in the Big East with UConn. It’s likely he would’ve seen more of the Huskies than his counterparts and looks all the wiser because of it.

On the other side, a coach might be inclined to rank a team higher if that team beat their own squad. It makes the coach’s team look better if a loss is to a stronger team while also using the rationale, “Well, they must be good if they beat us!”

Although we’re all working with a lot of the same data when looking at these teams, it’s not always a total consensus. Each person that votes on these polls brings their own experience and biases or puts their own weight on different factors.

Even as we’ve jumped further into analytic-led polling, the predictions aren’t much more successful. KenPom has become the gold standard in basketball rankings from statistics. It ranks all 363 NCAA teams based on adjusted efficiency margin (based on offensive and defensive efficiency per 100 possessions and team possessions per game).

KenPom was, rightfully, more wary of North Carolina, ranking it No. 9 preseason. But, it had UConn as low as anyone, at 27.

Where were our champions ranked preseason?

LSU- Coaches No. 14, AP No. 16, Student No. 17

UConn- Received votes but unranked in all three

Needless to say, no one was prepping a victory parade in Storrs or Baton Rouge off of the early poll releases. But, as I said early on, teams have been proving rankings and polls wrong since they first came around.

They expose some of the misconceptions pollsters have about their team and what it takes for them to win a championship.