File size: 2,738 Bytes
5f26252
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
# NBA Predictions

This repo contains AI model code and weights which predicts the outcome of NBA games. Its output represents the chance that a given point spread will occur.

The model requires 8 players on the home and away teams, plus their ages, as input. It will then output probabilities for each point spread between -20 and +20 points, from the home team's point of view.

For example, the following text and chart shows the model predicting the home team with a 77% chance to win and a 14% chance of winning by 20 or more points. This kind of chart is indicative of a dominant team playing at home. Most games will have more of a bell curve shape to them.

![NBA prediction graph](prediction.png)

## Installation

I recommend installing Python 3.11.8, as that is what the repo was written / tested in. The code will likely work with most recent versions of Python, though.

Once you have Python installed, run `pip install -r requirements.txt`. It will take a while to install dependencies if you don't already have PyTorch cached.

## Usage

The `example.ipynb` notebook shows how to use the model to predict the final game of the 2023-24 NBA season - a game between the Dallas Mavericks and Boston Celtics. It will output the chart above.

To change the players and their ages, you must reference the `player_tokens.csv` and `age_tokens.csv` files.

For example, if you wanted to subtract Kristaps Porzingis from Boston's team and swap who was home / away, you would take the token representing Porzingis `4416` out of the `home_team_tokens` list, and replace him with, say, Payton Pritchard `4999`. You would then have to look up Pritchard's age (26), find the corresponding age token in `age_tokens.csv`, which is `11`, and replace Porzingis' age token (which is the second to last token).

To swap home and away, you could replace the variables containing all of the player and age tokens, or just set the `swap_home_away` variable to `True`. The results are as follows:

![NBA Finals prediction without Porzingis](porzingis-swapped-for-pritchard.png)

As you can see, Dallas' win probability improved from 23% to 35%, and their chance of being blown out by 20+ points decreased from 14% to 10%. Clearly, the model thinks Porzingis is important to the Celtics' chances, but still considers Boston to be the superior team without him.

## Training Process

I downloaded data from stats.nba.com using the [https://github.com/swar/nba_api](swar/nba_api) package to get information on minutes played, game outcomes, and a few other dimensional elements to make everything fit together. Then, I ran a custom PyTorch training loop to train the model(s) on their chosen loss objective (spread, money line, or spread probability).