/r/NBAanalytics

3,879 Subscribers

7

Finished our first version of a Player ELO system.

2 Comments
2024/12/23
02:45 UTC

6

Analytics of an extended 3PT line

TL;DR: How would extending the 3PT line and removing corner threes affect the analytical approach to the game, and would it actually improve the viewing experience?

The idea of extending the three-point line isn’t new, but it feels more relevant than ever. TV ratings are down, and many fans have grown frustrated with the high volume of threes in today’s game. For a long time, I thought extending the line was a shortsighted or overly simplistic solution, but I’ve started to reconsider.

If the NBA were to extend the three-point line, the corner three would likely end up out of bounds unless the court is widened. Alternatively, the corner three could simply be removed. Let’s assume for this discussion that the line is extended far enough (perhaps to 25 feet) to maintain its value without being overly enticing, and that corner threes are eliminated.

In such a league, what might change? Would we see a resurgence of more traditional roles, such as power forwards operating as a second big like Duncan beside Robinson? Would shot variety truly increase, increasing the value of plays like post-ups or midrange isolations?

One of my main concerns with the current game is the lack of shot diversity and the significant impact of shooting variance. It can be deflating to see a team lose purely because of poor shooting luck, as somewhat happened with OKC in the NBA Cup finals. While shooting variance has always been a part of the game’s charm, I’m not sure I want a league so heavily dictated by it.

Another concern is the potential link between today’s extreme spacing and the rise in player injuries. The constant movement and reliance on spacing might be putting undue strain on players. (Note that this is hypothetical and I am not a qualified expert on this matter.)

I grew up in the pace-and-space era, so I fully appreciate the appeal of the long ball. However, I wonder if we’ve reached a point where the emphasis on three-point shooting is diminishing the overall experience. A change like this could address some of these concerns, but I’d love to hear what others—especially from the stats and analytics community—think.

1 Comment
2024/12/21
16:26 UTC

3

Yahoo Fantasy API Access

Hello everyone I wanted to play with the yahoo fantasy api just to see what I can make as a fun project to sharpen my JavaScript and python skills. Unfortunately I can’t gain access to the API itself . I have watched YouTube tutorials and followed Uber fast man wrapper instructions to no avail. Basically I follow all instructions and everything seems to go well but the login to authenticates myself never pops up for me. I have gotten errors on errors and even tried the docker solution to uberfastman wrapper to no luck. Please any advice that anyone can give would be greatly appreciated. Any more info needed I would gladly share . Thank you in advance.

0 Comments
2024/12/16
17:09 UTC

5

Is there a simple explanation for BPM?

I've watched videos even, and they don't want to just summarize the units used. I'm imagining that it's "the difference in the points the team scores that season, with and without the player on the court, adjusted to account for 100 possessions." The problem is that the video I saw seemed to say that this isn't it, because the narrator was unhappy that this allowed starter to look better because he plays mostly with the starters, and other starters are also missing from the court at the same time. Of course the team suffers with the substitutes playing, and all the starters look wonderful overall. Maybe I'm misunderstanding even that.

3 Comments
2024/12/14
22:14 UTC

14

Sports Analytics Resume / Personal Projects

Hello, Has anyone in this sub landed a internship or any job in the sports industry (preferably NBA) as data scientist or basketball analytics assistant or something among those roles on the operations side (not the business side) that is willing to share their resume or link some of their projects that help land the job? I’m trying to strengthen my resume to help me get some call backs .

9 Comments
2024/12/13
03:04 UTC

2

How might I reconcile the difference between my First Basket probability equations?

Hey guys, would like to start by saying I am absolutely no mathematician, if i'm just way off, please let me know. Also, when I refer to any sort of Field Goal, it's a first basket attempt. If the FG is not a first basket attempt, it's not factored in at all. To simplify, both equations are technically the same, but with one having more inputs, I'll start with the smaller one.

First Basket Implied Probability = p(c) + ((b * p)(1- c))

p = (Player total FGA / Team total FGA) * Player FG%. Player Implied Probability

  • If I've selected a specific shot value (FT, FG2, FGA): p = (Player FGxA / Team total FGA) * Player FGx%
    • here, x equals either a free throw, two point attempt, or three pointer.

c = (Team Center's Tip Win % + Opponent Center's Tip Loss %) / 2. Tip Win Rate

b = (Opponent FG Miss % + Team Defensive Stop %) / 2. Ball Back Chance

  • Defensive Stop of course means no score from the opposing team on their first basket attempts

Let's use Jaylen Brown's chance to score first basket against the grizzlies this evening, no specific shot value.

Jaylen has taken 7 total attempts to the Celtics 26, making 3 out of his 7 and the C's 26 total attempts.
p = (7/26) * 0.42857 = .1154 = 11.54%

I've selected Kristaps and JJJ as our centers. KP is 1-3 and JJJ is 11-3.
c = (1/4 + 3/14) / 2
= (.25 + .2143) / 2
= .2321 = 23.21%

The C's are only allowing 12/32 first basket attempts, while the Grizzlies are shooting 15/35.
b = (20/35 + 20/32) / 2
= (.5714 + .625) / 2
= .5982 = 59.82%

so First Basket Implied Probability = .1154(.2321) + ((.5982 * .1154)(1 - .2321))
= .0268 + (.069 * .7679)
= .0268 + .053
= .0798 = 7.98%

Hopefully that wasn't entirely wrong. Onto the "drill-down" equation. It's the same thing fundamentally, but each variable has a bunch of sub variables now. We'll use the same game and scenario as our example. Again, all FG and FTs I'm referring to are first basket attempts. edit: I do have a separate route of code for if a specific basket is selected, but i'm already yappin enough so i'll leave the explanation of it out as it's not relevant in this example.

First Basket Implied Probability
= (PlayerImplied% * TipWin%) + ((BallBack% * PlayerImplied%) * (1 - TipWin%))

PlayerImplied% = (p * .8) + (opD * .2)
p = (Player FT% * (Player FTA/Team FTA) * (Team FTA/Team total Attempts))
+ (Player FG2% * (Player FG2A/Team FG2A) * (Team FG2A/Team total Attempts))
+ (Player FG3% * (Player FG3A/Team FG3A) * (Team FG3A/Team total Attempts))
opD = (against Opponent FT% * (Opponent FTA allowed/Opponent total Attempts allowed))
+ (against Opponent FG2% * (Opponent FGA allowed/Opponent total Attempts allowed))
+ (against Opponent FG3% * (Opponent FG3A allowed/Opponent total Attempts allowed))

TipWin% = (Team Center's Tip Win% * weight) + (Opponent Center's Tip Loss% * (1 - weight))
weight = Team Center's total Tips / (Team Center's total Tips + Opponent Center's total Tips)

BallBack% = (teamD * .8) + (opOff * .3)
teamD = (Team forced FT Miss% * (Team FTA allowed/Team total Attempts allowed))
+ (Team forced FG2 Miss% * (Team FG2A allowed/Team total Attempts allowed))
+ (Team forced FG3 Miss% * (Team FG3A allowed/Team total Attempts allowed))
opOff = (Opponent FT miss% * (Opponent FTA/Opponent total Attempts))
+ (Opponent FG2 Miss * (Opponent FG2A/Opponent total Attempts))
+ (Opponent FG3 Miss * (Opponent FG3A/Opponent total Attempts))

This one will take a lot of yappin but let's get it. Start with PlayerImplied%

Jaylen is 1/5 on FG2 and 2/2 of FG3s; 7 total attempts. Celtics have 0 FTA, 11 FG2A, and 15 FG3A; 26 total attempts. The Grizzlies have allowed 0 FTA, 9 FG2As and 10 FG3A; 19 total allowed attempts. The Grizz opponents are shooting 5/9 from 2 and 3/10 from deep against them; 8/19 total.

p = (0 * 0 * 0) + (1/5 * 5/11 * 11/26) + (2/2 * 2/15 * 15/26)
= 0 + (.2 * .455 * .423) + (1 * .133 * .423)
= .0385 + .0769
= .1154 = 11.54%

opD = (0 * 0) + (5/9 * 9/19) + (3/10 * 10/19) This value is the opponents odds of allowing a basket
= 0 + (.56 * .4737) + (.3 * .5263)
= .2653 + .1579
= .4232 = 42.32%

PlayerImplied% = (.1154 * .8) + (.4232 * .2) = .1769 = 17.69%

Now onward to TipWin%. Same variables as before from up there, but i will repeat. I've selected Kristaps and JJJ as our centers. KP is 1-3 and JJJ is 11-3.

weight = 4 / (4 + 14) = 4/18 = .2222 = 22.22%

TipWin% = (1/4 * .2222) + (3/14 * (1 - .2222)
= (.25 * .2222) + (.2143 * .7778)
= .0556 + .1667 = .2223 = 22.23%
side note - that's weird... i did not expect it to equal the weight...

And finally...BallBack%! Remember, the Cs are allowing 12/32 first baskets and the Grizzlies are shooting 15/35. The Celtics have allowed 2 FTAs, 20 FG2As and 10 FG3As. Their opponents have missed 0, 11 and 9 respectively. Simplified, opponents are 2/2 on FTs, 9/20 on FG2s and 1/10 on FG3s against the Celtics.

The Grizzlies have 1 FTA, 17 FG2As and 17 FG3As. We'll be looking at their miss %, so 0/1, 7/17, and 13/17 respectively.

teamD = 0 + (11/20 * 20/32) + (9/10 * 10/32)
= (.55 * .625) + (.9 * .3125)
= .3438 + .2813
= .625 = 62.5%

opOff = 0 + (7/17 * 17/35) + (13/17 * 17/35)
= (.4117 * .4857) + (.7647 * .4857)
= .2 + .3714 (im rounding up .199999999)
= 0.571 = 57.14%

BallBack% = (.625 * .7) + (.5714 * .3)
= .4375 + .17142
= .6089 = 60.89%

let's put this all together, goodness that was a wall of text, apologies and thank you if you're still with me.
First Basket Implied% = (PlayerImplied% * TipWin%) + ((BallBack% * PlayerImplied%) * (1 - TipWin%))

(.1769 * .2223) + ((.6089 * .1769) * (1 - .2223))
= .0392 + (.1075 * .7777)
= .0392 + .0836
= .1228 = 12.28%

So the first equation got me 7.98%, while the second equation got me 12.28%. While i would love to see bigger numbers, I'm not quite sure what to make of such a large difference. Of course the differences vary by scenario, but i feel as the second equation is overstating each player's percentage at making the first basket. There are probably some rounding errors in this post as for some of the calculations i was just using a calculator, and others were taken straight from when i was debugging my code that generates this, shouldn't be much of a margin of error in that department.

Please let me know if you have any thoughts or feedback , or also if you have any scenarios you want me to plug in. Again, if you made it here, thank you!

11 Comments
2024/12/07
23:22 UTC

2

Analyzing NBA Players' Average Points in the First 3 Minutes

wassup everyone,

I’m working on a project to analyze NBA players' performance, specifically looking at their average points scored within the first 3 minutes of a game. I’m using data from Kaggle and would appreciate some help figuring out the best way to calculate this.

Here’s what I have so far:

  • I’ve downloaded player data, but I’m having trouble isolating the stats for just the first 3 minutes of each game.
  • I'm using R Studio, and I’m not sure how to approach extracting and aggregating the data specifically for this time frame.

If anyone has experience with similar analyses or knows how to filter data for this specific metric, I’d love to hear your thoughts and suggestions!

Thanks in advance!

9 Comments
2024/12/05
06:00 UTC

3

Predicting Rebound Chances before 2013

I'm working on a project to determine the best rebounders since 2000. The NBA player tracking stats ( https://www.nba.com/stats/players/rebounding ) include a neat statistic called "Rebound Chances" dating back to 2013-14. From that season onward, I have been able to analyze the best and worst "rebounders above average" dividing rebounds by rebound chances.

Is there any way I can estimate rebound chances for players over the prior 13 seasons? I've developed a couple of regression models, but the errors, especially for the top rebounders, have been too large for my liking. I appreciate any ideas.

2 Comments
2024/11/29
23:17 UTC

3

Basketball related SaaS?

Hi all,

As the title says, does anyone have experience or success in SaaS within the sports industry?

I’ve been in SaaS for 8 years, working across different areas with experience in growth, product, marketing, and data. While I’ve enjoyed it, I haven’t yet found a product I’m truly passionate about.

I’m really into sports, especially basketball, and I feel like my skills could fit well in sports tech. I focus on full-funnel growth - customer journeys, experiments, optimizing onboarding, improving retention, refining pricing strategies, driving user acquisition, and more

Has anyone worked in the sports space? Whether it’s analytics, fantasy, or something else, I’d love to hear your experiences or recommendations. Thanks!

4 Comments
2024/11/27
16:55 UTC

11

Injury counts this season compared to previous seasons

Injury announcements this season felt much more excessive than previous years. Because of this feeling, I wanted to understand if there really was a difference, and how big it was if it existed.

I obtained injuries for the last twelve years and compared the weekly average to weekly injury counts this season, so far. Week four this season had 161 individual announcements, which, compared to previous years average of around 114, is substansial.

Note - I use the word "around" because I'm using loess regression to smooth & approximate a distribution, as oppose to calculating the mean.

8 Comments
2024/11/27
07:56 UTC

3

Player ELO scores

Maybe this group can help out. I have been wondering if, similar to chess, it is possible to compute ELO ratings for players in the NBA.

Starting from the simple premise that the only thing that matters to win in Basketball is points, players increase their elo if they are on the field when their team makes points, decrease if the opposing team makes points. Their elo increases more if they play against players who have high elo scores and if they play with players who have low elo scores.

Individual stats like points made, assists, etc. as well as the final score of the game do not directly influence the score for a player.

It's basically a refinement +/-, but the ELO for a player is influenced by who they are on the field with, both in their own and in the opposing team. This means that a player with a negative +/- can still have a good score if he lifts the performance of his team enough compared to when he is not on the court.

Running a simple script on the play by play data for the 2023/2024 season, I got this ranking (I am only listing players who were part of at least 2000 point events during the season). Scores for all players of the GSW are also below.

My hunch is that something like this has been tried before, but I was not able to find it online.
Any thoughts are welcome. If you have links to related work, that would be great.

Rank|Player|team|ELO|PlusMinus|

1|Jalen Brunson|NYK|164.989|523|
2|Domantas Sabonis|SAC|144.153|85|
3|Joel Embiid|PHI|131.135|311|
4|Stephen Curry|GSW|128.244|167|
5|Donovan Mitchell|CLE|123.504|324|
6|Paul George|LAC|120.999|435|
7|Nikola Jokic|DEN|120.41|693|
8|Luka Doncic|DAL|110.306|416|
9|Kyrie Irving|DAL|107.632|390|
10|Shai Gilgeous-Alexander|OKC|106.825|669|
11|OG Anunoby|NYK|105.868|392|
12|Bogdan Bogdanovic|ATL|103.405|124|
13|Sam Hauser|BOS|93.0653|582|
14|Fred VanVleet|HOU|89.457|183|
15|Anthony Edwards|MIN|87.967|503|
16|D'Angelo Russell|LAL|86.2392|239|
17|Franz Wagner|ORL|85.7235|234|
18|Jimmy Butler|MIA|84.7006|214|
19|Dereck Lively II|DAL|83.2346|242|
20|Rudy Gobert|MIN|82.3989|506|
21|Jose Alvarado|NOP|81.9439|240|
22|Josh Giddey|OKC|81.8426|416|
23|Victor Wembanyama|SAS|81.3141|-142|
24|Tyrese Maxey|PHI|80.3999|295|
25|Tyrese Haliburton|IND|80.3552|334|
26|LeBron James|LAL|79.9548|239|
27|Andre Drummond|CHI|79.2179|31|
28|Isaiah Joe|OKC|77.1368|364|
29|Deandre Ayton|POR|75.6101|-319|
30|Lauri Markkanen|UTA|75.1353|27|
31|Alperen Sengun|HOU|74.8062|49|
32|De'Aaron Fox|SAC|73.802|249|
33|Norman Powell|LAC|72.6751|189|
34|Al Horford|BOS|72.0673|563|
35|Jalen Williams|OKC|71.559|449|
36|Darius Garland|CLE|71.0129|37|
37|Maxi Kleber|DAL|70.8659|137|
38|Giannis Antetokounmpo|MIL|70.5533|337|
39|Derrick White|BOS|68.5541|688|
40|Jayson Tatum|BOS|65.2705|757|

|| || |Player|team|ELO|PlusMinus|

Stephen Curry|GSW|128.244|167|
Brandin Podziemski|GSW|60.9323|262|
Chris Paul|GSW|42.639|95|
Kevon Looney|GSW|36.7251|38|
Moses Moody|GSW|24.3353|99|
Draymond Green|GSW|14.0305|149|
Klay Thompson|GSW|11.1006|-3| |
Jonathan Kuminga|GSW|-24.3173|103| |
Dario Saric|GSW|-38.9761|-22| |
Trayce Jackson-Davis|GSW|-46.9671|22| |
Andrew Wiggins|GSW|-50.7264|-84|

4 Comments
2024/11/26
19:03 UTC

4

Jared McCain is 1 of only 3 starters in the NBA averaging at least 25 PPG, 5 APG, and 4 3PM, and he’s the only rookie doing it. He's also averaging 26 PPG as a starting rookie, which is the most since Michael Jordan. Finally, he's leading the Sixers in points and threes. (Per StatMuse)

0 Comments
2024/11/23
16:23 UTC

30

yooooooo found a few new NBA endpoints! Plus all the endpoints I use currently

I havent found a new endpoint in forever, but here they are in all their glory:

Daily Lineups: https://stats.nba.com/js/data/leaders/00_daily_lineups_20241121.json
It looks like you can replace that date with anything less than or equal to today's date.

Player Transactions: https://stats.nba.com/js/data/playermovement/NBA_Player_Movement.json

edit: I just found the full regular and preseason schedule as well! https://cdn.nba.com/static/json/staticData/scheduleLeagueV2_1.json

My already known endpoints are:

Gambling Odds: https://cdn.nba.com/static/json/liveData/odds/odds_todaysGames.json
Today's Scoreboard (12pm EST refresh): https://cdn.nba.com/static/json/liveData/scoreboard/todaysScoreboard_00.json
*Play by Play: https://cdn.nba.com/static/json/liveData/playbyplay/playbyplay_0022400247.json
*Box Score: https://cdn.nba.com/static/json/liveData/boxscore/boxscore_0022400247.json
*Playoff Picture: https://stats.nba.com/stats/playoffbracket?LeagueID=00&SeasonYear=2024&State=0
https://stats.nba.com/stats/playoffbracket?LeagueID=00&SeasonYear=2024&State=1
https://stats.nba.com/stats/playoffbracket?LeagueID=00&SeasonYear=2024&State=2
Broadcasts: https://cdn.nba.com/static/json/liveData/channels/v2/channels_00.json

*Box Score and Play By Play: Replace 22400247 with game_id of desired game For example, to view the Cavs/Celtics game from the other night, replace with 22400021. In most cases, the game_id will be between 2__00001 and - 2__01230. replace __ with the last two digits of the year the season ends in (24, 23, 22, etc...). These two endpoints only go back to 2019-2020 I believe.
Some examples:
https://cdn.nba.com/static/json/liveData/boxscore/boxscore_0022400176.json
https://cdn.nba.com/static/json/liveData/playbyplay/playbyplay_0022400196.json
2023: https://cdn.nba.com/static/json/liveData/boxscore/boxscore_0022301170.json
2022: https://cdn.nba.com/static/json/liveData/playbyplay/playbyplay_0022200879.json

*Playoff Picture: It's been a minute since i looked at these, so i can't quite recall the difference, but i know one stores the play-in data and one doesn't. State 2 shows everything, but not the play-in if i'm remembering correctly. These three endpoints will return data from 1970 to now.

If you have any questions, i'll try to answer best i can!

10 Comments
2024/11/21
22:40 UTC

2

Mann v Lamelo Analytics

After watching Hornets games, it's my suspicion Mann plays better than Lamelo at PG. He has a better +/- and actively looks to get his teammates involved before trying to score. He's defence is also better than Lamelo's. Was hoping if members of this forum could help me generate analytics on this. I could be wrong, but my theory is that if Hornets trade lamelo, they will become a playoff contender and a genuine threat in the East.

1 Comment
2024/11/20
04:38 UTC

5

NBA Future Analytics Stars program

Hi everyone!

I'm pretty new to Reddit, so please excuse me if this is not the correct forum to post the question.

I know that last year the NBA organized the NBA Future Analytics Stars program, and I was wondering if they will run it this year too. Some months ago I checked their website:

https://pages.beamery.com/nbateamcareers/page/nba-future-analytics-stars-program

and it said that the application would be open from Nov 11 to Nov 22, but some weeks ago this info disappeared and now it doesn't say anything.

Does anyone here have some further information about the program, has the NBA discontinued it?

Thanks in advance!

0 Comments
2024/11/15
10:15 UTC

5

Where to find live in-game win probability charts?

Hello. I'd love to know where I can find live in-game win probability CHARTS of NBA games. I know there are sites that have those charts like 24 hours after a game is done. But I am looking for LIVE charts in-game (something like what baseball has with fangraphs or baseball savant.....but better because their chart UIs stink), NOT charts after the game is over. Extra points if there is a place that has live in-game charts that are log. Thanks

4 Comments
2024/11/15
00:19 UTC

7

Any way to get "with or without" stats of combinations of players?

For NHL there are sites where you can pick multiple players from a team and it would show you the stats when they play together, when they play without one of those players, without two of those players and so on. Is there a way to view this for NBA?

For example: This site for NHL

4 Comments
2024/11/13
09:34 UTC

6

Dashboard to view player shots over different seasons (1996 to now) with different situations, locations, shot types, etc in 3D.

Here's an example:

https://preview.redd.it/bupb3ewnuc0e1.png?width=3016&format=png&auto=webp&s=3dd1b712a5c047e9f524209b972c256abfe70a2e

There are a bunch of filters to and some other graphs below to view some trends and tendencies.

https://nbashotanalysis.streamlit.app/

4 Comments
2024/11/11
23:16 UTC

10

Quick dashboard and article I put together looking at who was clutch last year. Very subjective but an interesting way to look at things.

2 Comments
2024/11/08
15:55 UTC

4

Do we have any reliable RAPTOR formulation available?

RAPTOR from 538 was one of my favorite advanced metrics and I know there are guys like Neil Pane who have created similar models, I just wonder if there is a formulation availble so we could maybe rebuild it?

Cheers

5 Comments
2024/11/04
07:05 UTC

12

Lineup Optimization

Source: https://www.vacstats.com/lab

Hello Statheads,

I'm sharing this web app I created because it's a fun way to put together hypothetical lineups (e.g. Curry on the Lakers or a team of all Jokics), but I also have a larger idea for those interested.

Ultimately I would love to create a 538/kenpom-type of site for NBA, and am interested in anyone who would like "join forces" or add onto what currently exists in this base site.

If anyone is interested, send a DM!

Also, feedback is appreciated if you feel so inclined to provide any.

5 Comments
2024/10/28
19:57 UTC

2

Dumksandthrees new dashboard

The new dashboard has a predictive section and has data for this year, but when you look at the shooting day from game 1 it completely doesn't align. So are the current 24/25 data just smoothed 23/24 data? And the single game epm is uncorrected info, thus not indicative of game stats? https://dunksandthrees.com/epm

6 Comments
2024/10/25
19:37 UTC

6

Pushed NBA stata/betting model site live, would love feedback

Hi everyone. We finally have the basic features for www.sharpsresearch.com live

Its pretty bare-bones at the moment, with a lot of stuff we are still working on.

Right now it has 4 features when viewing a match

Moneyline prediction

  • basic prediction on who will win the game. We trained the model with 13 features on 2008 - present games.

Starting lineup strengths

  • We trained a bunch of models on starting lineups. We used the regression coefficients of the top 5 features from the models and multiplied and summed them up for each player.

Similarity search

  • This is pretty cool. We scan all the historical games, and look for the 10 most similar games to the matchup that is loaded. Its basically a cosine similarity + k-nearest neighbours algo

Daily updated NBA elos (/nba/datasets).

  • Our own engineered Elo.

Right now im working on

  • o/u models
  • spreads
  • model breakdowns (so users can see the calibration, confusion matrix etc)

Thanks for the community here. There iv definitely learned from a few of you.

6 Comments
2024/10/23
01:51 UTC

2

Ahead of the Game: Intro to Points Over Expected

Hey yall, I had Jackson McGuire on my podcast today to talk about his new player metric, Points Over Expected. What sets this metric apart from most others is that it takes into account how much a player is being paid. Hope yall enjoy

0 Comments
2024/10/20
19:00 UTC

6

Why do the advanced metrics hate Jalen Johnson?

JJ is projected by lots of outlets to be on the MIP radar. The Greek chorus are basically unanimous in that he is good at basketball. You can watch him play and see he is good at lots of things. So my question why do BPM, LEBRON, EPM, literally any metric that tries better estimate performance see him as fine. What's the deal?

19 Comments
2024/10/20
15:36 UTC

5

Beginner coder

Hey everyone! I really want to start on some nba analytics projects on my own but it seems the best way to go about this is coding using API instead of copy and pasting into excel. Realistically how long with it take me to get a basic understanding of coding so I can start to mess around with stats and have some fun.

3 Comments
2024/10/16
02:20 UTC

4

Question about calculation of PER and PIE

Hello, everybody,

I know these advanced stats are far from perfect, but that's ok, I'm just playing with them as part of creating stats for smaller European competitions.

The PER calculation is made up of several parts, I'll try to simplify it.

uPER = (1 / MP) *

[ 3P

  • (2/3) * AST
  • (2 - factor * (team_AST / team_FG)) * FG
  • (FT * 0.5 * (1 + (1 - (team_AST / team_FG)) + (2/3) * (team_AST / team_FG)))
  • VOP * TOV
  • VOP * DRB% * (FGA - FG)
  • VOP * 0.44 * (0.44 + (0.56 * DRB%)) * (FTA - FT)
  • VOP * (1 - DRB%) * (TRB - ORB)
  • VOP * DRB% * ORB
  • VOP * STL
  • VOP * DRB% * BLK
  • PF * ((lg_FT / lg_PF) - 0.44 * (lg_FTA / lg_PF) * VOP) ]

I want to ask what does some of these parameters exactly mean in this formula:

  • team_AST - is this the total number of ASTs for a particular team for the entire season, or is it the team's average per game?
  • lg_FT - is this the total number of FTs of all teams in the league, or is it an average per game or an average per team?
  • Parameters for players (FGA / ORB...) - I assume this is a player's total for the whole season?

Then a question regarding PIE, where is this formula:

(PTS + FGM + FTM - FGA - FTA + DREB + (.5 * OREB) + AST + STL + (.5 * BLK) - PF - TO) / (GmPTS + GmFGM + GmFGM + GmFTM - GmFGA - GmFTA + GmDREB + (.5 * GmOREB) + GmAST + GmSTL + (.5 * GmBLK) - GmPF - GmTO)

  • The PTS figure is the total number of points a player scored in a season?
  • The figure for Gm (e.g. GmPTS) means what exactly? The average number of points per game?

Thank you for your help.

9 Comments
2024/10/06
16:56 UTC

12

NBA Dataset to run SQL queries

Is there an NBA dataset available to run sql queries on? The one on kaggle by Wyatt doesn't seem up to date, unless I'm doing something wrong there. Thanks!

7 Comments
2024/10/02
20:34 UTC

2

Player Impact Estimate for download?

Does anyone know where you can download PIE (Player Impact Estimate) for individual players? Doesn't seem to be on basketball-reference

13 Comments
2024/09/27
07:15 UTC

25

Do NBA Draft Combine Metrics Predict NBA Success?

edit: looks like the pics/visualizations aren’t showing up in this post on mobile for some reason, but you can see them here: https://www.formulabot.com/blog/do-nba-draft-combine-metrics-predict-nba-success

Kevin Durant, one of the greatest scorers in NBA history, famously couldn't put up a single rep of 185 on the bench at the combine. That begs the question--do combine metrics matter? Do they meaningfully predict NBA success in any way?

^(Spoiler alert: not really)

Methodology

Data collection:

  • Combine metrics: I used Python to scrape combine results from 2000-2023 from NBA.com, narrowing down the metrics to max vertical leap, lane agility time, three-quarter court sprint, and bench press. I also wanted to include height and weight, so I calculated height and weight ratios to adjust for height confounding.
  • NBA success: I decided to operationalize NBA "success" via Bball Index's all-in-one advanced impact metrics, LEBRON, which is further broken down into O-LEBRON and D-LEBRON for offensive and defensive impact, respectively. I scraped all 3 in R to use as outcome variables in my analyses.
  • Data pre-processing was conducted in R.

Analyses:

  • I ran linear regression analyses predicting all 3 outcomes from all 6 combine metrics individually (total of 18 models)
  • I then broke down each analysis by position for a total of 90 models.
  • I also ran a random forest model predicting the 3 outcomes from all 6 combine metrics combined.
  • All analyses were conducted using Formula Bot's chat feature. You can view the chat log here.

Results

Linear regression analyses (all positions):

After adjusting for multiple comparisons, only D-LEBRON was significantly associated with select metrics:

https://preview.redd.it/don1ir7rdsqd1.png?width=1300&format=png&auto=webp&s=f828029d6523e1032f68f091e6c7d3546c26c8f1

Surprisingly, vertical leap was negatively associated with D-LEBRON while slower lane agility and three-quarter court sprint times were associated with D-LEBRON.

Linear regression analyses (by position):

After adjusting for multiple comparisons, no single regression was significant due to small sample sizes and low statistical power.

https://preview.redd.it/ta6j9vwvesqd1.png?width=1300&format=png&auto=webp&s=eb215911111279f7e8ec4370fc248817c3c269dc

But if we ignore multiple comparison adjustments, there were some interesting significant effects:

  • Three-quarter court sprint time was negatively associated with both LEBRON and O-LEBRON (i.e., quicker times, higher LEBRON) for point guards only (not pictured above). The effect size for O-LEBRON was the largest in our entire dataset at -0.38.

  • Wingspan ratio was positively associated with D-LEBRON for power forwards and especially centers. The effect size for centers was 0.14, which was larger than the effect for any other position.

Here's a more in-depth visualization of the latter effect:

https://preview.redd.it/8lb3fpt8fsqd1.png?width=1300&format=png&auto=webp&s=b8828094cd7a16f8132c97db655e51938fbc17ef

Random forest models:

https://preview.redd.it/fbkoz7kdfsqd1.png?width=1228&format=png&auto=webp&s=9edde8eccaa8a66a64464e51b68e920be7c32bec

The LEBRON and O-LEBRON models were terrible fits (i.e., no meaningful prediction), but the D-LEBRON model had a decent fit, with all 6 combine metrics collectively explaining around 8% of the variance in defensive impact.

Takeaways

  • For offense, three-quarter sprint speed is the only metric that might reliably translate to NBA success—but only for point guards.
  • For defense, all metrics combined provide a little bit of predictive utility, explaining about 8% of the total variance in D-LEBRON.
    • Looking at the metrics individually, slow lane agility times and a high weight ratio seem to be the most important overall for D-LEBRON, although there are inconsistent effects (some positive, some negative) depending on position.
  • Wingspan ratio is the only metric with a consistent positive association with D-LEBRON across all positions. The effect is especially pronounced for centers.

A more in-depth write-up of my analyses and findings is available here: https://www.formulabot.com/blog/do-nba-draft-combine-metrics-predict-nba-success

5 Comments
2024/09/24
17:02 UTC

Back To Top