/r/NBAanalytics
For NBA Statheads.
/r/NBAanalytics
TL;DR: How would extending the 3PT line and removing corner threes affect the analytical approach to the game, and would it actually improve the viewing experience?
The idea of extending the three-point line isn’t new, but it feels more relevant than ever. TV ratings are down, and many fans have grown frustrated with the high volume of threes in today’s game. For a long time, I thought extending the line was a shortsighted or overly simplistic solution, but I’ve started to reconsider.
If the NBA were to extend the three-point line, the corner three would likely end up out of bounds unless the court is widened. Alternatively, the corner three could simply be removed. Let’s assume for this discussion that the line is extended far enough (perhaps to 25 feet) to maintain its value without being overly enticing, and that corner threes are eliminated.
In such a league, what might change? Would we see a resurgence of more traditional roles, such as power forwards operating as a second big like Duncan beside Robinson? Would shot variety truly increase, increasing the value of plays like post-ups or midrange isolations?
One of my main concerns with the current game is the lack of shot diversity and the significant impact of shooting variance. It can be deflating to see a team lose purely because of poor shooting luck, as somewhat happened with OKC in the NBA Cup finals. While shooting variance has always been a part of the game’s charm, I’m not sure I want a league so heavily dictated by it.
Another concern is the potential link between today’s extreme spacing and the rise in player injuries. The constant movement and reliance on spacing might be putting undue strain on players. (Note that this is hypothetical and I am not a qualified expert on this matter.)
I grew up in the pace-and-space era, so I fully appreciate the appeal of the long ball. However, I wonder if we’ve reached a point where the emphasis on three-point shooting is diminishing the overall experience. A change like this could address some of these concerns, but I’d love to hear what others—especially from the stats and analytics community—think.
Hello everyone I wanted to play with the yahoo fantasy api just to see what I can make as a fun project to sharpen my JavaScript and python skills. Unfortunately I can’t gain access to the API itself . I have watched YouTube tutorials and followed Uber fast man wrapper instructions to no avail. Basically I follow all instructions and everything seems to go well but the login to authenticates myself never pops up for me. I have gotten errors on errors and even tried the docker solution to uberfastman wrapper to no luck. Please any advice that anyone can give would be greatly appreciated. Any more info needed I would gladly share . Thank you in advance.
I've watched videos even, and they don't want to just summarize the units used. I'm imagining that it's "the difference in the points the team scores that season, with and without the player on the court, adjusted to account for 100 possessions." The problem is that the video I saw seemed to say that this isn't it, because the narrator was unhappy that this allowed starter to look better because he plays mostly with the starters, and other starters are also missing from the court at the same time. Of course the team suffers with the substitutes playing, and all the starters look wonderful overall. Maybe I'm misunderstanding even that.
Hello, Has anyone in this sub landed a internship or any job in the sports industry (preferably NBA) as data scientist or basketball analytics assistant or something among those roles on the operations side (not the business side) that is willing to share their resume or link some of their projects that help land the job? I’m trying to strengthen my resume to help me get some call backs .
Hey guys, would like to start by saying I am absolutely no mathematician, if i'm just way off, please let me know. Also, when I refer to any sort of Field Goal, it's a first basket attempt. If the FG is not a first basket attempt, it's not factored in at all. To simplify, both equations are technically the same, but with one having more inputs, I'll start with the smaller one.
First Basket Implied Probability = p(c) + ((b * p)(1- c))
p = (Player total FGA / Team total FGA) * Player FG%. Player Implied Probability
c = (Team Center's Tip Win % + Opponent Center's Tip Loss %) / 2. Tip Win Rate
b = (Opponent FG Miss % + Team Defensive Stop %) / 2. Ball Back Chance
Let's use Jaylen Brown's chance to score first basket against the grizzlies this evening, no specific shot value.
Jaylen has taken 7 total attempts to the Celtics 26, making 3 out of his 7 and the C's 26 total attempts.
p = (7/26) * 0.42857 = .1154 = 11.54%
I've selected Kristaps and JJJ as our centers. KP is 1-3 and JJJ is 11-3.
c = (1/4 + 3/14) / 2
= (.25 + .2143) / 2
= .2321 = 23.21%
The C's are only allowing 12/32 first basket attempts, while the Grizzlies are shooting 15/35.
b = (20/35 + 20/32) / 2
= (.5714 + .625) / 2
= .5982 = 59.82%
so First Basket Implied Probability = .1154(.2321) + ((.5982 * .1154)(1 - .2321))
= .0268 + (.069 * .7679)
= .0268 + .053
= .0798 = 7.98%
Hopefully that wasn't entirely wrong. Onto the "drill-down" equation. It's the same thing fundamentally, but each variable has a bunch of sub variables now. We'll use the same game and scenario as our example. Again, all FG and FTs I'm referring to are first basket attempts. edit: I do have a separate route of code for if a specific basket is selected, but i'm already yappin enough so i'll leave the explanation of it out as it's not relevant in this example.
First Basket Implied Probability
= (PlayerImplied% * TipWin%) + ((BallBack% * PlayerImplied%) * (1 - TipWin%))
PlayerImplied% = (p * .8) + (opD * .2)
p = (Player FT% * (Player FTA/Team FTA) * (Team FTA/Team total Attempts))
+ (Player FG2% * (Player FG2A/Team FG2A) * (Team FG2A/Team total Attempts))
+ (Player FG3% * (Player FG3A/Team FG3A) * (Team FG3A/Team total Attempts))
opD = (against Opponent FT% * (Opponent FTA allowed/Opponent total Attempts allowed))
+ (against Opponent FG2% * (Opponent FGA allowed/Opponent total Attempts allowed))
+ (against Opponent FG3% * (Opponent FG3A allowed/Opponent total Attempts allowed))
TipWin% = (Team Center's Tip Win% * weight) + (Opponent Center's Tip Loss% * (1 - weight))
weight = Team Center's total Tips / (Team Center's total Tips + Opponent Center's total Tips)
BallBack% = (teamD * .8) + (opOff * .3)
teamD = (Team forced FT Miss% * (Team FTA allowed/Team total Attempts allowed))
+ (Team forced FG2 Miss% * (Team FG2A allowed/Team total Attempts allowed))
+ (Team forced FG3 Miss% * (Team FG3A allowed/Team total Attempts allowed))
opOff = (Opponent FT miss% * (Opponent FTA/Opponent total Attempts))
+ (Opponent FG2 Miss * (Opponent FG2A/Opponent total Attempts))
+ (Opponent FG3 Miss * (Opponent FG3A/Opponent total Attempts))
This one will take a lot of yappin but let's get it. Start with PlayerImplied%
Jaylen is 1/5 on FG2 and 2/2 of FG3s; 7 total attempts. Celtics have 0 FTA, 11 FG2A, and 15 FG3A; 26 total attempts. The Grizzlies have allowed 0 FTA, 9 FG2As and 10 FG3A; 19 total allowed attempts. The Grizz opponents are shooting 5/9 from 2 and 3/10 from deep against them; 8/19 total.
p = (0 * 0 * 0) + (1/5 * 5/11 * 11/26) + (2/2 * 2/15 * 15/26)
= 0 + (.2 * .455 * .423) + (1 * .133 * .423)
= .0385 + .0769
= .1154 = 11.54%
opD = (0 * 0) + (5/9 * 9/19) + (3/10 * 10/19) This value is the opponents odds of allowing a basket
= 0 + (.56 * .4737) + (.3 * .5263)
= .2653 + .1579
= .4232 = 42.32%
PlayerImplied% = (.1154 * .8) + (.4232 * .2) = .1769 = 17.69%
Now onward to TipWin%. Same variables as before from up there, but i will repeat. I've selected Kristaps and JJJ as our centers. KP is 1-3 and JJJ is 11-3.
weight = 4 / (4 + 14) = 4/18 = .2222 = 22.22%
TipWin% = (1/4 * .2222) + (3/14 * (1 - .2222)
= (.25 * .2222) + (.2143 * .7778)
= .0556 + .1667 = .2223 = 22.23%
side note - that's weird... i did not expect it to equal the weight...
And finally...BallBack%! Remember, the Cs are allowing 12/32 first baskets and the Grizzlies are shooting 15/35. The Celtics have allowed 2 FTAs, 20 FG2As and 10 FG3As. Their opponents have missed 0, 11 and 9 respectively. Simplified, opponents are 2/2 on FTs, 9/20 on FG2s and 1/10 on FG3s against the Celtics.
The Grizzlies have 1 FTA, 17 FG2As and 17 FG3As. We'll be looking at their miss %, so 0/1, 7/17, and 13/17 respectively.
teamD = 0 + (11/20 * 20/32) + (9/10 * 10/32)
= (.55 * .625) + (.9 * .3125)
= .3438 + .2813
= .625 = 62.5%
opOff = 0 + (7/17 * 17/35) + (13/17 * 17/35)
= (.4117 * .4857) + (.7647 * .4857)
= .2 + .3714 (im rounding up .199999999)
= 0.571 = 57.14%
BallBack% = (.625 * .7) + (.5714 * .3)
= .4375 + .17142
= .6089 = 60.89%
let's put this all together, goodness that was a wall of text, apologies and thank you if you're still with me.
First Basket Implied% = (PlayerImplied% * TipWin%) + ((BallBack% * PlayerImplied%) * (1 - TipWin%))
(.1769 * .2223) + ((.6089 * .1769) * (1 - .2223))
= .0392 + (.1075 * .7777)
= .0392 + .0836
= .1228 = 12.28%
So the first equation got me 7.98%, while the second equation got me 12.28%. While i would love to see bigger numbers, I'm not quite sure what to make of such a large difference. Of course the differences vary by scenario, but i feel as the second equation is overstating each player's percentage at making the first basket. There are probably some rounding errors in this post as for some of the calculations i was just using a calculator, and others were taken straight from when i was debugging my code that generates this, shouldn't be much of a margin of error in that department.
Please let me know if you have any thoughts or feedback , or also if you have any scenarios you want me to plug in. Again, if you made it here, thank you!
wassup everyone,
I’m working on a project to analyze NBA players' performance, specifically looking at their average points scored within the first 3 minutes of a game. I’m using data from Kaggle and would appreciate some help figuring out the best way to calculate this.
Here’s what I have so far:
If anyone has experience with similar analyses or knows how to filter data for this specific metric, I’d love to hear your thoughts and suggestions!
Thanks in advance!
I'm working on a project to determine the best rebounders since 2000. The NBA player tracking stats ( https://www.nba.com/stats/players/rebounding ) include a neat statistic called "Rebound Chances" dating back to 2013-14. From that season onward, I have been able to analyze the best and worst "rebounders above average" dividing rebounds by rebound chances.
Is there any way I can estimate rebound chances for players over the prior 13 seasons? I've developed a couple of regression models, but the errors, especially for the top rebounders, have been too large for my liking. I appreciate any ideas.
Hi all,
As the title says, does anyone have experience or success in SaaS within the sports industry?
I’ve been in SaaS for 8 years, working across different areas with experience in growth, product, marketing, and data. While I’ve enjoyed it, I haven’t yet found a product I’m truly passionate about.
I’m really into sports, especially basketball, and I feel like my skills could fit well in sports tech. I focus on full-funnel growth - customer journeys, experiments, optimizing onboarding, improving retention, refining pricing strategies, driving user acquisition, and more
Has anyone worked in the sports space? Whether it’s analytics, fantasy, or something else, I’d love to hear your experiences or recommendations. Thanks!
Injury announcements this season felt much more excessive than previous years. Because of this feeling, I wanted to understand if there really was a difference, and how big it was if it existed.
I obtained injuries for the last twelve years and compared the weekly average to weekly injury counts this season, so far. Week four this season had 161 individual announcements, which, compared to previous years average of around 114, is substansial.
Note - I use the word "around" because I'm using loess regression to smooth & approximate a distribution, as oppose to calculating the mean.
Maybe this group can help out. I have been wondering if, similar to chess, it is possible to compute ELO ratings for players in the NBA.
Starting from the simple premise that the only thing that matters to win in Basketball is points, players increase their elo if they are on the field when their team makes points, decrease if the opposing team makes points. Their elo increases more if they play against players who have high elo scores and if they play with players who have low elo scores.
Individual stats like points made, assists, etc. as well as the final score of the game do not directly influence the score for a player.
It's basically a refinement +/-, but the ELO for a player is influenced by who they are on the field with, both in their own and in the opposing team. This means that a player with a negative +/- can still have a good score if he lifts the performance of his team enough compared to when he is not on the court.
Running a simple script on the play by play data for the 2023/2024 season, I got this ranking (I am only listing players who were part of at least 2000 point events during the season). Scores for all players of the GSW are also below.
My hunch is that something like this has been tried before, but I was not able to find it online.
Any thoughts are welcome. If you have links to related work, that would be great.
Rank|Player|team|ELO|PlusMinus|
1|Jalen Brunson|NYK|164.989|523|
2|Domantas Sabonis|SAC|144.153|85|
3|Joel Embiid|PHI|131.135|311|
4|Stephen Curry|GSW|128.244|167|
5|Donovan Mitchell|CLE|123.504|324|
6|Paul George|LAC|120.999|435|
7|Nikola Jokic|DEN|120.41|693|
8|Luka Doncic|DAL|110.306|416|
9|Kyrie Irving|DAL|107.632|390|
10|Shai Gilgeous-Alexander|OKC|106.825|669|
11|OG Anunoby|NYK|105.868|392|
12|Bogdan Bogdanovic|ATL|103.405|124|
13|Sam Hauser|BOS|93.0653|582|
14|Fred VanVleet|HOU|89.457|183|
15|Anthony Edwards|MIN|87.967|503|
16|D'Angelo Russell|LAL|86.2392|239|
17|Franz Wagner|ORL|85.7235|234|
18|Jimmy Butler|MIA|84.7006|214|
19|Dereck Lively II|DAL|83.2346|242|
20|Rudy Gobert|MIN|82.3989|506|
21|Jose Alvarado|NOP|81.9439|240|
22|Josh Giddey|OKC|81.8426|416|
23|Victor Wembanyama|SAS|81.3141|-142|
24|Tyrese Maxey|PHI|80.3999|295|
25|Tyrese Haliburton|IND|80.3552|334|
26|LeBron James|LAL|79.9548|239|
27|Andre Drummond|CHI|79.2179|31|
28|Isaiah Joe|OKC|77.1368|364|
29|Deandre Ayton|POR|75.6101|-319|
30|Lauri Markkanen|UTA|75.1353|27|
31|Alperen Sengun|HOU|74.8062|49|
32|De'Aaron Fox|SAC|73.802|249|
33|Norman Powell|LAC|72.6751|189|
34|Al Horford|BOS|72.0673|563|
35|Jalen Williams|OKC|71.559|449|
36|Darius Garland|CLE|71.0129|37|
37|Maxi Kleber|DAL|70.8659|137|
38|Giannis Antetokounmpo|MIL|70.5533|337|
39|Derrick White|BOS|68.5541|688|
40|Jayson Tatum|BOS|65.2705|757|
|| || |Player|team|ELO|PlusMinus|
Stephen Curry|GSW|128.244|167|
Brandin Podziemski|GSW|60.9323|262|
Chris Paul|GSW|42.639|95|
Kevon Looney|GSW|36.7251|38|
Moses Moody|GSW|24.3353|99|
Draymond Green|GSW|14.0305|149|
Klay Thompson|GSW|11.1006|-3| |
Jonathan Kuminga|GSW|-24.3173|103| |
Dario Saric|GSW|-38.9761|-22| |
Trayce Jackson-Davis|GSW|-46.9671|22| |
Andrew Wiggins|GSW|-50.7264|-84|
I havent found a new endpoint in forever, but here they are in all their glory:
Daily Lineups: https://stats.nba.com/js/data/leaders/00_daily_lineups_20241121.json
It looks like you can replace that date with anything less than or equal to today's date.
Player Transactions: https://stats.nba.com/js/data/playermovement/NBA_Player_Movement.json
edit: I just found the full regular and preseason schedule as well! https://cdn.nba.com/static/json/staticData/scheduleLeagueV2_1.json
My already known endpoints are:
Gambling Odds: https://cdn.nba.com/static/json/liveData/odds/odds_todaysGames.json
Today's Scoreboard (12pm EST refresh): https://cdn.nba.com/static/json/liveData/scoreboard/todaysScoreboard_00.json
*Play by Play: https://cdn.nba.com/static/json/liveData/playbyplay/playbyplay_0022400247.json
*Box Score: https://cdn.nba.com/static/json/liveData/boxscore/boxscore_0022400247.json
*Playoff Picture: https://stats.nba.com/stats/playoffbracket?LeagueID=00&SeasonYear=2024&State=0
https://stats.nba.com/stats/playoffbracket?LeagueID=00&SeasonYear=2024&State=1
https://stats.nba.com/stats/playoffbracket?LeagueID=00&SeasonYear=2024&State=2
Broadcasts: https://cdn.nba.com/static/json/liveData/channels/v2/channels_00.json
*Box Score and Play By Play: Replace 22400247 with game_id of desired game For example, to view the Cavs/Celtics game from the other night, replace with 22400021. In most cases, the game_id will be between 2__00001 and - 2__01230. replace __ with the last two digits of the year the season ends in (24, 23, 22, etc...). These two endpoints only go back to 2019-2020 I believe.
Some examples:
https://cdn.nba.com/static/json/liveData/boxscore/boxscore_0022400176.json
https://cdn.nba.com/static/json/liveData/playbyplay/playbyplay_0022400196.json
2023: https://cdn.nba.com/static/json/liveData/boxscore/boxscore_0022301170.json
2022: https://cdn.nba.com/static/json/liveData/playbyplay/playbyplay_0022200879.json
*Playoff Picture: It's been a minute since i looked at these, so i can't quite recall the difference, but i know one stores the play-in data and one doesn't. State 2 shows everything, but not the play-in if i'm remembering correctly. These three endpoints will return data from 1970 to now.
If you have any questions, i'll try to answer best i can!
After watching Hornets games, it's my suspicion Mann plays better than Lamelo at PG. He has a better +/- and actively looks to get his teammates involved before trying to score. He's defence is also better than Lamelo's. Was hoping if members of this forum could help me generate analytics on this. I could be wrong, but my theory is that if Hornets trade lamelo, they will become a playoff contender and a genuine threat in the East.
Hi everyone!
I'm pretty new to Reddit, so please excuse me if this is not the correct forum to post the question.
I know that last year the NBA organized the NBA Future Analytics Stars program, and I was wondering if they will run it this year too. Some months ago I checked their website:
https://pages.beamery.com/nbateamcareers/page/nba-future-analytics-stars-program
and it said that the application would be open from Nov 11 to Nov 22, but some weeks ago this info disappeared and now it doesn't say anything.
Does anyone here have some further information about the program, has the NBA discontinued it?
Thanks in advance!
Hello. I'd love to know where I can find live in-game win probability CHARTS of NBA games. I know there are sites that have those charts like 24 hours after a game is done. But I am looking for LIVE charts in-game (something like what baseball has with fangraphs or baseball savant.....but better because their chart UIs stink), NOT charts after the game is over. Extra points if there is a place that has live in-game charts that are log. Thanks
For NHL there are sites where you can pick multiple players from a team and it would show you the stats when they play together, when they play without one of those players, without two of those players and so on. Is there a way to view this for NBA?
For example: This site for NHL
Here's an example:
There are a bunch of filters to and some other graphs below to view some trends and tendencies.
RAPTOR from 538 was one of my favorite advanced metrics and I know there are guys like Neil Pane who have created similar models, I just wonder if there is a formulation availble so we could maybe rebuild it?
Cheers
Source: https://www.vacstats.com/lab
Hello Statheads,
I'm sharing this web app I created because it's a fun way to put together hypothetical lineups (e.g. Curry on the Lakers or a team of all Jokics), but I also have a larger idea for those interested.
Ultimately I would love to create a 538/kenpom-type of site for NBA, and am interested in anyone who would like "join forces" or add onto what currently exists in this base site.
If anyone is interested, send a DM!
Also, feedback is appreciated if you feel so inclined to provide any.
The new dashboard has a predictive section and has data for this year, but when you look at the shooting day from game 1 it completely doesn't align. So are the current 24/25 data just smoothed 23/24 data? And the single game epm is uncorrected info, thus not indicative of game stats? https://dunksandthrees.com/epm
Hi everyone. We finally have the basic features for www.sharpsresearch.com live
Its pretty bare-bones at the moment, with a lot of stuff we are still working on.
Right now it has 4 features when viewing a match
Moneyline prediction
Starting lineup strengths
Similarity search
Daily updated NBA elos (/nba/datasets).
Right now im working on
Thanks for the community here. There iv definitely learned from a few of you.
Hey yall, I had Jackson McGuire on my podcast today to talk about his new player metric, Points Over Expected. What sets this metric apart from most others is that it takes into account how much a player is being paid. Hope yall enjoy
JJ is projected by lots of outlets to be on the MIP radar. The Greek chorus are basically unanimous in that he is good at basketball. You can watch him play and see he is good at lots of things. So my question why do BPM, LEBRON, EPM, literally any metric that tries better estimate performance see him as fine. What's the deal?
Hey everyone! I really want to start on some nba analytics projects on my own but it seems the best way to go about this is coding using API instead of copy and pasting into excel. Realistically how long with it take me to get a basic understanding of coding so I can start to mess around with stats and have some fun.
Hello, everybody,
I know these advanced stats are far from perfect, but that's ok, I'm just playing with them as part of creating stats for smaller European competitions.
The PER calculation is made up of several parts, I'll try to simplify it.
uPER = (1 / MP) *
[ 3P
I want to ask what does some of these parameters exactly mean in this formula:
Then a question regarding PIE, where is this formula:
(PTS + FGM + FTM - FGA - FTA + DREB + (.5 * OREB) + AST + STL + (.5 * BLK) - PF - TO) / (GmPTS + GmFGM + GmFGM + GmFTM - GmFGA - GmFTA + GmDREB + (.5 * GmOREB) + GmAST + GmSTL + (.5 * GmBLK) - GmPF - GmTO)
Thank you for your help.
Is there an NBA dataset available to run sql queries on? The one on kaggle by Wyatt doesn't seem up to date, unless I'm doing something wrong there. Thanks!
Does anyone know where you can download PIE (Player Impact Estimate) for individual players? Doesn't seem to be on basketball-reference
edit: looks like the pics/visualizations aren’t showing up in this post on mobile for some reason, but you can see them here: https://www.formulabot.com/blog/do-nba-draft-combine-metrics-predict-nba-success
Kevin Durant, one of the greatest scorers in NBA history, famously couldn't put up a single rep of 185 on the bench at the combine. That begs the question--do combine metrics matter? Do they meaningfully predict NBA success in any way?
^(Spoiler alert: not really)
Data collection:
Analyses:
Linear regression analyses (all positions):
After adjusting for multiple comparisons, only D-LEBRON was significantly associated with select metrics:
Surprisingly, vertical leap was negatively associated with D-LEBRON while slower lane agility and three-quarter court sprint times were associated with D-LEBRON.
Linear regression analyses (by position):
After adjusting for multiple comparisons, no single regression was significant due to small sample sizes and low statistical power.
But if we ignore multiple comparison adjustments, there were some interesting significant effects:
Three-quarter court sprint time was negatively associated with both LEBRON and O-LEBRON (i.e., quicker times, higher LEBRON) for point guards only (not pictured above). The effect size for O-LEBRON was the largest in our entire dataset at -0.38.
Wingspan ratio was positively associated with D-LEBRON for power forwards and especially centers. The effect size for centers was 0.14, which was larger than the effect for any other position.
Here's a more in-depth visualization of the latter effect:
Random forest models:
The LEBRON and O-LEBRON models were terrible fits (i.e., no meaningful prediction), but the D-LEBRON model had a decent fit, with all 6 combine metrics collectively explaining around 8% of the variance in defensive impact.
A more in-depth write-up of my analyses and findings is available here: https://www.formulabot.com/blog/do-nba-draft-combine-metrics-predict-nba-success