/r/Sabermetrics

Photograph via snooOG

Sabermetrics is the search for objective knowledge about baseball.

Sabermetrics - The search for objective knowledge about baseball through the analysis of empirical evidence.

Sabermetrics Analysis
Baseball Prospectus
Beyond the Box Score
Fangraphs
Hardball Times
High Heat Stats
Tom Tango
Tango Tiger Wiki
Balls and Strikes
Baseball Think Factory
Baseball Analysts
The Physics of Baseball, Alan Nathan
Baseball HQ Research and Analysis
Sabermetrics 101: Introduction to Baseball Analytics
Data Sources
Retro Sheet
Sean Lahman Database
DingerDB
Fangraphs
Baseball Reference
Stat Corner
Baseball Heat Maps
Pitch F/X
Brooks Baseball Pitch f/x
Baseball Savant
TexasLeaguers
Books
The Book: Playing the Percentages in Baseball
The Hidden Game of Baseball
Baseball Between the Numbers
Extra Innings: More Baseball Between the Numbers
The Bill James Historical Baseball Abstract
Curve Ball
The Baseball Economist
The Numbers Game
The Extra 2% - Jonah Keri
Big Data Baseball
Dollar Sign on the Muscle
Analyzing Baseball Data with R
Baseball Hacks: Tips & Tools for Analyzing and Winning with Statistics
The Sabermetric Revolution: Assessing the Growth of Analytics in Baseball
Trading Bases
AL East AL Central AL West
Yankees Tigers Oakland
Orioles WhiteSox Rangers
Rays Royals Angels
Blue Jays Indians Mariners
Red Sox Twins Astros
NL East NL Central NL West
Nationals Reds Giants
Braves Cardinals Dodgers
Phillies Brewers D-Backs
Mets Pirates Padres
Marlins Cubs Rockies
Related Subreddits
/r/baseball
/r/baseballstats
/r/fantasybaseball
/r/sultansofstats
/r/sportsanalytics
/r/footballstrategy
/r/nflstatheads
Misc.
/r/Sabermetrics Weekly Stat Discussions
Reddit Markdown Primer - how to make charts, other stuff in reddit

/r/Sabermetrics

13,529 Subscribers

9

Estimating the cost of pitch tipping?

Is anyone familiar with any attempts to quantify the expected cost of pitch tipping? My group chat sent this tweet

https://x.com/jomboy_/status/1842062696847393120?s=46&t=WHf4nK-muUXyQhXDAWyXMA

And suggested Devin Williams got rocked because of this but after watching the video I remained a bit skeptical because it was so subtle. I watched the video in the first comment by Trevor May and he walks through David Bednar’s performance and thinks he was tipping his pitches (which I can get onboard with given the more visible changes and the continual steep drop in performance this year).

But for a one game blowup it does seem unlikely that Williams didn’t tip his pitches all year (or he did and teams didn’t pick up on it) until the Mets did in the postseason.

So I was trying to approximate the likelihood using Bednar’s change in expected ERA YoY to guesstimate the impact on performance and assess the relatively likelihoods but I was wondering if anyone else has done this more quantitatively and systematically.

1 Comment
2024/10/04
23:19 UTC

8

What Was Different About 2024?

So, over the summer, as an experiment, I tried to come up with a run prediction formula solely based on XBH. Without getting too technical, I assigned a value for 2B+3B, a value for HR, and a value to HR per 2B+3B. I didn't factor BB rate or exit velocity. I based my values solely on 2023 league averages.

Once I set this up, I went team by team for 2023, and found that my formula correlated with total runs by about 95.5 percent, almost identical to the "technical" Runs Created formula based on Bill James work, and was more predictive than OPS. I then tested my formula on every team in 2022, which lead to a 97.1% correlation, and every team in 2021, which ended up at 96.2%. While I haven't yet gone team-by-team prior to 2021, I tested it against league averages each year from 2010-2019, and this still produced correlation at 95.5%, so I had hope that I might be on to something.

However, when crunching team-by-team 2024 numbers, the James model resulted in its usual 96%, whereas my model suddenly dropped to 90%. Specifically, it tended to underrate good offenses and overrate bad ones by a much larger degree than the three previous years. So my question is: what was different about this season that could've lead to this result? What would've caused a 96% correlation based on 110 samples to dip to 90% in this year's 30 samples? When searching everything available on fangraphs, I wasn't noticing anything that seemed obviously different this season.

As an aside, have any of you tried a similar experiment? And if so, what did you find?

5 Comments
2024/10/03
18:49 UTC

4

WPA chart that has a log scale?

I was talking to friend re todays Mets Braves as compared to Royals A's in 2014 and visually comaparing the WPA charts, and I suggested that WPA charts would better show action if they were on a log chart, since, say, a 3 run homer in 1-0 game in the third inning would make the chart swing steeply from like 65% to 30% despite not really making for a "crazy" game
Anyone know how I can find something like that? Or maybe the best way to download csv/xcelof individual games' wpas so I can do it myself

3 Comments
2024/09/30
20:46 UTC

2

Where to find 80's splits?

Any sites to search for L/R batting splits for the 80's? Fangraphs only shows it on league-wide scale for 21st century players. BRef shows it for individual players, but can't find where to search for it on a league-wide scale either

Not a specifically sabermetric question, but I assumed this subreddit would be the better one to ask

Edit: To be more specific. I want to sort through players by splits (similar to how you can on Fangraphs for seasons the past 20 years)

3 Comments
2024/09/29
21:34 UTC

2

3D Pitch Trajectory

I was wondering if there was publicly available code to recreate a 3D pitch trajectory plot given Trackman data.

I've seen Scott Powers' work (https://github.com/saberpowers/predictive-pitch-score/blob/main/package/predpitchscore/R/get\_quadratic\_coef.R) and creating a dataframe for it, I just want to be able to plot it and have their trajectories.

0 Comments
2024/09/29
20:34 UTC

3

I created a new Stat for Relievers. What do you think of it? The Standard Relief Outing

1 Comment
2024/09/29
03:31 UTC

4

Introducing The PCV. I Created a new pitching stat for starting pitchers.

2 Comments
2024/09/29
01:02 UTC

16

Can someone explain why Judge Off is so much higher than Ohtani?

Noob sabermetrics enjoyer here. Let me start by saying in no way I'm bashing Judge; I think he is amazing.

I'm looking at fWAR. I was wondering if someone can point out why Judge Off value is 96.2, or 16.3 points higher than Ohtani, who is at 79.9. Off is computed adding Batting Runs + BsR. In the latter Ohtani crushes Judge (9.2 vs -0.5, the japanese is the second best baserunner in MLB), so this means that Batting Runs value for them is Ohtani 70.7 vs Judge 96.7!!! A difference of 26 points.

Now, of course there's a reason for it, it is math. I just want to understand better what counts for Batting Runs. is it this because of +4 HR, +14 RBI and +0.016 point of average? Or is there something else I'm missing?

PS: RBI are counted in Off? Or do they account in the computation that they strongly depend on teammates getting on base?

13 Comments
2024/09/28
15:59 UTC

2

Baseball Savant Help

It appears the rolling xwOBA charts for pitchers have been replaced by a "movement profiles" chart. I have been searching how to switch back or find the same charts that they used to post. does anyone know how to find these red/blue xwOBA charts?

4 Comments
2024/09/27
01:49 UTC

2

Two Sabermetrics Questions

  1. What is the one sabermetric stat that most correlates with total runs scored for a team in a season?

  2. At what point in a season do "expected" stats start to correlate with actual numbers? In other words, if an xwOBA-wOBA split is large after the first 30 games, do they usually come close to each other by the 80th game?

7 Comments
2024/09/26
23:47 UTC

1

Pull information from MLB.com pages

Each mlb.com team has an injury and roster moves page (not an article) like this one for the Braves:

https://www.mlb.com/news/braves-injuries-and-roster-moves

All of the team can be found from links here:

https://www.mlb.com/injury-report

I'd love to find a way to see if any new information has been added to them. Or all the text from them to a doc (ex. Google Docs) and I could search them by date. Any suggestions? Thanks.

3 Comments
2024/09/26
18:53 UTC

7

Individual Pitch Velocity & Spin Rate Correlation Data

I'm sure we've all heard that pitchers tend to spin it better when they throw harder but it's definitely more nuanced than that.

This is every pitch in the majors and minors since 2020 thrown 200 times. Included is the correlation, slope, and intercept of velo and spin rate for each pitch. I also set up a few more columns for perspective: the min, med, and max of velo and rate, the expected spin for the min, med, and max of velo, and from 65-105mph. Added a few pivot tables to help sort through the data. If you just want to use it see what random minor league guys spin the best breakers though, go ahead.

It's immediately apparent that there is quite a bit of variance in how spin changes with velocity. Some guys consistently run high correlations while many others have basically none. Most people gain some spin as they throw harder, but some guys gain a ton while some guys actually lose spin.

Definitely more to investigate here. Could be good for investigating how individual pitcher's stuff will change in varying roles.

https://docs.google.com/spreadsheets/d/1hxWx6e81YR4_VeEaIRYPZ_qEG39DVrlJj3ST1J8LEWE/edit?usp=sharing

1 Comment
2024/09/26
01:35 UTC

2

Stuff+ Model validity

Are Stuff+ models even worth looking at for evaluating MLB pitchers? Every model I've looked into, logistic regression, random forest, XGBoost (What's used in industry), has an extremely small R^2 value. In fact, I've never seen a model with an R^2 value > 0.1

This suggests that the models cannot accurately predict changes in run expectancy for a pitch based on its characteristics (velo, spin rate, etc.), and the conclusions we takeaway from its inference, especially towards increasing pitchers' velo and spin rates, are not that meaningful.

Adding pitch sequencing, batter statistics, and pitch location adds a lot more predictive power to these types of Pitching models, which is why Pitching+ and Location+ exist as model alternatives. However, even adding these variables does not increase the R^2 value significantly.

Are these types of X+ pitching statistics ill-advised?

7 Comments
2024/09/25
00:01 UTC

5

Jackson Jobe - MiLB Pitch Metrics & Stuff

I've been experimenting with stuff models, pitch classification, and minor league pitch data. I need to do more with tuning and validating but current performance looks quite good and I will definitely have more to show y'all 'eventually'. Until then, with Jackson Jobe on his way to Detroit, I wanted to look at his milb stuff. Some data below for the fellow autists.

He’s sitting 96-97 mph with the fastball the last two years and is a premium fastball spinner. However, that's slightly stifled by being a short extension guy with an average release height. He's started cutting his fastball a bit this year; its giving him better seam effects, but he’s also lost some spin and movement. Should help him against shh but it looks worse against ohh.

He's been a +3k breaking ball guy before, but he’s lost a little spin on the breakers in 24 as well. The shape is basically identical though. A cutter-slider sits around 90 mph, and a big sweeper around 83. A mid-80s changeup seems unremarkable.

His median pitches look 50-65 grade on the 20-80, but his +95th percentile pitches look elite and he is going to be pitching in the bullpen for now. Some control metrics don't love his use of any pitch, but nothing looks particularly bad. His profile honestly looks like a younger higher-octane Randy Vásquez. Not the most flattering comp but overall still exciting.

If this stuff interests y'all leave some more names for me. Minors leaguers must have pitched in AAA or FSL-A.

https://docs.google.com/spreadsheets/d/1JTBAFxldDFENi3iWugQucg5-Jeq53CNkUq4N_gw8MBg/edit?usp=sharing

1 Comment
2024/09/24
09:11 UTC

1

Reaction Time Measurement

Are any of you aware of a Paper (or otherwise publicized piece) providing a way to measure reaction time to pitches?

Would the beginning of bat movement be a good estimator for this?

Having a solid estimator for the time it takes for a batter to decide whether to swing or not would be awesome.

Looking forward to any ideas you all have!

1 Comment
2024/09/23
18:01 UTC

4

Do number of at bats influence WAR?

Given two players, if all averaged stats are equal (batting avg, walks per 9, so's per 9, ..) and hit results (singles, doubles, ..) proportional to at bats are the same, would the player with the higher number of at bats have a higher WAR?

5 Comments
2024/09/17
01:05 UTC

1

Baseball Savant Help

I want to download every pitch from this season from pitchers who have thrown over 500 pitches. I thought I had this however when I downloaded the csv file it only gave me 25,000 rows. I was expecting it to be in the hundreds of thousands. How can I do this?

3 Comments
2024/09/16
21:57 UTC

2

MLB Player Plate Appearance log (w/ RBI)

Hi, I am looking for data that will have a row for each plate appearance by a batter and the result of that plate appearance, specifically including if an RBI was recorded on that play.

For example, for Marcell Ozuna, I can get his Game Logs anywhere, but when i break it down to Play Log or Plate Appearance log, I can't find if an RBI was recorded or not. Such as FanGraphs Play Log (https://www.fangraphs.com/players/marcell-ozuna/10324/play-log?position=OF) or Savant's Statcast search. Yes, it tells me in a text field whether someone scored or not, but not every time that someone scores does an RBI occur. I also could not find Play Log on Baseball Reference (maybe I am missing it)

Thanks

1 Comment
2024/09/16
18:58 UTC

5

Issue with scraping Baseball Savant in baseballr package

As the title says, I've been having an issue with scraping Baseball Savant from baseballr. I presume this has to do with the addition of the bat speed based columns, if anyone has a work around or a fix, please let me know.

https://preview.redd.it/xncsfb4ov7pd1.png?width=1890&format=png&auto=webp&s=f65547cc5de809eb4668edaab5c7c44f77f73001

5 Comments
2024/09/16
18:46 UTC

11

Bill James-invented stats

Question for the older baseball fans who might be in this sub: was there ever a vocal opposition to the metrics invented by Bill James?

James is the originator of game score, range factor, similarity scores, power/speed, and MANY other measures which are now widely accepted and available on virtually any baseball stats resource (whether or not they're all that useful in 2024).

Considering that in modern times there are older, more traditional baseball fans who still haven't even tried to understand WAR, outs above average etc, it's easy to imagine a block of old-heads who fully opposed James' statistical innovations.

It can be frustrating to hear MLB Network analysts reject even the simplest advanced metrics and complain about "launch angle ruining baseball," and I'm curious if fans, broadcasters, and writers shit on Bill James back in the day.

Any response appreciated

23 Comments
2024/09/16
01:05 UTC

0

Leaguewide splits versus velocities?

I'm writing a paper for school about TJ and the endless pursuit of velocity. I wanted to include a bit about splits versus higher velocities to assert that some of that overthrowing is grounded in analytics, but I can't figure out how to find the leaguewide slash line versus different pitch velocities, whether on Savant, baseball reference splits, or fangraphs. Any help would be greatly appreciated.

2 Comments
2024/09/14
20:51 UTC

10

Game-by-game WAR changes

Is there any public site that tracks a player's changes in WAR on a game-by-game basis? Specifically, I'm interested in seeing how WAR accrues and diminishes throughout the season in a game log-type format, but WAR isn't included among the statistics on either BBRef or Frangraphs' game log pages.

I'm not the data scientist that a lot of you in this community seem to be (so I'm not about to do coding to create such a tool myself) but I'm deeply intrigued by statistical analysis of the game nonetheless and this would be helpful in getting a better understanding of how game performance translates to WAR totals. As it stands now, I can only watch a specific player's WAR total fluctuations daily and then surmise how the last game affected it. It would be much more useful if I could look back at the whole season and view the changes.

4 Comments
2024/09/10
16:38 UTC

0

Error with pybaseball pulling records from baseball reference

4 Comments
2024/09/09
15:34 UTC

20

A new tool to evaluate uncertainty in WAR

I recently developed a site to show the uncertainty between different WAR implementations: https://clearingthefog.github.io/pages/player_comparisons.html

It combines and permutes the WAR components of Baseball Reference, FanGraphs, and Baseball Prospectus to estimate uncertainty of each player's WAR totals, and lets you compare players head to head.

https://preview.redd.it/26q8srqehmnd1.png?width=1690&format=png&auto=webp&s=0335397263445212f80d8be81164ffed1b4905c4

https://preview.redd.it/wkxwpfsrhmnd1.png?width=1696&format=png&auto=webp&s=0c0130fcdc27f25291cf18f02d8a992fb332f712

I've included some example figures, but the site has lots more (and accompanying explanatory text). I'd be curious to get some feedback from you sabermatricians before I try and share it with the general public.

Tom Tango approved! https://x.com/tangotiger/status/1832818215338094624

24 Comments
2024/09/08
17:49 UTC

Back To Top