Hi everyone, kind of an odd way to start, but this topic is kind of odd, if awfully fun, too. So here goes.
I've been looking at some Baseball Savant stuff over the past six months or so, which sounds pretty normal, right? Thing is, I've realized through a bit of in-depth look at it just how basic some of the coverage it brings is, nice as it is to have. I started with rotations per minute which is harmless enough until you realize someone who lead the league in RPM didn't necessarily lead it in Rotations Per Pitch (or appearance, if that's your preference, but I much prefer pitch), due to the Bauer unit, so named after Cleveland's Trevor. Velocity plays way too much in this stat, and it also lacks a practicality issue. From my findings, someone like Chapman would have to throw roughly 150 (I have the exact number, don't want to share just yet) fastballs for said fastballs to elapse a Statcast's choice of one minute, which, well, he's not going to do in one outing, nor would any starter. After all, Bumgarner's league leading 3791 pitches last season only equated to an average of 111.5 pitches in his 34 starts.
What I found interesting is the case of Jeff Manship, who had a really strong case about his story on The Ringer, and somehow is now with the NC Dinos of the KBO league instead of the MLB. Three pitches isn't a remotely great sample, but as I tweeted out to author Ben Lindbergh, the difference between this guy and Chapman in RPP is incredible if it's over a longer duration:
This isn't where RPP stops, since we've seen RPM be a big organizational component to what certain teams (Philadelphia, Houston) value in given pitchers when better information sample exists of what is happening each time a guy actually throws a pitch.
On the other side of things, what about when the bat actually connects the bat with the ball? Given batters control their BABIP more than pitchers, this applies to them materially more so. Anyway, I was looking to improve wRC+ tonight, realizing it gets its roots from wRAA, and ultimately wOBA with seasonal and park weights in various places of respective equations. I was looking at the fact that, say, a single last season was worth weighted to be worth 0.878. Brief aside: That's a new major league-low from 1871-present, breaking last year's .881, with the all of the top (bottom) 10 coming in the last 14 seasons in wOBA. But as we know, not all singles are created equal. Some are absolute rockets off the Green Monster by a slow runner that nobody would catch, other singles could be a bunt single that maybe the third baseman waited to see if they rolled foul, but has a decent chance of being fielded to throw out, say, the opposing pitcher. But yet, those are weighted entirely the same in the backbone of how we view baseball stats, even good ones like wRC+. It makes ZERO sense to me, as it should to you.
I haven't remotely completed this yet, having just thought of it in the past 12 hours, but what about if we look at this in terms of exit velocity? Barrels aren't going to be the whole equation because those are the by far the minority of plate appearances. More specifically, among batters with 190+ Batted Ball Events (BBEs), even MLB's 2016 leader in them, Khris Davis, was only at 10.7% of his plate appearances. What about the other 89.3% of his PAs? Let alone these three specific players: Billy Burns (220 BBEs), Dee Gordon (234), or Jed Lowrie (256)? What do they have in common? They were the only three to have 190 BBEs, yet zero barrels. Given this page on Savant, we have the exit velocity and angle for literally every batter, which is outstanding but not enough, in light of the fact we can compare those two items on a graphs page with any exit velocity and launch angle of our choosing, giving us a batting average and spray map for literally every kind of ball and track 100% of PAs, instead of a fraction of them. Basically, I think we should rewrite how we view the basic fundamentals of basic batting stats, working them into the more complex sabermetrics once the core/basic/fantasy stats like BA are recorded, at least to us. Unfortunately, the natural flaw of this is we can only go back so long as Savant as been around, but thankfully I think it's here to stay as an outstanding foundation resource for better statistics. There's also the issue of shifts, to name a prominent example here as it impacts given batter resulting outs, and I had no luck finding specifics of what percentage of the time a batter faces a specific defense (infield in, shifts, no doubles, OF shaded a certain direction, etc.) which would be wonderful to have, both for the sake of the batters but also manager evaluation. If someone could steer me with further info for player (not league) specifics and percentages, I am indescribably thankful to you.
I only work three hours a week on Thursday, and only have one major project to finish up, some writing for Inside the Pylon for this Friday so the goals are to get this done sooner than later. Given the Statcast pages I've listed only cover the 2016 season by its results to my knowledge, I'd like to run this project next week (12th-18th) for literally every single batter from last season, break it into XBH, types of batted ball (LD, FB, GB), and rewrite honestly flawed (if very solid in conceptualization) metrics we run/use today and take too much for granted. BABIP especially came to mind here and was the basis for this project. Naturally, I don't think one season of player statistics would be enough, but it would serve as a baseline to lay longer term groundwork, as supported by BP's Russell A. Carleton who noted various, specific stabilization rates while also avoiding (and thus helping me avoid) the specific flaws Pizza Cutter ran into. This, in turn, should have actual findings of undervalued players as early as next winter, using Carleton's sample size minimums. I've solved for the equation I want to run on this, so that won't be an issue. The only potential problem is just how dang long this is going to take, but given its impact it should have for the entirety of specifically the MLB, that's a positive, even if it's something only I find interesting. Knowing how we usually operate here, kinda doubt I'll be the only one.
Where do I go with these findings, as projected to be done by hopefully month's end? I think it dramatically impacts how accurately we evaluate certain players. Far as I know, this isn't being done elsewhere, and haven't the slightest thing to a connect in this sport, be it my local SABR (no transportation), FO folks, or otherwise. Thanks. This is my new pet.
I've been looking at some Baseball Savant stuff over the past six months or so, which sounds pretty normal, right? Thing is, I've realized through a bit of in-depth look at it just how basic some of the coverage it brings is, nice as it is to have. I started with rotations per minute which is harmless enough until you realize someone who lead the league in RPM didn't necessarily lead it in Rotations Per Pitch (or appearance, if that's your preference, but I much prefer pitch), due to the Bauer unit, so named after Cleveland's Trevor. Velocity plays way too much in this stat, and it also lacks a practicality issue. From my findings, someone like Chapman would have to throw roughly 150 (I have the exact number, don't want to share just yet) fastballs for said fastballs to elapse a Statcast's choice of one minute, which, well, he's not going to do in one outing, nor would any starter. After all, Bumgarner's league leading 3791 pitches last season only equated to an average of 111.5 pitches in his 34 starts.
What I found interesting is the case of Jeff Manship, who had a really strong case about his story on The Ringer, and somehow is now with the NC Dinos of the KBO league instead of the MLB. Three pitches isn't a remotely great sample, but as I tweeted out to author Ben Lindbergh, the difference between this guy and Chapman in RPP is incredible if it's over a longer duration:
This isn't where RPP stops, since we've seen RPM be a big organizational component to what certain teams (Philadelphia, Houston) value in given pitchers when better information sample exists of what is happening each time a guy actually throws a pitch.
On the other side of things, what about when the bat actually connects the bat with the ball? Given batters control their BABIP more than pitchers, this applies to them materially more so. Anyway, I was looking to improve wRC+ tonight, realizing it gets its roots from wRAA, and ultimately wOBA with seasonal and park weights in various places of respective equations. I was looking at the fact that, say, a single last season was worth weighted to be worth 0.878. Brief aside: That's a new major league-low from 1871-present, breaking last year's .881, with the all of the top (bottom) 10 coming in the last 14 seasons in wOBA. But as we know, not all singles are created equal. Some are absolute rockets off the Green Monster by a slow runner that nobody would catch, other singles could be a bunt single that maybe the third baseman waited to see if they rolled foul, but has a decent chance of being fielded to throw out, say, the opposing pitcher. But yet, those are weighted entirely the same in the backbone of how we view baseball stats, even good ones like wRC+. It makes ZERO sense to me, as it should to you.
I haven't remotely completed this yet, having just thought of it in the past 12 hours, but what about if we look at this in terms of exit velocity? Barrels aren't going to be the whole equation because those are the by far the minority of plate appearances. More specifically, among batters with 190+ Batted Ball Events (BBEs), even MLB's 2016 leader in them, Khris Davis, was only at 10.7% of his plate appearances. What about the other 89.3% of his PAs? Let alone these three specific players: Billy Burns (220 BBEs), Dee Gordon (234), or Jed Lowrie (256)? What do they have in common? They were the only three to have 190 BBEs, yet zero barrels. Given this page on Savant, we have the exit velocity and angle for literally every batter, which is outstanding but not enough, in light of the fact we can compare those two items on a graphs page with any exit velocity and launch angle of our choosing, giving us a batting average and spray map for literally every kind of ball and track 100% of PAs, instead of a fraction of them. Basically, I think we should rewrite how we view the basic fundamentals of basic batting stats, working them into the more complex sabermetrics once the core/basic/fantasy stats like BA are recorded, at least to us. Unfortunately, the natural flaw of this is we can only go back so long as Savant as been around, but thankfully I think it's here to stay as an outstanding foundation resource for better statistics. There's also the issue of shifts, to name a prominent example here as it impacts given batter resulting outs, and I had no luck finding specifics of what percentage of the time a batter faces a specific defense (infield in, shifts, no doubles, OF shaded a certain direction, etc.) which would be wonderful to have, both for the sake of the batters but also manager evaluation. If someone could steer me with further info for player (not league) specifics and percentages, I am indescribably thankful to you.
I only work three hours a week on Thursday, and only have one major project to finish up, some writing for Inside the Pylon for this Friday so the goals are to get this done sooner than later. Given the Statcast pages I've listed only cover the 2016 season by its results to my knowledge, I'd like to run this project next week (12th-18th) for literally every single batter from last season, break it into XBH, types of batted ball (LD, FB, GB), and rewrite honestly flawed (if very solid in conceptualization) metrics we run/use today and take too much for granted. BABIP especially came to mind here and was the basis for this project. Naturally, I don't think one season of player statistics would be enough, but it would serve as a baseline to lay longer term groundwork, as supported by BP's Russell A. Carleton who noted various, specific stabilization rates while also avoiding (and thus helping me avoid) the specific flaws Pizza Cutter ran into. This, in turn, should have actual findings of undervalued players as early as next winter, using Carleton's sample size minimums. I've solved for the equation I want to run on this, so that won't be an issue. The only potential problem is just how dang long this is going to take, but given its impact it should have for the entirety of specifically the MLB, that's a positive, even if it's something only I find interesting. Knowing how we usually operate here, kinda doubt I'll be the only one.
Where do I go with these findings, as projected to be done by hopefully month's end? I think it dramatically impacts how accurately we evaluate certain players. Far as I know, this isn't being done elsewhere, and haven't the slightest thing to a connect in this sport, be it my local SABR (no transportation), FO folks, or otherwise. Thanks. This is my new pet.