Do pitchers actually want to aim for a donut hole?

Petagine in a Bottle · Feb 25, 2024

Fascinating stuff, thanks for sharing this.

zenax · Feb 25, 2024

A problem with calling balls and strikes is the plate. It is an imaginary rectangle, 17 inches square in the horizontal direction but the actual plate is only 17 inches wide from its front portion to points 8.5 inches up each side. In its final 8.5 inches, it tapers to a point. So a late-breaking pitch could be visibly outside the plate for the first half of its path, then move into the invisible part of the zone in the final part. Or, a pitch over the top part of the zone could dip into it in the invisible part. A 90-mph pitch would only be in the invisible part for about 5.4 milliseconds.

Another problem with calling balls and strikes is that umpires cannot get an accurate view of most pitches because they are looking at them at an angle. For example, an umpire looks down to see a low pitch and how well can he tell if the top of the ball hits the zone at the front end of the plate?

Lose Remerswaal · Mar 4, 2024

Very interesting, please continue to post here. I just have zero to add to your detailed research and I expect others are also similarly impressed.

Njal · Mar 4, 2024

This is super interesting, thanks! One comment and some questions/suggestions.

1. I initially found the analysis confusing because the wOBA graphs are colored oppositely from the rest -- there, the red corresponds to pitcher failure, whereas in the others, red corresponds to pitcher success. It might be clearer to recolor, even at the expense of changing the red/large linkage.

2. The Snell map seems to suggest that at least some pitchers can hit a corner pretty consistently. If they can do that, it's not crazy to think they can make a donut hole, it's just that we might need to look a bit harder to find it in the data.

3. The general analysis seems to suggest that for a given batter handedness and pitch type, there are 1-2 corners that are good places for a pitcher to put the ball. What if you made a dataset consisting of (a) good pitchers (e.g. FIP < some cutoff), (b) throwing a particular pitch to (c) batters of a given handedness? If you got Snell-type graphs that would be cool, particularly if the hot corner moved around according to your model.

BaseballJones · Mar 4, 2024

Very interesting stuff. I'm going to offer this hypothesis to @CSteinhardt 's question: It is due to pitcher fatigue and, specifically, the style of pitching we now see in MLB (guys throwing with greater spin and with much higher average velocity). Starting pitchers used to throw slower than they do now, and that was partially because they weren't as good athletes in general as they are now, but partly because starters knew they were expected to go deep into games, which meant they couldn't max effort each pitch, which meant that they "pitched to contact" more, hoping to induce weak contact rather than just strike guys out all the time, which meant less stress on their arms, which meant that they had more control over their pitches late in games.

Guys today max effort seemingly every pitch, and that wears them down in a shorter time frame, which means that even if they can hump it up to max velocity in the 6th inning, they generally don't have as much control, and balls get hit harder when there is worse location.

Maybe all this pitching adjustment is due to the quality of hitting - they feel like they HAVE to get strikeouts. Or maybe it's just how they want to pitch. But clearly pitchers are throwing more pitches over fewer numbers of innings than they used to, and when you look at a bunch of starters in the 70s and 80s, you don't see the radical differences pitching the third time through the order compared with the first.

SirPsychoSquints · Mar 4, 2024

BaseballJones said:
Very interesting stuff. I'm going to offer this hypothesis to @CSteinhardt 's question: It is due to pitcher fatigue and, specifically, the style of pitching we now see in MLB (guys throwing with greater spin and with much higher average velocity). Starting pitchers used to throw slower than they do now, and that was partially because they weren't as good athletes in general as they are now, but partly because starters knew they were expected to go deep into games, which meant they couldn't max effort each pitch, which meant that they "pitched to contact" more, hoping to induce weak contact rather than just strike guys out all the time, which meant less stress on their arms, which meant that they had more control over their pitches late in games.

Guys today max effort seemingly every pitch, and that wears them down in a shorter time frame, which means that even if they can hump it up to max velocity in the 6th inning, they generally don't have as much control, and balls get hit harder when there is worse location.

Maybe all this pitching adjustment is due to the quality of hitting - they feel like they HAVE to get strikeouts. Or maybe it's just how they want to pitch. But clearly pitchers are throwing more pitches over fewer numbers of innings than they used to, and when you look at a bunch of starters in the 70s and 80s, you don't see the radical differences pitching the third time through the order compared with the first.

In 2023, the league had a tOPS+ of 113 facing a pitcher the third time through the order.
2013: 113
2003: 111
1993: 109
1983: 110
1973: 106
1963: 109
1953: 107

https://www.baseball-reference.com/leagues/split.cgi?t=b&lg=MLB&year=2023#all_times

So there has been an exaggeration over the years, but it's always been there.

BaseballJones · Mar 4, 2024

SirPsychoSquints said:
In 2023, the league had a tOPS+ of 113 facing a pitcher the third time through the order.
2013: 113
2003: 111
1993: 109
1983: 110
1973: 106
1963: 109
1953: 107

https://www.baseball-reference.com/leagues/split.cgi?t=b&lg=MLB&year=2023#all_times

So there has been an exaggeration over the years, but it's always been there.

Yes and even back in the day, starters still got tired. But the exaggeration you mention here is what I would have expected. So this checks out to me.

jarules1185 · Mar 4, 2024

CSteinhardt said:
But if we look at what happens based on the tracking after a ball is put in play, something different pops up:
View attachment 78623
At first I was a bit puzzled by this, but in retrospect it actually seems obvious what is happening. The barrel of the bat is a fixed distance from the hitter's shoulders, so when a MLB hitter makes contact, the distance from the shoulders is essential. There's a velocity dependence as well, because of how quickly a hitter can turn on a pitch. For example, this is the same thing for fastballs and for changeups:

So this is essentially saying that if a pitch has been hit (disregarding swinging strikes, which I assume are still more frequent on the red corners here than middle-middle), that up-and-away and down-and-in pitches are very nearly as bad outcome-wise (from a pitcher's perspective) as middle-middle ones, right?

Essentially that all possible swing sweet spots form an arc with a midpoint of the hitter's shoulder, and that arc intersects the middle of the zone, the up-and-away corner of the zone, and down-and-in corner. Meaning up-and-in and down-and-away pitches are quantitatively objectively superior to up-and-away and down-and-in pitches, in a very general sense. Obviously pitch type and a million other variables come into play on any given pitch.

Definitely has an intuitive physiological feel, and I've never thought about it this way before.

simplicio · Mar 4, 2024

I'm curious about how the projected pitch score graph would map against a pitch characteristics metric like Stuff+.

Also: why are we calling it a donut hole? Isn't it just a donut cause the hole is the middle-middle we're hoping to avoid?

ShawnDingle · Apr 9, 2024

CSteinhardt said:
That's a good question. In many ways it should be similar to Stuff+, since it's working similarly. Or, I guess to be more specific, it should be similar to Pitching+, since that combines both Stuff+ and Location+. However, because I did this independently, there are going to be several differences in methodology which might end up being important. There's one which I know of that ends up being very important - Stuff+ includes information about the release point.

This is actually a complex problem, so let me be a bit more (but hopefully not too) technical for a moment. The way that a model like this is put together involves a training set, where you get to see both pitch characteristics and the outcome, and then a test set, where you get to see characteristics but don't get to see the outcome. A successful model involves learning enough from the training set that you can do a good job of predicting the outcomes for the test set.

As you allow the model to be more complex, it can do a better and better job of describing the training data. So, for example, it might "learn" that the further a pitch is from the strike zone, the more likely it is that the outcome will be a ball, and that the closer a pitch is to the center of the strike zone, the more likely that the batter will swing at it. As the model learns this, it should become better both at the training set and the test set, since it's actually learning something about baseball.

However, there is a danger: if I allow the model to be too specific, it actually starts to overdescribe the training set in a way that no longer looks like baseball. For example, "if the pitch is closer to the center of the strike zone, it is less likely to be called a ball" is useful. But "if the pitch was thrown on May 6 by Adam Wainwright to Javier Baez on a 0-2 count in the 4th inning, it is likely to have been hit into play", while it does a better job of describing the training set, is very unlikely to help you model the test set. Often, this sort of overtraining actually makes the description of the test set worse, since you're replacing learning about baseball with learning about the details of the dataset.

The problem is that separating the two can be particularly tricky. For example, clearly it makes sense to evaluate knuckleballs differently than other pitches given their characteristics. However, in 2023 nearly all knuckleballs were thrown by Matt Waldron. And like with any pitcher, variation in the biomechanics is bad. So, if Matt Waldron misses his release point, it is a good predictor of a poor knuckleball. But if another knuckleball pitcher misses Matt Waldron's release point (since their best release point is likely different), that isn't an indication of a poor outcome. Trying to figure out how to separate these sorts of things is a real pain and much of the work you do in machine learning.

From the description given of Stuff+, release point is a big part of their algorithm. I experimented with using it, and found that for what I was doing, it typically resulted in overtraining. That is to say, every pitch thrown with Tim Hill's release point was thrown by Tim Hill. And more generally, the release points are specific enough that this is true for more typical pitchers as well. I already know that 100% of the pitcher with deGrom's release point are excellent because deGrom is excellent. But if you took another MLB pitcher and adjusted their release point to make it more deGrom like, it wouldn't make them better, presumably. Being able to reproduce the combination of speed, movement, and command on his slider, on the other hand...

So for those reasons I do end up with something which is similar to Pitching+ but very much not identical. I threw some of the info up here if you want to take a look: 2023 data for SP and for min. 300 pitches thrown. I'm still playing around with some of the training methodology and also very much with the presentation, so I apologize in advance for making a mess of the website and layout. But feel free to take a look and I'm happy to discuss possible improvements to the model (and presentation).

As for the donut hole, I did some looking and what I thought was a standard term is apparently something specific to the coaches I had as a kid. So perhaps if somebody knows how, we could change the thread title?

Wow, thanks for the info.

Search

Search

Do pitchers actually want to aim for a donut hole?

CSteinhardt

"Steiny"

Petagine in a Bottle

Member

zenax

Member

CSteinhardt

"Steiny"

Lose Remerswaal

Experiencing Furry Panic

Njal

New Member

BaseballJones

ivanvamp

SirPsychoSquints

Member

BaseballJones

ivanvamp

jarules1185

New Member

CSteinhardt

"Steiny"

simplicio

Member

CSteinhardt

"Steiny"

ShawnDingle

New Member