Which Batter is More Productive?

Martin and Woods

New Member
Dec 8, 2017
84
In this case I'd tend to look at defining Productivity as total Bases (lower-case t as we're including walks) per PA, or BPA, which Baseball Jones and Yo La Tengo were alluding to. Using Yo La Tengo's numbers, I'd go with Player B:

Player A: 473 TB + 73 BB = 546 Bases, .546 BPA
Player B: 438 TB + 142 BB = 580 Bases, .580 BPA

And, given that we don't know things like how many times each hit into a DP or was thrown out stealing or while trying to take an extra base, etc., Player B also has more bases per Out:

Player A: 546 bases/635 Outs = .860 bases per Out
Player B: 580 bases/635 Outs = .913 bases per Out
 
Last edited:

koufax32

He'll cry if he wants to...
SoSH Member
Dec 8, 2006
9,265
Duval
The currencies are outs and bases. Runs are the products of accumulating bases. Outs is the “time” available to accumulate those bases in a way that leads to runs.
I can see this. You could make the case that bases are a currency because they are the opposite or absence of outs.
 

triptych

New Member
Jun 7, 2013
15
Easiest for me was to work from 1000 ABs (reduces the decimals). Assuming the non-specified items (K's, double-plays, etc) were impact equalized:
Hits On-Base BBs At Bats Total Bases(excludes BBs) Extra Bases Total Bases+Walks 1B 1BEB 2B 2BEB 3B HR TB+BBs+EB
(1000*BA) (1000*OBP) (OB-Hits) (1000-BBs) (AB*SLG) (TB-Hits) (TB+BBs) ((1B*.32*.3)+(1B*.18*.6)) (EB*.32) (2B*.32*.41) ((EB*.05)/2) ((EB*.63)/3)

PLAYER A: 315 365 50 950 489 174 539 218 44 56 7 4 37 590
PLAYER B: 260 365 105 895 460 200 565 149 30 64 8 5 42 603

MLB 2022:
Doubles account for 32% of extra bases (7,940), Triples for 5% (643*2), and Home Runs for 63% (5215*3). So I derived the doubles, triples and HRs based on those percentages.
32% of singles (8,324 of 25,877) were hit with runners on 1st. 30% of those (2,524) resulted in more than one base taken by the runner on 1st and 18% (4,785 of 25,877) were hit with runners on 2nd. 60% of those (2,887) scored (1 extra base). So I derived extra bases on singles (1B EB) taken on singles from that. (Did not try to factor in 1st to home).
32% of doubles (2,544 of 7,940) were hit with runners on 1st. 41% of those (1,062) scored (1 extra base). So I derived extra bases on doubles (2B EB) from that.

So, absent a major mental hole in my reasoning, over the course of a season (about 500 ab) Player B seems to have a very slight advantage of about 6 1/2 more bases advanced in this scenario. And that might even be reduced by runners that score on singles instead of just going 1st to 3rd.
 

BaseballJones

ivanvamp
SoSH Member
Oct 1, 2015
25,819
60598

Instead of my made up (what I thought was simpler, but I screwed up all the calculations) scenario, just using this one above....

Player A (1000 pa)
- 0 bases (outs): 635
- 1 base (1b/bb): 259
- 2 bases (2b): 63
- 3 bases (3b): 11
- 4 bases (hr): 32

Player B (1000 pa)
- 0 bases (outs): 635
- 1 base (1b/bb): 237
- 2 bases (2b): 77
- 3 bases (3b): 15
- 4 bases (hr): 36

If we assume that on a walk, players advance ONE base, on a single they advance TWO bases, and on a double they advance THREE bases:

If all 1000 plate appearances...

- Nobody on base
- Player A: 32 runs
- Player B: 36 runs

- Runner on 1st
- Player A: 138 runs
- Player B: 164 runs

- Runner on 2nd
- Player A: 324 runs
- Player B: 259 runs

- Runner on 3rd
- Player A: 324 runs
- Player B: 259 runs

- Runners on 1st and 2nd
- Player A: 430 runs
- Player B: 387 runs

- Runners on 2nd and 3rd
- Player A: 616 runs
- Player B: 482 runs

- Runners on 1st and 3rd
- Player A: 430 runs
- Player B: 387 runs

- Runners on 1st, 2nd, and 3rd
- Player A: 795 runs
- Player B: 752 runs

So in the vast majority of circumstances, given the above assumptions (which may not be accurate), Player A actually is likely to account for more runs than Player B. However, the most common scenario for a batter is to come up either with the bases empty or a runner on first, and in both those scenarios, Player B will account for more runs. I can't find data for what percentage of the time each base running scenario occurs per batter though.


EDIT: While I was working on this, @triptych was working on his post above mine.
 

Yo La Tengo

Member
SoSH Member
Nov 21, 2005
1,032
The currencies are outs and bases. Runs are the products of accumulating bases. Outs is the “time” available to accumulate those bases in a way that leads to runs.
Runs are the product of accumulating bases in certain sequences. A team which puts up 3 walks and 3 strikeouts every inning would have a huge amount of bases and no runs.

So, I agree that accumulating bases is a key factor in scoring runs, but not all bases are created equal.
 

BaseballJones

ivanvamp
SoSH Member
Oct 1, 2015
25,819
Runs are the product of accumulating bases in certain sequences. A team which puts up 3 walks and 3 strikeouts every inning would have a huge amount of bases and no runs.

So, I agree that accumulating bases is a key factor in scoring runs, but not all bases are created equal.
That's right. That's why a home run (4 bases) is better than 4 singles (also 4 bases). The home run guarantees you at least one run, while 4 singles may produce no runs.

Nevertheless, you cannot score a run unless a team accumulates 4 bases (at least) in an inning.
 

koufax32

He'll cry if he wants to...
SoSH Member
Dec 8, 2006
9,265
Duval
Runs are the product of accumulating bases in certain sequences. A team which puts up 3 walks and 3 strikeouts every inning would have a huge amount of bases and no runs.

So, I agree that accumulating bases is a key factor in scoring runs, but not all bases are created equal.
The value of a base is not necessarily that it leads directly to runs. It’s valuable because it’s not making an out. That is the most important thing a batter can do. Getting outs is the most important thing a pitcher can do.
 

Yo La Tengo

Member
SoSH Member
Nov 21, 2005
1,032
So in the vast majority of circumstances, given the above assumptions (which may not be accurate), Player A actually is likely to account for more runs than Player B. However, the most common scenario for a batter is to come up either with the bases empty or a runner on first, and in both those scenarios, Player B will account for more runs. I can't find data for what percentage of the time each base running scenario occurs per batter though.
This is interesting, and I think shows that focusing solely on the "currencies" of outs and bases doesn't tell the whole story.

For example:
"However, the most common scenario for a batter is to come up either with the bases empty or a runner on first, and in both those scenarios, Player B will account for more runs."

This is only certain for the 4 extra homeruns player B hits in excess of player A's total. In all other circumstances, whether player B will "account for more runs" is not simply a function of the number of bases player B accumulates, but on the speed of the runner on base, if any, and the ability of subsequent batters to "accumulate bases" in a way that pushes player B across the plate.
 

BaseballJones

ivanvamp
SoSH Member
Oct 1, 2015
25,819
This is interesting, and I think shows that focusing solely on the "currencies" of outs and bases doesn't tell the whole story.

For example:
"However, the most common scenario for a batter is to come up either with the bases empty or a runner on first, and in both those scenarios, Player B will account for more runs."

This is only certain for the 4 extra homeruns player B hits in excess of player A's total. In all other circumstances, whether player B will "account for more runs" is not simply a function of the number of bases player B accumulates, but on the speed of the runner on base, if any, and the ability of subsequent batters to "accumulate bases" in a way that pushes player B across the plate.
Well yes, but since we aren't told the context of these 1000 plate appearances, we have to assume that all that stuff is equal between the two players. Obviously if one player comes up with runners on third while the other player always comes up with the bases empty, it's going to radically change their respective value. So assume all that evens out, and just isolate the numbers for these two players.
 

Yo La Tengo

Member
SoSH Member
Nov 21, 2005
1,032
The value of a base is not necessarily that it leads directly to runs. It’s valuable because it’s not making an out. That is the most important thing a batter can do. Getting outs is the most important thing a pitcher can do.
Our task is to figure out whether player A or player B is more productive, and, in this hypo, they make the same number of outs. So, we are left with assessing the value of different ways of accumulating bases.

Also, if player 1 hits 3 home runs in 10 plate appearances, that is more valuable than player 2 who walks 4 times in 10 plate appearances, even though they made one fewer out than player 1.
 

Max Power

thai good. you like shirt?
SoSH Member
Jul 20, 2005
8,462
Boston, MA
The value of a base is not necessarily that it leads directly to runs. It’s valuable because it’s not making an out. That is the most important thing a batter can do. Getting outs is the most important thing a pitcher can do.
Not allowing runs is the most important thing a pitcher can do. A pitcher who gives up one baserunner every other inning is doing a great job getting outs, but if that hit is a homer, he has a 4.50 ERA.

Like I mentioned upthread, Miggy and Schmidt are almost perfect for giving example stat lines for this. Their per 162 game averages are

Miguel Cabrera
60604

Mike Schmidt
60605

Cabrera put the ball in play more, so he hit into more double plays. If I were forced to choose, I'd probably go Schmidt. But I think I'd rather have one of each than two of either of them in my hypothetical lineup.
 

Yo La Tengo

Member
SoSH Member
Nov 21, 2005
1,032
Well yes, but since we aren't told the context of these 1000 plate appearances, we have to assume that all that stuff is equal between the two players. Obviously if one player comes up with runners on third while the other player always comes up with the bases empty, it's going to radically change their respective value. So assume all that evens out, and just isolate the numbers for these two players.
Assuming "stuff is equal between the two batters" is just the start in determining productivity, which is why I've been saying: since we don't know the context of the plate appearances, and we know that different, common scenarios favor one player or the other, we'd have to run a whole bunch of simulations that reflect the likelihood of all possible scenarios occurring in order to determine which batter is more productive.
 

Yo La Tengo

Member
SoSH Member
Nov 21, 2005
1,032
Related but slightly different question: who would you rather have going forward?
I'd pick player A, largely because I'd find them more entertaining, and I suspect, with all possible scenarios weighted appropriately, player A would be slightly more valuable.
 

Sandy Leon Trotsky

Member
SoSH Member
Mar 11, 2007
7,104
I'd pick player A, largely because I'd find them more entertaining, and I suspect, with all possible scenarios weighted appropriately, player A would be slightly more valuable.
I’m leaning towards that also…. It’s really an incremental difference here, and I’d just pick the more exciting player when it comes to such a small gulf. But then it comes down more to who the other players are- if it’s a bunch of Wil Middlebrooks then Player B is probably safer to have. If it’s closer to what we saw out of Casas, for example in a short ML sample, I’d be fine with player A.
 

nvalvo

Member
SoSH Member
Jul 16, 2005
22,274
Rogers Park
Can I just say that I've really enjoyed this thread, Fris? We should do more returns to Sabermetric first principles, especially in the tedium of the offseason.
 

Frisbetarian

♫ ♫ ♫ ♫ ♫ ♫
Moderator
SoSH Member
Dec 3, 2003
5,305
Off the beaten track
I’m really impressed by the well thought out responses and creative analyses here. I’m going to let the discussion continue for a while longer before I chime in.

One thing, though, as @OCD SS said above, you should assume both players’ ‘outs’ have an equal cost to the team. Also, don’t consider things like ‘clutch’ while making your argument. These 2 players are equivalent in all things except one has a higher batting average and the other has higher isolated power.
 

Frisbetarian

♫ ♫ ♫ ♫ ♫ ♫
Moderator
SoSH Member
Dec 3, 2003
5,305
Off the beaten track
Can I just say that I've really enjoyed this thread, Fris? We should do more returns to Sabermetric first principles, especially in the tedium of the offseason.
Thanks. I’ve really enjoyed it, as well. I’m spending extended time in southwest Mexico this winter, and intended to be surfing, mtn biking, exploring jungle waterfalls, and having long open ocean SUP adventures. But I tore my friggin’ left calf 10 days ago and am hobbled. The awesome responses here have helped keep me sane while I heal.
 

chrisfont9

Member
SoSH Member
I’m leaning towards that also…. It’s really an incremental difference here, and I’d just pick the more exciting player when it comes to such a small gulf. But then it comes down more to who the other players are- if it’s a bunch of Wil Middlebrooks then Player B is probably safer to have. If it’s closer to what we saw out of Casas, for example in a short ML sample, I’d be fine with player A.
I was kind of leaning toward player B because walks are a pretty repeatable skill, but the real answer would lie in some of the underlying metrics that we don't have in this hypothetical. Did Player A get some BABIP luck with all those singles, and thus the pick is B? Or if not, and their hard contact % is just really consistent and repeatable, then probably A.
 

tims4wins

PN23's replacement
SoSH Member
Jul 15, 2005
39,017
Hingham, MA
I’m leaning towards that also…. It’s really an incremental difference here, and I’d just pick the more exciting player when it comes to such a small gulf. But then it comes down more to who the other players are- if it’s a bunch of Wil Middlebrooks then Player B is probably safer to have. If it’s closer to what we saw out of Casas, for example in a short ML sample, I’d be fine with player A.
I was kind of leaning toward player B because walks are a pretty repeatable skill, but the real answer would lie in some of the underlying metrics that we don't have in this hypothetical. Did Player A get some BABIP luck with all those singles, and thus the pick is B? Or if not, and their hard contact % is just really consistent and repeatable, then probably A.
This might all come down to personal preference on approach as well. I'm far from a 3 true outcomes guy, but I would probably find the guy who gets more XBH more exciting. If their SLG is the same with a 60 point gap in BA, then would it not stand to reason that player B has a higher hard contact % / exit velo? E.g., I'd rather have Devers hitting rockets at .265 than Boegarts hitting infield singles at .315 (extreme example, there was that one year when Boegarts couldn't hit it out of the park).
 

chrisfont9

Member
SoSH Member
This might all come down to personal preference on approach as well. I'm far from a 3 true outcomes guy, but I would probably find the guy who gets more XBH more exciting. If their SLG is the same with a 60 point gap in BA, then would it not stand to reason that player B has a higher hard contact % / exit velo? E.g., I'd rather have Devers hitting rockets at .265 than Boegarts hitting infield singles at .315 (extreme example, there was that one year when Boegarts couldn't hit it out of the park).
For some reason my mind keeps going to Jim Rice as player A and Dewey Evans as player B. Not so much their peaks, but if you look at their 162-game averages, Rice was a 128 OPS+ with 52 walks (.854 OPS) and Evans was a 127 OPS+ with 86 walks (.840 OPS). Very similar tradeoffs to this thought exercise.

Somehow (relative longevity?) Rice has 20 fewer BWAR than Evans and only 4.3 of those are defense.
 
Last edited:

Rovin Romine

Johnny Rico
Lifetime Member
SoSH Member
Jul 14, 2005
25,961
Miami (oh, Miami!)
One thing, though, as @OCD SS said above, you should assume both players’ ‘outs’ have an equal cost to the team. Also, don’t consider things like ‘clutch’ while making your argument. These 2 players are equivalent in all things except one has a higher batting average and the other has higher isolated power.
Then I'd suspect our hypothetical A/B batter is just equal if we're white-rooming the costs and outcomes.

However, if we're postulating a specific hypothetical shared-team (instead of a generic unknowable one) I think we could make some guesses as to advancing runners and whatnot.

Like what if the rest of the team were clones of myself, and the other 8 batters could not drive in our hypothetical A/B batter against ML pitching? Clearly the A/B batter who hit the most HRs would be more valuable. The team might even actually win a game if the HR lined up with a pitching shutout. (Although my fielding isn't anything to write home about either.)
 

BuellMiller

New Member
Mar 25, 2015
457
I've got some simple-ish code that I've run in the past (fooling around with optimizing softball lineups back in the day), that does monte carlo simulations of a lineup to determine amount of runs scored per game and stdev.
In this case, I got the following results
.260 hitter: 6.57 runs/game, stdev 3.96
.315 hitter: 6.43 runs/game, stdev 3.95
Assumptions/parameters:
-100K games played, all nine innings.
-Each team is made up a lineup of 9 of the hitter in question
-rough estimates for chance for a base-runner to take an extra base that I may have pulled out of thin air (33% to go from 1st to third on a single (if third's "open" because a runner on 2nd scored on the hit as well), 50% chance to score from second on a single or 1st on a double when less than 2 outs, and 75% with two outs)
-no double plays or any other base running outs
-no other advances during an out (assume they're all K's or pop-outs)
-no reaching on errors
-no stolen bases Or CS
-the random numbers to determine the odds of getting a 1b, 2b, etc where a simple linear regression on the BA/OBP/SLG to these %values were calculated with a small 12 player sample from 2022 stats.

This kind of meets my suspicions that the .260 hitter could be ever so slightly better (making same # of outs but getting more xbh), but the .315 hitter having a slightly less variance (since they're more likely to drive runners in from 2nd and/or 3rd) (although, i was a little surprised they were that close, and I guess certainly still within error)
Of course, the simulation could be refined with other parameters (more realistic base-running, or using just 1 of that specific hitter in the lineup and then 8 other common randos)
 

nvalvo

Member
SoSH Member
Jul 16, 2005
22,274
Rogers Park
Thanks. I’ve really enjoyed it, as well. I’m spending extended time in southwest Mexico this winter, and intended to be surfing, mtn biking, exploring jungle waterfalls, and having long open ocean SUP adventures. But I tore my friggin’ left calf 10 days ago and am hobbled. The awesome responses here have helped keep me sane while I heal.
Oof. Well, at least you can still enjoy the cuisine.
 

StupendousMan

Member
SoSH Member
Jul 20, 2005
1,986
I've got some simple-ish code that I've run in the past (fooling around with optimizing softball lineups back in the day), that does monte carlo simulations of a lineup to determine amount of runs scored per game and stdev.

...

This kind of meets my suspicions that the .260 hitter could be ever so slightly better (making same # of outs but getting more xbh), but the .315 hitter having a slightly less variance (since they're more likely to drive runners in from 2nd and/or 3rd) (although, i was a little surprised they were that close, and I guess certainly still within error)
Of course, the simulation could be refined with other parameters (more realistic base-running, or using just 1 of that specific hitter in the lineup and then 8 other common randos)
Excellent. I'd be willing to bet a six-pack of your favorite beer that one can construct a lineup of 8 semi-random batters which will cause player A to look better, but another lineup of 8 different semi-random players which will make player B look better. I suspect that it won't be possible to declare that either A or B is better, definitively, given a reasonable lineup of players with different capabilities.
 

JM3

often quoted
SoSH Member
Dec 14, 2019
18,715
I'd probably slightly prefer to have had Player A last season, but much prefer Player B next season, if that makes sense?

In other words, the extra bases teammates move up that flow with the higher batting average probably have a deminimis additional value, & the fact that Player B probably sees more pitches is much less important in modern baseball where start lengths are more dependent on trips through the order than # of pitches.

But Player B almost certainly has better projections for upcoming seasons because his skills are more likely to translate year over year & Player A is more likely to have ran well on BABIP.
 

GrandSlamPozo

New Member
May 16, 2017
116
Player A is more productive - according to linear weights models like wOBA, doubles, triples and home runs are respectively worth approximately 1.4, 1.8 and 2.4 times as many runs as singles. Since slugging percentage places more weight on extra base hits than they are actually worth when it comes to producing runs, that means that if two players have the same slugging percentage, the one with the higher batting average will have produced more runs.
 

BaseballJones

ivanvamp
SoSH Member
Oct 1, 2015
25,819
What's better: A player that goes 4-4 with 4 singles, or a player that goes 1-4 with a home run?

Depends, right?

The player that goes 4-4 doesn't spend any outs to acquire his 4 bases. That's an enormous advantage. And a player who hits 4 singles MAY drive in as many as 8 runs (assuming players can score from second), depending on who is on base. And of course he may advance other runners as well.

But it's also true that a player that hits 4 singles may produce precisely zero runs. He could lead off every inning he comes up to bat with a single and gets stranded, producing no runs for his team. At least the guy who hits the home run is guaranteed to produce at least one run, by definition.

These are fascinating questions. I still go back to my post #54 above to see the various scenarios for which batter is more productive.
 

trs

Member
SoSH Member
Aug 19, 2010
628
Madrid
It seems to me that most analysis so far has placed rather even value to both players. So, I suppose one take-away is that given the Twitter poll's response that most assume the .315 hitter to be most productive is that "Nope! In fact both players are nearly equal..."

If in fact that's true.

What also stood out to me as an interesting point was:

Excellent. I'd be willing to bet a six-pack of your favorite beer that one can construct a lineup of 8 semi-random batters which will cause player A to look better, but another lineup of 8 different semi-random players which will make player B look better. I suspect that it won't be possible to declare that either A or B is better, definitively, given a reasonable lineup of players with different capabilities.
And if this is the case, then perhaps what we need to do is see which lineups would in fact benefit each player more. Then, if this goes beyond a purely theoretical exercise, we would look at our own lineups and see whether ours is more "compatible" with Player A or Player B and choose accordingly. If we stay more theoretical, we could maybe try to measure how likely those favorable lineups actually are. Perhaps Player A gets the benefit from a lineup full of doubles and triples hitters -- something that isn't very common, and Player B comparably benefits from lineups that frequently spit out a runner on first or no runners at all -- something more likely, and so we go with Player B.

Or, it's still all statistically insignificant and we settle this by asking them both the ingredients to a perfect pizza and make our choice based on that.
 

reggiecleveland

sublime
Lifetime Member
SoSH Member
Mar 5, 2004
28,426
Saskatoon Canada
The guy who walks more makes the pitcher throw more pitches. If it is as high as 5-6 extra pitches a game, over the course of a three-game series, that is essentially one extra inning the opposing team has to throw that series.
 

JM3

often quoted
SoSH Member
Dec 14, 2019
18,715
The guy who walks more makes the pitcher throw more pitches. If it is as high as 5-6 extra pitches a game, over the course of a three-game series, that is essentially one extra inning the opposing team has to throw that series.
It's super unlikely that there would be that large of a delta between their pitches per at bat. For reference purposes, the league leader last year averaged 4.32 pitches per plate appearance (Muncy), & the guy who was last in the league averaged 3.46 (Seager), so it's unlikely these 2 hypothetical hitters would have a larger spread than the 0.86 one that those two had, so 5 PAs per game would be about 13 extra pitches in a 3 game series. Which absolutely isn't nothing, but even that is probably a bit extreme in terms of an estimate.
 

reggiecleveland

sublime
Lifetime Member
SoSH Member
Mar 5, 2004
28,426
Saskatoon Canada
It's super unlikely that there would be that large of a delta between their pitches per at bat. For reference purposes, the league leader last year averaged 4.32 pitches per plate appearance (Muncy), & the guy who was last in the league averaged 3.46 (Seager), so it's unlikely these 2 hypothetical hitters would have a larger spread than the 0.86 one that those two had, so 5 PAs per game would be about 13 extra pitches in a 3 game series. Which absolutely isn't nothing, but even that is probably a bit extreme in terms of an estimate.
Good point.
I guess my preference is the patient power hitter, but admit that preference is based on those players being undervalued for a long time.
You seem more versed in this than me, but I am guessing forcing high pitch counts as a team is a product almost entirely of OB%. Is that accurate?
 

JM3

often quoted
SoSH Member
Dec 14, 2019
18,715
Is that necessarily true?

Edit: IOW, is there a correlation between pitches seen and walks? A batter can see a lot of 3-2 counts and still strike out more than walk.
Since Chawson did the work on the Red Sox hitters sorting them by pitches per PA last year, lets just interpose their BB rates last year onto it:

2022 Pitches/PA
Casas 4.29 - 20.0%
Dalbec 4.18 - 8.2%
Refsnyder 4.14 - 8.5%
Turner 4.04 (LAD) - 9.4%
JDM 4.00 - 8.7%
Story 4.00 - 8.1%
Cordero 3.98 - 10.2%
Bogaerts 3.97 - 9.0%
Duran 3.95 - 6.3%
Wong 3.95 - 8.9%
Hosmer 3.90 (SD+BOS) - 8.8%
Verdugo 3.87 - 6.5%
Duvall 3.86 (ATL) - 6.7%
Devers 3.71 - 8.1%
Kiké 3.70 - 8.5%
Alfaro 3.65 (SD) - 4.0%
Tapia 3.64 (TOR) - 3.7%
McGuire 3.53 (CWS+BOS) - 4.4%
Arroyo 3.50 - 4.3%
 

Frisbetarian

♫ ♫ ♫ ♫ ♫ ♫
Moderator
SoSH Member
Dec 3, 2003
5,305
Off the beaten track
Thank you to all who contributed to this thread. The responses were quite impressive, and showed some creative, innovative use of statistics.

With all things equal between the 2 batters including the cost of outs, the real question here is are player A’s extra hits/higher batting average more valuable than player B’s extra walks and higher isolated power. A few people here have utilized wOBA to compare the hitters, and that is a great method. But it’s a little abstract for some folks, who want to know exactly how the data is calculated, so I’m going to go old school, and use linear weights (lwts).

I know this is really elementary for a lot of you, and I apologize if it’s repetitive. But I think explaining how linear weights works will help some people here understand why these 2 players have contributed the same amount of offense. They are equally as productive.

My favorite mathematician is Andrey Markov, who was not only brilliant, but also sexy as hell. Markov is best known for Markovian Chains, which were based on the study of the probability of mutually dependent events. To simplify, an event which is influenced by past events and will influence future events would be considered Markovian. For example, the stock market would be considered Markovian, as would a blackjack game, while a coin flip would not. Baseball certainly is Markovian, in that what happens in a current at bat is influenced by past events, just as future events are influenced by present events. What I mean is that a single with two outs and no runners on base has a different value than a single with the bases loaded and no outs, with the events leading up to those respective singles being the difference, and the subsequent events also influencing the total run result. Clearly, all singles are not equal. So how do we determine with any degree of accuracy the “value” of a single (or the result of any at bat)? The following chart will help answer that question. The chart gives the “net expected run value” for each of the 24 different situations a batter can see in an at bat, from none on, none out, to bases loaded, two outs. The run values are calculated by looking at each at bat over the course of multiple seasons and determining how many runs score subsequent to each of the situations. These totals are then averaged to give the “net expected run values” for each situation.

The following chart uses data from 2015 - 2018, and while the run expectancy numbers do change depending on run environment, the value of these 2 players will remain equal in all run environments (I couldn't resist including a picture of Markov and those sext math eyes!):

View attachment 60660



This may be easier to understand when looking at the first batter of an inning--with no outs and none on, an average team will score 0.492 runs. Multiplying this by 9 innings in a game yields just over 4.4 runs, which is the average number of runs scored per game by Major League Baseball teams from 2015-18.

It’s worth taking a minute to look at the chart, as it will show the value of an out. For example, if you have a runner on 1st and no outs, the average runs scored is .859. If the team successfully bunts him to 2nd base, you now have a runner on 2nd and 1 out, which will on average yield .678 runs. That bunt ‘cost’ the team almost 2/10 of a run. And while there certainly are situations where the bunt is a good strategy, in most it is not. Outs are incredibly valuable in baseball. But I digress.

Back to linear weights. To calculate linear weights, you take the change in expected run situation for all offensive events and divide them by the number of times that event occurred. For example, if you took the total positive change in expected runs generated by all singles hit in a season/multiple seasons and divided that by the total number of singles, you would get the average value of a single. Doing the same thing for doubles, triples, home runs, outs, etc., gives you the average value of all those offensive events. If you were to total all the hits, walks, hit by pitch, stolen bases and caught stealing, and outs over the course of a season, you would arrive at a sum close to zero (some plays, such as passed balls or balks are not a product of offense and are thus not accounted for in linear weights). Simply multiplying a batter's singles, doubles, outs, etc., by the calculated average value of those events will, therefore, give you the total number of runs a player has contributed above or below an average player.

The event values are as follows:

• singles +0.47
• doubles +0.76
• triples +1.03
• home runs +1.40
• walks/hpb +0.33
• intentional walks +0.185
• stolen bases +0.193
• caught stealing -0.437
• outs -0.271

Because we know the cost of player A's and player B's outs are identical, we can eliminate them from this comparison. A sample season for both players with 600 plate appearances would be:

Player PA AB H 1B 2B 3B HR BB/HPB BA OBP SLG
A 600 556 175 115 34 1 25 44 0.315 0.365 0.515
B 600 515 134 74 24 1 35 85 0.260 0.365 0.515


Multiplying each players hit and walk/hbp contribution by the event values above gives us player A at 130.4 and player B at 131.1, less than 1 run contributed apart. They are equally productive. You can try different numbers that create the same slash lines, and you will always get the same result. Remember, these players batted in the same situations with the same teammates in the same parks, etc., etc.

I hope this all makes sense. My wife often says I'm effectively illiterate in both Spanish and English. If you have any questions, please don't hesitate to ask.

Thanks for the nice distraction. Maybe next time we can do wins vs runs, or the role of luck in sports :rolleyes:
 

Max Power

thai good. you like shirt?
SoSH Member
Jul 20, 2005
8,462
Boston, MA
Multiplying each players hit and walk/hbp contribution by the event values above gives us player A at 130.4 and player B at 131.1, less than 1 run contributed apart. They are equally productive. You can try different numbers that create the same slash lines, and you will always get the same result.
It seems like every attempt at running simulations or plugging in representative stat lines always results in player B being a tiny bit ahead. The difference is never significant, but it's always B.
 

dhappy42

Straw Man
Oct 27, 2013
15,913
Michigan
Thank you to all who contributed to this thread. The responses were quite impressive, and showed some creative, innovative use of statistics.

With all things equal between the 2 batters including the cost of outs, the real question here is are player A’s extra hits/higher batting average more valuable than player B’s extra walks and higher isolated power. A few people here have utilized wOBA to compare the hitters, and that is a great method. But it’s a little abstract for some folks, who want to know exactly how the data is calculated, so I’m going to go old school, and use linear weights (lwts).

I know this is really elementary for a lot of you, and I apologize if it’s repetitive. But I think explaining how linear weights works will help some people here understand why these 2 players have contributed the same amount of offense. They are equally as productive.

My favorite mathematician is Andrey Markov, who was not only brilliant, but also sexy as hell. Markov is best known for Markovian Chains, which were based on the study of the probability of mutually dependent events. To simplify, an event which is influenced by past events and will influence future events would be considered Markovian. For example, the stock market would be considered Markovian, as would a blackjack game, while a coin flip would not. Baseball certainly is Markovian, in that what happens in a current at bat is influenced by past events, just as future events are influenced by present events. What I mean is that a single with two outs and no runners on base has a different value than a single with the bases loaded and no outs, with the events leading up to those respective singles being the difference, and the subsequent events also influencing the total run result. Clearly, all singles are not equal. So how do we determine with any degree of accuracy the “value” of a single (or the result of any at bat)? The following chart will help answer that question. The chart gives the “net expected run value” for each of the 24 different situations a batter can see in an at bat, from none on, none out, to bases loaded, two outs. The run values are calculated by looking at each at bat over the course of multiple seasons and determining how many runs score subsequent to each of the situations. These totals are then averaged to give the “net expected run values” for each situation.

The following chart uses data from 2015 - 2018, and while the run expectancy numbers do change depending on run environment, the value of these 2 players will remain equal in all run environments (I couldn't resist including a picture of Markov and those sext math eyes!):

View attachment 60660



This may be easier to understand when looking at the first batter of an inning--with no outs and none on, an average team will score 0.492 runs. Multiplying this by 9 innings in a game yields just over 4.4 runs, which is the average number of runs scored per game by Major League Baseball teams from 2015-18.

It’s worth taking a minute to look at the chart, as it will show the value of an out. For example, if you have a runner on 1st and no outs, the average runs scored is .859. If the team successfully bunts him to 2nd base, you now have a runner on 2nd and 1 out, which will on average yield .678 runs. That bunt ‘cost’ the team almost 2/10 of a run. And while there certainly are situations where the bunt is a good strategy, in most it is not. Outs are incredibly valuable in baseball. But I digress.

Back to linear weights. To calculate linear weights, you take the change in expected run situation for all offensive events and divide them by the number of times that event occurred. For example, if you took the total positive change in expected runs generated by all singles hit in a season/multiple seasons and divided that by the total number of singles, you would get the average value of a single. Doing the same thing for doubles, triples, home runs, outs, etc., gives you the average value of all those offensive events. If you were to total all the hits, walks, hit by pitch, stolen bases and caught stealing, and outs over the course of a season, you would arrive at a sum close to zero (some plays, such as passed balls or balks are not a product of offense and are thus not accounted for in linear weights). Simply multiplying a batter's singles, doubles, outs, etc., by the calculated average value of those events will, therefore, give you the total number of runs a player has contributed above or below an average player.

The event values are as follows:

• singles +0.47
• doubles +0.76
• triples +1.03
• home runs +1.40
• walks/hpb +0.33
• intentional walks +0.185
• stolen bases +0.193
• caught stealing -0.437
• outs -0.271

Because we know the cost of player A's and player B's outs are identical, we can eliminate them from this comparison. A sample season for both players with 600 plate appearances would be:

Player PA AB H 1B 2B 3B HR BB/HPB BA OBP SLG
A 600 556 175 115 34 1 25 44 0.315 0.365 0.515
B 600 515 134 74 24 1 35 85 0.260 0.365 0.515


Multiplying each players hit and walk/hbp contribution by the event values above gives us player A at 130.4 and player B at 131.1, less than 1 run contributed apart. They are equally productive. You can try different numbers that create the same slash lines, and you will always get the same result. Remember, these players batted in the same situations with the same teammates in the same parks, etc., etc.

I hope this all makes sense. My wife often says I'm effectively illiterate in both Spanish and English. If you have any questions, please don't hesitate to ask.

Thanks for the nice distraction. Maybe next time we can do wins vs runs, or the role of luck in sports :rolleyes:
Brilliant explanation and analysis. Thanks.
 

JM3

often quoted
SoSH Member
Dec 14, 2019
18,715
Good point.
I guess my preference is the patient power hitter, but admit that preference is based on those players being undervalued for a long time.
You seem more versed in this than me, but I am guessing forcing high pitch counts as a team is a product almost entirely of OB%. Is that accurate?
I was kicking around some stuff on this in the Mondesi thread. In order for the league leader & the guy who is last in the league in pitches thrown to see the same # of pitches per out, the league leader would have to get on base 25% of the time while the guy who is last would have to get on base 40% of the time.

I think raw pitch count is less important in today's baseball where starters aren't going deep into games anyway than it was when they were still expected to pitch 7+ innings per game & getting into the bullpen earlier against an effective pitcher was super important. If teams are focused more on trips through the lineup, actually getting on base becomes more important that making the other team throw more pitches.

& as to your question, I would say that high pitch counts forced and OBP are likely highly correlated but it wouldn't necessarily be the cause as there are various ways to go about the high OBP (1st pitch singles v. long drawn out strikeouts for example). & I think your thought process is generally correct that if all else is equal, making the other team throw more pitches is fine to use as a tiebreaker.

The most important thing about the whole thought experiment to me, though, is that if that is not who is more valuable in the season they put up those #s...it's who would you rather have the next season. & that's Player B, by a lot.
 

dhappy42

Straw Man
Oct 27, 2013
15,913
Michigan
…The most important thing about the whole thought experiment to me, though, is that if that is not who is more valuable in the season they put up those #s...it's who would you rather have the next season. & that's Player B, by a lot.
Why? If they are equally productive in terms of runs created and therefore, presumably, wins, why favor Player B at all, let alone by lot?
 

JM3

often quoted
SoSH Member
Dec 14, 2019
18,715
Why? If they are equally productive in terms of runs created and therefore, presumably, wins, why favor Player B at all, let alone by lot?
Because he is better at the things that are more predictive year-over-year like walks & exit velocity (probably), while Player A likely had a higher BABIP, which is more prone to fluctuation. The caveat is it's impossible to really know without adding strikeout rates to the mix.

So I guess it's an oversimplification to say definitely Player B next year. If strikeout rates are similar, though, it would be a no-brainer.
 

KillerBs

New Member
Nov 16, 2006
957
A couple questions, my apologies if this point has been addressed completely above.

How do we "know the cost of player A's and player B's outs are identical"? We know they both batted into the same number of outs (setting aside the GIDPs) but if one or the other made outs which are more or less likely to advance runners, would that not be potentially significant in measuring the batters' productivity? The same point goes for the relative nature of the singles. Again, if one of the hitters more consistently hit singles of a sort more likely to advance runners two bases would this not be at least theoretically relevant to the comparison? Note the difference here is not about the base/out scenarios faced by each batter in the real world, but their relative ability to alter the base/out state by way of their hits and outs.

Another way to put the same point perhaps: linear weights measures the average value of singles, doubles, etc and outs but we do not know if these two hitters were both average in this respect, even setting aside the base/out settings they were presented with.
 

BaseballJones

ivanvamp
SoSH Member
Oct 1, 2015
25,819
Without giving any context with respect to base runners and outs, all we can do is equalize those for both players, and just measure the players' performance themselves, in a vacuum.

I mean, a player that goes 0-162 seems like a VERY unproductive player, but if in every one of those 162 ab he drives in a runner from third with a groundout, he actually will be pretty useful in terms of actual run production. (162 rbi in 162 ab is pretty amazing, wouldn't you say?)

So since we aren't told of those base running and out contexts, we just have to look at what they themselves do.
 

Frisbetarian

♫ ♫ ♫ ♫ ♫ ♫
Moderator
SoSH Member
Dec 3, 2003
5,305
Off the beaten track
Without giving any context with respect to base runners and outs, all we can do is equalize those for both players, and just measure the players' performance themselves, in a vacuum.

I mean, a player that goes 0-162 seems like a VERY unproductive player, but if in every one of those 162 ab he drives in a runner from third with a groundout, he actually will be pretty useful in terms of actual run production. (162 rbi in 162 ab is pretty amazing, wouldn't you say?)

So since we aren't told of those base running and out contexts, we just have to look at what they themselves do.
If there are no outs and a runner on 3rd when your fictional batter grounds out and drives in the run, he has actually cost his team runs on average. From the run expectancy chart, following a runner on 3rd and no outs a team will score 1.357 runs on average. The run scores, and the batter gets credit for that, but he leaves a situation of 1 out and no one on base, where the average runs scored is 0.261 runs. 1 + 0.261 - 1.357 = -0.096 runs.

Outs are really valuable in baseball.
 

BaseballJones

ivanvamp
SoSH Member
Oct 1, 2015
25,819
If there are no outs and a runner on 3rd when your fictional batter grounds out and drives in the run, he has actually cost his team runs on average. From the run expectancy chart, following a runner on 3rd and no outs a team will score 1.357 runs on average. The run scores, and the batter gets credit for that, but he leaves a situation of 1 out and no one on base, where the average runs scored is 0.261 runs. 1 + 0.261 - 1.357 = -0.096 runs.

Outs are really valuable in baseball.
I know but I was just talking about context helping us understand how 0-162 can actually differ depending on the scenarios.
 

KillerBs

New Member
Nov 16, 2006
957
But the ground out scoring the runner from 3rd is undoubtedly more valuable (or less costly to future estimated runs) than a K, or pop out in that scenario.

I found this old Tangotiger tweet which provides linear weights for Ks vs "Field Outs" which I take to mean any batted ball out.

View: https://twitter.com/tangotiger/status/1038232219595235328/photo/1


From this, a K costs on average about 1/10 more of a run compared to field out. I expect the GIDP are separated out here in this calculation, but idk.

Hence, if Player 1 had 200Ks and 181 field outs and Player 2 had 80Ks and 301 field outs (ie same number of outs) we could expect Player 2 to generate an extra 1.6 runs compared to Player 1 over the season. IOW, not enough to worry about.
 

JM3

often quoted
SoSH Member
Dec 14, 2019
18,715
Could you explain why are these are so different?
I would assume it relates to the times when teams choose to IBB someone & deploying that option strategically at times that will lead to less additional runs, as opposed to accidentally walking & hitting people at suboptimal times, like to lead off an inning or with the bases loaded (unless it's Barry Bonds).
 

dhappy42

Straw Man
Oct 27, 2013
15,913
Michigan
Could you explain why are these are so different?
Walks occur more or less at random. Intentional walks occur only when the fielding team believes there's a run-expectancy advantage to walking the batter, such as increasing the probability of a double play. You rarely, if ever, see a team intentionally walk in a run.
 

TapeAndPosts

Member
SoSH Member
Jul 21, 2006
601
You rarely, if ever, see a team intentionally walk in a run.
Intentional walking the batter with the bases loaded has happened, to my knowledge, three times since WWII: Barry Bonds in 1998, David Josh Hamilton in 2008, and Corey Seager in 2022. Some interesting facts:

* All three times, the team issuing the intentional walk won the game.
* In the 2022 case, the team issuing the intentional walk was already losing when they issued it, and still came back to win (!).
* 2/3 of these intentional walks were ordered by Joe Maddon. (The first was ordered by Buck Showalter.)

Okay, carry on!
 
Last edited: