The Other Fifteen

Eighty-five percent of the f---in' world is working. The other fifteen come out here.

A look at baserunning

I'll go ahead and be honest; baserunning is something I don't pay a lot of attention to. If a guy can hit the ball and field the ball, that's usually good enough for me. I mean, usually a guy has to be on the bases to run them, right?

But something that sticks in my craw is when people tell me that baserunning is one of the things "your stats can't measure." First of all, they're not my stats - I don't know that I've contributed one thing to the larger body of knowledge about baseball. Second - my damned stats sure as hell can measure your baserunning! And so that's what we're going to do here.

I should note that what I said above still holds true - I rely greatly upon the work and ideas of other, better people. I am not a sabermetrician, just someone that writes about sabermetrics. So it behooves me to say that I'm standing on the shoulders of giants here.

First of all, none of this would be possible without a database of play-by-play data. Therefore, the required boilerplate text:

"The information used here was obtained free of charge from and is copyrighted by Retrosheet. Interested parties may contact Retrosheet at 20 Sunset Rd., Newark, DE 19711."

Truly, Retrosheet is gold-plated awesome.

Second, many acknowledgements to Dan Fox; you can read more about his (far superior) EqBRR stats on his blog or at Baseball Prospectus. Also of great help was Lee Panas of Tiger Tales; I highly recommend his excellent series on baserunning metrics. And in case you haven't notice already, I rely very heavily on the research of Tom Tango.

Parsing the Retrosheet logs, I took a look at twelve different common baserunning events. Only the lead runner was considered; it's not fair to judge a runner based on how well the guy on base in front of him is running. (Dusty Baker infamously referred to this as "clogging the bases.")

Event Code Normal outcome XB
1B_2B Runner on first advances to second on a single Runner on first advances to third on a single
2B_3B Runner on first advances to third on a double Runner on first advances to home on a double
1B_GB Runner on first stays on first after a groundout Runner on first advances to second after a groundout
1B_FB Runner on first stays on first after a flyout Runner on first advances to second after a flyout
1B_NOTINPLAY Runner on first stays on first  on a ball not in play Runner on first advances to second on a ball not in play
2B_3B Runner on second advances to third on a single Runner on second advances to home on a single
2B_GB Runner on second stays on second after a groundout Runner on second advances to third after a groundout
2B_FB Runner on second stays on second after a flyout Runner on second advances to third after a flyout
2B_NOTINPLAY Runner on second stays on second on a ball not in play Runner on second advances to third on a ball not in play
3B_GB Runner on third stays on third after a groundout Runner on third advances to home after a groundout
3B_FB Runner on third stays on third after a flyout Runner on third advances to home after a flyout
3B_NOTINPLAY Runner on third stays on third on a ball not in play Runner on third advances to home on a ball not in play

Every time one of those events occurred, several things got recorded into a table. First, I noted one of three outcomes: the "normal" outcome, the XB, or "extra base," outcome, or the runner being thrown out on the play. I also noted how many outs there were in the inning - the correct decision on whether or not to run is dependant upon the number of outs in the inning, and heads-up baserunners will change their baserunning strategy accordingly. All of those things are then assigned to a baserunner and totaled up. [More accurately, there's a set of SQL queries and an Excel spreadsheet that does most of the work.] Balls not in play includes stolen bases, wild pitches and passed balls.

Every play is then assigned a run value based upon its run expectancy. We'll use the scenario Tango uses to explain this with. let's say you have a runner on first base with one out. According to Tango's run expectancy chart, an average of .573 runs scores in an inning in that situation. Suppose that the next runner hits a single. (That would be an event code 1B_2B in my system.)

If the runner on first advances to second, then the run expectancy increases to .971. A better baserunner will, depending on the situation, advance to third instead; the run expectancy with runners at the corners and one out is 1.243. Sometimes, however, a baserunner will get thrown out advancing to third; with a runner at first and two outs, the run expectancy drops to .251.

[Note that we assume that the trail runner stays on first in either the extra base or thrown out scenarios; that's a convenient abstraction, one that introduces a small amount of inaccuracy in exchange for avoiding a large computational headache.]

So, using the run expectancy formula, we assign run values to all of our event outcomes, based on the number of outs in the inning. For each runner, we multiply his outcomes by their run expectancy values, and sum up. Then, for a final step, we take a look at how the average baserunner would have performed given the same opportunities, and subtract that number from the sum. That gives us our +/- rating of runs above/below average.

Since I'm a miserable computer programmer, all of my SQL queries take about forever to run, so I limited myself to the years 2004-2007. Fair warning: single season performances are subject to sample size issues, especially with things like baserunning.

Okay, let's take a look at the top 10 best and worst baserunning seasons:

Batter Name year Opp. +/-
furcr001 Rafael Furcal 2004 298 12.46229
barfj003 Josh Barfield 2006 246 11.80434
pierj002 Juan Pierre 2007 368 11.08171
carrj001 Jamey Carroll 2006 295 10.95289
rollj001 Jimmy Rollins 2005 364 10.93031
milea001 Aaron Miles 2004 281 10.5188
crawc002 Carl Crawford 2004 400 10.2553
rollj001 Jimmy Rollins 2004 321 9.949088
durhr001 Ray Durham 2004 218 9.554747
milea001 Aaron Miles 2007 192 9.416688
Batter Name year Opp. +/-
lawtm002 Matt Lawton 2005 245 -7.7626
lee-d002 Derrek Lee 2007 221 -7.92601
ortid001 David Ortiz 2005 157 -8.12203
konep001 Paul Konerko 2004 161 -8.21513
lowem001 Mike Lowell 2007 163 -8.24073
sosas001 Sammy Sosa 2004 141 -8.24097
sancf001 Freddy Sanchez 2005 170 -8.45961
mattg002 Gary Matthews, Jr. 2006 282 -9.21264
garkr001 Ryan Garko 2007 163 -10.2327
berkl001 Lance Berkman 2004 213 -10.3233

Makes sense, yes? Speedy middle infielders and outfielders run better than slow corner infielders and outfielders. Lee is a surprising entry on our trailers; he has a reputation for being a very good basestealer for a first baseman. But on the whole it doesn't seem very controversial.

What we also get here are the run values of being a good or poor baserunner. The difference between the best and worst over this four year period was about 23 runs, or roughly two wins. Now two wins is nothing to scoff at, but it pales in comparison to defense or offense; nobody would look at this chart and say, "Gee, I think that the Sox should try and see if they can trade David Ortiz for Aaron Miles."

And, since this is ostensibly a Cubs blog - your 2007 Chicago Cubs!

Name Batter Opp. +/-
Derrek Lee lee-d002 221 -7.92601
Michael Barrett barrm003 106 -7.43955
Mark DeRosa derom001 153 -1.8259
Aramis Ramirez ramia001 154 -7.01669
Alfonso Soriano soria001 206 -3.55499
Ryan Theriot therr001 219 -3.70184
Jacque Jones jonej003 149 -0.68092
Cliff Floyd floyc001 75 0.153609
Matt Murton murtm001 84 1.03073
Mike Fontenot fontm001 94 -0.82379
Cesar Izturis iztuc001 152 0.30366
Felix Pie pie-f001 71 1.143167
Jason Kendall kendj001 161 -1.61157
Angel Pagan pagaa001 74 2.719269
Daryle Ward wardd002 44 -0.61667
Koyie Hill hillk002 15 0.13925
Ronny Cedeno ceder002 11 -0.14754
Geovany Soto sotog001 17 -1.63394

This will come as a surprise to nobody who watched the Cubs last season, but good Lord the Cubs sucked at running the bases. Daryle Ward was saved from himself on the basepaths by aggressive pinch-running by Lou Piniella; the same couldn't be said for Lee, Ramirez and Barrett. Oh, and that number for Soto in just 17 opportunities? Ah, catchers and their appreciable lack of knees.

The numbers on Soriano and Lee surprise me. Soriano was below average in 2006 as well (possibly a byproduct of an excessive amount of caught stealings that occurred chasing the mythical 40/40), but was very solid the previous two seasons. Lee, on the other hand, has been a below-average baserunner the past four years.

Oh, and I just have to add - Ryan Theriot is a below-average baserunner. That is all.

Spreadsheet available for download.

The next step is to work out a Marcels-like projection system using this data, so that I can incorporate it into my WAR depth chart.

Labels: , , , ,

15 Responses to “A look at baserunning”

  1. # Anonymous pmayo

    It seemed like you alluded to this, but I'll go ahead and make the (probably redundant) point. It seems to me that a player like Lee, or Ramirez, for that matter, while they might be -7 on the basepaths, they more than offset that -7 with plus defense and run production. OTOH, a player like Theriot, who offers only average defense and very little run production cannot possibly offset even a -3.5.  

  2. # Blogger Samael2681

    I just love the list of top 3 baserunning Cubs:

    1. Angel Pagan
    2. Matt Murton
    3. Felix Pie

    To think that Matt Murton was in the top 3 on anything involving the Cubs last year, particularly something he's not really supposed to be that good at, is entertaining. Of course there are sample size issues, but it's still funnny.  

  3. # Blogger Samael2681

    Is there anyway to get running totals for all of the season's? That would be pretty useful, but having downloaded the spreadsheet, I'm pretty sure I'm looking at SFR to add the totals, but I just wanted to check that I was looking at it correctly.  

  4. # Anonymous Maddog

    Lee's 2007 is an aberration. He was above average in the +/- system by James, which is very similar to this, in 2005 and 2006. His 2007 was was absolutely terrible. He made 5 outs on the bases (excluding caught stealing), which tied him with Ryan Theriot and a few others for 2nd most BR outs in baseball. He made only 1 BR out in 2005 and 2006 combined.

    Like Sam said, there are sample size issues and frankly, the fact that Murton shows up above average is a huge red flag that something is flawed. We're talking about a guy who didn't score from 3rd base in the same at-bat on a passed ball, a wild pitch, and ground ball to 2nd base with less than 2 outs.  

  5. # Anonymous Sky

    Good stuff, Colin. Keep it coming.

    Would it be possible to separate the events into categories that would allow stolen base attempts to be measured separately from advancing on batted balls?

    Do the Cubs still have that crazy third base coach who sent everyone home and ran the Cubs into a ton of outs? If not, how about running a few seasons when he was the 3B coach comparing him to the new 3B coach in situations he could affect?  

  6. # Anonymous Eric

    I liked Angel Pagan as an extra outfielder. Why colitis somehow derailed his Cub career is a mystifier to me (laughs). Oh well. If I'm Felix Pie I'd be mighty upset that my name is the third best on that list.  

  7. # Blogger Colin Wyers

    pmayo - To be honest, the main reason I did this was to show that Ryan Theriot's superior baserunning wasn't offsetting his liabilities on offense. But I didn't expect him to be below average!

    The Murton figures do seem a bit odd, but he's been above-average for three years according to these rankings. If you can figure out what events I'm missing that turn Murton into a poor baserunner I'll add them to the events list, but I'm inclined to buy on the general conclusion. It's really a "smallest midget" sort of thing, really - he comes off looking good but hardly excellent. (Which is pretty much the story of Matt Murton's life; just like Ryan Theriot is the player I use as a living embodiment of replacement player, Matt Murton is my example of a perfectly league-average ballplayer.)

    Retrosheet stores coaching data seperately from the event files, so it would take some work to get that integrated with the baserunning metrics. "Wavin'" Wendell Kim left the Cubs several years ago, anyway, after 2004. It might be a fun thing to do a WOWY on; I know Dan Fox has done something similar before.

    A running total would certainly be possible, though. And I could seperate out the stolen bases from the rest of the events, but for what I'm using these for, I don't know if I see the value in doing so. It's probably the great debate about baserunning metrics, and I'd say I understand the arguements for doing so but I really don't.  

  8. # Blogger Colin Wyers

    Oh, and on the topic of Derrek Lee - I know it's certainly an outlier, but I'm not sure it's an abberation. Lee's stock in range-based defensive ratings dropped across the board in 2007. It's possible that Lee is losing a step, both in the field and on the bases. You'd regress to the mean, obviously, but it's an area of (mild) concern.  

  9. # Anonymous Maddog

    Yeah, Lee's lost a step, or whatever you want to call it. I've always looked at Lee and saw when others were calling him an athlete a guy who could trip over his own shoes 3 or 4 times per day. He's always seemed clumsy to me and his defense is wildly overrated at 1st. Age caught up with him, in my opinion, getting rid of any youth he might have displayed prior to 2007. Now he's just an old, tall, clumsy player. I expect his baserunning will be just as poor in the future. I shouldn't have used the word aberration though.

    As for Theriot, he's just dumb. He has the ability at the plate to work the count in his favor and doesn't. People who want to call Theriot a smart ballplayer are ignoring a shitload of evidence that suggests otherwise.

    The same could be said for Matt Murton. If I had a nickel every time a Cubs fan said how he smart a ballplayer he was I'd be filthy rich.

    I think both he and Theriot are incredibly stupid ballplayers. Neither seems to have half a clue what the hell they're doing half the time.

    You want an example of how stupid Theriot is? When there's a runner on first and he hits a groundball that's probably going to be turned into two outs, he runs down the line while shaking his shoulders in and out. Just watch next time. If he didn't do that stupid nonsense he'd probably ground into half as many double plays as he does.

    Funny thing about these numbers, Colin. It isn't going to change a single person's mind about Ryan Theriot. Go show TheHawk your numbers and he'll say your missing something and incapable of understanding baseball. Theriot fans are Theriot fans for life.  

  10. # Anonymous DeRoMyHero


    Question: Does this data incorporate "Cedeno incidents" (i.e., getting picked off)?

    Also, could you please, pretty please explain to me why Lou thinks that DeRo isn't a good enough baserunner to hit at the top of the order despite his high OBP, but Riot is a great baserunner that needs to be at the top of the order despite his low OBP?

    Any insight you could give me into Lou's thinking would be greatly appreciated.


  11. # Blogger Colin Wyers

    DRMH: Stolen bases.

    And I'd have to double check, but runners getting picked off should be included in the balls not in play category.  

  12. # Blogger Wrigleyville

    as i am excel-less on my mac (too cheap, i guess), what do your statistics say about moises alou? especially '04ish.

    god he was terrible. or did it just seem that way?  

  13. # Blogger Colin Wyers

    He was pretty bad in 2004 - about five runs below average. The numbers have declined steadily since then as he gets on base less.

    If you don't want to fork out the cash for Excel, you could download OpenOffice for free.  

  14. # Anonymous Mike D.

    If you can figure out what events I'm missing that turn Murton into a poor baserunner I'll add them to the events list

    I understand this is simply anectdotal, but there was a game in Atlanta early last season where Murton was on third with nobody out and the Cubs trailing by two runs. On a chopper by Jock Jones hit to Chipper Jones, Murton froze along the third base line in spite of the fact that Chipper never looked back in conceding a run that was not the tying run. Chipper threw to first and was surprised to see Murton still on third. That was certainly not the first time Murton appeared to display a real lack of fluidity on the basepaths, although it jumps to mind as it was at a critical juncture of the game.

    Murton always struck me as someone who lacked smart baserunning instincts. He's not necessarily slow, but he certainly always seemed to be lacking lucid awareness. OTOH I suppose he doesn't get dinged too badly because this can be construed as simply being "conservative" and thus may have more value than some moron blindly taking an extra base when it's not warranted. I like Murton, and while I don't doubt the validity of this study, it doesn't convince me that Murton is anything less than a lousy baserunner.  

  15. # Blogger Lee Panas

    Good job Colin. It's nice to see somebody else looking at base running.


Post a Comment

Links to this post

Create a Link