First off, a caveat: I have no idea what exactly this data means. I don't know that it means anything, per se. But I think it's interesting nonetheless, and in lieu of sitting on it until I can make something of it, I'm going to share it as is - if any of you think you know what it means, please feel free to let me know.
Basically, I figured out how often a player got a single, double or triple off of a ground ball. And then I did H/GB - how many times a player got a hit off a ground ball. Full results available on EditGrid. 2007 only.
First off, a frame of reference - the league average batting average on ground balls is .243. So, now, some leaders and trailers.
Top five, minimum 100 plate appearances:
- Matt Kemp, .442
- Jeff Salazar, .407
- Willy Taveras, .398
- Ichiro Suzuki, .377
- Omar Infante, .375
Bottom five, minimum 100 plate appearances:
- Jason Phillips, .083
- Ramon Martinez, .094
- Koyie Hill, .100
- Chad Tracy, .101
- Barry Bonds, .123
The first list is dominated by speedy outfielders and middle infielders; the second by catchers and corner infielders I'm sure this surprises nobody.
Worst non-Koyie Hill Cub? Daryle Ward - shocking, I know - hitting only .171 on grounders. After that is... Felix Pie? That surprises the hell out of me, for one. Pie hit 68 ground balls, and only got hits on 12 of them. Seriously doubt that's sustainable.
Best Cub? Cliff Floyd at .327 - again, surprising the hell out of me. Next most? How about Matt Murton, at .295.
And, since I know everyone is wondering - Theriot hit .249 on ground balls. Aramis Ramirez hit .234 on ground balls last season, if you're interesting in comparing that sort of thing. Soriano hit .279.
If I go ahead and expand my net a bit, we see Geovanny Soto and his absurd .471 batting average on grounders. Probably not sustainable.
I want to just go ahead and remind everyone that this doesn't mean anything - at least, not without a lot more context than I've provided you with so far. (Any statistic that has Barry Bonds and Koyie Hill that close to each other on a leaderboard needs to be taken with a huge grain of salt.)
Simply glancing at the list, it seems to be roughly correlated with speed, which makes intuitive sense. (To actually figure out what the correlation is, we'd need a measure of speed - and nobody has published a list of 40-yard-dash times with Retrosheet IDs, sadly. I know about Speed Score, but quite frankly just looking at the description of how to calculate it hurts my brain.) I'm also pretty certain that there's a lot of noise in that list.
To be honest, I learned more about SQL than I did about baseball doing this - which makes sense, as this was mostly a learning exercise to get more comfortable working with Retrosheet data.
From that perspective, it was largely a success - I learned how to automate a lot of things I've been stupidly doing by hand, which is great. I am starting to run up against limitations in my current setup, however - I only have a handful of fields from one year of Retrosheet data. Tomorrow I'm going to play around with a nifty script I've found to hopefully give myself a better sandbox to play with.