So, you want to talk about a player’s defense?

Remember: a good sabermetrician is like a good hunter when cleaning his kill: he throws away as little as possible, taking care to use most of the animal. We have decades of information about players; why should we ever use only three and a half months worth of data in evaluating a player?

My process is based heavily off of Tango’s Marcels forecasting system; that said, he had nothing to do with this, and screwups in it are mine, not his. (For background on how a projection system works, here’s a decent writeup. If I don’t say so myself.)

Before going any further, I should note that I made this in about two hours. And I also made dinner in those two hours. And I had a side dish. So don’t expect anything on the order of PECOTA as far as complexity goes.

Here’s how it works. Every player’s zone rating data from 2005-2008 (yep, everything pre-All Star break from this year) is thrown into a mixer and weighted. I used a 5/4/3/2 weighting; I have no empirical basis for these weights other than it’s what Marcel uses. Then throw in two season’s worth of the league average for the position. There’s your regression to the mean.

Aging curves are… forthcoming. Maybe. I’m still hashing out the details. (I’ve started work on zone rating based aging curves for fielders, but there are questions about how accurate they are, and before they can be used in a projection system they need to be smoothed out a bit more.)

So, data. Plays and runs above or below average are figured using the Dial method. For that, each player is assumed to have a full season’s worth of chances at the position, not the number of chances used to compute zone rating.

The next step beyond aging curves would probably be to incorporate at least some measure of speed scores into the projection. But I was hungry, and so instead you have the best projection system I could make in two hours, while still making dinner. It’s a start, at least.

(Also, lemme take this chance to plug my hitter and pitcher evaluations on GROTA, if you have an interest in such things regarding the Cubs. Hitter and pitcher projections are next on my plate.)

Labels: Defense, Projections

Good stuff, thanks. What's your source data? STATS ZR or BIS ZR or both? Are you using BIS's OOZ data? I'm hoping your answers are "both" and "yes".

Also, are you projecting playing time? If so, how? Are the fielding runs listed pro-rated over a certain number of chances, like a full season's worth?

I'm not projecting playing time - I suppose the CH column is a rough indicator of playing time, but I'd have to put a lot more work into it to actually call it a projection.

Fielding plays/runs are for a full season. I'm only using the STATS ZR right now - I don't have the BIS data with IDs to use for it, and so it'd take a few hours just to generate and validate the IDs off Hardball Times.