The Other Fifteen

Eighty-five percent of the f---in' world is working. The other fifteen come out here.


Projecting RZR

There are two breeds of vanilla, free-as-in-beer zone rating available in the world: STATS and BIS. I already have a dumb projection system for STATS ZR, which could be refined (aging curves and speed/tools scores are the two major refinements I’m musing over.)

But first I wanted to introduce BIS’s RZR into it. And therein lies a dilemma, folks. Here’s the averages for RZR and OZR (OOZ divided by BIZ) over the years available at The Hardball Times:

POS
YEAR
Plays
OOZ
BIZ
RZR
OZR
1B
2004
4070
1783
5406
.753
.330
1B
2005
4343
1940
5493
.791
.353
1B
2006
3877
2012
4851
.799
.415
1B
2007
4963
1048
6695
.741
.157
1B
2008
2871
847
3815
.753
.222
1B
Total
20124
7630
26260
.766
.291
2B
2004
9863
1203
12129
.813
.099
2B
2005
10403
1478
12825
.811
.115
2B
2006
10401
1211
12679
.820
.096
2B
2007
10120
1412
12192
.830
.116
2B
2008
6313
649
7693
.821
.084
2B
Total
47100
5953
57518
.819
.103
SS
2004
9872
1919
11995
.823
.160
SS
2005
10484
1948
12821
.818
.152
SS
2006
10809
1659
13218
.818
.126
SS
2007
10625
1912
13019
.816
.147
SS
2008
6353
999
7627
.833
.131
SS
Total
48143
8437
58680
.820
.144
3B
2004
6215
2074
9007
.690
.230
3B
2005
6813
2396
9271
.735
.258
3B
2006
7686
1636
10880
.706
.150
3B
2007
7221
1717
10623
.680
.162
3B
2008
4444
1003
6344
.701
.158
3B
Total
32379
8826
46125
.702
.191
CF
2004
9478
2034
11905
.796
.171
CF
2005
10266
1963
12590
.815
.156
CF
2006
10316
2002
11534
.894
.174
CF
2007
10886
1944
12264
.888
.159
CF
2008
5922
1583
6468
.916
.245
CF
Total
46868
9526
54761
.856
.174
LF
2004
7710
847
12242
.630
.069
LF
2005
8686
718
13712
.633
.052
LF
2006
7723
1634
8971
.861
.182
LF
2007
8014
1614
9373
.855
.172
LF
2008
4475
1076
5060
.884
.213
LF
Total
36608
5889
49358
.742
.119
RF
2004
8736
781
13442
.650
.058
RF
2005
9181
695
14161
.648
.049
RF
2006
8376
1686
9436
.888
.179
RF
2007
8418
1575
9597
.877
.164
RF
2008
4802
1205
5321
.902
.226
RF
Total
39513
5942
51957
.760
.114

(2008 numbers will be slightly different from Studes’ numbers, as these are a few days old.) The projections for infielders are doable. But, as it stands, those outfield numbers are a horror show, taken by themselves.

So before we can make projections based upon RZR data, we first need to normalize it. I’m sure there are better ways than the one I’m using, but I don’t think I’m using the worst way either and it’s very expedient for my needs.

What I’m doing is dividing Plays, OOZ and BIZ by the totals for that season, and then multiplying by the averaged totals of all five years.

And, since I was rather short with the explanation the last time out, I’ll go ahead and spell out what I’m doing in full:

  1. First, as above, every player’s performance is “normalized” to an average of the past five seasons.
  2. Then, a weighted average of their past four seasons (05-08) is taken, with the most recent season being given a weight of 5, then 4, then 3, then 2.
  3. Two weights worth of a full season’s average defensive performance of the season is added as a regression to the mean component.
  4. 5 + 4 + 3 + 2 + 2 = 16, so everything gets divided by 16. I wouldn’t exactly call it a playing time projection, but it’s a rough guide to how much playing time a player might be expected to receive.
  5. Plays and Runs above average are figured for a full season’s performance, given the number of chances of the average player at that position from 04 through 08.

And… here are the projections. You can compare them to the STATS ZR projections, if you’d like.

(Note: Currently only players with a Baseball Databank ID who have appeared in 2008 are included in either projection set. The next step is to take the rest of the players in the RZR set, map them to the appropriate STATS ID, and run both projections side by side for all players who played in 2008, and maybe some who haven’t yet but could.)

So what’s next? Like I said before, these could really benefit from aging curves. (While I’m on the topic, Jon Shepherd over at Camden Depot has published RZR aging curves which are worth taking a look at. I have my own ZR aging curves which I should really try and get straightened out.) I really should probably run “projections” for seasons past and see how they match up with what actually happened.

And I want to work on combining data from multiple positions; I’ve done some comparisons of players who have played multiple positions, and my feeling from looking at the data is that in projecting a player’s zone rating, there really isn’t a lot of difference in difficulty in playing the different outfield positions – it’s not really much harder to catch fly balls in center field than it is anywhere else, but there’s a lot more fly balls to catch and so a good fielder is worth a lot more. But that’s worth exploring more, and there are some noteworthy sampling issues in that data; I find it hard to believe that a center fielder is below average as a first baseman defensively, for example. I should rerun this query on the RZR dataset here soon, see what that looks like.

Labels: ,

0 Responses to “Projecting RZR”

Post a Comment