The Other Fifteen

Eighty-five percent of the f---in' world is working. The other fifteen come out here.

Looking at forecast accuracy

I love player forecasts. Absolutely love them. But something's been bugging me for a while about them.

Several studies of forecast accuracy have been done; a good rundown of them is here. They all share one rather common flaw, however: all of them have excluded players below a certain level of playing time. I understand why it's done that way; pick your poison on how you want to evaluate ballplayers, and all of them require a certain sample size before you can have confidence you're approaching true-talent level. But we're introducing a bias here in only looking at forecasts where players were good enough to receive sufficient playing time to make it into the sample.

Why is this bad? Well, let's suppose we wanted to try and game our system to beat the test. (I want to note: I'm not accusing any forecaster of doing this; the people behind any of the forecasting systems I'm likely to mention here have track records of being reputable and "above board".) Here's what you would do: you'd set a "floor" to your forecasts, because any player below that floor is likely to fall out of the sample given the constraints of the evaluation.

Again, I don't think anyone is doing this. But there are other possible reasons a forecasting system could be too rosy, and the studies I linked to above would do an imperfect job of capturing that.

So how can we leave our bad/part-time players in the sample, while still addressing our sample size concerns? Well, instead of using a simple correlation, we can use a weighted correlation instead of a standard correlation, using the formula provided here. I also tested weighted average error. Both were weighted by plate appearances. [I won't lie to you - I avoided using RMSE because quite frankly it scares me and I have no idea what it's doing; I am after all a liberal arts major.]

I took all players in the Baseball-DataBank who hit but didn't pitch in 2007. (This means "two-way players" like Scott Spezio got left out of the sample. I'm willing to live with this.) I calculated wOBA (using the weights provided) for all of those players, for both the 2007 results and the Marcels forecast. For all players not given a projection by Marcels, I used the average wOBA of .338.

(Why Marcels? Because it was readily available and because it comes with IDs for all players that I can easily cross-reference with the Databank. PECOTA and ZiPS don't have IDs listed in the spreadsheet, and I can't get the 2007 CHONE projections at all.)

So, how did the monkey fair? Not well, I'm afraid:

CorrelationAvg. Error

Marcels is just as described above. Marcels_2 excludes those players for which Marcels did not provide an actual forecast. Marcels_3 reduces the projected wOBA to bring the forecasts in-line with the league average (a shockingly low .313 - I triple-checked those figures before I proceeded with the rest of the study).

[As for how to read those numbers: average error is how close Marcels came, expressed in points of wOBA. Correlation measures how closely vectors representing the two datasets match up, measured from -1 to 1. Jacob Cohen's guidelines for interpreting a correlation say that from .30 to .49 is a "medium" correlation; these are simply guidelines.]

The next step should be to add in one of the more "advanced" forecasting systems and see if they're any better than the monkey. I'll also be publishing the full dataset used so that people can... well, probably criticize my methodology. It's late and I'm tired of wrestling with EditGrid vs. Excel issues, so here's the data used, without any of the math that figures out correlation or average error.

(Special thanks to Larry Garfield for some help with SQL queries.)

Labels: ,

2 Responses to “Looking at forecast accuracy”

  1. # Anonymous Angus Lau

    I am curious to know what issues you are having with EditGrid vs. Excel. Lets see if we can help out.

    EditGrid Team  

  2. # Blogger Colin Wyers

    Angus -

    Thanks a lot for stopping by. The spreadsheet I'm using uses a lookup function to pull data from other spreadsheets, and when I convert it to an binary Excel file (I'm using Office 2007) and open it in EditGrid those functions no longer work - obviously, because EditGrid can't access those other spreadsheets. I'm sure there's a workaround, I just haven't found it yet.  

Post a Comment

Links to this post

Create a Link