The Other Fifteen

Eighty-five percent of the f---in' world is working. The other fifteen come out here.


Another site announcement

There's been quite a bit of radio silence around here recently, and hopefully that'll be corrected soon.

In the meantime, if you're interested in more data on Kosuke Fukudome than you can shake a stick at, well, find a stick and head on over to Goatriders.

Also, I've joined the crew over at Statistically Speaking, along with Brian Cartwright. I'm real excited to be working with Eric, Pizza and Brian. My first post is up, and I should be on a Friday schedule from here on out. If you're interested in things like FIP and BaseRuns, you should definitely take a look.

2008 pitcher Marcels

Like the hitters. Witness the power of this fully-operational widget!

I haven't updated my table since the 29th, in case you were wondering. New for the pitcher Marcels is a reliability score; higher is better. None of this is park-adjusted. ERA and FIP-ERA are both provided; I’d be happier with either Component ERA or BaseRuns converted to earned runs, I suspect, but both of those sound like work. (And, in fairness to me, would require a lot more data than I’m capturing now – I suspect I could get more pitching categories if I really, really desperately wanted to, but it would increase the amount of work involved exponentially. Remember – spidering these from Baseball-Reference gives me ID mappings and ages easily, both of which are insanely important to doing the projections.)

Remember: I am not projecting playing time, I am extrapolating playing time.

SQL code:

CREATE TABLE pitching
AS
SELECT ( CASE WHEN p.playerID is not null
        THEN p.playerID
        ELSE p.Player END ) AS playerID
    , 2008 AS yearID
    , 1 AS weight
    , p.Ag AS Age
    , SUM(p.G) AS G
    , SUM(p.GS) AS GS
    , SUM(p.H) AS H
    , SUM(p.ER) AS ER
    , SUM(p.HR) AS HR
    , SUM(p.BB) AS BB
    , SUM(p.SO) AS SO
    , SUM(p.HBP) AS HBP
    , SUM(p.IP) AS IP
    , SUM(p.BFP) AS BFP
from ( SELECT * from 7_29_08_pitching p
    LEFT JOIN ( select Player AS BPlayer, (AB+BB+SH+SF) AS PA from 7_29_08_batting ) b
    ON p.Player = b.BPlayer
    LEFT JOIN ( select bbrefID, playerID from bdb.master ) m
    ON p.Player = m.bbrefID ) p
WHERE ( p.PA < (p.BFP) OR p.PA IS NULL )
    AND p.BFP > 0
GROUP BY Player
UNION ALL
SELECT p.playerID
    , p.yearID
    , POW(0.999,(2008-p.yearID)*365) AS weight
    , (CASE WHEN m.birthMonth < 7 THEN ( p.yearID - m.BirthYear ) ELSE ( p.yearID - m.BirthYear - 1 ) END) AS Age
    , SUM(p.G) AS G
    , SUM(p.GS) AS GS
    , SUM(p.H) AS H
    , SUM(p.ER) AS ER
    , SUM(p.HR) AS HR
    , SUM(p.BB) AS BB
    , SUM(p.SO) AS SO
    , SUM(p.HBP) AS HBP
    , ROUND(SUM(p.IPouts)/3,1) AS IP
    , SUM(p.BFP) AS BFP
FROM
    ( SELECT * from bdb.pitching p
    LEFT JOIN ( select playerID AS bplayerID, yearID AS byearID, (b.AB+b.BB+b.SH+b.SF+b.HBP) AS PA
        from bdb.batting b WHERE b.yearID > 2004 ) b
        ON p.playerID = b.bplayerID AND p.yearID = b.byearID
        WHERE p.yearID > 2004) p, bdb.master m
WHERE p.playerID = m.playerID
    AND ( (p.PA) < (p.BFP) OR p.BFP IS NULL )
    AND (p.BFP) > 0
GROUP BY yearID, playerID;

CREATE TABLE average_pitch
AS
SELECT yearID
    , POW(0.999,(2008-p.yearID)*365) AS weight
    , SUM(p.G) AS G
    , SUM(p.GS) AS GS
    , SUM(p.H) AS H
    , SUM(p.ER) AS ER
    , SUM(p.HR) AS HR
    , SUM(p.BB) AS BB
    , SUM(p.SO) AS SO
    , SUM(p.HBP) AS HBP
    , SUM(p.IP) AS IP
    , SUM(p.BFP) AS BFP
FROM pitching p
GROUP BY yearID;

CREATE TABLE pitcher_league_average
AS
SELECT p.playerID
    , (SUM(a.weight)*SUM(a.G)*SUM(p.IP)) / (SUM(a.weight)*SUM(p.IP)) AS G
    , (SUM(a.weight)*SUM(a.GS)*SUM(p.IP)) / (SUM(a.weight)*SUM(p.IP)) AS GS
    , (SUM(a.weight)*SUM(a.H)*SUM(p.IP)) / (SUM(a.weight)*SUM(p.IP)) AS H
    , (SUM(a.weight)*SUM(a.ER)*SUM(p.IP)) / (SUM(a.weight)*SUM(p.IP)) AS ER
    , (SUM(a.weight)*SUM(a.HR)*SUM(p.IP)) / (SUM(a.weight)*SUM(p.IP)) AS HR
    , (SUM(a.weight)*SUM(a.BB)*SUM(p.IP)) / (SUM(a.weight)*SUM(p.IP)) AS BB
    , (SUM(a.weight)*SUM(a.SO)*SUM(p.IP)) / (SUM(a.weight)*SUM(p.IP)) AS SO
    , (SUM(a.weight)*SUM(a.HBP)*SUM(p.IP)) / (SUM(a.weight)*SUM(p.IP)) AS HBP
    , (SUM(a.weight)*SUM(a.BFP)*SUM(p.IP)) / (SUM(a.weight)*SUM(p.IP)) AS BFP
    , (SUM(a.weight)*SUM(a.IP)*SUM(p.IP)) / (SUM(a.weight)*SUM(p.IP)) AS IP
FROM pitching p, average_pitch a
WHERE p.yearID = a.yearID
GROUP BY playerID;

CREATE TABLE pitcher_league_average_prorated
AS
SELECT playerID
    , ( G / IP * 318 ) AS G
    , ( GS / IP * 318 ) AS GS
    , ( H / IP * 318 ) AS H
    , ( ER / IP * 318 ) AS ER
    , ( HR / IP * 318 ) AS HR
    , ( BB / IP * 318 ) AS BB
    , ( SO / IP * 318 ) AS SO
    , ( HBP / IP * 318 ) AS HBP
    , ( BFP / IP * 318 ) AS BFP
    , 318 AS IP
FROM pitcher_league_average;

CREATE TABLE player_Age_2008_pitching
AS
SELECT playerID
    , yearID
    , MAX(Age)
    , ( CASE
        WHEN ( MAX(Age)+(2008-yearID) ) > 29
        THEN 1 + ( 29 - ( MAX(Age)+(2008-yearID) ) )*0.003
        ELSE 1 + ( 29 - ( MAX(Age)+(2008-yearID) ) )*0.006 END ) AS Curve
FROM pitching
GROUP BY playerID;

CREATE TABLE player_ip_2008
AS
SELECT playerID, G, IP, (IP/110) AS IP_G, ROUND((IP/110)*52*3)/3 AS IP_LEFT from pitching WHERE yearID = 2008;

CREATE TABLE pitching_marcels_2008
AS
SELECT p.playerID
    , ROUND(( SUM(p.weight)*SUM(p.G) + w.G ) / ( SUM(p.weight)*SUM(p.IP) + 318 ) * i.IP_LEFT ) AS G
    , ROUND(( SUM(p.weight)*SUM(p.GS) + w.GS ) / ( SUM(p.weight)*SUM(p.IP) + 318 ) * i.IP_LEFT ) AS GS
    , ROUND(( SUM(p.weight)*SUM(p.H) + w.H ) / ( SUM(p.weight)*SUM(p.IP) + 318 ) * i.IP_LEFT / c.Curve) AS H
    , ROUND(( SUM(p.weight)*SUM(p.ER) + w.ER ) / ( SUM(p.weight)*SUM(p.IP) + 318 ) * i.IP_LEFT / c.Curve) AS ER
    , ROUND(( SUM(p.weight)*SUM(p.HR) + w.HR ) / ( SUM(p.weight)*SUM(p.IP) + 318 ) * i.IP_LEFT / c.Curve) AS HR
    , ROUND(( SUM(p.weight)*SUM(p.BB) + w.BB ) / ( SUM(p.weight)*SUM(p.IP) + 318 ) * i.IP_LEFT / c.Curve) AS BB
    , ROUND(( SUM(p.weight)*SUM(p.SO) + w.SO ) / ( SUM(p.weight)*SUM(p.IP) + 318 ) * i.IP_LEFT * c.Curve) AS SO
    , ROUND(( SUM(p.weight)*SUM(p.HBP) + w.HBP ) / ( SUM(p.weight)*SUM(p.IP) + 318 ) * i.IP_LEFT / c.Curve) AS HBP
    , ROUND(( SUM(p.weight)*SUM(p.BFP) + w.BFP ) / ( SUM(p.weight)*SUM(p.IP) + 318 ) * i.IP_LEFT ) AS BFP
    , i.IP_LEFT AS IP
    , (SUM(p.weight)*SUM(p.IP)) / ( SUM(p.weight)*SUM(p.IP) + 318 ) AS R
FROM pitching p, pitcher_league_average_prorated w, player_Age_2008_pitching c, player_ip_2008 i
WHERE p.playerID = w.playerID
    AND p.playerID = i.playerID
    AND p.playerID = c.playerID
GROUP BY playerID;

I’m less confident that I’m implementing Marcels correctly with these, simply because Tango has published less about them. But, outside of any errors I’ve made, the only thing left to do – I think - would be to implement real playing time projections. (I also have to backport the reliability calculation to the hitter projections, but that’s easy enough.) Past that… well, they wouldn’t be Marcels anymore. (Okay, so I should be projecting pitching stats by league as well.)

Later this weekend my intent is to take a couple of players and actually show you the calculations going on behind this code, so that anyone that’s unsure on exactly what’s going on here gets to see the guts of the system.

Labels: