Sabermetrics
- Linear weights – people have used multiple linear regression in order to determine relative value of baseball events (1B, 2B, 3B, HR, BB, HBP, etc.)
- One out is -.25
- OPS = OBP + SLG
- BRA = OBP * SLG
- XRR = (.5 * 1B)+(.72 * 2B) + (1.04 * 3B) +(1.44*HR)+.33*(HBP + BB)+.18*SB – .32*CS-.098*(AB-H)
- LWTs = (46 *1B)+(.8 *2B) + (1.02 * 3B) +(1.40*HR)+.33*(HBP + BB)+.3*SB – .6*CS-.25*(AB-H)
- By Pete Palmer
- Measures Runs Above Average
- Base Runs, created by David Smyth
- More flexible
- Works across different run scoring environments (might even work in softball)
Statistics
Regression to the Mean
- Think of sophomore slumps and SI cover jinx
- Often leads to incorrect interpretation of results
- We are often measuring outcomes, not talents or skills
- Outcomes are not necessarily perfect depiction of talent. Outcomes can have a heavy element of luck.
- Envision a typical scatterplot. Even when there is strong correlation between two variables, there is always error above and below that line of best fit.
- Outcomes = Innate Skills & Talents + Error Luck Chance Randomness
- IS&T changes over time, it fluxuates
- ELCR can change the outcome even when IS&T doesn’t change!
Technology
SQL Commands
- Describe TableName – gives output of all field names, types, keys, etc. for the specified table
- You can select from two instances of the same table in SQL. For example, you can create one instance of the batting table “b12” where yearID=2012, and select data from that. You can then create a second instance “b13” where yearID=2013, and select data from that. Then use WHERE b12.playerID = b13.playerID to get stats side-by-side for two year.
R Commands
- file.choose() – Opens dialog box in operating system to choose the file to load
- DataFrameName[“XXX”] = XYZ, to add a new field called “XXX” to a data set using variable array XYZ
- lm(X~Y) – fitting linear models, linear regression of variable x against variable y
- boxplot() – creates a box plot that shows min, max, median, and quartiles
History
Earnshaw Cook
- Wrote book “Percentage Baseball” in 1964, the first full-length book onsabermetrics. It wasn’t very well written, would not stand up as a scientific finding.
- Suggested sending batters to plate in order of their skill, best hitter first (this has proven to not be true)
- Suggested discarding platoon splits and just using best hitters (not true)
- Start game with a RP and pinch hit for him first time through order, then switching to SP
- Book was too complicated, math focused, could not be adopted
- Mathematical mistakes
- But he was a pioneer by doing this in the 1960s
- Princeton, engineer, professor
- Applied math to baseball
- 1964 Scoring Index, R = (Constant * (1B+BB+ROE+HBP-2*SH)*TB)/PA
- 1972 Scoring Index, R = ((H+BB+HBP)*(TB+SB-CS))/PA
- Formulas are similar to OPS
- He analyzed run expectancy for the 24 game states
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it.Ok