Should I use this projection system or that one? Why mess around with the second best system if you can easily determine the best, right?
If you search the web, you can locate previous studies that review the accuracy of baseball’s many projection models.
- Monster 2013 Projection Review, Rotovalue.com
- Evaluating 2013 Projections, Fangraphs Community, Will Larson
- Fantasy Baseball Accuracy, FantasyPros.com
- Fantasy Baseball Projections Review – 2012, Razzball.com
I Don’t Have Time To Read All That. Just Tell Me what They Say.
Understood. Here’s my summary:
- There area lot of different approaches to projecting stats (Marcel, Steamer, Zips, Oliver,PECOTA, etc.)
- Basic three year weighted average with regression to league average
- More than three year weighted averages incorporating more advanced component metrics
- Crowd sourcing
- Aging curves
- Similar player modelling
- No single projection system is consistently better than the others in all the stat categories we care about for fantasy baseball
- The most accurate projection model changes from year-to-year
- But there are some that consistently perform well
- Some systems do well in projecting offensive statistics
- Some are better at pitching
What Is Also True
A lot of research has been done on the effectiveness of combining or “aggregating” different projections or forecasts into one. This research was not done with only fantasy baseball in mind, but we can take advantage of it. Here’s one very interesting article on the topic (it’s from a website named “forecastingprinciples.com” and is a PDF of a study from the Wharton School of Business at Penn, it has to be legit, right?).
The thinking behind aggregating projections is that the wisdom of many intelligent people looking over a lot of information can lead to better results than just one isolated model for projecting future results. When you combine all of this together you’ll naturally be removing the outliers from the individual models, but hopefully you’re also improving the accuracy as a whole.
The Actual Results
It may not be appropriate to boil a 15 page research paper into a couple of sentences. But I’m going to do it anyway! Here’s what the PDF linked above concludes on the evidence on the value of combining forecasts:
Combined forecasts are more accurate than the typical component forecast in almost all situations studied to date. Sometimes the combined forecast will surpass the best method.
So there you have it.
Suggestions on How To Combine
The article suggests the following:
- Combine forecasts using different methods
- Combine forecasts using different input data
- Use at least five methods, when possible
- Use formal procedures for combining (a mechanical, structured method)
- Use equal weights in the combination unless you have strong evidence to support unequal weighting
And these are the situations when combining is the most beneficial
- Uncertainty about which forecasting method is the most accurate
- Uncertainty in the forecasting environment
- There is a high cost for large forecast errors
Conclusions
Seems to me that baseball statistics are ripe for this.
We have a lot of different forecasting methods available to us. They use a variety of different inputs. We can easily combine them using a simple mechanical approach. We are uncertain which method is the most accurate. There is great uncertainty in projecting baseball stats. And bragging rights (and maybe a few dollars) in a fantasy baseball league is surely a “high cost”.
What’s Next?
Coming soon, I’ll give you a sneak peak at an Excel tool I’m developing that will simplify the process of averaging multiple projections.
Great article yet again. Looking forward to the next in the series. Unfortunately it’s hard to find the actual time for some of this legwork needed!
Thanks for the complement, Sean. I agree. Time is at a premium these days. It’s taken me into July to get something done that I was planning on completing in the preseason. If all goes according to plan, the tool I’m working on will allow you to drop in a handful of different projections (like those at Fangraphs) and instantly get averages from the different sets. Maybe with adjusting a few settings, a few mouse-clicks, etc. But hopefully easy to use.
Has this happened yet? Would love to mess around with it during the winter.
Hi Mark, I am still working on some finishing touches to this, but I’ll e-mail you a version of the “older” model (that’s still going to be a very close approximation to the finished product).
Unless your ETA on your finished product is coming soon, I’d too like a version of the “older” model, if you’re so inclined. Thanks so much for everything in your blog, “helpful” is a vast understatement.
Rocky, thanks for your interest. I’ve e-mailed the file to you. Please let me know if you have any questions.
Hi Tanner! I’d love to see the older model as well, if that would be okay. jedi_116@yahoo.com
Would love an email with the older model as well. Keep up the good work!
Thanks for all the feedback and interest in this. After having many of the SFBB readers test and use this, I’m finally confident enough to formally release the Projection Aggregator into the wild! You can read more about the finished product and how to get your hands on it here.