Now that we’ve established that we can benefit from combining multiple projection models into one, let’s take a look at the challenges this presents.
I’ll also give brief explanations of how you can work around these challenges in Excel. At the end I’ll discuss an Excel template I’m working on that will do these calculations for you automatically and how you can get your hands on it.
I love Your Feedback
If you’re a SFBB Insider you might recall that after you sign up, the very first e-mail I send you asks you to reply with any fantasy baseball topics you’d like to know more about or difficulties you’re having (if you’re not, you can register here. I like to think it’s worth your while).
I Read All Of Those Responses
I’ve been fortunate enough to have nearly 500 people register, and I read every single response that comes in from that question. One of the most frequent areas of interest is how to average, or aggregate, multiple sets of projections into one usable set of information.
More Difficult Than I Originally Thought
These requests started to roll in during the off-season, and I even replied to several people saying that I thought this was going to be easy and that I’d have guidance coming out soon on how to do this.
… And here I sit months later having never written on the topic yet.
In theory, averaging a set of three numbers in Excel is easy. If one system says 25 HR, one says 30 HR, and another says 35 HR, Excel’s AVERAGE formula can easily respond with the average of 30.
But I quickly ran into some big problems that greatly complicated things.
Problem 1 – Lining Projections Up To Do The Averages
In order to aggregate multiple projection systems into one, we need a method of “lining up” the projections from one system with those of another system. Perhaps Giancarlo Stanton is projected to hit 20 HR the rest of the season by Steamer and 22 HR by PECOTA.
We can use formulas in Excel (e.g. VLOOKUP) to pull Stanton’s Steamer projection and place it next to his PECOTA projection. But you can run into some complications in doing this. What if one projection system lists him as “Stanton, Giancarlo” and the other as “Giancarlo Stanton”.
Using names to pull data also opens you up to inconsistencies in the name being used. Is it Jonathan Singleton or Jon Singleton? AJ Burnett or A.J. Burnett?
If you have taken on the challenge of creating your own rankings, you know that we’ve dealt with this problem before, but on a smaller scale. In my rankings spreadsheets I use a consistent playerID to pull information between the different tabs. I prefer to use the Baseball-Reference playerIDs because you can tell who a player is (Stanton is “stantmi03” because there were two other Mike Stanton’s before him).
But seemingly every major baseball site has their own player ID system. Fangraphs says Stanton is “4949”, Baseball Prospectus uses “57556”, ESPN says “30583”, etc.
This is why I maintain the SFBB player ID map Excel file. The map allows for this translation or “lining up” to happen. It’s the bridge that can easily help you take Stanton’s projection from one system and place it next to his projection from another.