Welcome to the second segment in my series documenting my own journey of getting started with daily fantasy baseball. In the first segment I did some introductory background reading, started to identify some analysts I can trust, took notes about research topics to focus on, and started planning features to be included in an Excel spreadsheet.
In this next segment we will take a closer look at some of the biggest myths that I’m seeing perpetuated out there.
I Can’t Believe We Even Have to Talk About This
I didn’t think I’d have to say anything like that until my daughters were teenagers… But here we are.
Just because you read something on the internet doesn’t make it true. We’ve all heard that before, right? I got up on my soapbox at the end of Part 1 and warned you to be careful out there… And here’s an example of why:
I would argue that BvPs are the most reliable out of all the sets we commonly use because they have the most amount of similar variables.
While it is wrong to accept BvPs as fact, it is equally wrong to discount their merit entirely.
BvPs are just one of the many things you should consider when building a roster but they are not the only thing.
Those are all from the same article talking about the merits of BvPs, or “Batter versus Pitcher” statistics. I promise you the article exists but I am not going to link to it. I’m not here to start fights. I merely want to point out that this misinformation exists, both in the form of online advice and it must also be a viable strategy in some DFS players’ minds.
Why Are You So Confident That BvPs Don’t Matter?
I can definitively say they don’t because it’s been studied by people much smarter than me. Studies using years and years of MLB data have been completed on this topic multiple times and have found that batter-vs-pitcher (or “BvPs”) information is not predictive unless you are talking at least a history of over 100 ABs between the two players.
So we have credible studies based on oodles of data performed by very smart people. Or we have baseless claims, anecdotes of individual examples, and Paul Goldschmidt’s numbers against Tim Lincecum on the other side.
If you find a study that suggest BvPs are meaningful, please let me know. But from what I have seen so far, it appears to be very irresponsible to suggest BvPs have value.
What Research Are You Referring To?
The main and most well known study I’m referring to is from “The Book” by Tom Tango, Mitchel Lichtman, and Andrew Dolphin. They studied “batters owning pitchers” and flipped it and studied “pitchers owning batters” from the time frame of 1999-2001. They identified players that “owned” other players during that three year span. Then they looked at what happened for those same players in 2002. In a very nice coincidence, in 2002 the hitters had 361 plate appearances against the pitchers they “owned”. And in that same season the pitchers had 361 plate appearances against hitters they “owned”.
The results were that the hitters had a wOBA of .349 against the pitchers they owned. These 30 hitters that were studied had wOBAs against those same pitchers ranging from .500 – .800 the two prior seasons. Then in 2002 they hit a combined .349, which is essentially league average.
And the pitchers that had dominated a group of batters allowed a wOBA of .343. Those 30 pitchers had wOBAs allowed ranging from below .100 to .210 from 1999 – 2001. Then in 2002 they too only allowed a league average wOBA against.
The results in The Book are clear:
There is no evidence to support the thinking that a hitter with an incredible record against a specific pitcher will continue that level of performance. There is strong evidence that they’ll just return to their normal talent level they’ve displayed against the rest of the league.
Hogwash
If you’re a believer in BvP you might be thinking,
“That study was from 1999. Get outta here with that weak sauce. That was so long ago, baseball was in an offensive era. Who even knows if that’s still credible. Paul Goldschmidt owns Tim Lincecum and he will continue to own him.”
Did I ever mention that I have mind-reading skills?
Alright, fine. You want more? What if I had another study to show you. One that is even more comprehensive than the study conducted in The Book. This study was conducted in 2011 that studied all batter vs. pitcher matchups since 2000. And it didn’t just compare hitters that already “owned” pitchers. This study looked at EVERY hitter with at least three appearances against a single pitcher. Would that satisfy you?
The study I’m talking about was conducted by Derek Carty and is posted here on Fanduel.com. He came at it from a slightly different angle. Instead of just looking at how the batters faced after their “ownership” of a pitcher was established (hitter X had a wOBA of .700 for three seasons against pitcher Y, then his performance fell to .340 wOBA in the fourth season), Carty was testing to see whether a hitter’s projection was more accurate or if the previous history against the pitcher was more accurate at predicting future performance in that matchup. He also studied AVG, OBP, SLG, and Fan Duel scoring, not just wOBA.
The results actually give a little credence to BvPs having some value… AFTER 100 PLATE APPEARANCES in a BvP matchup.
How many current players have even accumulated 100 PAs against individual pitchers?
I don’t know the exact answer, but I looked up Freddy Garcia and Randy Johnson, two recent pitchers that crossed my mind as having long careers. Neither of them ever had 100 matchups against a single hitter. Jamie Moyer, who pitched for 26 years (wow!), only faced three hitters over 100 times. And Roger Clemens had over 100 PAs against Harold Baines, Cal Ripken, Paul Molitor, Rafael Palmeiro, BJ Surhoff, Edgar Martinez, Joe Carter, John Olerud, Roberto Alomar, Ken Griffey, and Julio Franco.
So if you want to use BvP stats on hitters who have played every day in the major leagues for over 15 years when they face pitchers who have 20 seasons under their belt, you are justified in doing so!
I should point out that the results of PvB (flip it to pitcher versus batter) do suggest you don’t need 100 plate appearances before PvB starts to surpass normal projections. PvB can start to show benefits at around 30 PA for OBP and wOBA or around 70 PA for AVG. Still, the advantage of PvB is miniscule at that point and it’s much more difficult to apply PvB when the pitcher has to face seven or eight other hitters on that team.
But This One Time I Saw a Study on a Message Board
If you’re going to cling to this, then we’re done here. That’s not a study. It lasted for 10 days. It’s not comprehensive. I don’t know what else to tell you.
What About ‘Hot’ and ‘Cold’ Streaks?
Prices on the daily contest sites fluctuate frequently. Some of that is because of changes in circumstances (going in to Coors Field for a series versus going into Petco Park). But part of it is to reflect recent hot and cold streaks for hitters and pitchers.
How much merit should we put in such streaks? Are they predictive?
Fortunately for us there is a lot of research in this area too. The Book, Carty, and Rudy Gamble of Razzball have studied this in various forms.
I’ll try to summarize their findings so you don’t have to read for three hours… But before I go into that, it’s important to clarify something. No one is saying that hot and cold streaks don’t exist. We all observe them. They do exist. What we’re trying to study is if a streak has predictive value. If a hitter is hot for a week, what should we expect his performance on the eighth day to be? Will it be elevated? Or is his career baseline (or true talent level) a better predictor?
The Book on Streaks
The authors of The Book identified hitters and pitchers that experienced five-game hot or cold streaks. They then looked at the player’s stats in the one game immediately after the streak and the five-game period after the streak. The authors had to arbitrarily identify what constituted a hot and cold streak. They selected a measure that resulted in 5% of all five-game periods qualified as hot, 5% as cold, and 90% as neither.
The study grouped all hot and cold hitters together to determine their collective wOBA during the streak and for the one and five game periods afterwards. The authors also calculated an expected wOBA by using weighted averages of the three previous seasons (this sounds similar to Marcel).
For pitchers, four game streaks were used to determine hot and cold streaks and the authors only looked at the one game following the streak.
The results of the study are as follows:
- During their streaks, the hot hitters had a wOBA of 0.587. The expected wOBA for hot hitters was 0.365. The actual wOBA for hot hitters in the one and five games after their streak was 0.369 (during the streak they hit a collective 0.587)
- During their streaks, the cold hitters had a wOBA of 0.151. The expected wOBA for cold hitters was 0.336. The cold hitters hit 0.330 in the first game after the streak and 0.332 in the five games after.
- During their streaks, the hot pitchers allowed a wOBA of 0.217. The expected wOBA allowed for hot pitchers was 0.310. The actual wOBA allowed by the hot pitchers in the game after their hot streaks was 0.299.
- During their streaks, the cold pitchers allowed a wOBA of 0.453. The expected wOBA allowed for cold pitchers was 0.348. The actual wOBA allowed by the hot pitchers in the game after their cold streaks was 0.354.
Derek Carty on Streaks
Carty looked separately at hot and cold streaks for hitters and hot and cold streaks for pitchers.
In his research for hitter hot streaks, he looked at all hitter data going back to 1993 (he performed the study in 2011) and identified players that has 7-day, 14-day, and 30-day hot streaks. He then took each player’s 8th day, 15th day, and 31st day stats and separately compared them to a projection and the hitter’s own stats during the streak. The study specifically focused on BA, OBP, SLG, OPS, wOBA, and FanDuel point scoring. He found that the projections are much more accurate for the shorter streaks but that the gap narrows as a hitter’s hot streak gets longer. Still, in the end the hitter’s projections are much more accurate.
He did find something very interesting for hitter cold streaks. His methodology for cold streaks was the same as for his hot hitters study. The stats from a player’s 7-day cold streak were not as predictive as the hitter’s projection, but the 14-day cold streak data was more predictive for SLG and FanDuel scoring! The projections won out for BA, OBP, OPS, and wOBA.
Carty hypothesizes that this makes sense, stating that a hot streak is likely some kind of random fluctuation in the player’s performance, while cold streaks could be both random fluctuation as well as some other outside influence like an injury or a mechanical problem.
When he got to pitcher hot streaks, Carty looked at 3-game, 5-game, and 7-game hot streaks. He found that preseason projections were more accurate at projecting ERA, WHIP, and strikeouts than the streak itself was. Just like with hitters, the longer the hot streak progressed, the more accurate the streak became, but it never got more accurate than the preseason projection.
Carty’s study on pitcher cold streaks yielded some similar results to the cold hitters study. As early as a 3-game cold streak, the streak’s ability to project strikeouts is nearly as predictive as the preseason projection (the preseason projections had an advantage in ERA, WHIP, and FanDuel scoring). For the 5-game streak, the streak becomes slightly more accurate than the projections for strikeouts and FanDuel scoring. And for the 7-game streak it’s much the same story, except the streak and projections are about equal on FanDuel scoring.
Rudy Gamble on Streaks
Razzball performed a study on hitters and another on pitchers. The hitter study involved calculating the correlation between a hitter’s previous three games, their Hittertron projections, and the player’s stats in the fourth game. They also ran a variation of this over five game stretches.
Razzball creates their own projections using Steamer as a starting point. They then adjust those projections based on ballpark, opposing pitcher, and home/road splits.
The results of the test showed that the Hittertron stats were much more predictive than any recent measure of performance (3 or 5 game histories).
Rudy also ran a similar study on pitchers, taking a pitcher’s previous three starts, their Streamonator projections, and the pitcher’s stats in the fourth game. The results of this test showed that looking at the last three games for a pitcher by itself is not as predictive as Razzball’s pitcher projections, but if he incorporates the last three games into the projections there is a slight increase in the accuracy of IP and K/9 projections.
Putting All Of These Studies Together
There is a lot going on in the studies above. Some of the information conflicts slightly. The Book found that there were very slight increases in performances after hitter hot streaks and slight decreases after hitter cold streaks. It found more meaningful increases in performance after pitcher hot streaks. Meanwhile, Carty and Gamble found very little evidence in the predictive value of streaks.
There are a few caveats to be aware of in this research. The authors of The Book and Carty had to set arbitrary cutoff points for what defined a hot or cold streak. This is not easy to do and could affect results some. It appears both also used Marcel (or some close variation) as the projection system of reference. This is noteworthy because some of the complex systems like Steamer now have daily in season updates. While we don’t know exactly what goes into the in-season updating, we can be assured that it must include some measure of current season performance. This would incorporate recent hot and cold performances (but I’m sure they’re weighted very lightly and don’t have much of an affect).
Rudy Gamble did not filter out “regular performances” from hot and cold streaks. He simply broke everything into five game chunks and used that in his analysis whether the hitter was hot or cold. He wasn’t necessarily studying “hotness” or “coldness”. Just “can we gain anything by looking at a player’s last five games”.
After reading all of these studies, my conclusion is that there is nothing to be gained by looking at a hitter’s recent performance. None of the studies found anything meaningful on the hitter side. In regards to pitchers, even though Carty and Gamble didn’t find anything, I’m inclined to believe in The Book‘s findings that there is something small to pitcher hot and cold streaks. But again, it’s not a LARGE EFFECT. The authors of The Book convert wOBA into an expected ERA measure and found that a pitcher coming off a hot streak beat the expected ERA measure by about 0.30 runs while a pitcher coming off a cold streak under-performed the expected ERA measure by about 0.20 runs. That is something worth noting.
How to Apply This
Even though I’m inclined to believe in the pitcher streakiness findings of The Book, I think it’s safer to just ignore the findings. I say this for a few reasons:
- The research is a bit inconsistent between the studies.
- What The Book did find does not appear as though it will greatly affect a daily fantasy performance.
- Carty found no indication that streaks affect other fantasy-relevant measures like strikeouts
- I’m likely to misapply the findings! I don’t think the human brain could apply them with the appropriate measure of scale. If I allow myself any room to grant extra credit to hot pitchers, I’ll be very likely to overrate any such pitcher.
This reminds me of a book I read about a year ago, “Thinking Fast and Slow“. The book examines the two thought patterns our brains typically follow and the topics of decision making and cognitive biases. Our quick-thinking and instinctual thought center that can’t comprehend probabilities is likely to overrate any “hot” players if we allow it to. So because of the inconsistent findings between all the studies and the minimal effects the authors of The Book identified, it seems like making no explicit adjustments for hot or cold streaks makes the most sense.
Takeaways
The bottom line is that there is no reason to use batter versus pitcher statistics and there is very little reason to put stock in the recent performances of hitters and pitchers.
If you have not read The Book, you need to.
I say you “need” to read it because it will challenge a lot of the beliefs you have about baseball and it does an excellent job of scientifically breaking down a lot of other conventional beliefs held by baseball fans. You’ll find out when it makes sense to bunt, if clutch hitting exists, where to locate base stealers in the lineup, and the power of the platoon advantage.
What Are Your Thoughts?
Have you seen any other studies on the topics of BvPs or hot/cold players? I’d love to look over the results of other studies. Please let me know in the comments area below.
Great article…I know not to put much weight in BvP and streaks but I catch myself doing it all the time especially when I am stuck on two players and one spot left!
Really looking forward to this DFS series by you.
Thanks, Kevin. I’m slowly working through the DFS world. I think one or two more “conceptual” articles and then I’ll start breaking into Excel. I think it’s probably fine to use BvP as a tie breaker for two players that are otherwise close in your mind.
Great article. I am enjoying your information on DFS. I just recently started playing myself and after doing research, I have found your site to be the most helpful. Thanks for all the work you do.
Nice article. I have to disagree with the use of arbitrary cutoffs, though. It’s not like the predicitveness of a BvP history goes from 0% to 100% at the 100 PA mark. It’s going to be fairly linear. So if a veteran pitcher is facing a lineup — most of whom he has historically owned (vs. expected perf.) over 30-50 PA per batter — he almost definitely has a postive expectation over a neutral projection. I.e., a bunch of small, otherwise negligible edges matter when they happen to line up in one direction. It’s how I have succeeded in MLB betting for 10 years.
Hi Evo,
Yes, I’d have to agree. I word things a bit too simplistic and it may suggest the usefulness goes from 0% to 100% at a certain cutoff point. But I’d agree. The further along that path you go, the more predictive value you could draw from the stats. And I think your mention of 30-50 PA seems like a fair spot to start. Below that, just because a player has shown an advantage against another, I think there’s too much noise to try to claim there’s truly something to it.