In our last post we took a close look at a handful of different web query options available in Microsoft Excel.
By the end of that post we had an Excel file that was able to automatically go out and pull in a raw listing of player names and salaries. All you had to do was figure out the ID for the slate of games you’re entering on FanDuel and type in that five digit number.
Then voila… Current salary information (or you can even run it for tomorrow to start planning the night before)!
The Next Step
So now we have player names and salaries. And we know from doing research at how to succeed playing DFS that we also need to bring in other information like batting and pitching splits, Vegas over/unders, and weather data.
We also know that using Player IDs is a more reliable way of matching players up with all of this data.
New Column On The Player ID Map
To bring all this data together, I recognized that we need a way to match a player’s name according to FanDuel to the player ID systems shown on the Player ID Map. So I made an update to the map over the weekend and added “FANDUELNAME” in column AF (I also added over 30 new players that have been called up during this season and/or are starting to have a “fantasy impact”. Guys like Carson Smith, Nate Karns, Carlos Frias, Lance McCullers).
If you’re using the Player ID Map and you want instructions on how to drop in an updated version, check this out.
But We Had a Problem
Look closely at the image below.
Who the heck is “Clayton KershawP”?
If you look through the list a little more you will see “Jose FernandezDL”, “Yordano VenturaDTD”, and “Ervin SantanaNA”.
You can probably see that some player names are reflecting health or availability information. This is great, but it poses a challenge for our ability to match to the new column in the Player ID Map. Kershaw is going to be listed as “Clayton Kershaw” on the list. Not “KershawP”.
Enter Excel Formulas “RIGHT”, “LEFT”, “FIND”, “LEN”, and “IF”
We are going to use a series of formulas to do the following things:
- Identify if a player name has “DL”, “DTD”, “NA”, or “P”. I’ve scanned the list of player names and those are the only injury/availability classifications I see.
- If a player has one of those health indicators on their name we will strip it off
- If a player does not have health indicators, we’ll just use their name as it is shown
RIGHT
The RIGHT function will give you the rightmost characters in a text string. For example, we could use the RIGHT function to look at the string “Yu DarvishDL” to determine if the two rightmost characters are “DL”.
This formula uses two inputs:
- Text – The main string of text you want to pull the rightmost characters from (this can be a cell reference)
- Num_chars – The number of characters to pull out
We will use the RIGHT function to look at only the last three characters of each player’s name. We only need the last three because our longest health indicator is “DTD” (three characters in length). And we don’t want to look at full names, because one of our health indicators is just “P”. We can’t just tell Excel to look for a capital P in an entire player name because player’s like Michael Pineda or David Price might cause us trouble. So by limiting our search to just the last three letters in a player name, we should be able to located “DL”, “DTD”, “NA”, and “P” without issue.
FIND
The FIND function searches for a specific string of text within another string of text. If the string you are searching for is located, it returns a number that indicates the location where the string starts. FIND is also case-sensitive (thank goodness, otherwise looking for “NA” at the end of player names could be a problem with guys named “Santana”).
This formula requires two inputs and has one optional input:
- Find_text – The string of text you are searching for. You can enter your text in double quotes or use a cell reference.
- Within_text – The string of text you want to search within. This can be a cell reference.
- Start_num (optional) – What character within the string you want to start searching at. For example, if you wanted to start by searching only after the fifth character in a name, you could enter a five for this argument.
We will use the FIND function in conjunction with the RIGHT formula mentioned above. We will use RIGHT to first pull out only the last three characters from a player’s name. We will then run a FIND on those three characters to look for “P”, “DTD”, “DL”, or “N/A”.
LEFT
Similar to the RIGHT function, LEFT will give you the leftmost characters in a text string.
This formula uses two inputs:
- Text – The main string of text you want to pull the leftmost characters from (this can be a cell reference)
- Num_chars – The number of characters to pull out
We will use LEFT on players that have health information in their name. For example, for “Clayton KershawP”, we will want to pull out the first 15 characters (out of the total 16 characters in the full string of text).
LEN
The LEN function will tell you the number of characters in a string of text.
This formula has just one input, Text, which represents the text to want to count the characters from. This can be a cell reference.
We will use LEN with the LEFT function mentioned above. Remember the “Clayton KershawP” example where I mentioned that string of text has 16 characters and we want to take the leftmost 15 characters? We will use LEN to easily get that 16. We’ll use LEN to evaluate how many characters are in every player’s name.
IF
The IF function allows you to evaluate a cell to see if a condition is true or false. If the condition is true, we can give one result or a specific formula. If the condition is false, we can give a second result or an alternative formula.
This formula uses three inputs:
- Logical_test – The condition you want to evaluate for being true or false.
- Value_if_true – The value or formula you want to run if the Logical_test is true.
- Value_if_false – The value or formula you want to run if the Logical_test is false.
Nesting Formulas
By default, the IF function only allows you two options. One if true. One if false.
This might be a problem for us. Here are some of the tests we need to run…
If we find “P” in the last three characters of the player name, cut off the last character from the name (or if the player’s name is 16 characters in length, give me the leftmost 15). If we find a “DL” in the last three characters of the player name, cut off the last two characters from the name (or give me the leftmost 14 from a 16 character name). If we find a “DTD” in the last three characters, cut off those three letters from the name (or give me the leftmost 13 from a 16 character name). If we find an “NA”, cut off the last two characters (14 from 16). And if none of those are true, then just give me the full length of the player’s name and remove nothing.
Thankfully you can also “nest” IF functions to give you more options. In a weird combination of Excel language and plain English, our formula will look something like this:
=IF(player is status "P", give me LEFT(player_name, LEN(player_name)-1), if their status is not "P" then IF(player status is "DL", give me LEFT(player_name, LEN(player_name)-2),if their status is not "DL" then IF(player status is "DTD", give me LEFT(player_name, LEN(player_name)-3, if their status is not "DL" then IF(player is "NA", give me LEFT(player_name, LEN(player_name)-2), if their status is not "NA" just give me player_name))))
This is basically saying, if a player is marked as “P”, cut off one character from their name (that’s what the “-1” is doing in the LEFT(player_name, LEN(player_name)-1)
. If the player is not marked as “P”, go to the next IF statement. If the player is marked as “DL”, cut off two characters from their name LEFT(player_name, LEN(player_name)-2)
. If the player is not marked as “DL, go to the next IF statement.
And so on. Until we get to the very end of all the IF statements. The final player_name
just indicates that if all the previous IF conditions are false, this is what will show.
One More Web Query Option I Didn’t Mention
We looked at quite a few web query options in the last post, but I neglected to mention one that will now be helpful to use. We are going to be placing the formulas from above next to the FanDuel web query results (you can see my vision of how this will look below, they’re shaded a yellowish color).
The potential problem with doing this is that the web query results and FanDuel’s list of player names will be changing each day and for each contest. The length of the player list is going to be a lot longer for a full slate of games on a Wednesday night than it will be for the afternoon slate on a Thursday.
Fortunately, Excel offers a setting that will attempt to fill the formulas down as far as the data in the web query reaches. To activate this setting, first select a cell in the middle of the web query results (for example, select the cell with Clayton Kerhsaw’s name).
Then click on the “Data” tab. And then click the “Properties” button in the “Connection” section of the Data tab.
This will bring up the “External Data Range Properties” dialog. Check the bottom check box for “Fill down formulas in columns adjacent to data”. This will attempt to fill our formulas to the exact size of the web query results. I also like to uncheck the “Adjust column width” box because each time the web query runs, it resizes my columns and I find it annoying.
Step-By-Step Instructions
If you want to make sure you’re at the same starting point as me, here’s an Excel file (more…)