NOTE: It appears the information below is no longer relevant. You are welcome to read the article because it still contains a valuable approach that can be used in other scenarios. But the specific act of web querying FanDuel for salary information as shown below no longer seems to work.
Does this sound familiar?
You’re doing prep work for setting a daily lineup or even preparing for a season-long rotiserrie league. You’re trying to set up a spreadsheet to help you prepare, but you are sick and tired of having to copy information from the web and paste it into Excel.
There has to be a better way!
In this post I’ll show you an efficient way of grabbing player salaries directly from FanDuel’s website using Excel’s web query function. In fact, I’ll show you three different variations of Excel web queries:
Before I begin, I need to be honest with you about something.
I Don’t Know Where We’re Going
This is the first post of what I hope will be a series documenting how to build a spreadsheet for DFS. But the thing is, I don’t know exactly where I’m headed on this journey. I can’t promise you a panacea to cure all your DFS aches and pains. I don’t have a master plan that will lead us to a perfect functioning spreadsheet that will fit everyone’s exact desires. But my plan is to just start moving the ball in the right direction.
I don’t have much DFS experience. I don’t know exactly what you want. If you are looking for DFS lineup advice, I can tell you I won’t be giving that. But what I do have are a very particular set of skills. Skills I have acquired over a very long career… OK, I’ll end my Liam Neeson joke.
I hope that by just starting to build something, starting to share techniques you can use on your own, and by seeking feedback, we will eventually end up with something special.
Some of the techniques I’ll show you may seem silly. Or pointless. Or way too involved.
But I have a purpose in mind. You might wonder why I pull data from Site Y when Site X has the same information in an easier to use format. Or you may want to incorporate additional data that I don’t want to pursue.
That is why I want to stress it is not the “WHAT” in the instructions that is the important part. It is the “HOW”. I’m going for that whole “teach a man to fish” proverb. So let’s start learning…
Not All Versions of Excel are Created Equal
I’ll be using Office 2013 installed on a Windows 7 machine. I believe the web querying experience will be similar for Office 2007 and Office 2010 (while using Windows). Unfortunately, I have had poor results with web querying in Excel for Mac, but it can be done.
What Is a Web Query?
A web query is an automated way to copy information from a table on a web page into Excel. It removes the need for you to manually copy and paste data from websites into Excel. On top of that, you can set the query up to run each time you open an Excel file or change a particular value, so you can escape the cycle of continuously needing to go to a web site, copy the data, and paste it into Excel each day you want to set lineups. It can refresh automatically!
As you’ll soon see, setting up a basic web query is fairly easy. But there are a couple of more advanced settings that should make your life a little bit easier.
Step-By-Step Instructions, Simple Web Query
For this example, let’s just go with a simple and straight-forward web query. Later in this post we will build on this.
Step | Description |
---|---|
1. | Log into your FanDuel account using Internet Explorer.
If you don’t have one, you can sign up through this affiliate link (what’s an affiliate link?). The reason we have to do this is because Excel uses a browser (Internet Explorer) to perform the web query, but won’t be able to access the FanDuel web site if it doesn’t have your log in information. So logging into Internet Explorer will provide that and allow Excel to get through FanDuel’s authentication. If you have Google Chrome or another browser installed on your computer, it’s still important to do this log in through Internet Explorer because that’s the browser Excel uses for the web query. |
2. | Once you’ve logged in, locate a contest you want to enter. Don’t click on anything yet. Just identify one. You will be able to change your mind later on the exact contest, but you do want to pick one for the slate of games you want to play with (if you want to play the late game slate, don’t pick a contest that includes only early games).
It is important that you choose a “contest” and not one of the public head-to-head competitions that are available. For example, you could choose the third or fourth contest shown in the image below but not the “vs. csuram88” H2H matchups. After you have identified a contest, hover your mouse over the “Enter” button. Then note the URL that button will take you to in the lower right hand corner of Internet Explorer (this is a good tactic to use if you think a link on a website is going to take you somewhere fishy, hover over a link on the web and see the website you’ll be taken to). |
3. | Right-click on that same “Enter” button and choose the option to copy the shortcut. |
4. | Open Excel and start a blank file. Eventually you can do this on a blank tab in an existing spreadsheet you’ve started, but I’d recommend using a test file for now until you decide exactly how you want to set up your query. On the “Data” tab, choose the “From Web” option.
Once the “New Web Query” window pops up, paste the link you just copied into the “Address:” field and hit the “Go” button. I get a variety of script errors every time I perform a new web query. It’s not unusual, and you can probably answer so that you stop running scripts on the page. |
5. | Once the web page fully loads in the web query preview, look for small yellow squares containing an arrow pointing to the right.
These squares are intended to allow you to choose specific pieces or components of the web page to pull in. Or you could elect to import the whole entire page. You will likely see one of the yellow boxes in the top left of the web site preview. Choosing this yellow square will import the entire page. Click your mouse on the yellow square. You should see the square turn green. |
6. | Now click the “Import” button
A few things to keep in mind about this “Import Data” popup… The “Where do you want to put the data?” is asking where do you want the data from the website to go. You can also choose to start a new worksheet so the query doesn’t interfere with any other calculations. Click “OK” to start the import (we’ll talk about the “Properties…” later. You should see a status message on the bottom of Excel stating “Running Background Query”. |
7. | Because we imported the entire page, there will be some “junk” pulling into the top of the sheet. This is the various text, links, and other information at the top of the contest page.
If you scroll through the other information to about row 90 or so, you should see the player names and salaries pulling in. |
That’s It, You’ve Completed the First Example Web Query
This example is best suited for a one-time capture of data, so it isn’t perfect. But it illustrates how quickly this can be done. When we layer on the ability to easily update the query and to focus on specific tables in a page (keep reading), the power of the web query will increase greatly.
What Do I Do With All This Garbage Info That Pulled In With the Salaries?
Go easy, big fella. Let’s make it through the next couple of web query examples to see some improvements to this process. But also know that I eventually plan to show you how to take a dump of information like that and use various formulas to hone in on what you truly need.
We can move this eyesore of a tab to the back of your Excel file and use formulas to pull the valuable information to other areas of the file.
What Good Does It Do Me To Have This Query On a Random Tab in My Spreadsheet?
You might be asking yourself this question. Or maybe you’re wondering how you will later be able to combine this basic salary information with several other sources of data like historical stats for players, weather information, or betting lines.
When I’m designing a reusable spreadsheet, the initial hurdle is just to get all the information into the file. We can then use VLOOKUPS and other formulas to pull the different pieces of information into a central location. I find it better to keep the different data sources compartmentalized (or independent) so they can easily be updated later (via web queries).
Step-By-Step Instructions, Dynamic and Updating Web Query
You know how that FanDuel contest URL we identified above:
With a little experimenting, I’ve been able to determine that the only part of that long URL we really need is the https://www.fanduel.com/e/Game/#####. It appears to me that the ##### is a five digit number that FanDuel uses for all contests beginning at a particular start time. Or you might refer to this as the “slate of games” (the early slate would have a five digit ID that is slightly different than the late game slate).
If you type the https://www.fanduel.com/e/Game/##### URL into your browser, you should be taken to a page that displays all the salaries for the contest (you can’t look at past days or contests that have already started, unfortunately).
Of that shortened URL, the first part of that address, “https://www.fanduel.com/e/Game” stays the same. It’s just the five digit numeric ID (12452 in the image above) that changes. Well, instead of having to go through the tedious process of creating a web query from scratch each day, we can set one up that knows to look for the URL in a cell in the spreadsheet.
We can also tell Excel to update the query anytime we change that five digit number. So tomorrow when a new set of FanDuel salaries come out, we just get that five digit number, type it in, and all the salaries in the spreadsheet will update.
NOTE: The first three steps below are identical to the first three steps in the example above, but after that things begin to differ.
Step | Description |
---|---|
1. | Just like in our first example, log into your FanDuel account using Internet Explorer.
And if you don’t already have an account, you can sign up through this affiliate link. Remember that it is important to use Internet Explorer to log in because that is the browser Excel will be running the query through. |
2. | Once you’ve logged in, locate any contest running for the day. Make sure you choose a “contest” and not one of the public head-to-head competitions that are available. For example, you could choose the third or fourth contest shown in the image below but not the “vs. csuram88” H2H matchups.
After you have identified a contest, hover your mouse over the “Enter” button. Then note the URL that button will take you to in the lower right hand corner of Internet Explorer. |
3. | Right-click on that same “Enter” button and choose the option to copy the shortcut. |
4. | In the Excel file you started earlier, create a new blank tab/worksheet.
On the “Data” tab, choose the “From Web” option. |
5. | When the “New Web Query” dialog opens, paste the URL you just copied into the “Address:” field. But remember that the full web address is not needed for the query to work, so delete everything that comes after the five digit Game ID.
Then click the “Go” button. |
6. |
Click the “Save Query” button.
You will then be prompted to save the query (“.iqy” file) I would recommend storing all your .iqy files in one location or near the Excel files you use for setting your DFS lineups. Excel defaults to saving them in a buried location on the C: drive. I would change that to somewhere you can remember because we’ll be editing these .iqy files. |
7. | Using Windows Explorer (not Internet Explorer!), browse through your computer and locate that .iqy file.
Right-click on the file and choose the “Open with>Notepad” menu option. Most computers should come with Notepad, but if you don’t have it you could try WordPad or any other simple text file editor. |
8. | When the file opens in Notepad, you should see something like this:
We’ll start by changing the third line of the file. Delete the FanDuel web address and edit the third line to be: |
9. | Under the File menu, click “Save”. Then close Notepad. |
10. | Return to your blank Excel sheet you created above. Type the following in cells A1 and A2: |
11. | Type the FanDuel contest ID from the URL you previously identified. For example, in my screenshots above the ID is 12471.
Type the following formula in cell B2: Then hit Enter and you should see this (adjust for your own FanDuel contest ID): That formula we just used basically just attaches two text strings into one. We took the “https://www.fanduel.com/e/Game/” string of text and attached the value from cell B1. When Excel strings them together you get the URL we are hoping for! |
12. | Before we go any further, I should explain that to this point in the instructions we have not actually added the query into our Excel file. The separate act we did of creating and saving the .iqy file did not actually start the query in this particular Excel file.
To this point we set up a set of query instructions. Those instructions are going to prompt Excel to ask us to “Enter Desired Web URL to Query Here” (go look above, that’s what we typed into the iqy file). Then we separately set up cell B2 in our Excel tab that contains the URL to run the web query on. So let’s go ahead and link all of this together. Use your mouse to select cell A4. Then click on the Data tab in Excel and then click the “Existing Connections” button. |
13. | When the “Existing Connections” dialog opens, you should see the connection we created in our first web query. Click “Browse for More…” to start a new connection.
Then click the “Browse for More…” button. |
14. | Browse through your computer and locate the .iqy file we just finished editing. Then hit the “Open” button.
You should now be prompted about where do you want to put the imported data. We previously selected cell A4, but choose another spot, if you wish. Click “OK”. |
15. | This is the prompt we put in earlier! Excel is asking (because we asked it to in the .iqy file) us which URL we want to query.
Use your mouse to select cell B2. Then check both of the available check boxes. By checking these boxes, we will be able to type in a new contest ID in cell B1 and then Excel will immediately query the new URL. No more going to FanDuel, copying the list of players and pasting into Excel! Just type in a five digit number and get the salaries for the selected contest. |
16. | Click “OK” to start the web query.
Just like our first example, you will see quite a bit of extra “junk” pulling into the query, because at this point we are bringing in the entire web page. Keep reading for information on how to isolate in on certain sections of a web page. |
A Couple More Tips
I’m a big fan of using color or formatting to indicate meaning in your spreadsheets. I’d recommend shading or somehow formatting the cells where you need to type (input cells). For example, you could shade the Contest ID cell yellow:
This way if a month goes by before you come back to the spreadsheet, you’ll have some visual queues about where you can type.
You do need to be careful about typing information on a tab containing a web query. When the query comes in, if you have information to the side or below the web query and the size of the data coming in increases (wider or longer), your other information could get deleted. I think it’s safest for the query data to flow into a tab and then pull that information elsewhere to other tabs, rather than building complex formulas and analysis right onto the same tab.
Let’s Step This Up One More Notch
In this next part we will try querying specific tables from a web page. But before we start, I’ll give a quick lesson on HTML.
HTML, or Hyper Text Markup Language, is the standardized language for coding web pages. You don’t need to know much about HTML when it comes to web querying, but it helps to know the basic format of a table in HTML. At a very high level, any table you see on a web page will be coded like this:
<table>
<tbody>
<tr>
<td></td>
<td></td>
</tr>
</tbody>
</table>
Again, that’s a VERY simplified look at a table. There will be lots of content and some other code mixed in and around the table. But for web querying, it can be helpful to know that everything related to the table starts at the <table>
tag.
Do you remember the .iqy file we created and edited in the last example? You might have noticed that there was a line in the code for “Selection”. The default entry for that line is set to “EntirePage”, but we can change that.
Instead of “EntirePage”, you can specify the number of the table on a web page to be imported. For example, to import the third table on a web page you would use the following in your .iqy file:
Some sites (Razzball, Baseball-Reference) even give their HTML tables specific names inside the code. If you’re lucky enough to come across this, you can enter the name of the table for the “Selection” argument.
Let’s go through an example.
Step-By-Step Instructions, Table-Specific Web Query
Step | Description |
---|---|
1. | Open your internet browser of choice (Google Chrome, Firefox, Internet Explorer, etc.). I will be using Google Chrome which has a couple of nice features that make this step a little easier than when I perform the same steps on Internet Explorer. You’ll still be able to use Internet Explorer, but note that there may be a few differences in screenshots and features I mention.
Go to a web page (URL) containing specific data you wish you query. To keep consistent with this FanDuel example, I’ll attempt to query the following site: |
2. | Once that web page has fully loaded, locate the table you want to query. Right click somewhere toward the top of that table and choose the menu item for “Inspect element…” (it appears to me that the menu item is named “Inspect element” for both Google Chrome and Internet Explorer, the exact phrasing in other browsers may be slightly different). |
3. | This should open up a look at the HTML code that was used to generate the web page. This view will look different in each browser. I generated the following screenshots using Google Chrome. So just be aware things may look slightly different on your screen. The HTML code itself should be the same regardless of the browser you are using, but the appearance may not.
If you were close to the top of the table when you right clicked, you might see the start of the “table” HTML. |
4. | You can see from the image above that this “table” element does not have an “id=”. If there were an “id” we could use that in the web query to pull this specific table.
So because of that we will have to determine which number of table this is on the page (e.g. is it the first, second, third). There is a lot of information here and it would be easy to get overwhelmed by manually searching for and counting tables. But we can use the “Find” feature to locate things. Scroll to the top of the HTML element window so you’re at the beginning of the page. Hit CTRL + F to open the Find… dialog. Type the string “<table” (no quotes, but use the less than symbol before the word table and with no greater than symbol at the end). Hit Enter to start the search. |
5. | Your browser will locate the first HTML table element and highlight it.
One of the nice features of Google Chrome is that as you hover over an element in the HTML window, that same element will be highlighted in the actual browser window. This is extremely helpful for allowing you to identify which table the search just turned up. You can see from the image above that this is not the exact table we are hoping to query. We want the table with the full list of salaries, not just the table containing the headers. |
6. | Now that we’ve identified that the first table is not what we want, let’s check the second table. Return to the “Find” search box and note that two table elements were found. To confirm that the second table is the one we want, click the down arrow to view the second table instance.
Once the second instance is found, hover over the line in the HTML code. Then verify that the full listing of players becomes highlighted. |
7. | Now that we have identified that the second table on the web page is what we want to import into Excel, browse through your computer again and locate the .iqy file from our previous examples.
Again, right-click on the file and choose to “Open with>Notepad”. |
8. | When the file opens in Notepad, edit the “Selection” line value to simply be a “2”. |
9. | Save and close the .iqy file.
Return to Excel and start a new tab. Before starting the new query, return to the worksheet/tab created in the second example above. Copy cells A1 through B2 (the FanDuel Contest ID and FanDuel URL cells) and paste them into the top of the new tab you just created. Once you have this completed, select cell A4, then click on the Data tab in Excel and hit the “Existing Connections” button. |
11. | You should see the queries completed earlier. Click the “Browse for More…” button.
Browse to the location of the .iqy file, select the file, and then hit the Open button. |
12. | You should be prompted on where to place the query results. Cell A4 of the current worksheet is where I would prefer to put my data. Choose your desired location and click the OK button.
When prompted to “Enter Desired Web URL to Query Here”, select cell B2. Check both available options. Then hit the OK button. |
13. | Because we isolated the query to only the second table, the output should be much cleaner. You should have a nice clean table that only includes the FanDuel player salaries. |
14. | Nice! You did it! |
A Little Disclaimer About Web Querying
Use care and responsibility while web querying. It’s important to think about what you’re doing when you perform a query. You are “taking” data from a web site without really visiting the site. A typical exchange on the web involves you having to view a few ads in order to consume the information. But we are bypassing that in this scenario.
I doubt FanDuel mind that we do this. But to regularly take information from a Razzball or Baseball-Reference.com could be frowned upon. So keep this in mind. Support those sites by physically visiting them or purchasing some of their services.
Where Do We Go From Here?
I have a couple ideas. But I’d like to hear from you. Now that we have a list of player names and salaries, what should we tackle next? Let me know your thoughts in the comment area below. I’ll put my thoughts there to start the conversation.
Stay smart.
In my mind, the elephant in the room is now how to we take these salaries and join them up with other information about the player. And if you look closely at the player names, you’ll see some challenges. For one, we don’t have player IDs! Ugh.
Another issue is that some players have health information in their name. For example, “Jake ArrietaP” or “Yu DarvishDL”.
If we don’t have player IDs, we have to use names to join data. And that means we’ll have to strip off the health information.
So I’m thinking we need to work on some text string manipulation formulas.
Or we could just cheat and web query Fantasy Pros! That’s a hell of a starting point.
Like I said above, I think it’s best to teach a man to fish. But there’s also something to be said for working smarter and not harder.
What do you think? Where do we go from here?
Hey Tanner!
This is very good stuff! I’m extremely interested in the process of assembling all the information needed for potential DFS success, just as you are. It seems to me, as you alluded to already, we’re going to need to chop the health information from the names. Then, match the player names with their IDs.
I’m not sure I completely understand how Fantasy Pros can help. Especially if you plan to research your lineups the night before each contest. I know Fantasy Pros has a FanDuel and DraftKings cheat sheet. But you have to wait for it to update each night. Many times, the update takes place close to or after midnight EST. However, with this said, something tells me you have a different idea in mind, in terms of how Fantasy Pros can help.
One step I know we’ll need to address is how to match up the batters to the pitchers they will face and vice versa. That way, as you can imagine, we can use the batters wOBA (and/or other stats) vs. the handedness of the particular pitcher he is facing. But this is probably the next step, after we match each player name with their ID and stats.
In conclusion, I’m not sure I’ve said anything useful yet. But I’m with you. I’m ready to follow and possibly contribute along with journey!
Thanks, Joe. You’re right. The ability to access the next day’s salaries the night before is a huge benefit. I think for that reason alone I’ll go about this the “hard way” and not skip right to using another site like FantasyPros.
I think I have an idea, Tanner. Tell me if this sounds crazy. We’ll chop up the FanDuel names, removing the health information. Next, we’ll match those names with their FanDuel names from Roto Guru (SCSV exported list: http://rotoguru1.com/cgi-bin/stats.cgi?04z). Once matched with the Roto Guru names, the FanDuel names will have the MLB ID, ESPN ID and Yahoo ID associated with them.
We can then use the Player ID Map you provided for season long fantasy baseball to match the names with their FanGraphs ID. Using the FanGraphs ID, we will be able to associate the FanDuel names with their FanGraphs statistics.
What do you think about this idea?
Hi Tanner, I know I’m very late to the party, but this info is incredible. To summarize – I’d love to be able to scrape fanduel costs directly to excel rather than exporting a csv, and your post on creating the dynamic queries would be perfect – but I’m coming up short. I’m wondering if you’re still able to Web query directly from Fanduel. Im on a mac and using excel 2011, and I’m able to do a basic web query from other sites with no issues. But whenever I try to query fanduel, it’s just blank. I’ve tried in Google sheets as well, and can get other sites imported, and even lists from fanduel (I. e. Contact us, sponsors, other small lists or tables), but it almost seems like the player info is no longer built into a table? Afraid I don’t have an IT background, and there is a shockingly small amount of info on dfs Web querying, so I just wanted to see if it was still a possibility. Would love any info you could provide!
Ugh. This is frustrating. It does seem like the information above is no longer relevant. I’m getting an error message when I run the instructions above on a Windows machine using Excel 2013. I don’t see a viable alternative right now other than downloading the CSV and importing that. I know it’s not seamless. But you could still automate much of the process inside the spreadsheet using an approach like this: https://www.smartfantasybaseball.com/2015/08/importing-a-csv-file-as-an-excel-table/
You do still have to manually go get the data though.
I am going to e-mail you because I can’t really get any info into Google Sheets. So I am curious how you’re getting even that. Then maybe I can come up with an approach.