We’re about to take web querying to the next level. If you’re not familiar with web queries or if you missed our first discussion, you can brush up on Excel web query basics in Part 1 the Web Queries Series here (ha, that rhymes).
This will also make more sense if you’re familiar with the concepts of player IDs and website URLs discussed in this post.
The Weakness In Our First Query
You might recall that our first web query was specifically linked to Jose Fernandez’s player page at BrooksBaseball.net. This is a great way to pull his specific pitch mix into an Excel file. But it’s also pretty limiting that we can only pull in Jose Fernandez’s statistics and would have to create a new query to look up another player. At that point it’s faster to just go to BrooksBaseball.net and start searching for players manually.
What is a Dynamic Web Query?
A dynamic query would be designed in such a way so that we only have to initially set up one query. That query is not static, linked to one specific page or player, it’s dynamic, meaning it is linked to an input that can change.
Said another way, instead of linking the web query to player ID “11530“, link the web query to a specific cell in Excel. Then whatever player ID is typed into that cell will be used in the web query.
A New Example
Instead of linking to BrooksBaseball.net, we’ll link to the player pages at Razzball.com. If you haven’t checked them out, these player pages over are an incredible resource. Take a look at Miguel Cabrera’s page using this link:
http://razzball.com/player/1744/Miguel+Cabrera/
And just to illustrate something about the URLs at Razzball, try this link:
http://razzball.com/player/1744/
They load to the same page. The player’s name is not necessary for the page to load.
Stop Blabbering And Asking Me To Read Things. Show Me How To Do This.
Sorry about that. Video is more conducive to understanding this dynamic web query example. Here you go:
Have Any Questions?
Don’t just watch the video. Give this a try. You could use this to more easily do projections, evaluate trades, set the ideal daily fantasy lineup, and much more.
If you like getting your hands dirty in Excel and doing player analysis like this, sign up for the Smart Fantasy Baseball newsletter where you’ll get more great ideas like this.
Be smart.
Hey,
I tried to do this with Excel 2007 and I could not choose a different worksheet. Is there a way you did that or am I just using too old of a version?
Hi JP. Good question. I no longer have access to Excel 2007. It sounds like you’re talking about the part where the data is pulled into the sheet, starting at around the 9:18 mark. What if you create a new sheet before this, and then select the cell you want the information to populate in before you click the “Existing Connections” button? My guess is the information would pull into this new tab at that point.
Hi Tanner,
I’m having some difficult with running a query from tables from Fangraphs. Maybe you can help with the coding aspect.
I’m basically trying to capture gamelogs of my team’s players throughout the season. I want to be able to pull each players fangraphs “standard” gamelog (not advanced, dashboard, etc.).
I have followed your instructions, using the above video as a base. However, when I get to editing the query in Notepad, the web address for Dustin Pedroia, for example is this:
http://www.fangraphs.com/statsd.aspx?playerid=8370&position=2B&type=1&gds=&gde=&season=2014
How can I edit the web query to read the playerid, position, and (I think) type of format of the game log (i.e. standard view vs. dashboard (type=0))? This may be unclear but happy to clarify or if you know of a better way to pull down daily game logs of players, I would be greatly appreciated.
Also, is there any reason that Baseball Reference doesn’t allow you to pull their web tables in anymore? Just curious. I could do it for about 2/3 of last season
Hi Sean,
In regards to your Baseball Reference question, the tables on some sites don’t seem to get identified in Excel, so you don’t see the neat little yellow arrow. But in the iqy file, the “Selection=” option might still allow you to pull in the information. I believe “Selection=2” would pull in the second table on the page. And this is what let me pull in a game log from a player’s game log page.
For example, to pull in Miguel Cabrera’s 2014 game log my iqy file looked like this:
WEB
1
http://www.baseball-reference.com/players/gl.cgi?id=cabremi01&t=b&year=2014
Selection=2
Formatting=None
PreFormattedTextToColumns=True
ConsecutiveDelimitersAsOne=True
SingleBlockTextImport=False
DisableDateRecognition=False
DisableRedirections=False
I just keep changing “Selection=” to different numbers until the information I want comes in.
Hope that helps,
Tanner
Tanner, Thanks for the quick response and apologies on my delayed response. For some reason BR isn’t working even when toggling the “Selection=” to any given number. The Fangraphs does work for me even with the added nuisance of adding type and position. It’s further complicated by players with multiple position eligibility. But I do appreciate the help and it’s working well for me now.
What I’m basically trying to do is capture each team’s daily performances (mine in particular) but I haven’t found a good way to automate that process yet completely due to start/sits on any given day.
It does seem silly that you need the position to hit a player’s game log. Sounds like you’re pretty advanced at this, so I presume you thought of adding a column to the Player ID Map for Fangraphs position?
I wish I was a little better at scripting and/or web scraping, so I could help you out. One day…
Have you read Baseball Hacks? It’s several years old now, but it does have a section on how to create a database of current year stats and games. It’s more database oriented, but you may need that kind of heavy lifting to get the job done.