I like hockey. Watching, playing, chatting about the NHL. Like so many sports enthusiasts, my interest in professional sports turned into an obsession with fantasy sports. Ten years later, I’ve been involved in multiple leagues with multiple formats, often several at a time. I’m even a league commissioner. The draft is an exciting way to tee off the season, and the competitive aspects keep interest in the game throughout the year. I think it’s a great way to get to know about the entire NHL, and not just the hometown team (GO OILERS!).
For some, success in fantasy sports comes through in-depth knowledge of each player and team. Hours watching games, listening to sports talk shows, and reading scouting reports. I’m not one of those guys. I take a numbers approach to fantasy hockey.
The draft presents an obvious opportunity to use a numerical approach to achieve success. I’d like to craft a number of posts about draft preparation as I get ready for the 2017-2018 season. But since we’re in-season now, I’ll focus on the analysis I use week-to-week to understand the teams in my fantasy league, and ultimately gain an edge.
I apply this analysis to a Head-to-Head (H2H) league. The league is administered by Yahoo Fantasy Sports, and they have an explanation of the format, but I’ll add my own take. In H2H, your team of players competes with one other team for one week. Statistics are accumulated over that week in a number of categories (my league uses Goals (G), Assists (A), Plus/Minus (+/-), Hits (Hits), Blocked Shots (Blk), Goalie Wins (W), Goalie Save % (SV%)). The winner of the match is the team who has the higher score in the most categories (ties are also possible). Over the course of the season, a fantasy team accumulates a record of wins, losses, and ties. Best record wins a little cash and, more importantly, bragging rights for the next year.
The players chosen at the draft have a large impact on the success of the team, but there are things that can be done in-season to improve a team. Managers can trade players with other managers or with the pool of un-drafted players. I always struggled to identify the right way to improve my team. I might identify five candidates that I’d like to add to my team: a goal-scorer, a defensive specialist, an improving goalie, and a hitting forward. Each would help by team in one of the H2H categories, but which should I pursue? Then there’s the other half of the trade: which player to drop? To ensure that a trade affects the performance of my team in a positive way, any move I make should improve my team’s chance to win a category, and therefore, a match.
To do this, I need to know how my team stacks up against other teams in the league. What are my team’s strengths? Weaknesses? How do they compare to the teams at the top of the standings? Yahoo Fantasy tracks this to a limited extent, but I wanted more. Time to flex my Python muscles.
Before I can perform any kind of analysis on the league, I need to get the data into Python. I’m interested in every team‘s performance in each of the H2H categories: G, A, +/-, Hits, Blk, W, SV%. As of now, I put each week’s results into a separate .csv file (from web: copy – paste – export via spreadsheet). The challenge is then to import data from a series of .csv files. It turns out this can be pretty easy. I’ve uploaded each week’s results to a GitHub repo. (Aside: I’ve looked into scraping that data straight from the web, but realized that’s a little over my head for now. That option is definitely on my radar though.)
Unleash the Python
You can visit my Jupyter Notebook on GitHub to see exactly how I’ve tackled this.
[Note that the snippets of code I post here on the blog may look strange, especially on the narrow screens of mobile devices… the automatic line breaks can mess things up!]
To load a week’s results, I use:
BHLresults = np.array(pd.read_csv(https://raw.githubusercontent.com/scibbatical/fan_hockey/master/w1.csv'))
Import data: check. Let’s have a look at the way the data is structured:
Each week’s results are a two-dimensional matrix, with each row representing a team, and each column representing a category. Let’s add time as the third dimension: each week as a 2D layer of a 3D matrix. Compiling results will look something like this (you can imagine how you might append the other weeks as well…):
BHLresults = np.append(BHLresultsW1, BHLresultsW2, axis=0)
The results now reside in a three-dimensional NumPy array with the structure BHLresults[(week),(team),(category)]. From here, I can grab data to analyze it by team, category, and week. Extracting the goals scored (category index 0) by my team (team index 9) every week is done by:
The result is:
Now the fun can start. Let’s do some basic analysis: calculating the mean category performance for each team.
# initialize an array into which we can populate the mean values means = np.zeros(np.shape(BHLresults)[-2:]) for team in range(np.size(BHLresults,1)): for cat in range(np.size(BHLresults,2)): means[team,cat]=np.mean(BHLresults[:,team,cat]) # display using Pandas pd.DataFrame(means, index=names, columns=cats)
This is interesting so far, but not revolutionary. We have to start somewhere though, right? If you want to use this as a basis for your own data exploration, all the Python code used for this post is available on the Jupyter Notebook on GitHub.
I’m already working on a followup post. From the array of weekly results, I’ll determine matchup winners, an accurate ranking of teams, and a system to rate a team’s performance in each category. With all those things in hand, I’ll delve into how any of this is useful.
Feature image is public domain.