One of the People Who Matter is kmac6. And in a recent post about hockey stats, kmac6 stated the following:
I think a series of posts breaking down some of the advanced stats into terms that are easier to understand for “classic” Hockey fans could definitely help. Maybe even using Devils centered examples of each in those posts (after all the Devils are something we all understand). I know personally I haven’t fully grasped all of the advanced stats because they have a tendency to be presented in a very dense manner and I haven’t put as much time into understanding them as I should. But I am getting better and I think these types of posts can help.
Other members of the People Who Matter agreed with kmac6. I stated I could make it happen. And so it is happening. This is the first of a series of posts about hockey stats I and others on this site have been using for the last decade or so. It is meant to explain the basics of what the stats are, what they mean and what they lead to, what it shows about the game, and what it does not mean. I will try to use examples related to the New Jersey Devils were applicable.
Let us begin with one of the building blocks for what we call advanced stats: Corsi.
Corsi: What is it?
Corsi refers to the number of shooting attempts taken. Officially, NHL scorers record shots on net (shots), missed shots, and blocked shots. Sum them up and that’s Corsi. Just Corsi. There is no acronym. It does not need to be written as CORSI. Corsi is just the sum of three types of shooting results: either the player put the puck on (or in) the net, the puck was blocked, or the puck missed the net. The formulation is simple:
Corsi = Shots on net (this includes goals) + Missed shots + Blocked shots
There have been synonyms for this term, so you may have seen it called something else. At NHL.com, they use Shot Attempts Taken (SAT). Some refer to them as events, since a shot, block, or miss is its own separate entry on a NHL play-by-play log. A few in the scene tried (and failed) to just call them shots in the same way that shots in soccer are all attempts instead of shots on target. The most common and accurate synonym would be shooting attempts since that is what is being counted. Either works. For the purposes of this post, I will largely stick to Corsi.
How is it used? What does it mean?
The original use of Corsi was to define the differences in shots, blocks, and misses taken by their team minus the shots, blocks, and misses taken by their opponents. This is still the basic template for how Corsi has been commonly used for the last decade-plus. The shots, blocks, and misses taken by your team is commonly referred to as Corsi For (CF), which means Corsi for the player’s team. Those same shots, blocks, and misses taken by the opponent is commonly referred to as Corsi Against (CA), which means Corsi against the player’s team. The Corsi differential would just be the Corsi For minus the Corsi Against.
You can use that number to determine whether a team or a player had to play more offense or defense in a game. If the number was zero, then you know it was evenly split. Each side took the same number of attempts. Generally, you do not want to be forced to play defense as that puts you at risk of giving up a goal and it means you’re not in control of the play. That would be represented by a negative number. It is much more preferable to be in control of the puck and be looking to attack during a game. Even if you do not score, forcing the opposition to defend means they cannot either. Thus, a higher number is more favorable.
Over the past decade, there has been a shift to use a percentage of Corsi For or CF%. (Aside: I think this was first started by David Johnson at Hockey Analytics, but it didn’t catch on until the Extra Skater/War on Ice days.) This is because it gets to what we want to know about Corsi. Is the team or player taking more attempts than they are allowing? CF% directly gives us this answer. And if it is over 50%, we immediately know the team or player is taking more attempts than their opposition. If it is below 50%, then we know the team or player is giving up more attempts. Again, a higher number is better.
One of the immediate benefits from Corsi is that it turns out to be a good proxy, or approximation, for possession. Specifically, offensive possession. While in theory a player can fire a puck from anywhere on the ice, the vast majority of shot attempts are taken in the offensive zone. Even if the shot hits an opposition player for a block or is too wide or too high of the net, that there was an attempt at all tells us quite a bit. One, the team was on side and on offense. Two, the player who attempted the shot had the puck as well as the time and space to take a shot. Three, the player thought they had a chance to do something with the puck. While it does not capture anything about obtaining the puck in the first place or how it was moved around, it does capture the attempt itself. This is how Corsi has been interpreted to represent possession.
While it is not explicitly tracked, all of those “little things that don’t show up on the scoresheet” can contribute to a team’s or a player’s Corsi For percentage. That is a reason why it is a stat that represents how effective a team is playing or how a player is performing. After all, who wants their favorite team to be out-attempted by their opposition? Can we say a player is good if their team often gets out-attempted when they are on the ice? Corsi can answer those questions.
Related Developments with Corsi
Corsi has also led to the development of a lot of concepts used to this day regarding hockey analytics as a whole. It is essentially a building block for what has been called Advanced Stats for hockey. As this is a primer to explain Corsi, it is worth noting these concepts and considerations. In the later part of the last decade, a lot of work was identified by bloggers publicly discussing and debating Corsi and what influences it. It turned out that a lot can impact Corsi and that has led to a number of other considerations we use today for other hockey stats. It is essentially a building block and to best use Corsi (and other hockey stats) today, the following are commonly used:
- Game situation. Special teams skew Corsi so much that it is common to use it primarily for even strength or 5-on-5 situations. A team on a power play will be on offense given that they have a man advantage. A team on a penalty kill will be forced to play defense. Whoever pulls their goalie is either going to get a power play soon or are desperately looking for a goal. Rather than trying to adjust out the special teams, we can filter out those situations and look at Corsi to represent a team’s performance when both teams have the same number of players. In the beginning of a season, it may not mean a whole lot. But as a season goes on, it will better reflect the team’s performance as there will be much more data from 5-on-5 hockey.
Devils Example: The 2019-20 Devils played 69 games and 3,256:07 of their total ice time of 4,212:42 was in 5-on-5 situations. That is roughly 77% of their season. In that portion of the season, the Devils had a CF% of 46.08%. This meant the Devils took just about 46% of all shooting attempts in their games. This also meant the opposition out-attempted the Devils. This also ranked 30th out of 31 teams last season. This is evidence that the Devils’ performance was bad last season – beyond their record.
- Rates over raw numbers. It was quickly discovered that teams and players do not play the same amount of ice time. Games can go into overtime, some games are loaded with penalties, and players are used very differently across many teams. To keep things consistent, rates for Corsi For and Corsi Against are used instead of the actual number of Corsi For and Corsi Against. Typically, the rates are per 60 minutes to reflect a regulation game.
Devils Example: Which Devil saw the most offense on the team last season? By a raw count of all Corsi For, that would be Damon Severson who was on the ice for 1,060 attempts by the Devils – narrowly edging out P.K. Subban. However, Severson is a defenseman and so he would be on the ice for more minutes than the forwards. He was even the team’s leader in total ice time in 5-on-5 play. The raw count of Corsi does not account for the fact that by his position, Severson would have the chance to see more opportunities by his team than others. Using CF/60 instead of CF can provide a more fair comparison between different players in different positions. By CF/60, the team leader was Nikita Gusev with a CF/60 of 59. Severson ended up 13th on the team at 53.36. Severson may have seen more Corsi as a raw total, but Corsi by the Devils took place more often when Gusev was on the ice. By rate, the answer would be Gusev. The rate of Corsi For does not punish Gusev (or 12 other Devils) for not playing as much as Severson.
- Score effects. The scoreboard definitely impacts how a team performs. A team trailing is usually going to attack more in the hopes of tying it up. A team that is leading usually does not need to attack as much since they are already ahead. This has been adjusted on some statistical sites and you can filter out situations where a team is leading or trailing. If you want to truly neutralize score effects, you can look at Corsi for only score tied situations. Although that represents fewer minutes as a whole.
Devils Example: The Devils in 5-on-5 hockey had a CF% of 46.08%. When they were trailing, their CF% improved to 48.23%. That is still very bad and it was the worst in the league in trailing situations. But even they sought to tilt the ice more in their favor when losing.
- Scorer bias. This is worth its own post, but the harsh reality is that the scorer at each arena may have their own interpretations and counts of events. Mistakes can also be made. This can directly influence a team’s or player’s Corsi count, although the errors may be minimized over many games and as the team or player plays in more arenas.
Devils Example: The Devils undercounted shots on net for years. This consideration will get its own post in this series, by the way.
- Context is critical. Corsi by itself is misleading. Who the player has played with and who they played against absolutely matters. That is worth its own post in the future, though. Another one of the most important contexts for a player’s stats is their team. If their team has been bad as a whole at Corsi, then their players are likely not going to have good Corsi values either. The best player on a bad team may still end up being good from a Corsi perspective even if his own CF% is bad. Likewise, a team that has been quite good in terms of Corsi will have a lot of players with positive values and it may be trickier to identify who has been really making all of that happen.
Devils Example: Blake Coleman led the Devils in CF% last season with a value of 49.61%. That still meant the opposition out-attempted the Devils when he was on the ice. Does this mean that Coleman was bad in 5-on-5 hockey and that happened again in Tampa Bay? No, and we have proof of that. In his nine regular season games with Tampa Bay, Coleman had a CF% of 50.27%. He ranked low on the Lightning in that stat, but it was still an improvement over his time in New Jersey. In the playoffs, Coleman’s CF% over 25 games was 55.93% – and ninth best on the Lightning. Tampa Bay presumably found a great fit for Coleman in the lineup, used him in a reasonable role for his talents, and let him do his job – which he did quite well. Knowing that Coleman was the leader on an otherwise bad Devils team in terms of CF% gives us confidence that, in a different situation, he could thrive.
- Relative stats. Another important context for players is relative stats. Typically, this is the difference in rates of Corsi when the player is on the ice and when the player is off the ice. When the player takes a shift and their Corsi For rate improves, it means that the team’s rate of shot attempts went up when that player is on the ice. That suggests the player is helping. Likewise, if a player takes a shift and their Corsi Against rate rises, it means the team is allowing a higher rate of attempts against when they are on the ice. That suggests the player is not helping.
Devils Example: Going back to Coleman, relative rate stats showed Coleman’s value when he stepped on the ice for New Jersey. When Coleman took a shift, the Devils’ Corsi For rate improved by 4.83. We can say the Devils attacked more when Coleman was on the ice. Also, the Devils’ Corsi Against rate fell by -4.29 when Coleman took a shift. We can also say the Devils allowed fewer attempts when he was on the ice too. While Coleman’s total CF% was below 50%, the relative Corsi shows that Coleman provided a positive impact on the team’s Corsi. If anything, it suggests Coleman tried to make things better for a team bad at 5-on-5.
- Fenwick. Corsi is the sum of shots on net, blocked shots, and missed shots. Blogger Matt Fenwick argued (Steve Burtch cited the actual comment by Fenwick back in 2007 in this Pension Plan Puppets post.) that blocked shots are not as valuable since a shot that is blogged is likely not an actual chance to score and likely not from an advantageous position. If you’re looking for actual scoring chances, then take blocks out of the equation. And so they did and called the stat Fenwick. It is called unblocked shot attempts, or USAT at the NHL site. It turned to be an approximation for scoring chances and is another way to represent how effective a team’s or player’s process is on the ice. It also turned out that blocked shots are not recorded by location, which impacts another stat to be discussed later in this series.
This is not the entire list of what came out of the development of Corsi and the subsequent lessons from 2007 and beyond about it. But these are the major ones that are still with us today, easily represented through filters and other sites like Natural Stat Trick.
What does Corsi not do (and other drawbacks)?
Corsi is just the count of shots on net, blocked shots, and missed shots. Players get credit for an attempt being made by their team if they are on the ice when it happened. Likewise, they get marked down if they are on the ice when it did not happen. Whether or not the player was directly involved in the attempt is not considered or accounted for. That is one of the big concerns with a stat like plus-minus for goals. This is mitigated by the fact that there are many more shot attempts in a game than there are goals. It is hard to have a very high or low CF% and not be somewhat involved in the run of play. Still, it does not speak to whether the player directly contributed to those values. It works on the logic that if it happens a lot while you’re there, then you probably had something to do with it.
Corsi does not take into account where the shooting attempts are made. There is a world of difference between missing a 60-footer from the blueline and missing a shot from the slot. A shot taken at the crease is more likely to go in than a shot taken from the halfwall. Corsi does not care about that. Just that an attempt is made.
Corsi is also calculated for skaters and not really for goaltenders. There is value in seeing how a rate of Corsi For or Corsi Against changes when different forwards and defensemen play on a team. Goaltenders generally play the whole game, so there would be few changes. More importantly, the goaltender’s primary job is to stop pucks. They can distribute pucks and freeze them to force zone starts. But they rarely create situations for shooting attempts for or against them. This is a stat that is in practice the forwards and the defensemen. This means this stat cannot really be used to describe anything of value for the goaltender, determine if a goaltender is better or worse than another, or determine if a goaltender is more valuable than a forward.
Corsi also needs time for it to mean something with respect a team’s performance in general or with a player. In a game, a player finishing the game of a Corsi differential of -1 or +1 (or a CF% of around 50%) does not mean much. Now, the Devils had a lot of games last season where a defenseman pairing or a forward line was out-attempted by a large number. In those cases, it is more meaningful; a player finishing a game with a -14 in Corsi is evidence that he likely had a bad night. And if he had a Corsi of +14, then he likely had a great night. But in terms of representing whether the team is playing well as a whole in 5-on-5 hockey, Corsi has to be collected over a number of games before making any meaningful conclusions. This is a drawback since teams have to make decisions on a game-to-game basis and cannot afford to wait to make any adjustments (or not make any).
Corsi is also a way to represent a team’s performance. It assumes that offensive possession is the goal. It is best interpreted as part of the process. That does not always match the team’s results. Teams can and do win despite being out performed by Corsi. It does not make their wins any less. It just means how they are winning may not see it continue. Alternatively, a team that is doing great by Corsi in games may end up losing for a whole bunch of other reasons. Likewise, a player can still contribute outside of whether they have a good CF% or not in a game or in a season. (See the example of Blake Coleman.) It is not an be-all, end-all stat. No stat is. It only represents what happened on the ice by way of shooting attempts. It is typically used in the most common situation of hockey – 5-on-5 hockey in regulation – and it is an important part of a team’s process. That is the major point: it is a part of the process
Lastly, Corsi can only go as far back as 2006-07. As far as I can tell, the league did not record events of shots, missed shots, and blocks in their box scores in the way they do now. Therefore, short of re-watching and counting all of those events individually, we do not have much in the way of history to look back and see if those great players/teams of the past were also great in terms of Corsi. Which is sad because I think it would elevate several great players from the past as being much more dominant than what their point totals would suggest. For example, legendary Devils forward Patrik Elias would likely be absolutely astounding by this metric had this stat been around in his prime years. From his mid-20s until the end of his career, it’s still astoundingly good. But imagine if we had access to Corsi for Elias’ 2000-01 season when he put up 96 points. That Devils team was excellent and Elias was the most productive forward on their dominant top line, The A Line. His CF% then could have been as high or even higher than what it was six or seven seasons later. Alas, we do not know. We are limited in its history.
Do Good Teams Always Have Good Corsi Values?
Not always, but it is common that the teams that play well in the regular season, where there are more games and everyone plays, tend to do well by this stat. For example, nine out of the top ten teams by point percentage last season had a CF% over 50%. The tenth team was Dallas, who had the tenth best point percentage and a CF% of 49.6%, so they were not far off from breaking even. A team serious about contending does not necessarily need to have the league’s best CF%, but it absolutely can help. Just ask Las Vegas, who did lead the league in CF% at 54.76%.
However, a team can not live on Corsi alone. The 2013-14 Devils are a fantastic example of this statement. They were fantastic in the run of play with a CF% of 54.4% that season, which was the third best in the whole league. However, they struggled to finish their shots (they had the fifth lowest shooting percentage), their goaltending was below the league average (25th in team save percentage), and they faltered in opportunities to pick up needed points (this was the 0-13 shootout season). They were a difficult team to play against, but they were not successful because of these other factors. As a result, they earned just 88 points and missed the playoffs. (To make matters worse, the teams around them in CF% all earned over 100 points each and made the postseason.)
This is also similar as to why Carolina failed to make the postseason during most of this past decade. And issues in the net and a general lack of talent in other areas contributed heavily to Montreal and Los Angeles having such a poor season despite some very good CF% values last season. Having a high CF% in 5-on-5 hockey is valuable. At the least, it means your team is tough to play against. However, it alone will not lead to a season full of wins and legitimate playoff aspirations.
As important as it is to be good in the run of play and be able to out-attempt your opposition as a whole, a good team needs to have at least decent goaltending, special teams, talent levels, and even some puck luck with the finishing of shots. Likewise, if the team struggles in Corsi, then their success is going to lean more on being exceptional in these other areas like goaltending and special teams. It is possible to be a playoff team or be in the mix to make it to the postseason with a sub-50% CF%? Yes. However, it needs to be overcome and so it makes success harder to achieve.
Why is it called Corsi?
The origins of the stat’s name comes from Tim Barnes, formerly known as Vic Ferrari of the blog Irreverent Oilers Fans. He named after former Buffalo Sabres goaltender coach Jim Corsi because he liked his mustache. Really. Bob McKenzie had the whole story at TSN in this post back in 2015. It turned out the naming decision ended up being more accurate than Barnes expected. As former Buffalo GM Darcy Reiger told McKenzie, Corsi would use shot differentials to count a goaltender’s workload. While a goalie’s job is to stop shots, they do react to misses and blocks and that does take energy. Silly as Barnes’ actual reason was, who he named the stat after turned out to be wholly appropriate – even though Barnes used it for an entirely different purpose. By the way, Barnes has been with the Washington Capitals since 2014, he is listed as their Director of Hockey Analytics on the team’s site, and, yes, he has a ring.
Hopefully this post helped you have a better understanding of what Corsi is, how it is useful, and even some of its drawbacks. Now I want to read from you any further questions about the stat, and I will do my best to answer them. Thanks to CJ for his feedback and suggestions for this article as well as the other writers for reviewing this piece. Thank you for reading.
Next time in this series: Shooting percentages, save percentages, and PDO.