clock menu more-arrow no yes mobile

Filed under:

What are we rating when we are rating goalkeepers?

Goalkeeper ratings often have far more to do with the ten players in front of the netminder.


In baseball, a pitcher's statistics take a hit any time a runner safely reaches base, regardless of whether it was a well-placed hit between the gaps or a clumsy outfielder dropping a routine fly ball. Recognizing that a pitcher's ERA was being inflated by things beyond his control, Voros McCracken invented Field Independent Pitching, or FIP, to try to determine how much blame (or credit) a pitcher actually deserves.

Soccer has a similar, but more complicated, conundrum. How can we strip away a team's possession rate and its defensive effort and reasonably compare and judge MLS goaltenders? It's certainly not a new question--I know we wrestled with that in promoting Dan Kennedy for last year's All-Star game.

Jimmy Nielsen of course got the nod, but Kennedy earned the backup spot. Mercifully the bottom had yet to completely fall out for Chivas USA before Chelsea came stateside. Still many weren't sold on Kennedy's selection. Nielsen had the stats that counted: wins and goals allowed.

Before 2008, it's likely Kennedy would have spent his All-Star break at home. In 2008, Zack Greinke won the Cy Young Award despite playing for the Kansas City Royals. Cy Young Award voters overturned years of precedent, looking past traditional statistics like "wins" and instead using new statistics like FIP that were able to pinpoint Grienke's contributions and minimize the variables beyond his control.

Comparing pitchers and goalkeepers only works at a very shallow level. For one thing, every event in baseball begins with the pitcher. In contrast, a goalkeeper, even a busy keeper, has the fewest events as measured by Opta. For another, both strikeouts and home runs are directly influenced by the pitcher whereas goalkeepers have very little influence on the opposing strikers.

Unlike baseball, with its treasure trove of easily applicable statistics, soccer just doesn't really have the comparables (i.e. home runs, strikeouts, hit by pitcher) to make a statistic like FIP applicable. However, an astute blogger named Steve Holroyd found a comp: hockey.

In a 2008 post to BigSoccer, Holroyd applied a statistic called "perseverence rating" in "A New Method For Evaluating Goalkeeper Performance."

Hockey has a similar issue. Imagine a goalie whose team controlled the puck for the game. This goalie came away with a clean-sheet but only faced one shot. Contrast that with the opposing keeper who gave up three goals but saved 27 shots.

To reconcile that some goalkeepers' Save Percentage is out of whack with the number of shots faced, Jeff Klein and Karl-Eric Reif created a statistic called Perseverance Rating. In his post at Big Soccer Holroyd adapted Klein and Reif's work to soccer.

From Holroyd's post:
To find Efficiency:(Saves divided by Total Shots Faced) X 100
To find Shots Faced per Game:Total Shots Faced divided by (Minutes Played divided by 60)
To find Perseverance Rating:(Efficiency X 6) + Shots Faced per Game = µµ divided by .6 = Perseverance Rating

I tweaked the Shots Faced statistic from being divided by 60 to 90 to compensate for soccer being a 90 minute game.

((Efficiency *6)+ Shots Faced per Game)/0.6=(Shots Faced/(Minutes/90))

In comparing MLS keepers from 2004-07, Holroyd noticed results tended to oscillate widely between years and leagues (he also compared MLS results to 70's era NASL franchises). To account for this he cleverly created a baseline PR of all keepers and then subtracted that value from each individual keeper.

My first observation is the importance of the number six. I'm not sold that Save Percentage is six times more valuable than Shots Faced. It seems both arbitrary and a little high. I'm pretty sure the number was chosen as hockey is a sixty minute game, the same way a law firm charges clients in increments of six minutes.

There also is the case of inflation. Taking a number in the 70's and multiplying it by six will net a high value. I suppose you could mete it down by dividing by 10 but then you're just adding a another step not to mention negating the premise of Occam's Razor.

What I really learned is PR's most useful context is actually not its intended purpose. Instead of comparing starting goalkeepers among themselves, the statistic is instead incredibly useful in comparing performances between starting and their backup goalkeepers/replacements. Below are the PR values of 4 starting keepers and their backups from 2012. It's not perfect but you can see why Michael Gspurning was the top rated MLS goalkeeper on the Castrol Index. When he was gone, Seattle Sounders' defense suffered despite his backups facing fewer shots.

If you scroll down you can also see that the Chicago Fire was in good hands with Paolo Tornaghi in the net when Sean Johnson was minding the net for the U.S. U23 team. Of course what a difference a year makes as Tornaghi allowed four goals in his only appearance this year.

A second use is when you sum the values amongst teams, another way of evaluating a team's defensive efficiency. We'll track the value and, at season's end, track how closely it resembles DER. After all, one component of DER is how much work the keeper is put under.

The spreadsheet above provided the context of our Kennedy over Nielsen argument: Chivas USA defenders were allowing just under six shots on target per game, twice of what Nielsen faced. Let's use Tom Hanks movies as metaphors. A day in the life of Kennedy is like the Normandy scene in the opening of Saving Private Ryan. Nielsen is closer to the path of the fallen feather in Forrest Gump. Are these references completely dated? (Ed. note: yes).

In the end, instead of being a soccer equivalent of FIP, it appears PR actually has more utility as a VORP or DER substitute. Now if you'll excuse me, I'm off to enjoy some Alphabet soup.

What do you think? Leave a comment below!