clock menu more-arrow no yes

Filed under:

Common terms in modern NFL analytics

Learn about the statistical categories and terms used in modern football analytics

Tampa Bay Buccaneers v Los Angeles Rams Photo by John McCoy/Getty Images

This is a reference to terms used in my Stacking the Box Score series and other statistical analyses


Average depth of target (ADOT)

This metric quantifies how far a quarterback’s average pass is thrown down the field. It is measured vertically (straight north/south) from the line of scrimmage to where the receiver catches (or doesn’t catch) the ball. Obvious throwaways are recorded as 0 air yards, to prevent a QB from gaining credit for just throwing the ball out of bounds 20 yards downfield. This metric is a useful way to quantify and compare the gunslingers against the check-down artists.


Completion percentage over expectation (CPOE)

CPOE is perhaps most popular as the proprietary statistic calculated by Next Gen Stats. Their version of the metric utilizes proprietary tracking data to estimate the likelihood a given pass is completed at the time it is thrown, averages all of those expected percentages, and then subtracts that from the actual completion percentage of a player for each game.

Hence, completion percentage over expectation.

But Next Gen Stats’ data only goes back a few years — and the public has access to their data only on a week-by-week basis, rather than play-by-play. That makes it impossible for us to find the expected completion percentage on a specific play.

But in an article on the popular data/statistics journalism site FiveThirtyEight.com, writer and analyst Josh Hermsmeyer showed that by just using the depth of target of a pass and its location (left, middle, and right — which is available publicly in the play-by-play data set), we can pretty accurately predict whether a particular pass will be completed.

Hermsmeyer argued that this is the best metric for predicting which quarterbacks will succeed in the NFL. Since then, economist and writer Ben Baldwin has popularized using the metric to evaluate quarterback performance — even finding that the metric using only publicly available play-by-play data correlates very strongly (r^2=0.88, for those curious) with the Next Gen Stats metric that uses their private tracking data.

So this public CPOE metric is a very valuable tool that can show whether a quarterback is performing well in a given game — or even on a given pass.


Expected points added (EPA)

EPA is a metric showing how many expected points a team added on a given play.

Here’s the short version: it captures how much a play improved a teams’ scoring chances (the true value of the play) by accounting for the context in which it occurred.

The longer version?

EPA is calculated by first determining how likely the next score of the game is — a touchdown, field goal or safety — based on a variety of factors that include field position, down, yards-to-go, score differential and time remaining in the game.

More precisely, if a team is first-and-1 at the opposing team’s goal line, the model will predict an expected points value near 7; it’s very likely that the team will score a touchdown. So if a player in that situation then runs the ball in for a touchdown, the expected points added (EPA) won’t be very large.

On the other hand, if a team is at fourth-and-19 at their own 20-yard line, there could be negative expected points in that situation; it’s more likely the other team will score next. So if a quarterback throws a deep bomb for a touchdown, that play will have a very high EPA value.


Play-calling tendencies

When analyzing a team’s play-calling tendencies, it is important to separate early-down, neutral game-script plays from the rest of the game.

Why?

Any team that is winning by a large margin is going to run the clock out — and any team that is way behind is going to pass. Furthermore, third and fourth-down play-calling is largely dictated by the yards needed to gain the first down.

So instead of looking at run versus pass percentages over the course of the whole game, we only look at first and second down when a team’s win probability is between 20% and 80%.


Success rate

A play is a success if it had a positive EPA value. So if a team has a 60% success rate, that means 60% of their plays had a positive EPA. A run of two yards on first-and-10 is often not a success; a team tends to be less likely to score after such a play.


Win probability

There’s something very important to note about the win probability model I will be using for this series. It does not take into account the strength of either team in the match-up. Both teams begin the game with a 50% to win. All the model is doing is looking back over the past decade to say historically, given the current score, possession, time remaining, home-field, etc., how often has this team ended up winning. If you want to see an actual look at how likely the Chiefs are to win the game, just check the vegas odds before the game and at halftime.

So, if this model does not perfectly predict who will win, what good is it? Well, it’s an excellent way to contextualize just how important a play was. We can say that before Mahomes did _________, the Chiefs had a 30% chance of winning, and after, they had a 70% chance. Thus, we can say that the amazing thing Mahomes just did added 40% to the Chiefs win probability — also known as Win Probability Added. This makes our model very useful because it treats all teams equally and thus lets us compare plays across games to see which had the biggest influence on the game.