Authors: Judah Fortgang and Tej Seth

One of the more difficult tasks every offseason is trying to project a team’s outlook for the upcoming year. One approach, we think, is to establish some key metrics to create a base from the last year, allowing us to compare teams at each stage of the following year to those last year. Oftentimes rooted in a coach’s philosophy, offensive tendencies are relatively stable year-over-year and during the season—barring key coaching and personnel changes, of course. By tendencies, we refer to how a team tends to lead their offensive attack. Combining our own priors with an understanding of how offenses operate could be a powerful tool both for projections and, during the season, separating which teams are noisy and which are the signal. 

To accomplish this, we ran a cluster analysis to learn how offenses both attacked and executed their offenses in 2020. Here are the variables:

  • PROE=Pass Rate Over Expected
  • EXPLP=Explosive Play rate(rate of 20 yd gains)
  • EXPA=Explosive Play attempt rate (air_yards+xyac>=20)
  • QBSR=QB Scramble rate
  • EDSR= Early Down Success rate
  • LDSR=Late Down Success rate
  • TAL= Third and Long rate
  • FDR=First down rate
  • QBSKR= QB sack rate

Our goal with these variables is twofold:

  1. We want to see teams approach without including results-based metrics. Aka the process to “how” they attack. These variables include: EXPA, PROE, QBSR, etc. 
  2. We also wanted to include metrics that capture the results, how efficient teams were given their choice of attack. These include: EXPLP, ED/LD SR, etc. Here are each Cluster’s features:  

Cluster 1: “Maybe Next Year”

  • High 3rd & Long Rate
  • Low Success Rates
  • Example team: Denver Broncos

Cluster 2: “Balanced Attack”

  • Average Explosive Plays 
  • Average Success Rates
  • Example Team: Arizona Cardinals

Cluster 3: “Embrace the Variance”

  • High Sack Rates
  • High Late Down Success Rate
  • Example team: Seattle Seahawks

Cluster 4:  “March and Avoid Negative Plays”

  • Low Pass Rate Above Expected
  • High 1st Down Rate
  • Example team: Tennessee Titans

Cluster 5: “The Model Offenses”

  • High Pass Above Expected
  • High Explosive Plays Attempts and Success
  • Example Team: Kansas City Chiefs

Cluster 6: “Inefficient Offenses”

  • Low Success Rates
  • High 3rd & Long Rate
  • Example Team: Pittsburgh Steelers

And here are the teams in their clusters: 

Cluster 1 contains the teams who achieved little degree of offensive success last season, with poor QB play all around for the teams in the cluster. Teams in cluster 2 did everything at about a league average clip in approach and execution. In cluster 3 we have solid, aggressive offenses whose offensive lines held them back ever so slightly. In cluster 4 we have run heavy teams whose philosophy was to avoid negative plays by remaining conservative and efficiently marching up the field and stockpiling first downs. In cluster 5 we have the juggernauts, passing often and aggressively and doing so efficiently. With strong passing games, it is no coincidence these teams were the conference finalists. In cluster 6 we have the wannabes of cluster 5, passing often and downfield but with poor efficiency. 

Here was each EPA per play of the teams in each cluster:

Under this predictions mindset, we think these clusters can be seen as both  archetypes and ranges of outcomes for a given team. For example, the 2021 Rams with Stafford are likely to be more efficient as an offense. And early in the season, we can likely tell how their offensive approach shapes up to a group of similar teams in 2020. Will McVay continue to remain balanced as he was last year? Or will he entertain a cluster 5 type offense with his new QB? It shouldn’t take much time to know. 

Or even a team with a wide range of outcomes like the Jets, this analysis can help to see where this team will fall. Will they try to remain balanced with their rookie QB and end up like a cluster 2 (or 1) team? Or will they Let Zach Wilson loose and look more like a cluster 6 or 3 team? With these clusters, the applications of such predictions are nearly endless.