Pigskin Principia, Vol. II: Evidence and Expertise in Football

This post is an updated version of an article by the same author published on Football Outsiders.

There is a great debate in the football world over which manner of analysis best captures performance and allows teams to exploit key edges in their pursuit of victory. At the heart of the issue are strong disagreements between those analyzing football traditionally via tape and those using modern analytical methods. These disagreements are especially present in the public sphere as there are fewer incentives for cooperation. They stem mostly from a failure to communicate, the source of which is not immediately obvious.

Discussions on the nature of football, besides the disagreements based on opinion, fall within multiple camps. Whether you're a philosopher of tape or an acolyte of the empirical, it's easy to see there is a communications gulf that requires resolution. Arguments about the value of certain positions in the NFL, establishing the run, or any other topic debated ad nauseum are not addressed here, though they are borrowed from to illustrate a larger point. There are many ways to maximize football performance, but the specific problem of resolving disagreements between the different camps is a bit abstract. First consider football through the eyes of the General Manager.

General Manager

The General Manager is the conductor of the entire orchestra that is an NFL team. The GM’s day can start out and end with coordinating with the head coach; they spend their time checking, rechecking, and checking again the depth chart, asking the cap and contract staff ‘what ifs,’ going back to player personnel for more player lists, and increasingly asking for information from analytics.

But how exactly does one balance these very different sources of information in their decision-making process? When scouting and analytics give conflicting predictions on a player, what is the balance between these two projections?

Traditionally GMs come up from the scouting departments and have a deep familiarity with the evaluation practices of their team even if they came from a different organization. Even those who come from alternative routes tend to come from cap and contract and primarily coordinated with Player Personnel rather than Analytics departments.

The center of the GM’s focus in many cases is the depth chart. How are our players performing? Does the head coach have any concerns? What holes or weaknesses can we address (Pro Personnel)? When and how should we address them (AGM)? Can we afford to address them at all (Cap and Contract)? Are our players healthy physically and mentally?

The GM is managing an inventory system of players that have all of the complications and nuances that all humans do, with the addition of being asked to perform superhuman feats of speed and strength on a weekly basis. Selecting these players at a high level takes an enormous amount of skill, and perhaps it’s even the case that the reason why we struggle to identify GMs who are consistently better than average is because of the high level of skill involved in the first place.

The Coach

While GMs are team maestros, head coaches in football are much like philosopher kings: each rules over his team's scheme and personnel usage with a level of authority seldom seen in the modern world. Their jobs demand not just an intense, personal, and deep love of football, but also the leadership skills required to mold their players into the best versions of themselves that fit within the coach's vision. This is no easy task, so head coaches tend to develop a philosopher's worldview with respect to football. They are seekers of truth and the deep causal roots of all aspects of football itself. To many coaches, the "why" holds a deep philosophical meaning. To know that something works is not sufficient, because it's the "why" that unlocks the type of creative genius needed to innovate at the play level.

Consider some of the aspects that go into the design or implementation of a particular play: the skill set of the current roster, the likely opponent actions, the desired movement of all 11 players on the team. Even body position and foot placement can disguise intent. Coaches do not simply accept that something works. They attempt to provide a deep and powerful explanatory framework from which they derive inspiration, meaning, and vision.

Why a play works, and the context in which it works, can help shape a coach's worldview and intuition, which are both key to ultimate success. In some cases, the vision itself is powerful, but the ability to generate buy-in and direct a team towards that vision is something both rather difficult and difficult to measure. Proponents of coaches such as former Seahawks head coach Pete Carroll would argue this pursuit of a team vision is part of what makes him a great coach. Scouts view and evaluate players similarly, as their work molds the view of coaches and decision-makers on everything from future opponents to prospects in upcoming drafts who fit the current team schema.

When viewing the conduct of play in football through a philosopher's lens, the obsession with the deep causal roots of the performance of a particular player, a play type, or a position requires an incredible amount of in-depth knowledge. Modern coaches are no less methodical than quantitative analysts, and they certainly offer a depth and breadth of knowledge of football far beyond what someone from the analytics community can offer.

The Scout

Scouts, unlike coaches, are generally less concerned with scheme and in some scouting circles, the deeper discussions of scheme are met with mockery and derision. Personnel people pride themselves on not only observing what a player has done but divining the inherent traits of a player and their potential for the future. This focus on traits attempts to answer the question of what a player can bring when his surroundings change, an obsessive pursuit of context to explain what a player is and what they could become.

This is especially true when scouting college players whose games can display disparities in talent not seen in the NFL during the time that most fans have been alive. This leads scouts to keep production in mind, but to keep it at arm’s length. Production, often described as “box score stats” can lie to you. This foundational distrust is why many scouts (and GMs) can be hesitant when it comes to analytics. In some cases, analytic measures are contextualizing production rather than measuring a trait which is met with skepticism. In others, analytics may aggregate on data sources that scouts don’t trust, another source of apprehension and a barrier to adoption.

Yet, scouting grades are often foundationally based in at least assessing the minimum amount of athleticism a player possesses and contextualizing their performance by the level of difficulty of their competition. Yet, these are two tasks which are particularly well suited for analytics. In many ways, the keys to adoption lay not just in providing an alternative way of grading players but making traditional scouting easier, more accurate, and less susceptible to biases.

What a player was asked to do, who they were asked to do it against, and how consistently they can do it are key questions for a scout because they view traits as more inherent to the player than measures of production. So, they laser focus on the traits, estimate their importance, refer to historical examples and comparisons, and paint a picture of who the player is for the GM.

Scouting grades are typically a snapshot in time for NFL players or a projection of those coming up in the draft. But more than that, they’re an attempt to boil an ocean of information into a small enough sip for the GM and head coach to understand who and what they are getting out of the more than 2100 NFL players or 250+ draftable prospects per year.

Scouting grades are their own model, no less precise than any mathematical one, but one that is a black box in many ways. Production, character, mental processing, traits, size requirements, and more are synthesized, compared, processed and boiled down into a bin, color, or number. The term “blue-chip player” is the output of a model relying on thousands of hours of expertise passed down through generations of scouts over 100 years.

The Analyst's View

While coaches, scouts, and GMs are indeed the foremost experts of their craft, it is also true that some transformational changes in football are driven not by experts, but by "outsiders." For example, fourth-down decision-making is just now reaching a point where risk aversion is the exception and acceptance is the norm even though David Romer's seminal paper on fourth downs was published over 15 years ago. Some have argued that football analytics has a tone problem, but the conflict over tone is simply a symptom of the true disconnect, the perception of evidence itself.

Those who work with sports data tend to have a much different view on the nature of football, which things are important, and why they are important. Instead of a search for universal football truth, analysts are, in my opinion, more generally motivated by how Michael Strevens describes scientists in his 2020 book The Knowledge Machine: How Irrationality Created Modern Science. In his work, Strevens argues that what makes science work is what he calls "the iron rule of explanation." The iron rule boils down to two simple precepts. First, that scientists must uncover and generate evidence to argue with and second, that the only thing that "counts" in a scientific debate is that which we have evidence for.

This stands in sharp contrast to how a GM approaches the evaluation of a football player. Shallow causal inference certainly produces results, but only from things we know how to measure, and it doesn't necessarily provide us with a deep understanding of what we're observing. Isaac Newton, while harangued by his contemporaries on this lack of deep understanding about gravity, had this to say: "It is enough that gravity really exists and acts according to the laws that we have set forth and is sufficient to explain all the motions of the heavenly bodies and our sea."

The key distinction opens up much more of a gap than it would first appear. For a coach seeking to maximize his player's performance or design a new play, accounting for only what we can directly measure seems absurd. Likewise, an analyst may feel as though a scout is focusing on tendencies, patterns, and behaviors in players with no discernible benefit. Which of these two viewpoints is correct? The answer is, unfortunately, both.

Evidence is a matter of perspective, and our very perception of what is and is not evidence is what holds back a fuller, deeper understanding of the beautiful and grotesque game of football. One of the difficulties with an analytical approach is that, unlike the motion of heavenly bodies, the rules and "laws" of football are subject to the whims and machinations of man.

Historical Examples

Consider the play action pass, which has a history going back all the way to the 1930s. Play action is not a new concept. Yet, if you analyze play action from when it became a staple in the 1960s to today, you might end up with vastly different estimations of its usefulness even if its effectiveness has remained largely unchanged in the modern era. This provides a look at both sides of our evidence argument.

Before the proliferation of play action, there were numerous theories about why play action worked, many of which were eventually unfounded. So, while play action was born out of coaching creativity, the theories about rushing production or attempts driving play action success turned out to not be supported by evidence.

Alternatively, while analytics has shed light on various parts of play action, it would be difficult for the analytics community to conceive a new approach to play action on its own. Even more compelling is the possibility of some adaptation in play calling or design that fundamentally changes the balance of power with regards to play action or some other aspect of football. If the linebackers' first intuition through sufficient training was to key in on stopping the pass first rather than defaulting towards their run fits, would we not then see a fundamental change in the effectiveness of not just play action but the running game itself?

As Lamar Jackson took the league by storm in 2019, the rules and preconceptions about reliance on quarterback rushing were turned on their head. As more spread offense concepts work their way from college to the NFL and mobile quarterbacks such as Jackson and Josh Allen find success, the rules shift under our feet a bit.

It's not that data couldn't tell us quarterback rushing wasn't valuable before (going all the way back to Fran Tarkenton), but extrapolation from one style or archetype to another is dangerous, and anyone comparing Jackson's skill set to Cam Newton's would rightly be strongly questioned if not flatly ignored.

Paradigm shifts within the game are often the result of coaching vision in the face of opposition, yet that same coaching vision can lead a team to languish behind antiquated ideas and disproven theories. Additionally, there is no global optimum to solve for; one cannot simply draft a Patrick Mahomes every year. The trends are mutable, the extrapolation is perilous, and conventional football wisdom can produce explanations which fail simple scrutiny.

Working Together

This is not meant to be football nihilism, but it's important to note that neither analytical prowess nor coaching genius alone is likely to dominate the NFL. This makes communication between the camps driven by shallow causal explanation and deep philosophical understanding all the more important. The edges may be small, but they are exploitable, and they accumulate.

There is a benefit to closing the communications gap. There are certainly organizational designs and schemes that help bring the two worlds together through pairing analysts with scouts, rapid feedback between sides, keeping theories falsifiable when possible, and starting analysts off with measuring things normally measured by hand. Setting expectations for realistic feedback is the only means for improving our collective football knowledge.

It would be unreasonable to expect a proprietor of analytics to be able to identify, let alone conceptually explain, Drop-8 Inverted-Cover-1 Double-Rat, just as it would be unreasonable to ask an offensive coordinator to describe how a 2D convolutional neural network can be used to process Next Gen Stats data and produce predictions of future performance.

They are radically different ways of looking at the same problems and it's very likely impossible to be an expert in both fields. People spend their entire lives studying football or data science never to truly master either discipline. But most important is that your film or data contemporary likely thinks in a way structurally different from you, and that difference alters the very nature of what you and they consider valid evidence, feedback, and criticism.

As previously mentioned, one of the difficulties from the GM perspective is not just balancing new sources of information but how to manage disagreements between data sources. When scouts and analytics disagree that disagreement needs to be managed. Have the quantitative analysts really modeled the true behavior? Are the scouts focused on something that either does matter or perhaps used to matter more than it does now? Every source of information is subject to its own biases, but knowing where those biases exist and how they interact is key to managing not just the interaction between departments but the very synthesis of the information provided.

Shared understanding will allow the most forward-thinking coaches, scouts, general managers, and organizations to leverage analysts from outside their field to discover unexploited edges not yet conceived. While in some cases analysts may be measuring the wrong thing, it's also possible that we can enhance our coach or general manager's deep causal framework of football by breaking down these relationships into smaller, more testable pieces with a focus on explainability. In doing so, we can marry our football philosophy with specific testing through a systems-based approach.

For GMs, coaches, scouts, and other practitioners relying on subject matter expertise, an openness to unlearn can be critical. The realization that conventional wisdom can be wrong or sometimes correlations are much weaker than we believe is often harder to accept than the opposite when ties are stronger than expected.

Team building is more than just the selection of players in the draft or management of the roster during the season. Team building relies on a foundational trust that everyone involved in a process is bought in and working together in good faith. There can be no room for undermining one another if a team wishes to achieve the highest levels of success on the field; the same is true of the members of the front office.

Intellectual openness is the only path forward to exploiting the full strength of analytics while at the same time empowering the creativity of team builders. Openness sounds simple enough until a disagreement occurs and the ties of teamwork are tested. It should be enough to know that performance edges are possible as any sufficiently competitive person would do whatever it takes to win.

As we do not live in such a world, several edges remain for those who are willing to have their beliefs challenged and are brave enough to change their minds when presented with new evidence. It can be hard work to translate narrow findings to an overall football framework, just as it's difficult to change preconceptions rooted in decades of conventional wisdom.

Real change in public discourse will necessarily lag behind the league in this area as analysts and scouts are teammates rather than adversaries, but strong leadership is still required at the team level to drive change. Real leadership demands the humility of accepting the possibility that what we think we know is wrong, and the boldness to act on it.

That work is hard. It is worth it for those who want to win.

Pigskin Principia, Vol. II: Evidence and Expertise in Football

General Manager

The Coach

The Scout

The Analyst's View

Historical Examples

Working Together

Related Articles

Sticky Football Stats: Predictive NFL Metrics

Decoding the House v. NCAA Settlement

NFL Schedule Analysis: Do Rest Differentials Impact Game Outcomes?

Jonathan Casillas