When Data Lies: Finding Optimal Strategies for Penalty Kicks with Game Theory

Contents

Penalties are among the most decisive and high-pressure moments in football. A single kick, with only the goalkeeper to beat, can determine the outcome of an entire match or even a championship. From a data science perspective, they offer something even more interesting: a uniquely controlled environment for studying decision-making under strategic uncertainty. Unlike open play, penalty kicks feature a fixed distance, a single kicker, one goalkeeper, and a limited set of clearly defined actions. This simplicity makes them an ideal setting for understanding how data and strategy interact. Suppose we want to answer a seemingly simple question: Where should a kicker shoot to maximize the probability of scoring? At first glance, looking at historical data seems to be sufficient to answer this question. As we will see, however, relying solely on raw statistics can lead to misleading conclusions. When outcomes depend on strategic interactions, optimal decisions cannot be inferred from averages alone. By the end of this article, we will see why the most successful strategy to kick a penalty is not the one suggested by raw data, how game theory explains this apparent paradox, and how similar reasoning applies to many real-world problems involving competition and strategic behavior. The Pitfall of Raw Conversion Rates Formalizing Penalties as a Zero-Sum Game A Toy Model Equilibrium Strategies Learning from Real-World Data Are Players Actually Optimal?Beyond Football: A Data Science Perspective Conclusions

The Pitfall of Raw Conversion Rates

Imagine having access to a dataset containing many historical observations of penalty kicks. A natural first quantity we might think of measuring is the scoring rate associated with each shooting direction.

Suppose we discover that penalties aimed at the center are converted more often than those aimed at the sides. The conclusion might seem obvious: kickers should always aim at the center.

The hidden assumption behind this reasoning is that the goalkeeper’s behavior remains unchanged. In reality, however, penalties are not independent decisions. They are strategic interactions in which both players continuously adapt to each other.

If kickers suddenly started aiming centrally every time, goalkeepers would quickly respond by staying in the middle more often.
The historical success rate of center shots therefore reflects past strategic behavior rather than the intrinsic superiority of that choice.

Hence, the problem is not about identifying the best action in isolation, but about finding a balance in which neither player can improve their outcome by changing their strategy. In game theory, this balance is known as a Nash equilibrium.

Formalizing Penalties as a Zero-Sum Game

Penalty kicks can naturally be modeled as a two-player zero-sum game. Both the kicker and the goalkeeper need to simultaneously choose a direction. To keep things simple, let us assume they just have three possible choices:

Left (L)
Center (C)
Right (R)

In making their choice, kickers aim to maximize their probability of scoring, while goalkeepers aim to minimize it.

If $P$ denotes the probability of scoring, then the kicker’s payoff is $P$ , while the goalkeeper’s payoff is $-P$ . The payoff, however, is not a fixed constant, as it depends on the combined choice of both players. We can represent the payoff as a matrix:

$P= \begin{bmatrix} P_{LL} & P_{LC} & P_{LR}\\ P_{CL} & P_{CC} & P_{CR}\\ P_{RL} & P_{RC} & P_{RR}\\ \end{bmatrix}$ ,

…where each elements $P_{ij}$ represents the probability of scoring if the kicker chooses direction $i$ and the goalkeeper chooses direction $j$ .

Later we will estimate these probabilities from past data, but first let us build some intuition on the problem using a simplified model.

A Toy Model

To define a simple yet reasonable model for the payoff matrix, we assume that:

If the kicker and the goalkeeper choose different directions, the result is always a goal ( $P_{ij}=1$ for $i\ne j$ ).
If both choose center, the shot is always saved by the goalkeeper ( $P_{CC}=0$ ).
If both chose the same side, a goal is scored $60\%$ of the times ( $P_{LL}=P_{RR}=0.6$ ).

This yields the following payoff matrix:

$P= \begin{bmatrix} 0.6 & 1 & 1\\ 1 & 0 & 1\\ 1 & 1 & 0.6\\ \end{bmatrix}$ .

Equilibrium Strategies

How can we find the optimal strategies for the kicker knowing the payoff matrix?

It is easy to understand that having a fixed strategy, i.e. always making the same choice, cannot be optimal. If a kicker always aimed in the same direction, the goalkeeper could exploit this predictability immediately. Likewise, a goalkeeper who always dives the same way would be easy to defeat.

In order to achieve equilibrium and remain unexplotaible, players must randomize their choice, which is what in game theory is called having a mixed strategy.

A mixed strategy is described by a vector, whose elements are the probabilities of making a particular choice. Let’s denote the kicker’s mixed strategy as

$p = (p_L, p_C, p_R)$ ,

and the goalkeeper’s mixed strategy as

$q = (q_L, q_C, q_R)$ .

Equilibrium is reached when neither player can improve their outcome by unilaterally changing their strategy. In this context, it implies that kickers must randomize their shots in a way that makes goalkeepers indifferent to diving left, right, or staying center. If one direction offered a higher expected save rate, goalkeepers would exploit it, forcing kickers to adjust.

Using the payoff matrix defined earlier, we can compute the expected scoring probability for every possible choice of the goalkeeper:

if the goalkeeper dives left, the expected scoring probability is:

$V_L = 0.6 p_L + p_C +p_R$

if the goalkeeper stays in the center:

$V_C = p_L +p_R$

if the goalkeeper dives right:

$V_R = p_L + p_C + 0.6 p_R$

For the strategy of the kicker to be an equilibrium strategy, we need to find $p_L$ , $p_C$ , $p_R$ such that for goalkeepers the probability of conceding a goal doesn’t change with their choice, i.e. we need that

$V_L = V_C = V_R$ ,

which, together with the normalization condition of the strategy

$p_L+p_C+p_R=1$ ,

gives a linear system of three equations. By solving this system, we find that the equilibrium strategy for the kicker is

$p^* \simeq (0.417, 0.166, 0.417)$ .

Interestingly, even though central shots are the easiest to save when anticipated, shooting centrally about $16.6\%$ of the times makes all options equally effective. Center shots work precisely because they are rare.

Now that we are armed with the knowledge of game theory and Nash equilibrium, we can finally turn to real-world data and test whether professional players behave optimally.

Learning from Real-World Data

We analyze an open dataset (CC0 license) containing 103 penalty kicks from the 2016-2017 English Premier League season. For each penalty, the dataset records the direction of the shot, the direction chosen by the goalkeeper, and the final outcome.

By exploring the data, we find that the overall scoring rate of a penalty is approximately $77.7\%$ , and that center shots appear to be the most effective. In particular, we find the following scoring rates for different shot directions:

Left: $78.7\%$ ;
Center: $88.2\%$ ;
Right: $71.2\%$ .

In order to derive the optimal strategies, however, we need to reconstruct the payoff matrix, which requires estimating nine conversion rates — one for each possible combination of the kicker’s and goalkeeper’s choices.

However, with only 103 observations in our dataset, certain combinations are encountered quite rarely. As a consequence, estimating these probabilities directly from raw counts would introduce significant noise.

Since there is no strong reason to believe that the left and right sides of the goal are fundamentally different, we can improve the robustness of our model by imposing symmetry between the two sides and aggregating equivalent situations.

This effectively reduces the number of parameters to estimate, thus lowering the variance of our probability estimates and increasing the robustness of the resulting payoff matrix.

Under these assumptions, the empirical payoff matrix becomes:

$P\simeq \begin{bmatrix} 0.6 & 1 & 0.86\\ 0.94 & 0 & 0.94\\ 0.86 & 1 & 0.6\\ \end{bmatrix}$ .

We can see that the measured payoff matrix is quite similar to the toy model we defined earlier, with the main difference being that in reality kickers can miss the goal even if the goalkeeper picks the wrong direction.

Solving for equilibrium strategies, we find:

$\begin{aligned} p^* &\simeq (0.39, 0.22, 0.39) \\ q^* &\simeq (0.415, 0.17, 0.415) \end{aligned}$ .

Are Players Actually Optimal?

Comparing equilibrium strategies with observed behavior reveals an interesting pattern.

Comparison between equilibrium and observed strategies for kickers and goalkeepers. Image by author.

Kickers behave close to optimally, although they aim at the center slightly less often than they should ( $16.5\%$ of the times instead of 22%).

On the other hand, goalkeepers deviate significantly from their optimal strategy, remaining in the center only $6\%$ of the times instead of the optimal $17\%$ .

This explains why center shots appear unusually successful in historical data. Their high conversion rate does not indicate an intrinsic superiority, but rather a systematic inefficiency in the goalkeepers behavior.

If both keepers and goalkeepers followed their equilibrium strategies perfectly, center shots would be scored roughly $77.8\%$ of the time, which is close to the global average.

Beyond Football: A Data Science Perspective

Although penalty kicks provide an intuitive example, the same phenomenon appears in many real-world data science applications.

Online pricing systems, financial markets, recommendation algorithms, and cybersecurity defenses all involve agents adapting to each other’s behavior. In such environments, historical data reflects strategic equilibrium rather than passive outcomes. A pricing strategy that appears optimal in past data may stop working once competitors react. Likewise, fraud detection systems change user behavior as soon as they are deployed.

In competitive environments, learning from data requires modeling interaction, not just correlation.

Conclusions

Penalty kicks illustrate a broader lesson for data-driven decision-making optimization.

Historical averages do not always reveal optimal decisions. When outcomes emerge from strategic interactions, observed data reflects an equilibrium between competing agents rather than the intrinsic quality of individual actions.

Understanding the mechanism that generates the data is therefore essential. Without modeling strategic behavior, descriptive statistics can easily be mistaken for prescriptive guidance.

The real challenge for data scientists is therefore not only analyzing what happened, but understanding why rational agents made it happen in the first place.