Introduction to Lean for Programmers

Editor
23 Min Read


Intro to proof assistants

who transitioned into data science, and I work daily with machine learning algorithms. I’m fascinated both by their apparent magic and by the mathematics that underlies them. Pry open any machine learning library and you’ll find mathematical tricks involving matrix decompositions, convolutions, Gaussian curves, and more. These, in turn, are built on even more fundamental axioms and rules, such as function application and logic.

During my journey to learn these primitives, I collected a whole shelf of mathematics books, many of which now gather dust. I also enrolled at the Open University, where I attended classes over Zoom and learned about syntax and axioms, then how to combine them to build theorems.

The topic was fascinating: axioms are like game pieces and legal moves on a board, while theorems are legal plays, some more interesting than others. Proving a theorem means unwinding the gameplay just enough to convince the players that it is reachable, or disproving it by pointing out an impossibility. For example: “The two bishops are on white squares, and a bishop can never switch to a square of a different color.”

But on a practical level, the courses were tedious. I was given PDFs to solve by hand and then submit scans of my work online. While scribbling with pencil on paper, struggling with the course assignments I had two insights:

  1. Building a proof is like programming.
  2. There’s no way I can engage with mathematics in these early century conditions. I need an IDE, with type hints, syntax highlighting, and descriptive error messages, and engage with it interactively.

My thoughts weren’t new. The first idea is called the Curry-Howard correspondence, explained below. The latter is the goal of proof assistants such as Lean and their IDE extensions (usually a VS Code extension).

Interactive proof assistants vs automatic theorem provers

Theorem checkers like Lean’s kernel can determine whether expressions are well-formed, if the steps are legal, and if the final conclusion is valid. Interactive proof assistants build on them and will also help you build your proofs and suggest steps. Automatic theorem provers (ATPs) attempt to find the proof on their own using AI techniques. Lean works as a proof checker and assistant, but it pairs well with generative AI to function effectively as an ATP.

Terence Tao streams AI-assisted proofs on his YouTube channel using Lean 4 and GitHub Copilot. Most of the headlines in the form of “AI just solved an x-years-old open problem” were likely formalized with Lean (example 1, example 2, and 3). AI projects and benchmarks such as LeanDojo, miniF2F, ProofNet, PutnamBench, FormalMATH, use Lean. OpenAI and Meta have trained models to use Lean, and DeepMind’s AlphaProof used it to earn a silver medalist at the International Mathematical Olympiad.

DeepMind’s AlphaProof’s core reasoning components. a: A prover agent observes an initial Lean tactic state, b: The proof network (a 3b transformer model inspired by AlphaZero) produces a list of promising tactics, c: The tree search uses the proof network’s outputs to guide its exploration of potential proof paths. A neuro-symbolic system with a lot of the architecture carried over from AlphaZero playing Go. From Nature article, under Creative Commons License.

There is a great existing corpus of guides and tutorials for Lean, as well as a good community, yet the target audience doesn’t seem to be developers experienced in other programming languages. I was struggling to parse Lean’s syntax in my head, and as I learned, I noticed I was walking on unpaved territory. Hopefully the following sections will guide you on this path.

The Curry-Howard correspondence

Let’s start with a simple logic theorem, something you might have to prove in an undergraduate logic class:

(P → Q → R) → (P ∧ Q → R)

The operator → denotes logical implication, but in the Curry-Howard correspondence between proofs and computer programs it can be read as a function. ∧ is the logical “and.” In plain English, we’re asking whether this statement is true: “If I have a function that, given P, produces a function from Q to R, then if I already have both P and Q already, I can produce R.”

Note: (P → Q → R) is a “curried function,” (Haskell Curry is here to blame as well), and the syntax is equivalent to (P → (Q → R)). In programming terms, it’s a function that returns another function, which can later be invoked as foo(bar)(baz). It’s logically equivalent to foo receiving both bar and baz as a tuple foo((bar, baz)), but that is what we’re trying to prove here. Likewise, the entire sentence could be written as (P → Q → R) → P ∧ Q → R.

Let’s imagine this with a concrete example: If I have a vending machine that given a coin (P) returns a cup of instant noodles (Q → R), which given hot water (Q) returns noodles (R), than means that I can make a system that given both a coin and hot water (P ∧ Q) as inputs, I can return noodles.

Cup Noodle vending machines. These machines already provide the hot water for you. Wikimedia Foundation.
Cup Noodle vending machines. These machines already provide the hot water for you. Wikimedia Foundation.

If you were told to prove that this is possible as a programmer, you might be tempted to do so by writing a function that conforms to this scenario. In TypeScript we might write:

Try it in the TypeScript playground, modifying anything will cause type errors.

Written generically and simplified (playground):

We defined a function — our theorem — that receives a curried function (P → Q → R) as a parameter and outputs a function that receives a product type P ∧ Q and outputs R, that is, (P ∧ Q → R). This function acts as a “witness” that the logic theorem is true. The proof looks sound and it even satisfies the types that spell the original theorem in TypeScript:

Let’s state this even stronger: The proof is sound because it satisfied the type. Propositions are types, and proofs are terms of those types. We also saw that conjunctions are product types / tuples, and that logical implications Q → R are functions (q: Q) => R. The arrow is equivalent both syntactically and semantically.

Let’s implement this in Lean:

You can play with it in Lean’s playground.

example is a way to define anonymous declarations only to check the types (unlike the keyword theorem), and its type (what follows the colon : up to the definition :=) is our theorem. The proof is achieved when a definition “inhabits” this type. As is fashionable in functional programming, function application is just whitespace: the most fundamental operation is mapped to the most basic operator in the language. P ∧ Q is a product type / tuple, an its members are accessed via .left and .right .

Notice that the type and the implementation are nearly identical — the type alone is enough to determine the code. In fact, given a type signature, you can use Loogle to search for matching definitions in Lean’s library.

Other Curry-Howard Correspondences

Logic/Lean Type form TypeScript Approximation
P ∧ Q product type [P, Q] or { left: P; right: Q }
P ∨ Q sum type P | Q *
P → Q function type (p: P) => Q
True unit type undefined, or null
False empty type never
¬P function to empty type (p: P) => never
P ↔ Q pair of functions [(p: P) => Q, (q: Q) => P]
(P ∧ Q) → R function from product (pq: [P, Q]) => R
P → Q → R curried function (p: P) => (q: Q) => R

* A better sum type would be a discriminated union: { tag: “left”; value: P } | { tag: “right”; value: Q }

What you haven’t seen yet

Dependent Types

Let’s look at the following, more complex, type:

It means: “For all values () of x that are natural numbers, 0 is less or equal to x.” the first thing we notice is the Unicode mathematical symbols; these can either be auto-completed in the IDE with \forall or simply substituted with the ASCII keyword forall.

This strikes us as not looking like a type. The type depends on the value of x — something that we usually get from complex validation libraries like Zod, not from static type checkers like TypeScript. It’s a type that “depends” on values. Yet Lean’s types don’t rely on runtime checks the way Zod does; it’s still a type, albeit a complex one. This is a source of confusion when reading Lean’s syntax.

Artist's logarithmic scale conception of the observable universe with the Solar System at the center. Wikimedia Foundation.
Artist’s logarithmic scale conception of the observable universe with the Solar System at the center. Wikimedia Foundation.

Universes

In the playground or IDE, we can check the type of terms with #check:

Now let’s check the type of Nat:

Nat is of type Type. But why stop there?

Type is a type of Type 1, and so forth. This shows that there isn’t a hard syntactic distinction between types and terms in Lean. Types are terms that have their own types, stacked like onion layers — these layers are called ‘universes.’ 3 is a term of type Nat and Nat is a term of type Type, Type n has type Type (n+1), and so on.

Lean must keep track of universe levels to avoid the barber’s paradox: “A barber shaves everyone who does not shave themselves, does the barber shave himself?” With type universes, the barber is a procedure whose type can only operate on elements of a type from a universe below it, it cannot operate elements on its own level, including himself. This avoids the self-referential paradoxes that could cause inconsistency. Inconsistency would mean we could prove False to be true, and through it prove all propositions.

Propositions

Let’s try an even harder type:

In plain English: “Is this true? For every natural number x, y, z, and n, given a proof that n > 2and x * y * z ≠ 0, we can produce a proof that xⁿ + yⁿ ≠ zⁿ

This is Fermat’s Last Theorem, which dumbfounded mathematicians for 358 years after Fermat claimed his own proof didn’t fit the margins of the page. The definition uses sorry which is a placeholder for an incomplete proof. It will compile but will tell you: declaration uses `sorry`. It feels like using TypeScript’s any to silence its type errors, but a reviewer might notice it and ask you to fix the lint error. Providing this proof is your “goal.”

Let’s walk back into something easier:

This is ridiculous. How can I define a term whose type is 2 + 2 = 4?

Let’s start by looking at the type:

The type is a Prop, short for proposition. A proposition is a statement that can be true or false. As we stated above, in the Curry-Howard correspondence a proposition is a type, and a proof is a term or program of that type. The proposition is true if the type has “an inhabitant” and false if it doesn’t. Lean doesn’t “resolve” a proposition into a Boolean true or false in a computational sense; we just say it’s “true” or “false.” Using Lean as a proof assistant means working mostly with propositions.

Behind the scenes, Lean’s type checker will normalize this type for you into 4 = 4, but no further. We just need to provide a term that inhabits this normalized type. We do this with rfl, an entity constructor automatically imported from the Prelude package into your namespace.

This works. It’s declarative syntax for asserting that 2 + 2 = 4 is true by reflexivity. To understand why this works, let’s check the type:

These are generics, which by convention use Greek letters. The curly brackets mean that Lean will infer them for you.

rfl.{u} is the name of the function and its universe level, (your layer of the types onion). {α : Sort u} means α is a type at universe level u, nothing to do with “sortable,” it’s just of the “sort” u. {a : α} means a is a term of type α, and the type of rfl is ultimately the proposition a = a. In our case, α is Nat and a is 4 , rfl has the type 4 = 4 , matching the type Lean normalized above. We have found an inhabitant for 2 + 2 = 4 proving the proposition by reflexivity.

Coach discussing tactics with team. DoD photo by Mass Communication Specialist 1st Class Daniel Hinton. DVIDS. Public Domain.
Coach discussing tactics with team. DoD photo by Mass Communication Specialist 1st Class Daniel Hinton. DVIDS. Public Domain.

Tactics

Earlier we solved our vending machine example with function application alone, but there are many more interesting primitives in Lean. In Lean you can start your proof with by which activates “tactic” mode:

This lets you build your proof step by step, while the right pane of your IDE (the InfoView) progressively shows your local assumptions and your goal, marked by . Tactics are a combination of macros and syntactic sugar that resolve the proof state for the type-checking language kernel. They help you think analytically and in a declarative way while tackling your goals. It’s an imperative-style solution inside of a functional language, running inside a monad.

You will have to familiarize yourself with tactics just as someone learning SQL does with analytical constructs like GROUP BY, PIVOT, and LEFT JOIN. Removing sorry will show the state and goals in the InfoView:

We have one goal to prove. Coin and HotWater are propositions. Noodles is an unknown of sort ?u.29, a random ID for a yet-unassigned universe level. Our goal is to produce a proof of (Coin → HotWater → Noodles) → Coin ∧ HotWater → Noodles.

Let’s use the tactic intro to introduce an assumption into scope. It will automatically capture the input type (Coin → HotWater → Noodles) — one of the proofs I already “have” to work with:

The InfoView will refresh and show you the updated local assumptions and goal:

In addition to Coin, HotWater, and Noodles, we now have a local hypothesis (or assumption): noodleVendingMachine, a constant of type Coin → HotWater → Noodles (a curried function). Our goal is now to produce Coin ∧ HotWater → Noodles.

Let’s introduce another variable that will automatically capture the next assumption, of type Coin ∧ HotWater:

InfoView now shows:

Now we only need to return Noodles using our toolbox of tactics and the assumptions in local scope. We have a curried function noodleVendingMachine and a product type (tuple) coinAndHotWater. Let’s apply the Coin and the HotWater from coinAndHotWater to noodleVendingMachine:

The tactic exact serves as a return keyword when we have the exact type to return. .left and .right are the elements of the product type / tuple coinAndHotWater, which we applied in order to the curried function noodleVendingMachine. Remember that whitespace is function application.

InfoView shows:

This can be written generically as:

Satisfying our original theorem (P → Q → R) → (P ∧ Q → R) above.

Some tactics help decompose a goal into smaller obligations, such as intro, cases, constructor, apply, or refine. Others rewrite expression using assumptions of equivalences, such as rw or simp. Other tactics normalize expression by computation or algebraic manipulation, such as rfl, and ring. And some provide targeted automation: linearinth, omega, aesop. Consult the tactic reference.

Thinking in tactics is an elegant skill that can only come from experience. To reuse a previous simile, it’s like a data analyst pulling up the right analytic functions and effortlessly cleaning up, enriching, and summarizing tables — or a senior software engineer piping together the right Bash commands or recognizing the cleanest design pattern for the job.

Overarching Vision of Theorem Provers

Leibniz' mechanical calculator (1671). The first calculator able to perform addition, subtraction, multiplication, and division. Wikimedia Foundation.
Leibniz’ mechanical calculator (1671). The first calculator able to perform addition, subtraction, multiplication, and division. Wikimedia Foundation.

Leibniz dreamed of a universal formal language of thought (Characteristica universalis) paired with a mechanical calculator (Calculus ratiocinator) that could prove all theorems and philosophical arguments by turning a crank. Hilbert asked, “is this possible?” — convinced until death that “there is nothing that we shall not know.” This was regarded as “the main problem of mathematical logic” and “the most general problem of mathematics.” Gödel though about this and proved that some truths can’t be proven, while Church and Turing both independently proved that “often you’ll be turning that crank forever,” arguably starting the field of computer science in the process. To use the chess analogy from earlier, automated theorem proving is like searching the game tree for winning positions, but mathematics has an infinite board, so the search may never end — some plays are “undecidable.” However, we can still check whether a move is legal, and this is where we are now with proof assistants.

In 1956, Bertrand Russell wrote a letter regarding Logic Theorist, that had automatically proved theorems from Principia Mathematica, and which is sometimes described as the first AI program:

“I wish Whitehead and I had known of this possibility before we wasted ten years doing it by hand. I am quite willing to believe that everything in deductive logic can be done by machinery.”

Conclusion

Lean has a steep learning curve, partly due to its syntax and its use of tactics, which you must learn. Struggling with Lean comes, however, with a reward: you’re learning mathematics. Types, universes, tactics, theorems, and proofs are not quirks of yet another programming language but concepts that mathematicians and logicians struggled with for generations. They are not useless facts to memorize but thought patterns as ancient as Euclid and as eternal as the theorems we proved.

Once you’re familiar with the fundamentals, try moving on to more advanced topics with Mathematics in Lean, where you’ll tackle problems in discrete mathematics, linear algebra, differential calculus, and topology. Or delve deep into the foundations starting from zero (literally) through Tao’s Lean companion to “Analysis I” Book. Mathlib, Lean’s collaborative library, already has 1.6 million lines of formalized mathematics that you can use to bridge to all branches of mathematics.

Share this Article
Please enter CoinGecko Free Api Key to get this plugin works.