Lean 4.11.0

David Thrane Christiansen

2024-09-01

Release Announcements

We're delighted to announce that a new release of Lean, version 4.11.0, is now available. As usual, we've made a number of improvements to Lean's internals, and Lean 4.11 is more predictable, more responsive, and more reliable than prior versions. The improvements range from getting the details of hovers on parentheses just right to better debugging tools for proof automation developers to large performance increases to certain subsystems in Lean's internals. There is no way for this post to capture all of the hard work that's gone into the more than 250 separate commits that have been made to Lean's since the last release—please refer to the official release notes for a complete accounting, or read on for a selection of highly-visible improvements.

New `HashMap` and `HashSet` Types

Lean 4.11 features a brand-new implementation of hash tables, concretely usable as both dictionaries (Std.HashMap) and sets (Std.HashSet). This new implementation has been designed from the ground up for both efficiency and correctness, and it features an extensible, carefully-engineered framework for specifying and proving properties of its API. Additionally, the API is a result of a careful comparison with both Lean's other data structures and with the hash table APIs of other languages, including both pure functional languages and imperative languages.

Updated `variable` Command

The variable command has been made more predictable in Lean 4.11. This command is used to provide signatures in a single place for parameters that multiple declarations have in common, which reduces both the mental overhead of reading repetitive code and the risk of subtle inconsistencies.

In previous versions of Lean, when a variable has been declared, any declaration in the scope of the variable command may freely mention the declared variables. Each mentioned variable, along with the variables that it depends on and any instance implicits that mention any included variables, are added as formal parameters to the declaration. In Lean 4.11, this rule has been modified for theorems, so that only mentions of variables in theorem statements result in added parameters.

After this command, the names α, β, f, and g are variables:

variable
  {α : Type u} {β : Type v}
  (f : α → β) (g : β → α)

This definition of listMap references the variable f:

def listMap (xs : List α) : List β :=
  match xs with
  | [] => []
  | x :: xs => f x :: listMap xs

Thus, its signature is:

listMap {α : Type u} {β : Type v}
      (f : α → β)
      (xs : List α) : List β

The variable g is not part of the signature because it is not mentioned in the definition of listMap.

Applying this rule to theorems has a drawback: if a variable is used in the body of a proof, then it becomes an assumption of the theorem. This means that the content of a proof can, in principle, affect the theorem statement. While Lean would always accurately report on the theorem statement, and there is no risk of inconsistency due to this behavior, it could be confusing. As an example, in Lean 4.10, one could write:

variable (n : Nat)

and then, after some intervening code:

variable (n_eq_zero : n = 0)

A user who did not notice the assumption that n = 0 might then attempt the following proof:

theorem declaration uses 'sorry'n_plus_k_eq_k : n + k = k := byn:Natn_eq_zero:n = 0k:Nat⊢ n + k = k
  sorryAll goals completed! 🐙

The theorem's type is:

n_plus_k_eq_k (n : Nat) {k : Nat} :
    n + k = k

However, a second attempt to prove the theorem might make use of the variable and succeed. This use may not be easy to see. In n_plus_k_eq_k', simp_all uses n_eq_zero behind the scenes:

theorem n_plus_k_eq_k' : n + k = k := byn:Natn_eq_zero:n = 0k:Nat⊢ n + k = k
  simp_allAll goals completed! 🐙

Even though its statement appears identical to n_plus_k_eq_k, the type of n_plus_k_eq_k' is:

n_plus_k_eq_k' (n : Nat) (n_eq_zero : n = 0) {k : Nat} :
    n + k = k

In Lean 4.11, the variable rule's interaction with proofs has been refined so that variables mentioned in the body of a proof no longer cause them to be inserted as parameters. Only variables mentioned in the theorem statement itself are included. The interaction between variable and ordinary definitions has not been changed.

It can sometimes be convenient, however, to unconditionally insert a parameter across a swathe of definitions and theorems. The new include command causes a variable or variables to be unconditionally included as assumptions in theorems, regardless of whether they are mentioned. Given this code in Lean 4.11:

variable (n : Nat)

variable (n_eq_zero : n = 0)

theorem declaration uses 'sorry'n_plus_k_eq_k : n + k = k := byn:Natk:Nat⊢ n + k = k
  sorryAll goals completed! 🐙

include n_eq_zero

theorem declaration uses 'sorry'n_plus_k_eq_k' : n + k = k := byn:Natn_eq_zero:n = 0k:Nat⊢ n + k = k
  sorryAll goals completed! 🐙

the theorems have the following signatures, even though neither proof mentions n_eq_zero:

n_plus_k_eq_k (n : Nat) {k : Nat} :
    n + k = k

n_plus_k_eq_k' (n : Nat) (n_eq_zero : n = 0) {k : Nat} :
    n + k = k

The effects of include can be reversed using the omit command. Both include and omit can be used with in to limit their effects to a single theorem. To help limit the risk of unintended assumptions, a new linter issues a warning when an include command results in probably-spurious assumptions:

automatically included section variable(s) unused in theorem 'add_comm':
  n_eq_zero
consider restructuring your `variable` declarations so that the variables are not in scope or explicitly omit them:
  omit n_eq_zero in theorem ...
note: this linter can be disabled with `set_option linter.unusedSectionVars false`theorem add_comm {x y : Nat} : x + y = y + x := byn:Natn_eq_zero:n = 0x:Naty:Nat⊢ x + y = y + x
  simp_arithAll goals completed! 🐙

automatically included section variable(s) unused in theorem 'add_comm':
  n_eq_zero
consider restructuring your `variable` declarations so that the variables are not in scope or explicitly omit them:
  omit n_eq_zero in theorem ...
note: this linter can be disabled with `set_option linter.unusedSectionVars false`

The updated rules for variable inclusion makes large Lean libraries easier to maintain over time by reducing the number of unpredictable changes, without sacrificing the convenience of variable. To ease porting proofs to Lean 4.11, a new option deprecated.oldSectionVars can be used to locally switch back to the old variable behavior.

Structural Recursion

In Lean, all functions must either terminate with a return value for all possible inputs or be declared partial or unsafe. Termination is justified to Lean's logical kernel by translating the function into primitive recursors, which keeps the trusted code base small by not requiring a trusted termination checker. Broadly speaking, this translation can either use the recursor for one of the function's parameters (structural recursion) or the recursor for a proof that some measure of the inputs decreases down to a minimal value (well-founded recursion). Structural recursion is simpler, leading to smaller and better-behaved proof terms, while well-founded recursion is much more flexible. Lean is capable of automatically choosing a strategy for many programs, but because the problem is fundamentally undecidable, some functions must be annotated with a termination_by clause that explicitly specifies why the function terminates.

Declaring Structural Recursion

In prior versions of Lean, termination_by was to provide a decreasing measure for use with well-founded recursion. While this is useful when Lean otherwise would not accept a recursive definition, it is also useful when Lean successfully finds a decreasing measure, because it can speed up re-checking the function and document to readers why the function terminates. However, there has not previously been a way to specify that Lean should use structural recursion to check termination, and the absence of a termination_by clause could indicate either structural recursion or automatically-determined well-founded recursion. Furthermore, the termination_by? syntax that caused Lean to offer to insert the measure that was found for well-founded recursion into the code had the additional effect of requiring that the function use well-founded, rather than structural, recursion.

Lean 4.11 removes these limitations by introducing the termination_by structural clause, which allows structural recursion to be explicitly indicated. For example, in the following code, the recursive call in the function findList is on an immediately-smaller list, while the recursive call in the helper function findArray.loop is on a larger natural number:

def findList (p : α → Prop) [DecidablePred p] :
    List α → Option {x : α // p x}
  | [] => none
  | x :: xs =>
    if ok : p x then some ⟨x, ok⟩
    else findList p xs
termination_by structural x => x

def findArray (p : α → Prop ) [DecidablePred p]
    (xs : Array α) :
    Option {x : α // p x} :=
  let rec loop i :=
    if h : i < xs.size then
      if ok : p xs[i] then some ⟨xs[i], ok⟩
      else loop (i + 1)
    else none
  termination_by xs.size - i
  loop 0

Lean's termination analysis is capable of finding both termination arguments automatically, but it may be desirable to annotate the program nonetheless.

The prototypical example of a function that cannot be written with only structural recursion is Ackermann's function:

def ack : Nat → Nat → Nat
  | 0, n => n + 1
  | m + 1, 0 => ack m 1
  | m + 1, n + 1 => ack m (ack (m + 1) n)
termination_by x1 x2 => (x1, x2)

However, this is only true in languages without first-class functions. With higher-order functions, Ackermann's function can be made structurally recursive:

def ack' : Nat → Nat → Nat
  | 0, n => n + 1
  | m + 1, n => (n + 1).repeat (ack' m) 1
termination_by structural m => m

Documenting this in the termination_by clause makes the two definitions' purposes clear to readers of the code. Additionally, termination_by? no longer forces Lean to use well-founded recursion, and it can suggest a termination_by structural clause if appropriate.

Mutual Structural Recursion

Mutually recursive datatypes naturally give rise to mutually recursive functions. In prior versions of Lean, mutually recursive functions always used well-founded recursion in the core language. In Lean 4.11, however, mutual structural recursion is available.

Beginning in Lean 4.11, structural recursion can be used for mutually recursive functions, potentially even over mutually recursive datatypes. This brings the advantages of structural recursion to more functions.

For example, given the mutually-recursive types Tree and Forest:

mutual
  inductive Tree (α : Type u) : Type u where
    | leaf
    | branch : α → Forest α → Tree α
  inductive Forest (α : Type u) : Type u where
    | empty
    | trees : Tree α → Forest α → Forest α
end

Lean 4.11 is able to use structural recursion to count their elements or map a function over them:

mutual
  def Tree.count : Tree α → Nat
    | .leaf => 0
    | .branch _ xs => 1 + xs.count
  termination_by structural t => t

  def Forest.count : Forest α → Nat
    | .empty => 0
    | .trees t ts => t.count + ts.count
  termination_by structural ts => ts
end

mutual
  def Tree.map (f : α → β) : Tree α → Tree β
    | .leaf => .leaf
    | .branch x xs => .branch (f x) (xs.map f)
  termination_by structural t => t

  def Forest.map (f : α → β) : Forest α → Forest β
    | .empty => .empty
    | .trees t ts => .trees (t.map f) (ts.map f)
  termination_by structural ts => ts
end

Structural mutual recursion is also supported over non-mutually-recursive data:

mutual
  def isEven : Nat → Bool
    | 0 => true
    | n + 1 => not (isOdd n)
  termination_by structural n => n

  def isOdd : Nat → Bool
    | 0 => false
    | n + 1 => not (isEven n)
  termination_by structural n => n
end

Recursion Through Lists and Arrays

Many datatypes contain lists or arrays of themselves. When writing recursive functions over these types, it is very convenient to use Array.map, List.map, or other higher-order functions from the standard library. However, this pattern of recursion does not directly apply the recursive function to a smaller argument, as the recursive occurrence of the function is not syntactically applied to any argument whatsoever! The function does terminate, however, because every argument that will ever be passed to the function is indeed smaller, but this reasoning is beyond what Lean's automation can derive on its own.

Lean 4.11 incorporates a solution by Mario Carneiro, originally part of the community-run library batteries, in which a helper function “attaches” to each element a proof that it is, in fact, a member of the list or array. This is sufficient to enable Lean's automation to construct a termination proof using well-founded recursion. These helpers are called List.attach and Array.attach:

List.attach (l : List α) : List {x : α // x ∈ l}Array.attach (xs : Array α) : Array {x : α // x ∈ xs}

As an example, the table of contents of a book can be represented with the datatype ToC:

inductive ToC where
  | content (title : String) (subContent : Array ToC)

The elements of the array represent the chapters in a book, the sections in a chapter, and so forth, as in this cookbook:

def cookbook : ToC :=
  .content "Pea and Lentil Recipes" #[
    .content "Introduction" #[],
    .content "Soups" #[
      .content "Palouse-Style Brown Lentil Soup" #[],
      .content "Split Pea Soup" #[],
      .content "Creamy Red Lentils" #[]
    ],
    .content "Sandwiches" #[
      .content "Lentil Sloppy Joes" #[],
      .content "Chickpea Patty" #[]
    ],
    .content "Salads" #[
      .content "Lentil and Rice Confetti Salad" #[],
      .content "Le Puy Lentils with Mint and Cucumber" #[]
    ]
  ]

Converting a table of contents to a string can be done with a recursive function that adds section numbers:

def prefixLines (pref content : String) : String :=
  "\n".intercalate ((content.splitOn "\n").map (pref ++ ·))

partial def ToC.asString (toc : ToC) : String :=
  let ⟨title, subContent⟩ := toc
  let under := subContent.mapIdx fun i toc' =>
    prefixLines s!"{i.val + 1}." (" " ++ toc'.asString)
  s!"{title}{under.foldl (· ++ "\n" ++ ·) ""}"

Pea and Lentil Recipes
1. Introduction
2. Soups
2.1. Palouse-Style Brown Lentil Soup
2.2. Split Pea Soup
2.3. Creamy Red Lentils
3. Sandwiches
3.1. Lentil Sloppy Joes
3.2. Chickpea Patty
4. Salads
4.1. Lentil and Rice Confetti Salad
4.2. Le Puy Lentils with Mint and Cucumber
#eval IO.println <| cookbook.asString

Pea and Lentil Recipes
1. Introduction
2. Soups
2.1. Palouse-Style Brown Lentil Soup
2.2. Split Pea Soup
2.3. Creamy Red Lentils
3. Sandwiches
3.1. Lentil Sloppy Joes
3.2. Chickpea Patty
4. Salads
4.1. Lentil and Rice Confetti Salad
4.2. Le Puy Lentils with Mint and Cucumber

The function must be partial because Lean is unable to automatically conclude that argument in the recursive call is smaller:

def ToC.asString (toc : ToC) : String :=
  let ⟨title, subContent⟩ := toc
  let under := subContent.mapIdx fun i toc' =>
    prefixLines s!"{i.val + 1}." (" " ++ failed to prove termination, possible solutions:
  - Use `have`-expressions to prove the remaining goals
  - Use `termination_by` to specify a different well-founded relation
  - Use `decreasing_by` to specify your own tactic for discharging this kind of goal
title : String
subContent : Array ToC
toc' : ToC
⊢ sizeOf toc' < 1 + sizeOf title + sizeOf subContenttoc'.asString)
  s!"{title}{under.foldl (· ++ "\n" ++ ·) ""}"
termination_by toc

failed to prove termination, possible solutions:
  - Use `have`-expressions to prove the remaining goals
  - Use `termination_by` to specify a different well-founded relation
  - Use `decreasing_by` to specify your own tactic for discharging this kind of goal
title : String
subContent : Array ToC
toc' : ToC
⊢ sizeOf toc' < 1 + sizeOf title + sizeOf subContent

However, using Array.attach to add a proof that toc' is an element of subContent provides sufficient information for Lean to conclude that it terminates:

def ToC.asString (toc : ToC) : String :=
  let ⟨title, subContent⟩ := toc
  let under := subContent.attach.mapIdx fun i ⟨toc', elem⟩ =>
    prefixLines s!"{i.val + 1}." (" " ++ toc'.asString)
  s!"{title}{under.foldl (· ++ "\n" ++ ·) ""}"

List.attach and Array.attach have no run-time overhead in compiled code. The compiler erases proofs, and they are implemented with a special optimization that removes the traversal of the list.

User-Defined Options

Lean's options system allows users to configure various properties of Lean, such as the maximum recursion depth during tactic execution, whether to display tracing output, and how many columns to use when displaying output. The set of options is extensible, so user-defined tactics and other extensions can provide a consistent interface through the set_option command.

In prior versions of Lean, built-in options could additionally be configured by passing the -D flag to Lean on the command line or by setting the leanOptions field in lakefile.toml or lakefile.lean. In Lean 4.11, user-defined options can also be set this way.

By default, Lean validates option names from the command line, ensuring that they are defined. This helps avoid typos. To turn off this checking, prefix an option name with weak..

For example, if a user option for a custom tactic squish is named theorem_squisher.attempts, then

[[lean_lib]]
name = "AmbitiousProof"
leanOptions = { theorem_squisher.attempts = 40 }

will fail if theorem_squisher.attempts has not yet been defined, while

[[lean_lib]]
name = "AmbitiousProof"
leanOptions = { weak.theorem_squisher.attempts = 40 }

will not.

Safer Evaluation

Unlike many programming languages, Lean does not have a separate read-eval-print loop for interactive exploration. Instead, Lean expressions can be evaluated directly in the program text, without having to switch contexts, using the #eval command.

In previous versions of Lean, #eval had a difficult interaction with another Lean feature: to provide better feedback on a whole program, Lean continues to check a file even after errors have been encountered. In order for the rest of the file to make as much sense as possible, the errors are replaced with sorry.

The Lean compiler will not compile code that contains errors. However, in prior versions of Lean, #eval would run erroneous code. Because the sorry can occur as a replacement for a proof of some important property, such as array access being in-bounds or a function terminating, #eval could cause the compiler to crash or loop.

In Lean 4.11, #eval no longer runs code that contains sorry. In this example, there are two errors:

def arr := #[1, 2, 3]

cannot evaluate expression that depends on the `sorry` axiom.
Use `#eval!` to evaluate nevertheless (which may cause lean to crash).#eval failed to prove index is valid, possible solutions:
  - Use `have`-expressions to prove the index is valid
  - Use `a[i]!` notation instead, runtime check is perfomed, and 'Panic' error message is produced if index is not valid
  - Use `a[i]?` notation instead, result is an `Option` type
  - Use `a[i]'h` notation instead, where `h` is a proof that index is valid
⊢ 3 < arr.sizearr[3]

The first comes from the array lookup, which fails due to the index being out of bounds. The second comes from #eval:

cannot evaluate expression that depends on the `sorry` axiom.
Use `#eval!` to evaluate nevertheless (which may cause lean to crash).

A new command #eval! can be used to evaluate expressions that depend on sorry.

Lake

Lake, the Lean build tool, now includes a basic GitHub Actions configuration when it generates a new project. This configuration uses lean-action to build, test, and lint the project. lean-action is developed and maintained by Austin Letson.

Standard Library

In addition to the brand-new Std.HashMap and Std.HashSet, Lean 4.11 includes a number of additions to the standard library, many of which originate in the batteries community library. In particular, many lemmas about lists, arrays, bitvectors, and bitwise operators are either new or have been upstreamed from batteries in Lean 4.11.

Tactics

Extensionality Lemmas

Extensionality is the principle that two objects are equal if all possible observations that can be made on them have equal results. For functions, this means that applying them to equal arguments results in equal values. For structures, it means that all fields have equal values. The ext tactic selects an appropriate extensionality lemma from a table and applies it.

There is an attribute called @[ext] with two uses:

Applied to extensionality lemmas, it registers them for use with the ext tactic.
Applied to structure declarations, it generates and registers extensionality lemmas for the tactic.

In Lean 4.10, the attribute would also generate a bidirectional extensionality lemma for structures that strengthens "equal observations imply equal values" to "equal observations is logically equivalent to equal values", named ext_iff. In Lean 4.11, the attribute now arranges for lemmas to be generated on-demand, rather than doing it up-front. When used to register a user-written extensionality lemma, it also automates the construction of the corresponding ext_iff lemma. In both Lean itself and in Mathlib, this has made it possible to delete many manual proofs, and the resulting generated lemmas are more consistent with each other.

Additionally, in Lean 4.11 the @[ext] attribute now supports use as local or scoped attribute, allowing more fine-grained control over the lemmas. Local attributes only have an effect in the current section or namespace declaration, while scoped attributes have an effect in the current namespace or any other context that opens it.

Improved Error Messages

The decide tactic, which can solve a proof goal using a decision procedure (expressed as a Decidable instance), now features significantly improved error messages. This tactic can fail for a number of distinct reasons:

The goal may contain free variables that prevent the decision procedure from running to completion.
The goal may in fact be false.
The decision procedure may have been defined non-computationally, either using classical reasoning or tactics that don't generate executable code.

In Lean 4.7, decide's error messages were made explicit about the reason for failure. Lean 4.11 builds on that work, further improving the error message. Starting in version 4.11, the error message includes information about which specific Decidable instances were unfolded. It recognizes a number of common reasons that an instance may not be computable and provides specific guidance on remedying the situation. It can even pinpoint the specific instance that is at fault.

Breaking Changes

Signatures and visibility of generated extensionality lemmas: In ext and ext_iff lemmas that are generated by the @[ext] attribute, the x and y term arguments are now implicit. Additionally, these two lemmas are now protected, so opening their namespace will not make them visible without a prefix. Please refer to pull request #4543 for more details and context.
Typo fix in option name: The name of trace.profiler.useHearbeats has been corrected to trace.profiler.useHeartbeats.
Structural recursion: A bug in the elaboration of structurally-recursive functions allowed incorrect programs to be accepted, leading to their rejection later by the kernel due to function parameters not being in the order required for the elaboration to always work. However, it is possible that some mistakenly-accepted programs could work nonetheless, because the translation rules are conservative. These programs are now rejected prior to reaching the kernel. Affected programs can be fixed by re-ordering their arguments as described in pull request #4672.
List.filterMapM sequences monadic actions left-to-right: Previously, List.filterMapM sequenced monadic actions from right to left, even though the result values were returned from left to right. This bug has been fixed, but code that relied on the prior behavior may need updating.
The effect of the variable command on proofs of theorems has been changed.: Please see the dedicated section in this post for more details. The option deprecated.oldSectionVars can be used to temporarily switch back to the old behavior while updating code.