Lean 4.9.0

David Thrane Christiansen

2024-07-01

Release Announcements

Lean 4.9.0 has been released! This release features a number of significant improvements to the user experience, better tools for diagnosing performance issues, and many more theorems in the standard library. Like every release, 4.9 also brings improvements to performance, predictability, and robustness to Lean's internals. There have been many more improvements to Lean since the release of 4.8.0 than there is space to discuss here—please refer to the release notes for a complete account.

Incrementality

In prior versions of Lean, editing a top-level declaration or theorem would result in Lean restoring a saved internal state from just prior to the declaration. It would then re-check the declaration along with the rest of the file. Now, Lean's incremental checking is much more fine-grained, and it is able to re-use much more of the prior state. In particular, progress is checkpointed after checking type signatures and theorem statements and before checking their bodies, even in mutual blocks. Even better, Lean is also able to rewind its progress to the most recent tactic step in a proof, so time-consuming automation steps will be re-run much less frequently. Some tactic control structures are also supported, including induction, cases, and case-selection operators such as case, next, and bullets (·).

When modifying a proof, only the step that was modified and subsequent steps need to be re-checked. The yellow progress bar on the left side shows the region that Lean is currently checking.

Incrementality required significant re-architecting of Lean's internals. There's no need to worry that these changes have made Lean less trustworthy, however, because Lean's separate trusted kernel did not require modification, and it still checks every result just as before.

Because incrementality is still new, not all of Lean's interactive features support incrementality in Lean 4.9. In particular, incremental feedback is limited to the progress bar and to messages, warnings, and errors. Hovers, outlines, and other similar information must still wait until the current declaration has been fully checked.

Lean's incrementality API is also available to embedded languages and user-written commands that have the @[incremental] attribute. While it has not yet been thoroughly documented, its use in Verso can serve as an example until documentation is written.

Reducibility

Local Control via `seal` and `unseal`

While checking proofs and programs, Lean takes reducibility, also known as transparency, into account. There are three levels:

Reducible definitions are unfolded essentially everywhere (on demand). Type class instance synthesis, definitional equality checks, and the rest of the language treat the definition as being essentially an abbreviation, and this is the setting applied by the abbrev command.
Semireducible definitions are not unfolded by potentially expensive automation such as type class instance synthesis or simp. The def command creates semireducible definitions unless a different reducibility level is specified. Even with the definition def Name := String, the Repr String instance will not be found when searching for Repr Name instances. However, outside such automation, semi-reducible definitions do get unfolded, so def lean : Name := "Lean" is still accepted.
Irreducible definitions are not unfolded at all. They can be created using the @[irreducible] attribute. With the definition @[irreducible] def Version := String, the type Version can't be used for strings because it doesn't get unfolded while type checking, so definitions like this one are not accepted: def version : Version := type mismatch "4.9.0" has type String : Type but is expected to have type Version : Type"4.9.0".

Controlling reducibility is an important part of using Lean effectively. Unfolding too many definitions leads to slowdowns and increased memory usage, but unfolding too few definitions can lead to boilerplate proof steps or even type errors. In most cases, the defaults are appropriate, but from time to time it is necessary to unfold a definition that usually is best left folded. Some tactics allow this to be specified, and the with_reducible, with_reducible_and_instances, and with_unfolding_all tactic combinators cause reducible definitions only, reducible definitions and the bodies of type class instances, or all definitions to be unfolded. However, general control over reducibility was a blunt instrument, and arranging for one important definition to be unfolded could result in wasted work unfolding others as well.

Lean 4.9 introduces the seal and unseal commands, which allow the reducibility of a single definition to be locally overridden. If a definition is sealed, then it becomes irreducible until the end of the current namespace, section or file. Likewise, unsealing a definition makes it semireducible until the end of the namespace, section or file. These can also be used with in, which restricts the change in reducibility to a single command.

Given the above examples, sealing and unsealing Name and Version results in the opposite behavior:

seal Name
unseal Version

def lean' : Name := type mismatch
  "Lean"
has type
  String : Type
but is expected to have type
  Name : Type"Lean"

def version' : Version := "4.9.0"

type mismatch
  "Lean"
has type
  String : Type
but is expected to have type
  Name : Type

Well-Founded Recursion is Irreducible

Functions defined by well-founded recursion are now irreducible by default. This eliminates performance gotchas: to reduce a definition that uses well-founded recursion, Lean must actually reduce the proofs that the measure reduces, and not just the arguments themselves. This can require much more computational work than might be expected, leading to slowdowns. This is not a concern in compiled code, from which proofs are erased, but it crops up while checking Lean proofs and while translating Lean to the core language understood by the trusted kernel.

For such definitions, it's usually much better to explicitly rewrite using the equational lemmas that Lean generates, and the simplifier does this whenever the definition's name is specified as a simp rule. In Lean 4.9, these definitions are irreducible, which means that Lean will not unfold them unless explicitly asked to and the proofs no longer slow down computation.

More Theorems

Lean 4.9 ships with 686 more theorems than Lean 4.8, which represents a 19% increase in the number of theorems. Many of these theorems have been collated and upstreamed from Batteries and Mathlib, and many are new. The vast majority of the added theorems are about basic types, particularly List (268 new theorems), Array (121 new theorems), and BitVec (54 new theorems). The fixed-width integer types UInt8, UInt16, UInt32, and UInt64 each gained 29 theorems.

The top ten namespaces by added theorem count are List (268), Array (121), BitVec (54), UInt8—UInt64 (all 29), String (26), String.Pos (25), and USize (21).

Together with Lean's simplifier, these added theorems provide more than an enriched library to choose from. They work synergistically to provide Lean with more opportunities for automation, as improved simplification can allow other tactics to become applicable as well.

Bitvector Improvements

Bitvectors occur frequently in program verification. We have been focusing on making Lean's bitvector type more ergonomic to work with and fleshing out the standard library with all the expected operations and theorems. Starting in Lean 4.9, the bitvector literal notation is always available without requiring any imports. The simplifier's support for left and right shifts, where the amount shifted by is a literal, has also been greatly improved by the addition of a custom simproc. Our growing theory of bitvectors also makes it easier to interface with lightning-fast external solvers, because tools like LeanSAT need a sufficiently large library of provide verified translations from Lean expressions to the formats expected by solvers, as well as to reconstruct the solvers' proofs in Lean.

Improved Diagnostics

Kernel Performance Diagnostics

Lean's trusted kernel now reports the number of times it unfolds each definition using the diagnostic counters framework that was introduced in Lean 4.8. This makes it easier to diagnose performance issues that arise in the final proof verification step. Turning on the counters and setting a threshold of 5 in the following proof reveals the definitions that the kernel unfolds at least five times:

set_option diagnostics true in
set_option diagnostics.threshold 5 in
[reduction] unfolded reducible declarations (max: 38, num: 1):
    outParam ↦ 38[kernel] unfolded declarations (max: 56, num: 8):
    List.rec ↦ 56
  
  PProd.fst ↦ 48
  
  List.casesOn ↦ 29
  
  List.reverseAux.match_1 ↦ 29
  
  Append.append ↦ 8
  
  HAppend.hAppend ↦ 8
  
  String.casesOn ↦ 8
   String.rec ↦ 8use `set_option diagnostics.threshold <num>` to control threshold for reporting counterstheorem hello_world : s!"{hello}, {world}" = "Hello, world!" := by⊢ toString "" ++ toString hello ++ toString ", " ++ toString world ++ toString "" = "Hello, world!"
  rflAll goals completed! 🐙

The top three are List.rec (56 unfoldings), PProd.fst (48 unfoldings), and List.casesOn (29 unfoldings), most of which are internal details of the implementations of String and List.

In addition to diagnostic counters, the kernel's timeout mechanism for long-running computations is better-integrated with the rest of Lean. Lean uses a machine-independent measurement of time called heartbeats to terminate long-running computations deterministically. Heartbeats correspond roughly to memory allocations. The setting for the maximum number of heartbeats to attempt is now propagated to the kernel, making it easier to control Lean's overall resource use.

Improved Error Messages

Lean's parser features error recovery, and even declarations that contain syntax errors will frequently result in partial syntax trees. In Lean 4.9, when a declaration's signature is complete, but its body contains a syntax error, Lean is capable of reporting errors that occur while checking that the signature makes sense.

The error message that results from using a natural number literal for a type that does not implement the necessary instance has been improved, explaining to new users what must be done. For example,

example : String := failed to synthesize
  OfNat String 29
numerals are polymorphic in Lean, but the numeral `29` cannot be used in a context where the expected type is
  String
due to the absence of the instance above
use `set_option diagnostics true` to get diagnostic information29

now emits:

failed to synthesize
  OfNat String 29
numerals are polymorphic in Lean, but the numeral `29` cannot be used in a context where the expected type is
  String
due to the absence of the instance above
use `set_option diagnostics true` to get diagnostic information

Lake

Lake's new dependency tracking and monitoring infrastructure, which debuted in Lean 4.8, has received a number of refinements in Lean 4.9. Build product caching has been made more robust, the internal documentation has been extended, and package templates now include default README.md files. Additionally, support for test drivers in lake test has been extended (in particular, test drivers can now be called from dependencies and provided with arguments), and a new lake lint command provides a standardized place to run linters in Lean projects.

Build Artifact Archives

Lean 4.9 introduces the lake pack and lake unpack commands that pack and unpack Lake build artifacts from an archive. These archives can be distributed and re-used. Today, this feature enables organizations to share build products between team members more easily, and in the long term, it will enable cloud builds for packages.

Language Frontend and Tools

Line Ending Normalization

Lean and its tooling now consider all lines in source code to end with LF ('\n'), regardless of their actual line endings. This makes file hashes used in platform-neutral build caches work more reliably, even with version control systems that rewrite line endings in text files to platform-native conventions. It also improves the robustness of test suites for features that are sensitive to the precise contents of files.

Widget Messages

Wojciech Nawrocki contributed support for user widgets as part of messages. This allows rich representations of data to be shown in errors and warnings, or a fallback text in contexts that don't support widgets.

Substitution, Both Ways

If p is a proof of equality, then the syntax p ▸ e is used to substitute in the type of e using the equality. In prior versions of Lean, the expected type of the expression would be used to determine whether the left-hand side of the equality should be rewritten to the right-hand side, or vice versa. If no expected type was available, the left side would be rewritten to the right. In Lean 4.9, both directions are attempted even when there is no expected type available.

As a consequence, the following examples without type annotations both work in Lean 4.9.0, with the type Fin k inferred for both:

example (p : n = k) (x : Fin n) := p ▸ x
example (p : k = n) (x : Fin n) := p ▸ x

In Lean 4.8, the second example resulted in an error:

example (p : n = k) (x : Fin n) := p ▸ x
example (p : k = n) (x : Fin n) := invalid `▸` notation, the equality
  p
has type 
  k = n
but its left hand side is not mentioned in the type
  Fin np ▸ x

invalid `▸` notation, the equality
  p
has type
  k = n
but its left hand side is not mentioned in the type
  Fin n

Memory Layout Documentation

Mario Carneiro contributed documentation that describes the specific memory layout used for constructors of inductive types. This describes how to use Lean's inductive datatypes from C code, which is often necessary when writing bindings to C libraries or using Lean in a context that requires close integration with existing systems. In particular, accessing unboxed constructor arguments requires type-specific computation of offsets; the improved documentation describes the algorithm by which these offsets are computed.

Tactics

This release features improvements to the apply, split, omega, and simp tactics, including improved error messages and greater flexibility. The simp tactic now assists in diagnosing cases where its automatic indexing is too aggressive, which can aid in diagnosing tricky cases where simp lemmas are not being applied or when porting code from Lean 3, which used a simpler indexing strategy. Please refer to the release notes for more details.

Breaking changes

Nat.zero_or and Nat.or_zero have been swapped

These lemmas did not previously conform to Lean's naming conventions.

IsLawfulSingleton is now LawfulSingleton

This makes its name consistent with the other Lawful* classes.

BitVec.rotateLeft and BitVec.rotateRight now rotate modulo the bitwidth

In the past, specifying a rotation amount that was greater than the width of the bitvector could lead to unpredictable results. In Lean 4.9, the rotation amount is first taken modulo the width.

List.length_pos and Option.bind_eq_some are no longer simp lemmas.

These lemmas could lead to surprising results, so they are no longer automatically applied by the simplifier. You may need to add them to simp's arguments in proofs that rely on them.

Stricter elaboration of type annotations in let and have

Type annotations in let and have (both as expressions and as tactics), are now less affected by the inferred type of their assigned value. This improves performance, and can in particular lead to much faster production of type error messages. In particular, tactics embedded in TYPE will no longer make use of the type of VALUE in expressions such as let x : TYPE := VALUE; BODY.

Functions defined by well-founded recursion are irreducible by default

Existing proofs that hold by definitional equality (e.g., rfl) can be updated in a variety of ways:

They can explicitly unfold the function definition (using simp, unfold, rw).
The recursive function can be temporarily made semireducible (using unseal f in before the theorem).
The function definition itself can be marked as @[semireducible] to get the previous behavior.

The MessageData.ofPPFormat constructor has been removed

Its functionality has been split into two orthogonal operations:

To delay constructing structured messages until they are needed, please use MessageData.lazy
To embed Format or FormatWithInfos, use MessageData.ofFormatWithInfos.

An example migration can be found in Lean pull request #3929.

The MessageData.ofFormat constructor has been turned into a function

If you need to inspect MessageData, you can pattern-match on MessageData.ofFormatWithInfos.