## Diagram: Compiler-Guided Process Annotation for Formal Theorem Proving
### Overview
The image is a technical flowchart illustrating a system for translating informal mathematical statements into formally verified proofs using the Lean 4 proof assistant. The diagram depicts a multi-stage pipeline involving natural language processing, formalization, compilation, and verification, with feedback loops for error correction.
### Components/Axes
The diagram is organized into several interconnected blocks, each representing a component or data state. The primary components are:
1. **Informal / Natural Language** (Top-left box)
2. **Lean 4** (Top-center box)
3. **Compiled Feedback** (Top-right box)
4. **Informalizer** (Middle-left, with OpenAI logo)
5. **Autoformalizer** (Middle-center, with neural network icon)
6. **Lean 4 Compiler** (Middle-right, with Lean logo)
7. **Mathlib4 Pool** (Bottom-left box)
8. **Verified Lean 4** (Bottom-center box)
9. **Compiler-Guided Process Annotation** (Right-side box)
10. **Process-Supervised Verifier** (Bottom-right, with verification and network icons)
Arrows (solid and dashed) indicate the flow of data and control between these components.
### Detailed Analysis / Content Details
**1. Informal / Natural Language Box (Top-left)**
- **[Question]:** "The theorem states that when converting the minimum of two non-negative real numbers (notated as `R≥0`) to a real number (notated as `R`), it is the same as taking the minimum of those two numbers after each has been individually converted to a real number."
- **[Answer]:** "The proof uses a property of the function `NNReal.coe_mono`, specifically its ability to `map_min`, which ensures that the operation of finding the minimum between two numbers is preserved under the function that converts non-negative real numbers to real numbers."
**2. Lean 4 Box (Top-center)**
- **[Theorem Env]:**
```
lemma coe_mono : Monone ((↑) : R≥0 → R) :=
fun _ _ => NNReal.coe_le_coe.2
```
- **[Statement]:**
```
theorem coe_min :
((min (a : R≥0) b : R) : R) = min (a : R) b :=
```
- **[Proof]:** `NNReal.coe_mono.map_min`
**3. Compiled Feedback Box (Top-right)**
- Contains a structured error message:
- `'severity': 'error'`
- `'pos': {'line': 613, 'column': 3}`
- `'endPos': {'line': 613, 'column': 26}`
- `'data': 'type mismatch\n Monotone.map_min coe_mono has type\n ↑(min ?m.78355 ?m.78356) = min ↑?m.78355 ↑?m.78356 : Prop\n but is expected to have type\n min (↑a) b = min (↑a) b : Prop'`
**4. Process Flow (Middle Row)**
- **Informalizer:** Takes input from the "Mathlib4 Pool" and outputs to the "Informal / Natural Language" box. Represented by an OpenAI logo.
- **Autoformalizer:** Takes input from the "Informal / Natural Language" box and outputs to the "Lean 4" box. Represented by a neural network icon.
- **Lean 4 Compiler:** Takes input from the "Lean 4" box and outputs to the "Compiled Feedback" box. Represented by the Lean logo.
**5. Mathlib4 Pool Box (Bottom-left)**
- **[Statement]:**
```
theorem coe_min (x y : R≥0) :
((min x y : R≥0) : R) = min (x : R) (y : R) :=
```
- **[Proof]:** `NNReal.coe_mono.map_min`
**6. Verified Lean 4 Box (Bottom-center)**
- **[Theorem Env]:** (Identical to the one in the top-center Lean 4 box)
```
lemma coe_mono : Monone ((↑) : R≥0 → R) :=
fun _ _ => NNReal.coe_le_coe.2
```
- **[Statement]:**
```
theorem coe_min (x y : R≥0) :
((min x y : R≥0) : R) = min (x : R) (y : R) :=
```
- **[Proof]:** `NNReal.coe_mono.map_min`
**7. Compiler-Guided Process Annotation Box (Right)**
- **[Question]:** "The theorem states that when converting the minimum of two ..." (Truncated version of the full question).
- **[Answer]:** "The proof uses a property of the function `NNReal.coe_mono`, specifically ..." (Truncated version of the full answer).
- **[Statement]:**
```
theorem coe_min :
((min (a : R≥0) b : R) : R) = min (a : R) b :=
```
- The line `((min (a : R≥0) b : R) : R) = min (a : R) b :=` is highlighted in **green**.
- **[Proof]:** `NNReal.coe_mono.map_min`
- The line `NNReal.coe_mono.map_min` is highlighted in **red**.
- **Legend:** A green box labeled "Correct steps" and a red box labeled "Incorrect steps".
**8. Process-Supervised Verifier (Bottom-right)**
- Represented by an icon combining a verification checklist (✓) and a neural network. It receives input from the "Compiler-Guided Process Annotation" and outputs to the "Verified Lean 4" box.
### Key Observations
1. **Two Representations of the Same Theorem:** The theorem `coe_min` appears in three forms:
- **Informal:** Natural language description.
- **Formal (Initial):** In the "Lean 4" and "Mathlib4 Pool" boxes, stated with explicit type annotations `(x y : R≥0)`.
- **Formal (Annotated):** In the "Compiler-Guided Process Annotation", stated with implicit variables `(a b)`.
2. **Error Identification:** The "Compiled Feedback" shows a `type mismatch` error. The "Compiler-Guided Process Annotation" visually identifies the proof line (`NNReal.coe_mono.map_min`) as an **incorrect step** (red highlight), while the theorem statement line is marked as a **correct step** (green highlight).
3. **Feedback Loop:** The diagram shows a closed-loop system: `Mathlib4 Pool -> Informalizer -> Informal Language -> Autoformalizer -> Lean 4 -> Compiler -> Compiled Feedback -> Compiler-Guided Annotation -> Process-Supervised Verifier -> Verified Lean 4 -> (back to Mathlib4 Pool?)`. This suggests an iterative process for refining formal proofs.
4. **Component Roles:** The "Autoformalizer" translates natural language to formal code. The "Lean 4 Compiler" checks the code and generates errors. The "Process-Supervised Verifier" uses the annotated feedback to produce a verified version.
### Interpretation
This diagram illustrates a sophisticated **human-in-the-loop or AI-in-the-loop system for automated theorem proving**. The core challenge is bridging the gap between intuitive, informal mathematical reasoning and the strict, syntactic requirements of a formal proof assistant like Lean 4.
The system works as follows:
1. A mathematical statement exists in a library (`Mathlib4 Pool`) and is described informally (`Informalizer`).
2. An AI model (`Autoformalizer`) attempts to convert this informal description back into formal Lean 4 code.
3. The Lean 4 compiler checks this code. In the example, it fails with a type mismatch error.
4. The error feedback is not just a raw message but is used to create a **process annotation**. This annotation breaks down the proof attempt into steps (statement, proof) and labels them as correct or incorrect based on the compiler's feedback.
5. A `Process-Supervised Verifier` (likely another AI model trained on verification steps) uses this granular, step-level feedback to understand *why* the proof failed and to guide the generation of a corrected, verified proof (`Verified Lean 4`).
The key innovation highlighted is moving beyond simple pass/fail compilation to **fine-grained, step-level supervision**. By annotating which specific part of the proof script is incorrect, the system provides a much richer learning signal for the verifier model, enabling it to correct errors more effectively. This represents a significant step towards more robust and reliable automated reasoning systems. The presence of the OpenAI logo on the "Informalizer" suggests the use of large language models as a component within this larger formal verification pipeline.