## Diagram: APOLLO Framework Workflow
### Overview
This image is a complex technical flowchart illustrating the "APOLLO Framework," a system designed to translate natural language mathematical problems into formally verified proofs using the Lean 4 theorem prover. The diagram demonstrates a multi-agent, iterative pipeline where an initial Large Language Model (LLM) output is systematically refined, stripped of informalities, segmented into logical gaps (using Lean's `sorry` macro), and solved by automated tools. When the automated tools fail, the system recursively generates sub-lemmas to bridge the gaps, ultimately compiling a complete, verified formal proof.
### Components/Axes
The diagram is spatially organized into a main processing area (enclosed in a large dashed rectangle) and input/output areas at the bottom.
**Columns (Processing Stages):**
1. **LLM (Red Boxes, Left Column):** Generates initial Lean code mixed with natural language comments and informal proof steps.
2. **Syntax Refiner (Brown Boxes, Second Column):** Strips natural language comments, leaving only Lean syntax. Highlights unverified or potentially faulty tactics in red text.
3. **Sorrifier (Yellow Boxes, Third Column):** Replaces the unverified/faulty tactics (red text) with numbered `sorry` placeholders (e.g., `sorry #1`), effectively isolating the logical steps that need formal verification.
4. **Auto Solver (Blue Boxes, Right Column):** Attempts to replace the `sorry` placeholders with valid Lean tactics (highlighted in blue).
**Rows (Logical Progression):**
* **Top Row:** The main theorem proof attempt.
* **Middle Row:** Generation and processing of "Lemma #5" (triggered by a failure in the main proof).
* **Bottom Row:** Generation and processing of "Lemma #6" (triggered by a failure in the main proof).
**Input/Output (Bottom Section):**
* **Formal Statement (Grey Box, Bottom Left):** The initial problem statement and setup.
* **LLM + APOLLO (Green Box, Bottom Right):** The final, successfully compiled and verified Lean 4 proof.
**Flow Indicators:**
* **Solid Black Arrows:** Indicate the forward progression of code through the pipeline (Left to Right).
* **Dashed Black Arrows:** Indicate feedback loops or structural dependencies. Notably, dashed arrows point from `sorry #5` and `sorry #6` in the top Auto Solver box down to the LLM boxes for the respective lemmas. Another dashed arrow points from the combined lemmas to the final output.
---
### Content Details (Transcription & Analysis)
#### 1. Input: Formal Statement (Bottom Left)
This box sets up the Lean environment and states the mathematical problem: Find $x^2 + y^2$ given the arithmetic mean is 7 and the geometric mean is $\sqrt{19}$.
```lean
-- Real numbers x and y have an arithmetic mean of 7 and a geometric mean of √19. Find x²+y².
Show that it is 158.-/
import Mathlib
import Aesop
set_option maxHeartbeats 0
set_option pp.numericTypes true
set_option pp.coercions.types true
open BigOperators Real Nat Topology Rat
theorem mathd_algebra_332 (x y : ℝ) (h₀ : (x + y) / 2 = (7 : ℝ)) (h₁ : Real.sqrt (x * y) = Real.sqrt (19 : ℝ)) :
x ^ 2 + y ^ 2 = (158 : ℝ) := by
```
#### 2. Main Proof Flow (Top Row)
**LLM (Red Box):**
Generates the initial proof attempt. It includes extensive natural language comments (highlighted in red in the image) explaining the math, interspersed with Lean code.
*Key extractions:*
* `have sum : x + y = (14 : ℝ) from by` ... `rw [← mul_div_cancel_left (two_ne_zero : (2 : ℝ) ≠ 0)]`
* `have prod_pos : 0 < x * y from by` ... `apply (sqrt_pos (19 : ℝ)).2`
* `have prod : x * y = (19 : ℝ) := by` ... `apply sqrt_inj`
* Ends with `rw [sum, prod]` and `ring`.
**Syntax Refiner (Brown Box):**
The comments are removed. The informal or potentially incorrect tactic blocks are highlighted in red text.
*Key extractions (Red text):*
* `from by`
* `rw [← mul_div_cancel_left (two_ne_zero : (2 : ℝ) ≠ 0)]`
* `from by`
* `apply (sqrt_pos (19 : ℝ)).2`
* `by`
* `apply sqrt_inj`
* `by`
* `rw [sum, prod]`
* `ring`
**Sorrifier (Yellow Box):**
The red text from the previous step is crossed out (strikethrough) and replaced with numbered `sorry` tags highlighted in orange.
*Key extractions:*
* `have sum : x + y = (14 : ℝ) := by` -> `sorry #1`
* `have prod_pos : 0 < x * y := by` -> `sorry #2`
* `have h_pos_xy : 0 < sqrt (x * y) := by` -> `sorry #3`
* `have prod : x * y = (19 : ℝ) := by` -> `sorry #5`
* Final step -> `sorry #6`
**Auto Solver (Blue Box):**
Attempts to solve the `sorry` tags. Successful tactics are highlighted in blue.
*Key extractions:*
* `sorry #1` becomes `linarith #1`
* `sorry #2` becomes `simp_all only [Real.sqrt_pos, ofNat_pos] #2`
* `sorry #3` becomes `linarith #3`
* `sorry #4` becomes `exact Real.sqrt_pos.mp h_pos_xy #4`
* **Crucial Detail:** `sorry #5` and `sorry #6` remain unsolved (highlighted in blue text, but still say `sorry`). Dashed arrows point from these to the rows below.
#### 3. Lemma #5 Flow (Middle Row)
Triggered by the failure to solve `sorry #5` (proving $x * y = 19$).
**LLM (Red Box):**
Defines `lemma mathd_algebra_332_1`. Includes comments explaining how to prove that if $\sqrt{x*y} = \sqrt{19}$ and $x*y$ is positive, then $x*y = 19$.
*Key code:*
```lean
lemma mathd_algebra_332_1
(x y : ℝ)
(h₁ : √(x * y) = √(19 : ℝ))
(sum : x + y = (14 : ℝ))
(prod_pos : (0 : ℝ) < x * y)
(h₀ : True) :
x * y = (19 : ℝ) := by
```
**Syntax Refiner -> Sorrifier -> Auto Solver:**
Follows the same pattern. The Auto Solver successfully finds tactics for this lemma.
*Auto Solver (Blue Box) Output for Lemma 5:*
* `have h_nonneg : 0 ≤ x * y := by exact le_of_lt prod_pos`
* `have hs1 : (√(x * y)) * (√(x * y)) = x * y := by exact mul_self_sqrt h_nonneg #1`
* `have h_n19 : 0 ≤ (19 : ℝ) := by linarith`
* `have hs2 : (√(19 : ℝ)) * (√(19 : ℝ)) = 19 := by exact mul_self_sqrt h_n19 #2`
* `rw [←hs1]`
* `rw [h₁]`
* `rw [hs2]`
#### 4. Lemma #6 Flow (Bottom Row)
Triggered by the failure to solve `sorry #6` (the final algebraic expansion).
**LLM (Red Box):**
Defines `lemma mathd_algebra_332_2`. Proves $x^2 + y^2 = 158$ given $x+y=14$ and $x*y=19$.
*Key code:*
```lean
lemma mathd_algebra_332_2
(x y : ℝ)
(sum : x + y = (14 : ℝ))
(prod : x * y = (19 : ℝ))
(h₀ : True) :
x ^ 2 + y ^ 2 = (158 : ℝ) := by
```
**Syntax Refiner -> Sorrifier -> Auto Solver:**
The Auto Solver successfully resolves the algebraic steps.
*Auto Solver (Blue Box) Output for Lemma 6:*
* `have h1 : (x + y) ^ (2 : ℕ) = x ^ (2 : ℕ) + 2 * x * y + y ^ (2 : ℕ) := by ring`
* `rw [sum] at h1`
* `have h2 : x ^ (2 : ℕ) + y ^ (2 : ℕ) = (14 : ℝ) * (14 : ℝ) - 2 * (19 : ℝ) := by linarith`
* `norm_num at h2`
* `exact h2`
Below the Auto Solver for Lemma 6 is a grey box stating: **"Base Proof + Lemma #5 + Lemma #6"**, with a dashed arrow pointing to the final output.
#### 5. Output: LLM + APOLLO (Bottom Right Green Box)
This is the final, synthesized Lean 4 code. It combines the initial setup, the successfully solved steps from the main proof, and inlines the logic discovered during the lemma generation phases into a single, cohesive, verified proof.
```lean
import Mathlib
import Aesop
set_option maxHeartbeats 0
set_option pp.numericTypes true
set_option pp.coercions.types true
open BigOperators Real Nat Topology Rat
theorem mathd_algebra_332 (x y : ℝ) (h₀ : (x + y) / 2 = (7 : ℝ)) (h₁ : Real.sqrt (x * y) = Real.sqrt (19 : ℝ)) :
x ^ 2 + y ^ 2 = (158 : ℝ) := by
have sum : x + y = (14 : ℝ) := by
linarith
have prod_pos : 0 < x * y := by
have h_pos19 : 0 < sqrt (19 : ℝ) := by
simp_all only [Real.sqrt_pos, ofNat_pos]
have h_pos_xy : 0 < sqrt (x * y) := by
linarith
exact Real.sqrt_pos.mp h_pos_xy
have prod : x * y = (19 : ℝ) := by
have h_nonneg : 0 ≤ x * y := by
exact le_of_lt prod_pos
have hs1 : (√(x * y)) * (√(x * y)) = x * y := by
exact mul_self_sqrt h_nonneg
have h_n19 : 0 ≤ (19 : ℝ) := by
linarith
have hs2 : (√(19 : ℝ)) * (√(19 : ℝ)) = 19 := by
exact mul_self_sqrt h_n19
rw [←hs1]
rw [h₁]
rw [hs2]
have h1 : (x + y) ^ (2 : ℕ) = x ^ (2 : ℕ) + 2 * x * y + y ^ (2 : ℕ) := by ring
rw [sum] at h1
have h2 : x ^ (2 : ℕ) + y ^ (2 : ℕ) = (14 : ℝ) * (14 : ℝ) - 2 * (19 : ℝ) := by linarith
norm_num at h2
exact h2
```
---
### Key Observations
* **Iterative Refinement:** The system does not expect the LLM to write perfect Lean code on the first try. Instead, it uses the LLM for high-level logical structuring and relies on deterministic tools (Syntax Refiner, Sorrifier, Auto Solver) to handle the rigorous formalization.
* **The Power of `sorry`:** The framework ingeniously uses Lean's native `sorry` macro (which tells the compiler "assume this is true for now") as a boundary marker. By "sorrifying" dubious LLM code, the system isolates discrete sub-problems that an automated theorem prover can tackle independently.
* **Dynamic Lemma Generation:** The dashed arrows from `sorry #5` and `sorry #6` reveal a dynamic fallback mechanism. When the Auto Solver cannot bridge a logical gap in the main proof, the framework prompts the LLM to treat that specific gap as a new, standalone theorem (a lemma), restarting the refinement pipeline for that specific chunk of logic.
### Interpretation
This diagram illustrates a sophisticated neuro-symbolic architecture. It addresses a fundamental limitation of Large Language Models in formal mathematics: LLMs are good at intuitive leaps and natural language reasoning but struggle with the exact, unforgiving syntax and step-by-step rigor required by formal verification languages like Lean 4.
By breaking the process into specialized agents, the APOLLO Framework acts as a translator and verifier. The LLM provides the "skeleton" of the proof. The Syntax Refiner and Sorrifier act as a "compiler," identifying where the LLM's logic is informal or unverified. The Auto Solver acts as the "muscle," using brute-force or specialized tactics to fill in the formal steps.
The most powerful feature demonstrated here is the recursive lemma generation. By recognizing failure (`sorry #5` and `#6` remaining unsolved) and branching those failures into new LLM prompts, the system can dynamically break down complex mathematical leaps into smaller, digestible proofs, eventually synthesizing them into the flawless final output shown in the green box. This represents a significant step toward fully automated formal mathematical reasoning.