## Diagram Type: Code Comparison Screenshot (Lean 4 Theorem Proving)
### Overview
This image displays a side-by-side comparison of two code snippets written in the Lean 4 theorem-proving language. The left panel, outlined in red and titled "LLM", shows a proof attempt generated by a standard Large Language Model. The right panel, outlined in green and titled "LLM + APOLLO", shows a corrected, much longer, and more rigorous proof. Specific lines of code in the left panel that likely failed or were inadequate are highlighted in red and numbered 1 through 9. Corresponding corrected blocks of code in the right panel are highlighted in green and carry the same numbers (1 through 9), demonstrating how the "APOLLO" system expanded or fixed the LLM's initial output.
### Components/Layout
* **Left Panel (Red):** Positioned on the left side, taking up about one-third of the image width. It has a red header reading "LLM". It contains a single column of code. Nine specific tactic calls (mostly `linarith` and `nlinarith`) are highlighted with a light red background and tagged with dark red numbered labels (`#1` to `#9`).
* **Right Panel (Green):** Positioned on the right side, taking up about two-thirds of the image width. It has a green header reading "LLM + APOLLO". Because the code is significantly longer, it is split into two vertical columns within the green box. It contains nine corresponding highlighted blocks with a light green background and dark green numbered labels (`#1` to `#9`).
### Content Details (Transcription)
*Note: The programming language is Lean 4. The text is transcribed exactly as it appears, with formatting added to denote the highlighted and numbered sections.*
#### Left Panel: LLM (Red)
```lean
import Mathlib
import Aesop
set_option maxHeartbeats 0
open BigOperators Real Nat Topology Rat
theorem mathd_algebra_184_llm
(a b : NNReal)
(h₀ : (0 : NNReal) < a ∧ (0 : NNReal) < b)
(h₁ : a ^ (2 : ℕ) = (6 : NNReal) * b)
(h₂ : a ^ (2 : ℕ) = (54 : NNReal) / b) :
a = (3 : NNReal) * (NNReal.sqrt : NNReal → NNReal) (2 : NNReal) := by
have h3 : (6 : NNReal) * b = (54 : NNReal) / b := by
rw [←h₁, ←h₂]
have hb : b = (3 : NNReal) := by
[RED HIGHLIGHT #1 START]
have h4 : b ≠ 0 := by #1
linarith [h₀.right]
[RED HIGHLIGHT #1 END]
have h5 : (6 : NNReal) * b ^ 2 = (54 : NNReal) := by
have h6 : b ≠ 0 := h4
field_simp at h3
[RED HIGHLIGHT #2 START]
nlinarith #2
[RED HIGHLIGHT #2 END]
[RED HIGHLIGHT #3 START]
have h7 : b ^ 2 = (9 : NNReal) := by #3
nlinarith
[RED HIGHLIGHT #3 END]
have h8 : b = (3 : NNReal) := by
have h9 : b > (0 : NNReal) := h₀.right
[RED HIGHLIGHT #4 START]
have h10 : b ^ 2 - (9 : NNReal) = (0 : NNReal) := by linarith #4
[RED HIGHLIGHT #4 END]
have h11 : (b - (3 : NNReal)) * (b + (3 : NNReal)) = (0 : NNReal) := by
ring_nf at h10 ⊢
[RED HIGHLIGHT #5 START]
linarith #5
[RED HIGHLIGHT #5 END]
cases' (mul_eq_zero.mp h11) with h12 h13
[RED HIGHLIGHT #6 START]
· -- b - 3 = 0 #6
linarith
[RED HIGHLIGHT #6 END]
[RED HIGHLIGHT #7 START]
· -- b + 3 = 0, but b > 0, contradiction #7
linarith [h9]
[RED HIGHLIGHT #7 END]
exact h8
have ha : a = (3 : NNReal) * (NNReal.sqrt : NNReal → NNReal) (2 : NNReal) := by
rw [hb] at h₁
have h14 : a ^ 2 = (18 : NNReal) := by
norm_num at h₁
[RED HIGHLIGHT #8 START]
linarith #8
[RED HIGHLIGHT #8 END]
have h15 : a = Real.sqrt (18 : NNReal) := by
rw [←h14]
field_simp [h₀.left]
have h16 : Real.sqrt (18 : NNReal) = (3 : NNReal) * Real.sqrt (2 : NNReal) := by
rw [Real.sqrt_eq_iff_sq_eq] <;> norm_num
<;> ring_nf <;> norm_num
<;> ring_nf <;> norm_num
[RED HIGHLIGHT #9 START]
rw [h15, h16] #9
[RED HIGHLIGHT #9 END]
exact ha
```
#### Right Panel: LLM + APOLLO (Green)
*(Note: Transcribed linearly, combining the two columns from top-left to bottom-left, then top-right to bottom-right)*
```lean
import Mathlib
import Aesop
set_option maxHeartbeats 0
open BigOperators Real Nat Topology Rat
theorem mathd_algebra_184_apollo
(a b : NNReal)
(h₀ : (0 : NNReal) < a ∧ (0 : NNReal) < b)
(h₁ : a ^ (2 : ℕ) = (6 : NNReal) * b)
(h₂ : a ^ (2 : ℕ) = (54 : NNReal) / b) :
a = (3 : NNReal) * (NNReal.sqrt : NNReal → NNReal) (2 : NNReal) := by
have h3 : (6 : NNReal) * b = (54 : NNReal) / b := by
rw [←h₁, ←h₂]
have hb : b = (3 : NNReal) := by
[GREEN HIGHLIGHT #1 START]
have h4 : b ≠ 0 := by #1
have hb : (0 : NNReal) < b := h₀.right
exact ne_of_gt hb
[GREEN HIGHLIGHT #1 END]
have h5 : (6 : NNReal) * b ^ 2 = (54 : NNReal) := by
have h6 : b ≠ 0 := by
exact h4
field_simp at h3
[GREEN HIGHLIGHT #2 START]
have h5 : (6 : NNReal) * b ^ (2 : ℕ) = (54 : NNReal) := by #2
have h6 : b ^ (2 : ℕ) = b * b := by simp [pow_two]
rw [h6]
try norm_cast ; try norm_num ; try simp_all ; try ring_nf at * ; try native_decide ; try linarith ; try nlinarith
exact h3
exact h5
[GREEN HIGHLIGHT #2 END]
[GREEN HIGHLIGHT #3 START]
have h7 : b ^ 2 = (9 : NNReal) := by
have h6 : (6 : NNReal) * b ^ (2 : ℕ) = (54 : NNReal) := by
gcongr
have h7 : b ^ (2 : ℕ) = (9 : NNReal) := by #3
have h8 : (6 : NNReal) ≠ (0 : NNReal) := by
norm_num
have h9 : b ^ (2 : ℕ) = (54 : NNReal) / (6 : NNReal) := by
field_simp [h8]
try norm_cast ; try norm_num ; try simp_all ; try ring_nf at * ; try native_decide ; try linarith ; try nlinarith
exact h6
rw [h9]
norm_num
try norm_cast ; try norm_num ; try simp_all ; try ring_nf at * ; try native_decide ; try linarith ; try nlinarith
apply NNReal.eq
norm_num
exact h7
[GREEN HIGHLIGHT #3 END]
[GREEN HIGHLIGHT #5 START]
have h_real : (↑b : ℝ) ^ 2 = (3 : ℝ) ^ 2 := by
rw [← NNReal.coe_pow, h7]
norm_num
have hb_real : (↑b : ℝ) = 3 := by
simp_all only [ne_eq, NNReal.zero_le_coe, ofNat_nonneg, OfNat.ofNat_ne_zero, not_false_eq_true, pow_left_inj₀]
have hb : b = 3 := by
apply NNReal.eq
exact hb_real
have h8 : b = (3 : NNReal) := by
have h9 : b > (0 : NNReal) := by
exact pos_of_ne_zero h4
[GREEN HIGHLIGHT #5 END]
[GREEN HIGHLIGHT #4 START]
have h10 : b ^ 2 - (9 : NNReal) = (0 : NNReal) := by #4
simp_all only [and_true, ne_eq, gt_iff_lt, tsub_self]
[GREEN HIGHLIGHT #4 END]
have h11 : (b - (3 : NNReal)) * (b + (3 : NNReal)) = (0 : NNReal) := by
ring_nf at h10 ⊢
[GREEN HIGHLIGHT #5 CONTINUES]
have hb : b = (3 : NNReal) := by #5
have h7' : b ^ 2 = (9 : NNReal) := by
simpa using h7
have h9' : b > (0 : NNReal) := by gcongr
have h11 : b = (3 : NNReal) := by
have h12 : b ^ 2 - (9 : NNReal) = (0 : NNReal) := by
gcongr
have h13 : (b - (3 : NNReal)) * (b + (3 : NNReal)) = (0 : NNReal) := by
ring_nf at h12 ⊢
try norm_cast ; try norm_num ; try simp_all ; try ring_nf at * ; try native_decide ; try linarith ; try nlinarith
...
[GREEN HIGHLIGHT #5 ENDS]
*(Moving to Column 2)*
cases' (mul_eq_zero.mp h13) with h14 h15
· try norm_cast ; try norm_num ; try simp_all ; try ring_nf at * ; try native_decide ; try linarith ; try nlinarith
· have h17 : b ≤ (0 : NNReal) := by
simp_all only [and_true, ne_eq, gt_iff_lt, tsub_self, mul_zero, AddLeftCancelMonoid.add_eq_zero, OfNat.ofNat_ne_zero, and_self]
simp_all only [and_true, ne_eq, gt_iff_lt, tsub_self, mul_zero, AddLeftCancelMonoid.add_eq_zero, OfNat.ofNat_ne_zero, and_self]
exact h11
rw [hb]
norm_num
[GREEN HIGHLIGHT #6 START]
cases' (mul_eq_zero.mp h11) with h12 h13
· have h13 : b ^ 2 = 9 := by #6
simpa using h7
have h14 : b = 3 := by
have h15 : b ^ 2 - 9 = 0 := by
gcongr
have h16 : (b - 3) * (b + 3) = 0 := by
ring_nf at h15 ⊢
simp_all only [and_true, ne_eq, gt_iff_lt, tsub_self, zero_mul, mul_zero, add_zero]
cases' (mul_eq_zero.mp h16) with h17 h18
· try norm_cast ; try norm_num ; try simp_all ; try ring_nf at * ; try native_decide ; try linarith ; try nlinarith
· simp_all only [and_true, ne_eq, gt_iff_lt, tsub_self, mul_zero, AddLeftCancelMonoid.add_eq_zero, OfNat.ofNat_ne_zero, and_self]
exact h14
[GREEN HIGHLIGHT #6 END]
[GREEN HIGHLIGHT #7 START]
· simp_all only [and_true, ne_eq, gt_iff_lt, tsub_self, mul_zero, AddLeftCancelMonoid.add_eq_zero, OfNat.ofNat_ne_zero, and_self] #7
[GREEN HIGHLIGHT #7 END]
exact h8
have ha : a = (3 : NNReal) * (NNReal.sqrt : NNReal → NNReal) (2 : NNReal) := by
rw [hb] at h₁
have h14 : a ^ 2 = (18 : NNReal) := by
norm_num at h₁
[GREEN HIGHLIGHT #8 START]
gcongr #8
[GREEN HIGHLIGHT #8 END]
have h15 : a = Real.sqrt (18 : NNReal) := by
rw [←h14]
field_simp [h₀.left]
have h16 : Real.sqrt (18 : NNReal) = (3 : NNReal) * Real.sqrt (2 : NNReal) := by
rw [Real.sqrt_eq_iff_sq_eq] <;> norm_num
<;> ring_nf <;> norm_num
<;> ring_nf <;> norm_num
[GREEN HIGHLIGHT #9 START]
have h17 : (a : ℝ) = √18 := by #9
have h18 : (a : ℝ) = √(18 : ℝ) := by
rw [h15]
norm_num
rw [h18]
have h19 : √18 = 3 * √2 := by
calc
√18 = √(3^2 * 2) := by norm_num
_ = √(3^2) * √2 := by rw [Real.sqrt_mul (by norm_num)]
_ = 3 * √2 := by rw [Real.sqrt_sq (by norm_num)]
have h20 : (a : ℝ) = 3 * √2 := by
rw [h17, h19]
have h21 : a = (3 : NNReal) * (NNReal.sqrt : NNReal → NNReal) (2 : NNReal) := by
have h22 : (a : ℝ) = (3 : NNReal) * (NNReal.sqrt : NNReal → NNReal) (2 : NNReal) := by
simp [h20]
<;> rw [Real.sqrt_eq_iff_sq_eq] <;> norm_num
<;> ring_nf <;> norm_num
<;> nlinarith [Real.sq_sqrt (show (0 : ℝ) ≤ 2 by norm_num)]
exact_mod_cast h22
<;> nlinarith [h₀.left, show 0 ≤ (3 : NNReal) * √2 by
apply mul_nonneg
· norm_num
· apply Real.sqrt_nonneg]
exact h21
[GREEN HIGHLIGHT #9 END]
exact ha
```
### Key Observations
1. **Tactic Failure in LLM:** The base LLM (left) relies heavily on high-level, automated tactics like `linarith` (linear arithmetic) and `nlinarith` (non-linear arithmetic) to skip steps. For example, in `#2` and `#3`, it assumes `nlinarith` can solve equations involving squares and division directly.
2. **Explicit Expansion in APOLLO:** The APOLLO system (right) replaces these single-word tactic calls with extensive, step-by-step logical deductions. Where the LLM used 1 line (`nlinarith`), APOLLO uses 10-20 lines of explicit algebraic manipulation (`field_simp`, `gcongr`, `norm_num`).
3. **Shotgun Tactic Strings:** APOLLO frequently utilizes a robust fallback string of tactics to force a proof step through: `try norm_cast ; try norm_num ; try simp_all ; try ring_nf at * ; try native_decide ; try linarith ; try nlinarith`. This appears multiple times in the green code.
4. **Type Coercion Handling:** In highlight `#9`, the LLM attempts a simple rewrite (`rw [h15, h16]`) to handle square roots of Non-Negative Reals (`NNReal`). This likely fails due to type mismatch or missing lemmas. APOLLO replaces this with a massive block that explicitly coerces the variables to standard Reals (`ℝ`), performs the square root calculations (`√18 = 3 * √2`), and then casts the result back to `NNReal` using `exact_mod_cast`.
### Interpretation
This image serves as a technical demonstration of a neuro-symbolic AI system (APOLLO) designed to improve the theorem-proving capabilities of Large Language Models in formal verification environments like Lean 4.
Standard LLMs often "hallucinate" proof steps by assuming that powerful automated tactics (like `nlinarith`) will magically bridge the gap between two mathematical statements. However, Lean is a strict formal environment; if the context isn't perfectly prepared (e.g., types don't match exactly, or the non-linear step is too complex), the tactic fails, and the proof breaks.
The image demonstrates that APOLLO acts as an agentic wrapper or corrector. It takes the flawed, high-level "sketch" provided by the LLM (the red side) and systematically expands the failing steps into rigorous, atomic logical deductions (the green side). It handles the tedious, pedantic requirements of formal provers—such as explicit type coercions between `NNReal` and `ℝ`, handling both branches of a quadratic equation (`cases`), and applying specific algebraic normalizers (`ring_nf`, `gcongr`)—which the base LLM glossed over. The repeated use of the long `try...` tactic string suggests APOLLO uses a search-based or brute-force approach at the micro-level to ensure each small step compiles successfully.