\n
## Code Comparison Diagram: LLM vs. LLM + APOLLO Proof Scripts
### Overview
The image displays a side-by-side comparison of two Lean 4 code blocks. The left block, outlined in red, is labeled "LLM". The right block, outlined in green, is labeled "LLM + APOLLO". Both blocks contain code to prove the same mathematical theorem (`imo_1983_p6`), but they employ different proof tactics, suggesting a comparison of proof strategies or the effect of an additional tool (APOLLO).
### Components/Axes
The image is divided into two primary, vertically-aligned rectangular regions:
1. **Left Region (LLM):** Bordered in red. Contains a complete Lean 4 proof script.
2. **Right Region (LLM + APOLLO):** Bordered in green. Contains a complete Lean 4 proof script.
Each region contains the following structural components:
* **Header Label:** A colored box at the top center of each region ("LLM" in red, "LLM + APOLLO" in green).
* **Code Block:** Monospaced text representing Lean 4 code.
* **Highlighted Proof Tactic:** The final `nlinarith` tactic call is highlighted with a background color matching the region's border (light red on the left, light green on the right). A small tag `#1` appears at the end of the first line of this tactic in both blocks.
### Detailed Analysis / Content Details
**1. Common Code Structure (Both Blocks):**
Both scripts share identical setup and theorem statements:
```lean
import Mathlib
import Aesop
set_option maxHeartbeats 0
open BigOperators Real Nat Topology Rat
theorem imo_1983_p6_... (a b c : ℝ) (h₀ : 0 < a ∧ 0 < b ∧ 0 < c) (h₁ : c < a + b) (h₂ : b < a + c) (h₃ : a < b + c) : 0 ≤ a ^ 2 * b * (a - b) + b ^ 2 * c * (b - c) + c ^ 2 * a * (c - a) := by
```
* **Imports:** `Mathlib`, `Aesop`.
* **Option:** `maxHeartbeats` set to 0 (disabling the timeout).
* **Opened Namespaces:** `BigOperators`, `Real`, `Nat`, `Topology`, `Rat`.
* **Theorem Hypotheses:** `a, b, c` are positive real numbers (`h₀`), and they satisfy the triangle inequalities (`h₁`, `h₂`, `h₃`).
* **Theorem Conclusion:** A specific cyclic inequality involving squares and products of `a, b, c`.
**2. Divergent Proof Tactics (The Key Difference):**
The proof scripts differ entirely in the tactic used after `:= by`.
* **Left Block (LLM) Proof Tactic:**
```lean
nlinarith [sq_nonneg (a - b), sq_nonneg (b - c),
sq_nonneg (c - a), sq_nonneg (a - b),
sq_nonneg (b - c), sq_nonneg (c - a),
mul_pos h₀.left h₀.right.left,
mul_pos h₀.left h₀.right.right,
mul_pos h₀.right.left h₀.right.right]
```
* **Tactic:** `nlinarith` (non-linear arithmetic).
* **Provided Facts:** A list of 9 facts:
1. `sq_nonneg (a - b)` (repeated twice)
2. `sq_nonneg (b - c)` (repeated twice)
3. `sq_nonneg (c - a)` (repeated twice)
4. `mul_pos h₀.left h₀.right.left` (product of `a>0` and `b>0` is positive)
5. `mul_pos h₀.left h₀.right.right` (product of `a>0` and `c>0` is positive)
6. `mul_pos h₀.right.left h₀.right.right` (product of `b>0` and `c>0` is positive)
* **Right Block (LLM + APOLLO) Proof Tactic:**
```lean
nlinarith [sq_nonneg (a - b), sq_nonneg (b - c),
sq_nonneg (c - a),
mul_pos (sub_pos.mpr h₁)
(sub_pos.mpr h₂), mul_pos
(sub_pos.mpr h₂)
(sub_pos.mpr h₃), mul_pos (sub_pos.mpr h₃)
(sub_pos.mpr h₁)]
```
* **Tactic:** `nlinarith` (non-linear arithmetic).
* **Provided Facts:** A list of 6 facts:
1. `sq_nonneg (a - b)`
2. `sq_nonneg (b - c)`
3. `sq_nonneg (c - a)`
4. `mul_pos (sub_pos.mpr h₁) (sub_pos.mpr h₂)` (product of `(a+b-c)>0` and `(a+c-b)>0` is positive)
5. `mul_pos (sub_pos.mpr h₂) (sub_pos.mpr h₃)` (product of `(a+c-b)>0` and `(b+c-a)>0` is positive)
6. `mul_pos (sub_pos.mpr h₃) (sub_pos.mpr h₁)` (product of `(b+c-a)>0` and `(a+b-c)>0` is positive)
### Key Observations
1. **Identical Goal:** Both proofs aim to establish the same inequality under the same hypotheses.
2. **Proof Strategy Divergence:** The core difference lies in the auxiliary facts supplied to the `nlinarith` automation tactic.
3. **Fact Selection - LLM:** The LLM proof uses a redundant set of non-negativity facts for squares (`sq_nonneg`) and basic positivity facts derived directly from `h₀` (`a,b,c > 0`).
4. **Fact Selection - LLM + APOLLO:** The APOLLO-assisted proof uses a more targeted set. It includes the three necessary `sq_nonneg` facts without repetition. Crucially, it derives positivity facts from the *triangle inequalities* (`h₁, h₂, h₃`) using `sub_pos.mpr`, which transforms `c < a + b` into `0 < a + b - c`. The supplied `mul_pos` facts are products of these derived positive terms.
5. **Conciseness:** The APOLLO version provides fewer total facts (6 vs. 9) and avoids redundancy.
### Interpretation
This diagram illustrates a comparison between two automated theorem proving approaches for a classic IMO problem. The "LLM" label suggests a baseline proof generated by a Large Language Model. The "LLM + APOLLO" label indicates a proof where the LLM's output is augmented or guided by a system named APOLLO.
The key insight is in the **quality and relevance of the auxiliary facts** fed to the `nlinarith` tactic. The LLM-generated proof uses a somewhat brute-force approach, supplying many basic, loosely related facts (including duplicates). In contrast, the APOLLO-assisted proof demonstrates a more sophisticated understanding of the problem structure. It correctly identifies that the triangle inequalities (`h₁, h₂, h₃`) are the critical source of positivity for the terms `(a+b-c)`, `(a+c-b)`, and `(b+c-a)`, which are central to the inequality's proof. By providing these derived facts, the APOLLO system likely makes the `nlinarith` tactic's job more efficient and direct.
Therefore, the image suggests that APOLLO enhances the LLM's capability by improving its **strategic selection of lemmas** or its ability to **perform intermediate reasoning steps** (like applying `sub_pos.mpr`) before invoking automation. This leads to a more elegant, concise, and logically focused proof script. The comparison highlights the difference between generating syntactically correct code and generating optimally informative code for automated provers.