# Technical Document: Comparison of LLM Search and Verification Strategies
This document provides a detailed technical extraction of the provided diagram, which illustrates three distinct methodologies for generating and verifying Large Language Model (LLM) outputs: **Best-of-N**, **Beam Search**, and **Lookahead Search**.
---
## 1. Global Legend (Footer)
The following key defines the visual components used across all three diagrams.
| Symbol | Description |
| :--- | :--- |
| **Purple Dashed Box** | **Apply Verifier**: Indicates a point in the process where a verification model evaluates the generated content. |
| **Solid Orange Circle** | **Full Solution**: Represents a completed response or path. |
| **Hollow Orange Circle** | **Intermediate solution step**: Represents a partial generation or a node in a search tree. |
| **Solid Green Circle** | **Selected by verifier**: A solution or step deemed high-quality by the reward model/verifier. |
| **Solid Red Circle** | **Rejected by verifier**: A solution or step deemed low-quality or incorrect by the verifier. |
---
## 2. Component Analysis
### Panel A: Best-of-N
**Header Text:** Best-of-N
**Instruction Box:** "Generate N full solutions, selecting the best one with the verifier"
* **Process Flow:**
1. Starts with a single **Question** (Blue box).
2. Four independent paths (full solutions) are generated simultaneously from the question to the final state.
3. **Visual Trend:** The paths are long, continuous lines. Three lines are colored red (rejected) and one is colored green (selected).
4. **Verification:** The verifier (purple dashed box) is applied only at the very end of the generation process for each of the N solutions.
* **Footer Caption:** "Select the best final answer using the verifier"
### Panel B: Beam Search
**Header Text:** Beam Search
**Instruction Box:** "Select the top-N samples at each step using the PRM" (Process Reward Model)
* **Process Flow:**
1. Starts with a single **Question** (Blue box).
2. The generation is broken into discrete "Intermediate solution steps" (hollow circles).
3. **Visual Trend:** A tree-like structure where branching occurs at every level. At each horizontal level, multiple nodes are evaluated.
4. **Verification:** The verifier (purple dashed box) is applied at *every* intermediate step.
5. **Selection Logic:** Only the green nodes (selected) serve as the base for the next step of generation. Red nodes (rejected) are pruned and do not continue.
* **Footer Caption:** "Select the best final answer using the verifier"
### Panel C: Lookahead Search
**Header Text:** Lookahead Search
**Instruction Box:** "Beam search, but at each step rollout k-steps in advance, using the PRM value at the end of the rollout to represent the value for the current step"
* **Process Flow:**
1. Starts with a single **Question** (Blue box).
2. **Rollout Mechanism:** From an intermediate step, the system performs a "rollout" (indicated by a dashed black curved line) several steps ahead.
3. **Annotation 1:** "Rollout k-steps"
4. **Annotation 2:** "Propagate PRM value back to step" (indicated by a solid black arrow pointing from the future state back to the current decision node).
5. **Verification:** The verifier evaluates the *future* state (the end of the rollout) to decide whether the *current* step is valid.
6. **Search Continuation:** "Continue Search from the top-N options" (indicated by solid green nodes at a lower level).
* **Visual Trend:** This is the most complex structure, showing a recursive or "look-ahead" pattern where current choices are dictated by simulated future outcomes.
---
## 3. Summary of Differences
| Feature | Best-of-N | Beam Search | Lookahead Search |
| :--- | :--- | :--- | :--- |
| **Verification Frequency** | Once (at the end) | At every intermediate step | At every step, based on future steps |
| **Granularity** | Full solution level | Step-by-step level | Step-by-step with future rollouts |
| **Computational Cost** | Low/Medium (N full paths) | High (Verification at every step) | Very High (Multiple rollouts per step) |
| **Pruning** | No pruning during generation | Immediate pruning of bad steps | Pruning based on predicted future value |