## Diagram: Comparison of PRM Frameworks
### Overview
The image compares three problem-solving frameworks: **Traditional PRM**, **Two-stage Retrieval-enhanced Mechanism**, and **Retrieval PRM Framework**. Each framework is structured with system prompts, target questions, solution steps, and feedback mechanisms.
### Components/Axes
1. **Traditional PRM**
- **System Prompt**: "I want you to act as a math teacher..."
- **Target Question**: "How many seconds are in 5.5 minutes?"
- **Solution Steps**:
- Step 1: 5.5 minutes = 5 minutes + 0.5 minutes.
- Step 2: 5 minutes = 300 seconds.
- Step 3: 0.5 minutes = 30 seconds.
- **Target Step**: "Is that step correct?" with feedback (Yes: 0.9, No: 0.1).
2. **Two-stage Retrieval-enhanced Mechanism**
- **Question Pool**: Contains reference questions (e.g., "What is the equivalent number of seconds in 7.8 minutes?").
- **Reference Questions**:
- Reference Question 1: Process involves converting 7.8 minutes to seconds (7.8 ร 60 = 468 seconds).
- Reference Question 2: Placeholder for additional examples.
- **New Step Pool**: Stores validated steps (e.g., "0.3 hours = 18 minutes").
3. **Retrieval PRM Framework**
- **Reference Questions**:
- Reference Question 1: "How many seconds are in 5.5 minutes?"
- Reference Question 2: Placeholder for additional examples.
- **Reference Steps**:
- Reference Step 1: "0.3 hours = 18 minutes" (correct).
- Reference Step 2: Placeholder for additional steps.
- **Feedback**: Emojis (๐ for correct, ๐ for incorrect) and probabilities (Yes: 0.2, No: 0.8).
### Detailed Analysis
- **Traditional PRM**:
- The target question is solved via direct calculation (5.5 minutes = 330 seconds).
- Feedback shows high confidence in correctness (Yes: 0.9).
- **Two-stage Retrieval-enhanced Mechanism**:
- Uses a database of reference questions to guide problem-solving.
- Example process: 7.8 minutes ร 60 = 468 seconds.
- **Retrieval PRM Framework**:
- Integrates reference steps (e.g., "0.3 hours = 18 minutes") to validate target steps.
- Feedback uses emojis and probabilities to indicate correctness.
### Key Observations
1. **Feedback Mechanisms**:
- Traditional PRM uses binary feedback (Yes/No) with probabilities.
- Retrieval PRM Framework employs emojis for intuitive feedback.
2. **Reference Utilization**:
- Two-stage and Retrieval frameworks leverage reference questions/steps to enhance accuracy.
3. **Step Validation**:
- The Retrieval PRM Framework explicitly cross-references steps (e.g., "Is the target step correct?").
### Interpretation
- **Traditional PRM** relies on direct computation without external references, which may limit adaptability.
- **Retrieval-enhanced frameworks** improve robustness by integrating reference data, reducing errors in complex conversions (e.g., 7.8 minutes โ 468 seconds).
- The use of emojis in the Retrieval PRM Framework suggests a user-friendly approach to feedback, potentially improving interpretability.
- The Two-stage mechanismโs question pool and new step pool indicate a modular design for scalable problem-solving.
## Additional Notes
- **Language**: All text is in English.
- **Missing Data**: No numerical trends or charts are present; the diagram focuses on structural and procedural comparisons.
- **Spatial Grounding**:
- **Traditional PRM**: Left section with linear flow (System Prompt โ Target Question โ Solution Steps โ Feedback).
- **Two-stage Retrieval**: Central section with bidirectional flow (Question Pool โ Reference Questions โ New Step Pool).
- **Retrieval PRM**: Right section with reference-question-driven feedback loop.