## Bar Chart: Tool Call Ratio Comparison Before and After Fine-tuning
### Overview
The chart compares tool call ratios (%) for four search methods (Base Generator, Google Search, Web Search, Wikipedia Search) at two stages: "Step 0" (initial state) and "Step 32" (after fine-tuning). It includes accuracy metrics (Acc:19.2% → Acc:25.2% +6.21%) and percentage changes for each method.
### Components/Axes
- **X-axis**: Labeled "Step 0" (left) and "Step 32" (right), representing pre- and post-fine-tuning states.
- **Y-axis**: Labeled "Tool Call Ratio (%)" with a range from 0 to 60.
- **Legend**: Top-left corner, color-coded:
- Red: Base Generator
- Green: Google Search
- Blue: Web Search
- Purple: Wikipedia Search
- **Annotations**:
- "After Finet-tuning" arrow pointing from Step 0 to Step 32.
- Accuracy metrics: "Acc:19.2%" (Step 0) and "Acc:25.2% (+6.21%)" (Step 32).
### Detailed Analysis
#### Step 0 (Pre-fine-tuning)
- **Base Generator**: 3.1% (red bar, bottom-left).
- **Google Search**: 38.7% (green bar, tallest in Step 0).
- **Web Search**: 18.4% (blue bar, mid-height).
- **Wikipedia Search**: 38.5% (purple bar, second-tallest).
#### Step 32 (Post-fine-tuning)
- **Base Generator**: 0.9% (red bar, decreased by 2.2%).
- **Google Search**: 38.6% (green bar, slight decrease of -1.5%).
- **Web Search**: 23.6% (blue bar, increased by +5.2%).
- **Wikipedia Search**: 33.8% (purple bar, decreased by -4.7%).
### Key Observations
1. **Accuracy Improvement**: Overall accuracy increased by 6.21% (19.2% → 25.2%) after fine-tuning.
2. **Dominant Method**: Google Search remains the most frequently used tool in both stages (~38.7% → ~38.6%).
3. **Web Search Growth**: Web Search saw the largest relative increase (+5.2 percentage points).
4. **Declines**: Base Generator (-2.2%) and Wikipedia Search (-4.7%) decreased significantly post-fine-tuning.
5. **Minor Fluctuations**: Google Search showed near-zero change (-1.5%).
### Interpretation
- **Fine-tuning Impact**: The chart demonstrates that fine-tuning improved overall system performance (accuracy +6.21%) while redistributing tool usage. Web Search emerged as the primary beneficiary, suggesting enhanced relevance or reliability post-optimization.
- **Declining Tools**: The drop in Base Generator and Wikipedia Search usage implies these methods became less necessary or effective after fine-tuning, possibly due to better alternatives (e.g., Web Search).
- **Google Search Stability**: Its near-constant usage indicates it remained a reliable fallback, though its slight decline (-1.5%) suggests marginal shifts toward Web Search.
- **Anomaly**: The Base Generator’s sharp decline (-2.2%) warrants investigation—was it intentionally deprecated, or did fine-tuning render it obsolete?
This analysis highlights how fine-tuning reshaped tool prioritization, emphasizing Web Search’s growing role while maintaining Google Search’s dominance.