## Bar Chart: Tool Call Ratio After Finetuning
### Overview
The image is a bar chart comparing the tool call ratio (%) of different search methods (Base Generator, Google Search, Web Search, and Wikipedia Search) at two different steps: Step 0 and Step 32, after finetuning. The chart shows the accuracy (Acc) at each step and the change in tool call ratio for each search method.
### Components/Axes
* **Title:** Tool Call Ratio (%)
* **X-axis:** Step (Step 0, Step 32)
* **Y-axis:** Tool Call Ratio (%) - Scale from 0 to 60
* **Legend:** Located at the top-left of the chart.
* Red: Base Generator
* Green: Google Search
* Blue: Web Search
* Purple: Wikipedia Search
* **Annotations:**
* "After Finetuning" with an arrow pointing from Step 0 to Step 32.
* Accuracy (Acc) at Step 0: 19.2%
* Accuracy (Acc) at Step 32: 25.2% (+6.21%)
### Detailed Analysis
**Step 0:**
* **Base Generator (Red):** 3.1%
* **Google Search (Green):** 38.7%
* **Web Search (Blue):** 18.4%
* **Wikipedia Search (Purple):** 38.5%
**Step 32:**
* **Base Generator (Red):** 0.9% (-2.2%)
* **Google Search (Green):** 13.6% (-1.5%)
* **Web Search (Blue):** 13.6% (+5.2%)
* **Wikipedia Search (Purple):** 13.6% (-4.7%)
### Key Observations
* The accuracy increased from 19.2% at Step 0 to 25.2% at Step 32 (+6.21%).
* The tool call ratio for Google Search and Wikipedia Search decreased significantly from Step 0 to Step 32.
* The tool call ratio for Web Search increased slightly from Step 0 to Step 32.
* The tool call ratio for Base Generator decreased from Step 0 to Step 32.
### Interpretation
The chart illustrates the impact of finetuning on the tool call ratio of different search methods. The overall accuracy improved after finetuning, but the distribution of tool calls across different methods changed significantly. Google Search and Wikipedia Search, which initially had high tool call ratios, experienced substantial decreases, while Web Search saw a slight increase. The Base Generator's tool call ratio also decreased. This suggests that finetuning altered the model's preference for different search methods, potentially optimizing for a more balanced or effective approach. The decrease in Base Generator usage could indicate a shift towards more specialized search tools.