## Radar Chart: Scaling Ability
### Overview
The image presents a radar chart (also known as a spider chart or star chart) illustrating the "Scaling ability" of several models – MathVista, MathVision, WeMath, MMStar, and MMVet – across different prompting strategies: Zero-shot, DreamPRM@2, DreamPRM@4, and DreamPRM@8. The chart visually compares the performance of each model under each prompting strategy, with higher values indicating better scaling ability.
### Components/Axes
* **Title:** "Scaling ability" (centered at the top)
* **Model Labels:** MathVista, MathVision, WeMath, MMStar, MMVet (arranged clockwise around the outer edge of the chart).
* **Prompting Strategy Legend:** Located at the bottom of the chart.
* Zero-shot (Orange)
* DreamPRM@2 (Red)
* DreamPRM@4 (Pink)
* DreamPRM@8 (Dark Grey)
* **Radial Axes:** Representing the scaling ability, with no explicit numerical scale, but values are indicated at each point. The chart has 5 radial axes, one for each model.
### Detailed Analysis
The chart displays performance values for each model and prompting strategy. The values are plotted as points connected by lines, forming a polygon for each prompting strategy.
* **MathVista:**
* Zero-shot: ~68.9
* DreamPRM@2: ~66.5
* DreamPRM@4: ~65.3
* DreamPRM@8: ~60.0
* Trend: The Zero-shot line is the highest, decreasing slightly with increasing DreamPRM parameters.
* **MathVision:**
* Zero-shot: ~20.0
* DreamPRM@2: ~55.9
* DreamPRM@4: ~58.0
* DreamPRM@8: ~60.3
* Trend: The Zero-shot line is the lowest, increasing significantly with increasing DreamPRM parameters.
* **WeMath:**
* Zero-shot: ~57.4
* DreamPRM@2: ~51.7
* DreamPRM@4: ~53.6
* DreamPRM@8: ~54.5
* Trend: The Zero-shot line is the highest, decreasing slightly with increasing DreamPRM parameters.
* **MMStar:**
* Zero-shot: ~62.3
* DreamPRM@2: ~59.3
* DreamPRM@4: ~60.0
* DreamPRM@8: ~66.5
* Trend: The Zero-shot line is relatively high, decreasing slightly with DreamPRM@2 and @4, then increasing significantly with DreamPRM@8.
* **MMVet:**
* Zero-shot: ~61.4
* DreamPRM@2: ~55.9
* DreamPRM@4: ~51.7
* DreamPRM@8: ~53.6
* Trend: The Zero-shot line is the highest, decreasing with increasing DreamPRM parameters.
### Key Observations
* MathVista consistently performs best with the Zero-shot prompting strategy.
* MathVision shows the most significant improvement with the DreamPRM prompting strategies, starting from a very low Zero-shot score.
* MMStar exhibits a unique trend, with performance increasing substantially with DreamPRM@8.
* The Zero-shot strategy generally performs well for MathVista, WeMath, MMStar, and MMVet, but is significantly lower for MathVision.
* DreamPRM@8 shows mixed results, improving MathVision and MMStar, but decreasing performance for MathVista, WeMath, and MMVet.
### Interpretation
The radar chart demonstrates the scaling ability of different models when using various prompting strategies. The chart suggests that the optimal prompting strategy is model-dependent. MathVista benefits from a simple Zero-shot approach, while MathVision requires the more complex DreamPRM strategies to achieve reasonable performance. The performance of MMStar with DreamPRM@8 is an outlier, indicating a potential synergy between the model and this specific prompting configuration. The chart highlights the importance of tailoring prompting strategies to the specific characteristics of each model to maximize performance. The differences in scaling ability suggest that the models have varying levels of inherent mathematical reasoning capabilities and sensitivity to prompt engineering. The chart provides valuable insights for selecting the most appropriate model and prompting strategy for a given mathematical task.