## Bar Chart: Number of Real-World Verifiable SWE Instances
### Overview
The chart compares the number of real-world verifiable software engineering (SWE) instances across different datasets/tools, categorized by Python-only and Multilingual implementations. The data is presented as grouped bars, with Python-only in blue and Multilingual in orange. The y-axis represents instance counts, while the x-axis lists specific SWE frameworks/datasets.
### Components/Axes
- **Title**: "Number of Real-World Verifiable SWE Instances"
- **X-Axis Labels**:
- SWE-Bench
- SWE-Gym
- Multi-SWE-RL
- SWE-rebench
- DeepSeek-V3.2
- CWM
- MiMo-V2-Flash
- SWE-Universe (Ours)
- **Y-Axis**: Instance counts (logarithmic scale implied by spacing)
- **Legend**:
- Blue = Python-only
- Orange = Multilingual
- **Bar Colors**:
- Python-only: Blue
- Multilingual: Orange
### Detailed Analysis
| Dataset/Tool | Python-only | Multilingual |
|-----------------------|-------------|--------------|
| SWE-Bench | 2,294 | 2,438 |
| SWE-Gym | 2,438 | 4,723 |
| Multi-SWE-RL | 4,723 | 21,000 |
| SWE-rebench | 21,000 | 24,667 |
| DeepSeek-V3.2 | 24,667 | 35,000 |
| CWM | 35,000 | 90,000 |
| MiMo-V2-Flash | 90,000 | 807,693 |
| SWE-Universe (Ours) | N/A | 807,693 |
### Key Observations
- **Multilingual dominance**: Multilingual instances consistently outnumber Python-only across all categories, with ratios increasing exponentially (e.g., 1.06x for SWE-Bench to 22.5x for SWE-Universe).
- **Exponential growth**: The largest gap appears in SWE-Universe (Ours), where Multilingual instances reach 807,693, dwarfing all prior categories.
- **Outlier**: SWE-Universe (Ours) is an extreme outlier, with Multilingual instances exceeding the previous highest (MiMo-V2-Flash) by 8.97x.
### Interpretation
The data suggests that Multilingual SWE implementations are significantly more prevalent or effective in real-world scenarios compared to Python-only approaches. The SWE-Universe (Ours) category demonstrates a breakthrough in scalability, achieving 807,693 Multilingual instances—an order of magnitude higher than prior tools. This implies that Multilingual frameworks may better address diverse linguistic requirements in software engineering tasks. The absence of Python-only data for SWE-Universe could indicate either a lack of Python support or a strategic focus on Multilingual capabilities in this dataset.