# Model Performance Comparison
## Chart Components
- **Title**: Model Performance Comparison
- **X-Axis**: Model Number (1–22)
- **Y-Axis**: Score (%) (0–100)
- **Legend**: Located in the top-right corner
- **Red squares**: GPQA Diamond
- **Blue circles**: MMLU
- **Cyan diamonds**: Humanity's Last Exam
## Data Series Analysis
### GPQA Diamond (Red Squares)
- **Trend**: Steady upward trajectory with minor fluctuations
- **Key Points**:
- Model 1: 30%
- Model 2: 35%
- Model 3: 48%
- Model 4: 40%
- Model 5: 70%
- Model 6: 78%
- Model 7: 60%
- Model 8: 78%
- Model 9: 79%
- Model 10: 50%
- Model 11: 65%
- Model 12: 67%
- Model 13: 72%
- Model 14: 80%
- Model 15: 81%
- Model 16: 83%
- Model 17: 84%
- Model 18: 81%
- Model 19: 85%
- Model 20: 87%
- Model 21: 89%
- Model 22: 90%
### MMLU (Blue Circles)
- **Trend**: Initial peak followed by stabilization and slight decline
- **Key Points**:
- Model 1: 70%
- Model 2: 86%
- Model 3: 86%
- Model 4: 82%
- Model 5: 88%
- Model 6: 92%
- Model 7: 85%
- Model 8: 92%
- Model 9: 79%
- Model 10: 80%
- Model 11: 88%
- Model 12: 90%
- Model 13: 86%
- Model 14: 87%
- Model 15: 88%
- Model 16: 89%
- Model 17: 90%
- Model 18: 81%
- Model 19: 85%
- Model 20: 90%
- Model 21: 88%
- Model 22: 89%
### Humanity's Last Exam (Cyan Diamonds)
- **Trend**: Gradual rise with sharp late-stage increase and subsequent drop
- **Key Points**:
- Model 1: 10%
- Model 2: 12%
- Model 3: 14%
- Model 4: 16%
- Model 5: 18%
- Model 6: 20%
- Model 7: 22%
- Model 8: 25%
- Model 9: 19%
- Model 10: 21%
- Model 11: 23%
- Model 12: 25%
- Model 13: 27%
- Model 14: 29%
- Model 15: 31%
- Model 16: 35%
- Model 17: 30%
- Model 18: 32%
- Model 19: 35%
- Model 20: 41%
- Model 21: 36%
- Model 22: 42%
## Spatial Grounding
- **Legend Position**: Top-right corner
- **Axis Markers**:
- X-axis: Incremented by 1 (Model Numbers)
- Y-axis: Incremented by 20 (0, 20, 40, 60, 80, 100)
## Source
- **Footer Text**: AI Model Performance Analysis, 2023
## Validation
- All legend colors match corresponding data series:
- Red squares (GPQA Diamond) consistently represent red data points
- Blue circles (MMLU) consistently represent blue data points
- Cyan diamonds (Humanity's Last Exam) consistently represent cyan data points
- Trend descriptions align with visual patterns:
- GPQA Diamond shows overall upward movement
- MMLU exhibits early peaks and late stabilization
- Humanity's Last Exam demonstrates gradual growth with late acceleration