## Chart Analysis: PKU Exam Performance Comparison
### Overview
The image presents a comparison of the performance of different AI models (GPT-4, 01-2024-12-17, deepseek-r1, o3-mini-2025-01-31, and gemini-2.5-pro) on PKU (Peking University) undergraduate and PhD qualifying exams. The data is visualized using two radar charts and one bar chart. The radar charts show performance across different subject areas for the undergraduate and PhD exams, while the bar chart shows the average score on the undergraduate exam.
### Components/Axes
**Left Radar Chart (PKU Undergraduate Exam):**
* **Title:** PKU Undergraduate Exam
* **Axes:**
* Theory of Functions of Complex Variables
* Analytic Geometry
* Advanced Algebra
* Mathematical Analysis
* Set Theory and Graph Theory
* Abstract Algebra
* Probability
* Numerical Analysis
* Numerical Linear Algebra
* PDE
* ODE
* **Scale:** 0 to 100, with increments of 20.
* **Legend (Top-Right):**
* GPT-4 (Teal)
* 01-2024-12-17 (Orange)
* deepseek-r1 (Blue)
* 03-mini-2025-01-31 (Pink)
* gemini-2.5-pro (Lime Green)
**Center Bar Chart (PKU Undergraduate Exam - Average Score):**
* **Title:** PKU Undergraduate Exam
* **X-axis:** AI Models (GPT-4, 01-2024-12-17, deepseek-r1, 03-mini-2025-01-31, gemini-2.5-pro)
* **Y-axis:** Average Score, ranging from 0 to 100.
* **Bars:** Each bar represents the average score of a specific AI model. The colors of the bars match the colors in the radar chart legend.
**Right Radar Chart (PKU PhD Qualifying Exam):**
* **Title:** PKU PhD Qualifying Exam
* **Axes:**
* Analysis
* Geometry & Topology
* Algebra
* Probability
* **Scale:** 0 to 100, with increments of 20.
* **Legend (Top-Right):**
* 03-mini-2025-01-31 (Orange)
### Detailed Analysis
**Left Radar Chart (PKU Undergraduate Exam):**
* **GPT-4 (Teal):** Generally scores between 20 and 40 across most subjects, with a peak around 60 in Advanced Algebra.
* **01-2024-12-17 (Orange):** Scores vary more widely, with peaks around 80 in Theory of Functions of Complex Variables and Analytic Geometry, but lower scores in Probability and Numerical Analysis (around 20-40).
* **deepseek-r1 (Blue):** Shows a relatively consistent performance across subjects, mostly between 60 and 80.
* **03-mini-2025-01-31 (Pink):** Similar to deepseek-r1, with scores generally between 60 and 80.
* **gemini-2.5-pro (Lime Green):** Outperforms the other models in most subjects, with scores generally between 80 and 100.
**Center Bar Chart (PKU Undergraduate Exam - Average Score):**
* **GPT-4 (Teal):** Average score of 59.6.
* **01-2024-12-17 (Orange):** Average score of 89.7.
* **deepseek-r1 (Blue):** Average score of 85.0.
* **03-mini-2025-01-31 (Pink):** Average score of 92.2.
* **gemini-2.5-pro (Lime Green):** Average score of 94.2.
**Right Radar Chart (PKU PhD Qualifying Exam):**
* **03-mini-2025-01-31 (Orange):** Scores approximately 80 in Analysis, Geometry & Topology, Algebra, and Probability.
### Key Observations
* The gemini-2.5-pro model generally outperforms the other models on the PKU undergraduate exam, both in individual subjects and in average score.
* GPT-4 shows the lowest average score on the undergraduate exam.
* The 03-mini-2025-01-31 model shows consistent performance across all subjects in the PhD qualifying exam.
* The bar chart confirms the relative performance of the models as seen in the radar chart for the undergraduate exam.
### Interpretation
The data suggests that the gemini-2.5-pro model is the most proficient among the tested AI models in the PKU undergraduate exam, while GPT-4 lags behind. The 03-mini-2025-01-31 model demonstrates strong performance in the PKU PhD qualifying exam. The radar charts provide a detailed view of the strengths and weaknesses of each model across different subject areas, while the bar chart provides a concise summary of their overall performance. The consistent performance of 03-mini-2025-01-31 across the PhD qualifying exam subjects suggests a well-rounded understanding of the core concepts. The variability in performance of the 01-2024-12-17 model in the undergraduate exam indicates potential specialization in certain areas.