# Technical Document Extraction: Normalized Performance Comparison of Expert Routing Strategies
## 1. Image Overview
This image is a grouped bar chart comparing the "Normalized Performance" of four different Mixture-of-Experts (MoE) architectural configurations across six distinct NLP evaluation metrics. The chart uses a color-coded legend to differentiate between routing strategies and expert segmentation levels.
## 2. Component Isolation
### A. Header / Legend
**Location:** Top-left quadrant of the chart area.
**Content:** Four categories with corresponding color swatches.
1. **Blue (#0072B2):** `0 shared expert + 2 out of 16 routed experts (GShard)`
2. **Yellow/Orange (#E69F00):** `1 shared expert + 1 out of 15 routed experts (+ shared expert isolation)`
3. **Green (#009E73):** `1 shared expert + 3 out of 31 routed experts (+ fine-grained expert segmentation)`
4. **Dark Orange (#D55E00):** `1 shared expert + 7 out of 63 routed experts (+ finer expert segmentation)`
### B. Main Chart Area (Axes)
* **Y-Axis (Vertical):**
* **Label:** `Normalized Performance`
* **Range:** 0.5 to 1.2
* **Major Tick Marks:** 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2
* **X-Axis (Horizontal):**
* **Label:** `Metrics`
* **Categories (Left to Right):** `HellaSwag`, `PIQA`, `ARC-easy`, `ARC-challenge`, `TriviaQA`, `NaturalQuestions`
## 3. Data Extraction (Reconstructed Table)
The following table estimates the numerical values based on the visual alignment with the Y-axis grid lines.
| Metric | GShard (Blue) | Shared Isolation (Yellow) | Fine-grained (Green) | Finer (Dark Orange) |
| :--- | :---: | :---: | :---: | :---: |
| **HellaSwag** | ~0.92 | ~0.96 | ~0.98 | 1.00 |
| **PIQA** | ~0.98 | ~0.98 | ~0.97 | 1.00 |
| **ARC-easy** | ~0.89 | ~0.97 | ~1.00 | 1.00 |
| **ARC-challenge** | ~0.92 | ~0.91 | ~0.92 | 1.00 |
| **TriviaQA** | ~0.61 | ~0.85 | ~0.93 | 1.00 |
| **NaturalQuestions** | ~0.56 | ~0.79 | ~0.88 | 1.00 |
## 4. Trend Verification and Analysis
### General Trend
Across all six metrics, the **"1 shared expert + 7 out of 63 routed experts (+ finer expert segmentation)"** (Dark Orange) consistently achieves the highest normalized performance, reaching the 1.0 mark in every category. This indicates it serves as the baseline or the peak performance target for this comparison.
### Specific Series Observations
* **GShard (Blue):** Shows the lowest performance across most metrics, particularly on knowledge-heavy tasks like `TriviaQA` and `NaturalQuestions`, where it drops significantly below 0.7.
* **Shared Expert Isolation (Yellow):** Provides a substantial performance boost over GShard in almost every category, most notably in `TriviaQA` and `NaturalQuestions`.
* **Fine-grained Expert Segmentation (Green):** Generally shows incremental improvements over the "Shared Isolation" strategy. It matches or nearly matches the top performance in `ARC-easy`.
* **Task Sensitivity:** The performance gap between the simplest configuration (GShard) and the most complex (Finer segmentation) is narrowest in `PIQA` and widest in `NaturalQuestions` and `TriviaQA`. This suggests that finer expert segmentation is critical for factual retrieval/knowledge tasks.
## 5. Language Declaration
The text in this image is entirely in **English**. No other languages are present.