Image 7de873b1c326...

EXPERT: gemini-3-flash-free VERSION 1

RUNTIME: nugit/gemini/gemini-3-flash-preview

INTEL_VERIFIED

# Technical Document Extraction: Normalized Performance Comparison of Expert Routing Strategies

## 1. Image Overview
This image is a grouped bar chart comparing the "Normalized Performance" of four different Mixture-of-Experts (MoE) architectural configurations across six distinct NLP evaluation metrics. The chart uses a color-coded legend to differentiate between routing strategies and expert segmentation levels.

## 2. Component Isolation

### A. Header / Legend
**Location:** Top-left quadrant of the chart area.
**Content:** Four categories with corresponding color swatches.
1.  **Blue (#0072B2):** `0 shared expert + 2 out of 16 routed experts (GShard)`
2.  **Yellow/Orange (#E69F00):** `1 shared expert + 1 out of 15 routed experts (+ shared expert isolation)`
3.  **Green (#009E73):** `1 shared expert + 3 out of 31 routed experts (+ fine-grained expert segmentation)`
4.  **Dark Orange (#D55E00):** `1 shared expert + 7 out of 63 routed experts (+ finer expert segmentation)`

### B. Main Chart Area (Axes)
*   **Y-Axis (Vertical):**
    *   **Label:** `Normalized Performance`
    *   **Range:** 0.5 to 1.2
    *   **Major Tick Marks:** 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2
*   **X-Axis (Horizontal):**
    *   **Label:** `Metrics`
    *   **Categories (Left to Right):** `HellaSwag`, `PIQA`, `ARC-easy`, `ARC-challenge`, `TriviaQA`, `NaturalQuestions`

## 3. Data Extraction (Reconstructed Table)

The following table estimates the numerical values based on the visual alignment with the Y-axis grid lines.

| Metric | GShard (Blue) | Shared Isolation (Yellow) | Fine-grained (Green) | Finer (Dark Orange) |
| :--- | :---: | :---: | :---: | :---: |
| **HellaSwag** | ~0.92 | ~0.96 | ~0.98 | 1.00 |
| **PIQA** | ~0.98 | ~0.98 | ~0.97 | 1.00 |
| **ARC-easy** | ~0.89 | ~0.97 | ~1.00 | 1.00 |
| **ARC-challenge** | ~0.92 | ~0.91 | ~0.92 | 1.00 |
| **TriviaQA** | ~0.61 | ~0.85 | ~0.93 | 1.00 |
| **NaturalQuestions** | ~0.56 | ~0.79 | ~0.88 | 1.00 |

## 4. Trend Verification and Analysis

### General Trend
Across all six metrics, the **"1 shared expert + 7 out of 63 routed experts (+ finer expert segmentation)"** (Dark Orange) consistently achieves the highest normalized performance, reaching the 1.0 mark in every category. This indicates it serves as the baseline or the peak performance target for this comparison.

### Specific Series Observations
*   **GShard (Blue):** Shows the lowest performance across most metrics, particularly on knowledge-heavy tasks like `TriviaQA` and `NaturalQuestions`, where it drops significantly below 0.7.
*   **Shared Expert Isolation (Yellow):** Provides a substantial performance boost over GShard in almost every category, most notably in `TriviaQA` and `NaturalQuestions`.
*   **Fine-grained Expert Segmentation (Green):** Generally shows incremental improvements over the "Shared Isolation" strategy. It matches or nearly matches the top performance in `ARC-easy`.
*   **Task Sensitivity:** The performance gap between the simplest configuration (GShard) and the most complex (Finer segmentation) is narrowest in `PIQA` and widest in `NaturalQuestions` and `TriviaQA`. This suggests that finer expert segmentation is critical for factual retrieval/knowledge tasks.

## 5. Language Declaration
The text in this image is entirely in **English**. No other languages are present.

DECODING INTELLIGENCE...

EXPERT: nemotron-free VERSION 1

RUNTIME: free/nvidia/nemotron-nano-12b-v2-vl:free

INTEL_VERIFIED

# Technical Document Extraction: Expert Configuration Performance Analysis

## Chart Description
This bar chart compares the normalized performance of four expert configurations across six question-answering metrics. The configurations vary in the number of shared experts and routed experts, with increasing granularity in expert segmentation.

### Axis Labels
- **X-axis (Metrics):**
  - HellaSwag
  - PIQA
  - ARC-easy
  - ARC-challenge
  - TriviaQA
  - NaturalQuestions
- **Y-axis (Normalized Performance):**
  - Scale: 0.5 to 1.2
  - Increment: 0.1

### Legend
Four expert configurations with color-coded bars:
1. **Blue:** 0 shared expert + 2 out of 16 routed experts (GShard)
2. **Orange:** 1 shared expert + 15 routed experts (+ shared expert isolation)
3. **Green:** 1 shared expert + 31 routed experts (+ fine-grained expert segmentation)
4. **Red:** 1 shared expert + 63 routed experts (+ finer expert segmentation)

## Key Data Points & Trends
### Performance by Metric
| Metric        | GShard (Blue) | Shared +15 (Orange) | Shared +31 (Green) | Shared +63 (Red) |
|---------------|---------------|---------------------|--------------------|------------------|
| HellaSwag     | ~0.92         | ~0.95               | ~0.98              | ~1.00            |
| PIQA          | ~0.98         | ~0.98               | ~0.97              | ~1.00            |
| ARC-easy      | ~0.89         | ~0.96               | ~1.00              | ~1.00            |
| ARC-challenge | ~0.92         | ~0.91               | ~0.92              | ~1.00            |
| TriviaQA      | ~0.61         | ~0.85               | ~0.94              | ~1.00            |
| NaturalQuestions | ~0.56     | ~0.79               | ~0.88              | ~1.00            |

### Observations
- **Performance Scaling:** All configurations achieve near-perfect performance (1.00) on most metrics when using 1 shared expert with 63 routed experts (red bars).
- **GShard Limitations:** The baseline GShard configuration (0 shared experts) shows significantly lower performance across all metrics, particularly on TriviaQA and NaturalQuestions.
- **Expert Segmentation Impact:** Increasing the number of routed experts (from 15 to 63) improves performance in configurations with 1 shared expert, with the most notable gains in TriviaQA and NaturalQuestions.
- **Consistency:** The 1 shared expert + 63 routed experts configuration (red) consistently outperforms other configurations across all metrics.

## Technical Notes
- Normalized performance values suggest relative comparison rather than absolute scores.
- The chart implies a trade-off between expert sharing and segmentation granularity, with finer segmentation (63 experts) yielding optimal results.
- No configuration exceeds the 1.00 performance threshold except in specific cases (e.g., PIQA with 1 shared +15 experts).

DECODING INTELLIGENCE...

TECHNICAL ASSET FINGERPRINT

7de873b1c326459cd4adcf24

FOUND IN PAPERS

EXPERT: gemini-3-flash-free VERSION 1

EXPERT: nemotron-free VERSION 1