## Line Charts: Llama-3 Model "I-Don't-Know Rate" Across Layers
### Overview
The image displays two side-by-side line charts comparing the "I-Don't-Know Rate" across the layers of two different-sized language models: Llama-3-8B (left) and Llama-3-70B (right). Each chart plots the performance of eight different experimental configurations, distinguished by anchoring method (Q-Anchored vs. A-Anchored) and evaluation dataset (PopQA, TriviaQA, HotpotQA, NQ). The charts visualize how the model's propensity to output an "I don't know" response changes as information propagates through its internal layers.
### Components/Axes
* **Chart Titles:**
* Left Chart: `Llama-3-8B`
* Right Chart: `Llama-3-70B`
* **Y-Axis (Both Charts):**
* Label: `I-Don't-Know Rate`
* Scale: 0 to 100, with major tick marks at 0, 20, 40, 60, 80, 100.
* **X-Axis:**
* Label: `Layer`
* Left Chart Scale: 0 to 30, with major tick marks at 0, 10, 20, 30.
* Right Chart Scale: 0 to 80, with major tick marks at 0, 20, 40, 60, 80.
* **Legend (Bottom of Image, spanning both charts):**
* Contains 8 entries, each with a line sample and text label.
* **Q-Anchored (Solid Lines):**
* Blue solid line: `Q-Anchored (PopQA)`
* Green solid line: `Q-Anchored (TriviaQA)`
* Purple solid line: `Q-Anchored (HotpotQA)`
* Pink solid line: `Q-Anchored (NQ)`
* **A-Anchored (Dashed Lines):**
* Orange dashed line: `A-Anchored (PopQA)`
* Red dashed line: `A-Anchored (TriviaQA)`
* Brown dashed line: `A-Anchored (HotpotQA)`
* Gray dashed line: `A-Anchored (NQ)`
### Detailed Analysis
**Llama-3-8B Chart (Left):**
* **Q-Anchored Lines (Solid):** These lines generally show a **downward trend** in the early layers (0-10), indicating a decreasing "I-Don't-Know Rate." After layer 10, they exhibit significant volatility but tend to stabilize at lower values (mostly between 10-40) compared to their starting points.
* `Q-Anchored (PopQA)` (Blue): Starts very high (~90), drops sharply to ~10 by layer 10, then fluctuates between ~10-40.
* `Q-Anchored (TriviaQA)` (Green): Starts high (~80), drops to near 0 by layer 10, then fluctuates at a very low level (0-20).
* `Q-Anchored (HotpotQA)` (Purple): Starts high (~85), drops to ~20 by layer 10, then shows high volatility between ~10-50.
* `Q-Anchored (NQ)` (Pink): Starts moderately high (~60), drops to ~20 by layer 10, then fluctuates between ~10-40.
* **A-Anchored Lines (Dashed):** These lines show a general **upward trend** across layers, indicating an increasing "I-Don't-Know Rate."
* `A-Anchored (PopQA)` (Orange): Starts around 40, rises steadily to ~70 by layer 30.
* `A-Anchored (TriviaQA)` (Red): Starts around 40, rises to the highest level among all lines, reaching ~80 by layer 30.
* `A-Anchored (HotpotQA)` (Brown): Starts around 40, rises to ~60 by layer 30.
* `A-Anchored (NQ)` (Gray): Starts around 40, rises to ~60 by layer 30.
**Llama-3-70B Chart (Right):**
* **Q-Anchored Lines (Solid):** Similar to the 8B model, these lines show an initial **downward trend** but with more pronounced and sustained volatility across the deeper layers (0-80).
* `Q-Anchored (PopQA)` (Blue): Starts very high (~95), drops steeply to ~10 by layer 20, then fluctuates widely between ~5-40.
* `Q-Anchored (TriviaQA)` (Green): Starts high (~85), drops to near 0 by layer 20, then remains very low (0-10) with minor fluctuations.
* `Q-Anchored (HotpotQA)` (Purple): Starts high (~90), drops to ~20 by layer 20, then exhibits extreme volatility between ~5-60.
* `Q-Anchored (NQ)` (Pink): Starts moderately high (~70), drops to ~20 by layer 20, then fluctuates between ~10-50.
* **A-Anchored Lines (Dashed):** These lines also show a general **upward trend**, but they reach higher peaks and exhibit more noise compared to the 8B model.
* `A-Anchored (PopQA)` (Orange): Starts around 40, rises with high volatility to a peak near 90 around layer 60.
* `A-Anchored (TriviaQA)` (Red): Starts around 40, rises to the highest sustained levels, fluctuating between 70-90 from layer 40 onward.
* `A-Anchored (HotpotQA)` (Brown): Starts around 40, rises to fluctuate between 60-80 from layer 40 onward.
* `A-Anchored (NQ)` (Gray): Starts around 40, rises to fluctuate between 50-70 from layer 40 onward.
### Key Observations
1. **Divergent Anchoring Effects:** There is a stark and consistent contrast between anchoring methods. **Q-Anchored** methods lead to a *decrease* in the "I-Don't-Know Rate" through the layers, while **A-Anchored** methods lead to an *increase*.
2. **Model Size Impact:** The larger Llama-3-70B model shows more extreme values (both higher peaks for A-Anchored and lower troughs for Q-Anchored) and significantly greater volatility in its layer-wise responses compared to the 8B model.
3. **Dataset Sensitivity:** The effect magnitude varies by dataset. For Q-Anchored methods, `TriviaQA` (green) consistently results in the lowest "I-Don't-Know Rate." For A-Anchored methods, `TriviaQA` (red) and `PopQA` (orange) often result in the highest rates.
4. **Early Layer Convergence:** For Q-Anchored methods, the most dramatic change occurs in the first 10-20 layers, after which the rate stabilizes or fluctuates around a new, lower baseline.
### Interpretation
This data suggests a fundamental difference in how information is processed under the two anchoring paradigms. The **Q-Anchored** approach appears to progressively build confidence or extract answer-related information as data moves through the network layers, reducing uncertainty. Conversely, the **A-Anchored** approach seems to amplify uncertainty or "forget" initial priors, leading to a higher likelihood of a non-answer response in deeper layers.
The increased volatility in the 70B model indicates that larger models may have more specialized or unstable internal representations across layers for these tasks. The consistent dataset-specific patterns (e.g., TriviaQA being easiest for Q-Anchored) imply that the underlying nature of the knowledge or question format in each dataset interacts predictably with the model's architecture and the anchoring method.
From a technical document perspective, these charts provide strong evidence that the choice of anchoring method (Q vs. A) is a critical hyperparameter that dramatically influences model behavior and calibration (as measured by the "I-Don't-Know" rate) across its depth, with effects that scale with model size.