\n
## Line Charts: Comparative Probability Analysis of Sentence Types Across Models
### Overview
The image displays four line charts arranged horizontally, comparing the probability trends of four different sentence types ("Teacher Sentence," "Boosted Sentence," "Shared Sentence," "Student Sentence") across two different models or conditions labeled "Ours" and "DeepSeek-Distill-Qwen3-8B." Each chart plots "Probability" on the y-axis against "Sentence Index" on the x-axis. The charts include shaded confidence intervals or variance bands and specific numerical annotations at the end of each data series.
### Components/Axes
* **Legend (Top Center):** A shared legend for all four charts defines the color coding:
* **Green Line:** Teacher Sentence
* **Purple Line:** Boosted Sentence
* **Orange Line:** Shared Sentence
* **Blue Line:** Student Sentence
* **Chart 1 (Far Left):**
* **Title/Label:** "Ours" (top-right corner)
* **Y-axis:** Label "Probability," scale from 0.1 to 0.6.
* **X-axis:** Label "Sentence Index," scale from 0 to 1500.
* **Chart 2 (Center Left):**
* **Title/Label:** "Ours" (top-right corner)
* **Y-axis:** Unlabeled, scale from 0.0 to 1.0.
* **X-axis:** Label "Sentence Index," scale from 0 to 1500.
* **Chart 3 (Center Right):**
* **Title/Label:** "DeepSeek-Distill-Qwen3-8B" (top-right corner)
* **Y-axis:** Unlabeled, scale from 0.4 to 0.9.
* **X-axis:** Label "Sentence Index," scale from 0 to 1500.
* **Chart 4 (Far Right):**
* **Title/Label:** "DeepSeek-Distill-Qwen3-8B" (top-right corner)
* **Y-axis:** Unlabeled, scale from 0.0 to 0.5.
* **X-axis:** Label "Sentence Index," scale from 0 to 1500.
### Detailed Analysis
**Chart 1 ("Ours"):**
* **Teacher Sentence (Green):** Shows a steep initial decline from a probability of ~0.6 at index 0, followed by a gradual, noisy decrease. The line ends with an annotation **"= 45.08"**. A shaded green band indicates variance around the line.
* **Boosted Sentence (Purple):** Remains relatively flat and low, hovering just above 0.1 probability throughout.
* **Shared Sentence (Orange) & Student Sentence (Blue):** Both remain very low, near the bottom of the chart (probability ~0.05 or lower), with minimal visible fluctuation.
**Chart 2 ("Ours"):**
* **Boosted Sentence (Purple):** Dominates this chart. It rises quickly from near 0 to a plateau around 0.6 probability and remains stable. Annotated with **"= 62.99"**.
* **Shared Sentence (Orange):** Shows a gradual, slight increase from near 0 to approximately 0.15-0.2 probability. Annotated with **"= 28.98"**.
* **Student Sentence (Blue):** Remains very low, near zero, with a slight upward trend at the very end. Annotated with **"= -11.07"**.
* **Teacher Sentence (Green):** Not visibly plotted in this chart.
**Chart 3 ("DeepSeek-Distill-Qwen3-8B"):**
* **Teacher Sentence (Green):** Fluctuates significantly between approximately 0.5 and 0.7 probability, showing a slight overall downward trend. Annotated with **"= 61.26"**.
* **Boosted Sentence (Purple):** Remains relatively flat in the lower portion of the chart, around 0.4 probability.
* **Shared Sentence (Orange) & Student Sentence (Blue):** Both remain very low, near the bottom of the chart's visible range (probability ~0.4 or lower).
**Chart 4 ("DeepSeek-Distill-Qwen3-8B"):**
* **Boosted Sentence (Purple):** Is the highest line, fluctuating between approximately 0.3 and 0.45 probability. Annotated with **"= -75.71"**.
* **Shared Sentence (Orange):** Shows a noisy but generally increasing trend from near 0 to about 0.1 probability. Annotated with **"= 9.61"**.
* **Student Sentence (Blue):** Remains very low, near zero, with a slight increase towards the end. Annotated with **"= 4.48"**.
* **Teacher Sentence (Green):** Not visibly plotted in this chart.
### Key Observations
1. **Model/Condition Comparison:** The "Ours" model (Charts 1 & 2) shows a clear separation where the Teacher Sentence probability is high in one view and the Boosted Sentence probability is high in another. The "DeepSeek" model (Charts 3 & 4) shows a more mixed picture with the Teacher Sentence having moderate, volatile probability and the Boosted Sentence being the highest in its respective view.
2. **Annotation Anomaly:** Several numerical annotations are negative (e.g., "-11.07", "-75.71"), which is atypical for a metric labeled "Probability." This strongly suggests the annotations may represent a different calculated value (e.g., a score, difference, or log probability) rather than the raw probability shown on the y-axis.
3. **Line Visibility:** The Teacher Sentence line is only prominent in the first and third charts. The Boosted Sentence line is prominent in the second and fourth charts. This may indicate the charts are grouped to highlight different primary comparisons.
4. **Trend Consistency:** The Shared and Student Sentence types consistently show the lowest probabilities across all charts and models.
### Interpretation
The data appears to analyze the internal probability dynamics of different sentence types within two language model distillation or training frameworks ("Ours" vs. "DeepSeek-Distill-Qwen3-8B"). The "Sentence Index" likely represents a sequence of tokens or steps in a generation or training process.
The contrasting profiles between the "Ours" charts suggest that in this model, the "Teacher" and "Boosted" sentence types dominate under different conditions or metrics. The "DeepSeek" model shows less dominance by a single type, with the Teacher sentence exhibiting high variance, possibly indicating less stability.
The most critical finding is the disconnect between the y-axis ("Probability") and the final annotations. The negative values for what should be a non-negative probability metric indicate these annotations are **not** the final y-axis values. They likely represent a derived performance score, a change relative to a baseline, or a different metric altogether (e.g., a reward signal). Without additional context, their exact meaning is uncertain, but their presence is a key feature of the data presentation. The charts collectively suggest that the "Boosted Sentence" type is engineered to achieve high probability in specific contexts, while the "Teacher Sentence" provides a baseline that can be volatile.