# Technical Data Extraction: Rotary Positional Embedding (RoPE) Decay Analysis
This document provides a detailed technical extraction of the data presented in the two-panel line chart. The charts illustrate the relationship between "Relative distance" and a metric labeled $B_{m, \theta}$, likely representing the decay of basis functions or attention scores in a transformer model using different "base" values for positional encoding.
---
## 1. Global Metadata and Layout
* **Image Type:** Two-panel line plot (Left and Right).
* **Primary Language:** English.
* **Y-Axis Label (Shared):** $B_{m, \theta}$
* **X-Axis Label (Shared):** Relative distance
* **Visual Style:** Scientific plot with LaTeX-style font rendering.
---
## 2. Left Panel Analysis (Short Range)
### Axis Scales
* **X-Axis Range:** 0 to 4000. Major ticks at 0, 2000, 4000.
* **Y-Axis Range:** -0.2 to 0.6. Major ticks at 0.0, 0.2, 0.4, 0.6.
### Legend and Data Series [Spatial Grounding: Top Right Quadrant]
| Series Color | Label | Trend Description |
| :--- | :--- | :--- |
| **Blue** | `base:1e4` | Starts highest (~0.62). Slopes downward with high-frequency oscillations. Remains above other series until ~3500. |
| **Orange** | `base:1e3` | Starts at ~0.5. Rapidly decays to near 0.0 by distance 1000, then oscillates around the 0.0 axis. |
| **Green** | `base:1e2` | Starts lowest (~0.25). Drops sharply into negative values (~ -0.1) before distance 500, then oscillates around 0.0. |
### Key Observations
* Higher base values (1e4) maintain a higher $B_{m, \theta}$ value over longer relative distances.
* Lower base values (1e2, 1e3) exhibit much faster initial decay and settle into a zero-centered oscillation pattern much earlier.
---
## 3. Right Panel Analysis (Long Range)
### Axis Scales
* **X-Axis Range:** 0 to 30,000+. Major ticks at 0, 10000, 20000, 30000.
* **Y-Axis Range:** -0.2 to 0.6 (consistent with left panel).
### Legend and Data Series [Spatial Grounding: Top Right Quadrant]
| Series Color | Label | Trend Description |
| :--- | :--- | :--- |
| **Orange** | `base:1e6` | Starts highest (~0.65). Slopes downward gradually, maintaining the highest mean value (~0.25) at distance 30,000. |
| **Green** | `base:1e5` | Starts middle (~0.55). Slopes downward, maintaining a mean value around 0.1 at distance 30,000. |
| **Blue** | `base:1e4` | Starts lowest (~0.45 on this scale). Slopes downward rapidly, oscillating around 0.0 by distance 10,000. |
### Key Observations
* This panel focuses on much larger relative distances (up to 32k).
* The `base:1e4` series, which was the "high" performer in the left plot, is the "low" performer here, showing that base values must scale with the intended context window to prevent the metric from decaying to zero.
* All series exhibit a "fuzzy" appearance caused by high-frequency oscillations superimposed on the general downward decay curve.
---
## 4. Comparative Summary Table
| Base Value | Initial Value ($B_{m, \theta}$) | Effective Range (before mean $\approx$ 0) |
| :--- | :--- | :--- |
| **1e2** | ~0.25 | < 500 |
| **1e3** | ~0.50 | ~1,000 |
| **1e4** | ~0.62 | ~8,000 - 10,000 |
| **1e5** | ~0.55 | > 32,000 (Mean remains > 0) |
| **1e6** | ~0.65 | > 32,000 (Mean remains ~0.2) |
**Conclusion:** Increasing the `base` parameter significantly extends the distance over which the $B_{m, \theta}$ metric remains positive and significant, effectively "stretching" the positional encoding's reach.