# Technical Document Extraction: Heatmap Analysis of $\phi(A_{r=64}, A'_{r=64}, i, j)$
## 1. Header Information
* **Main Title:** $\phi(A_{r=64}, A'_{r=64}, i, j)$
* **Description:** This image contains a grid of eight heatmaps organized into two rows and four columns. The visualization represents a mathematical function $\phi$ applied to weight matrices (likely LoRA adapters) across different layers of a neural network.
## 2. Component Isolation
### A. Global Legend (Color Bar)
* **Spatial Placement:** Located on the far right of the image.
* **Scale:** Linear numerical scale ranging from **0.0 to 0.8**.
* **Color Mapping:**
* **0.0 (Dark Purple/Black):** Low value/correlation.
* **0.4 (Magenta/Pink):** Mid-range value.
* **0.8 (Light Peach/White):** High value/correlation.
### B. Axis Definitions
* **Y-Axis (Vertical):** Labeled as **$i$**. Markers are provided at intervals of 6: `1, 7, 13, 19, 25, 31, 37, 43, 49, 55, 61`.
* **X-Axis (Horizontal):** Labeled as **$j$**. Markers are provided at intervals of 5: `1, 6, 11, 16, 21, 26, 31, 36, 41, 46, 51, 56, 61`.
* **Row Labels (Left of each row):**
* Top Row: **Layer 1**
* Bottom Row: **Layer 64**
* **Column Labels (Middle of the grid):**
* The third column is labeled **Layer 32**.
* The fourth column is labeled **Layer 96**.
* **Sub-headers (Top of each heatmap):**
* Columns 1 & 3: **$\Delta W_q$** (Query weight delta)
* Columns 2 & 4: **$\Delta W_v$** (Value weight delta)
## 3. Data Table: Heatmap Grid Organization
| Row Label | Col 1: $\Delta W_q$ | Col 2: $\Delta W_v$ | Col 3: $\Delta W_q$ (Layer 32) | Col 4: $\Delta W_v$ (Layer 32) |
| :--- | :--- | :--- | :--- | :--- |
| **Layer 1** | Heatmap (1,1) | Heatmap (1,2) | Heatmap (1,3) | Heatmap (1,4) |
| **Layer 64** | Heatmap (2,1) | Heatmap (2,2) | **Layer 96** $\Delta W_q$ | **Layer 96** $\Delta W_v$ |
## 4. Trend Verification and Data Extraction
### General Visual Trends
Across all eight heatmaps, there is a consistent **"Top-Left Heavy"** pattern. The highest values (lightest colors, ~0.6 to 0.8) are concentrated at the very beginning of the indices ($i=1, j=1$). As $i$ and $j$ increase toward 61, the values decay toward 0.0 (darker colors).
### Specific Heatmap Analysis
#### 1. Layer 1 ($\Delta W_q$ and $\Delta W_v$)
* **Trend:** These show the highest overall intensity across the entire grid.
* **$\Delta W_q$:** High values (~0.7) persist along the top edge ($i=1$) and left edge ($j=1$). The "bright" region extends further into the center than in deeper layers.
* **$\Delta W_v$:** Similar to $\Delta W_q$, but the decay toward the bottom-right is slightly more gradual.
#### 2. Layer 32 ($\Delta W_q$ and $\Delta W_v$)
* **Trend:** Significant reduction in intensity compared to Layer 1.
* **$\Delta W_q$:** The high-value region is strictly confined to the very first few indices. Most of the map is dark purple (~0.1 - 0.2).
* **$\Delta W_v$:** Shows a slightly stronger vertical band at $j=1$ compared to its $\Delta W_q$ counterpart.
#### 3. Layer 64 ($\Delta W_q$ and $\Delta W_v$)
* **Trend:** Intensity is higher than Layer 32 but lower than Layer 1.
* **$\Delta W_q$:** Notable horizontal "streaking" at the top ($i=1$ to $i=7$), suggesting higher correlation in the initial dimensions.
* **$\Delta W_v$:** Displays a very prominent vertical "bright" bar at $j=1$ to $j=6$, indicating these dimensions are highly active.
#### 4. Layer 96 ($\Delta W_q$ and $\Delta W_v$)
* **Trend:** The most "sparse" or "dim" of the layers shown.
* **$\Delta W_q$:** Almost entirely dark (values < 0.2) except for the extreme top-left corner.
* **$\Delta W_v$:** Shows a very sharp, thin line of higher values at $j=1$, with the rest of the matrix approaching 0.0.
## 5. Summary of Findings
The data indicates that the function $\phi$ (likely measuring similarity or magnitude of weight updates) is most significant in the **initial dimensions** (low $i, j$ indices) of the weight matrices. This significance is most pronounced in the **earlier layers** (Layer 1) and appears to diminish or become more localized to specific dimensions as the network depth increases (Layer 96). There is a visible difference between Query ($\Delta W_q$) and Value ($\Delta W_v$) updates, with Value updates often maintaining a stronger vertical "column" of influence at the first index.