# Technical Document Extraction: Image Analysis
## Section 1: Synchronized Partial Softmax Update (a)
### Diagram Components
1. **Core Operations**:
- `Attention N-1` → `mul1` → `max` → `exp` → `sum` → `mul2` → `Attention N+1`
- **Synchronized Update Path**:
- `mul1` and `mul2` reference operations ② & ④ in (a)
- `sum` operation precedes `Attention N+1`
2. **Key Textual Elements**:
- **Red Text**: "Synchronized partial softmax update"
- **Blue Text**: "Asynchronous softmax with unified max value"
- **Section Label**: "Section 3"
### Workflow
- **Synchronized Path**:
- Input: `Attention N-1`
- Operations: `mul1` → `max` → `exp` → `sum` → `mul2`
- Output: `Attention N+1`
- **Asynchronous Path**:
- Input: `Attention N-1`
- Operations: `mul1` → `exp` → `mul2` → `sum`
- Output: `Attention N+1`
---
## Section 2: Under-Utilized Computation of Flat GEMM (b)
### Diagram Components
1. **Flat GEMM Configuration**:
- **Padding Zeros**:
- Matrix `A` padded with zeros (dashed red box)
- **Direct Computation**:
- `A × B` (direct multiplication)
- **Optimized Path**:
- `load A` → `A × B` → `load A'` → `A' × B`
2. **Key Textual Elements**:
- **Red Text**: "Under-utilized computation of flat GEMM"
- **Blue Text**: "Flat GEMM optimization with double buffering"
- **Section Label**: "Section 4"
### Workflow
- **Baseline (Under-Utilized)**:
- `flat-shape GEMM` → `A × B` (with padding zeros)
- **Optimized**:
- `load A` → `A × B` → `load A'` → `A' × B` (double buffering)
---
## Section 3: Performance Loss to Static Dataflow (c)
### Diagram Components
1. **Static Dataflows**:
- **Static Dataflow 1**:
- `GEMM√` (enabled)
- `Flat GEMM×` (disabled)
- `GEMV×` (disabled)
- **Static Dataflow 2**:
- `GEMM×` (disabled)
- `Flat GEMM√` (enabled)
- `GEMV√` (enabled)
2. **Heuristic Dataflow**:
- All components enabled:
- `GEMM√` (enabled)
- `Flat GEMM√` (enabled)
- `GEMV√` (enabled)
3. **Key Textual Elements**:
- **Red Text**: "Performance loss to static dataflow"
- **Blue Text**: "Heuristic dataflow with hardware resource adaption"
- **Section Label**: "Section 5"
### Workflow
- **Static Dataflow 1**:
- `GEMM` → `static dataflow 1` → `GEMM√`, `Flat GEMM×`, `GEMV×`
- **Static Dataflow 2**:
- `GEMM` → `static dataflow 2` → `GEMM×`, `Flat GEMM√`, `GEMV√`
- **Heuristic Dataflow**:
- `GEMM` → `heuristic dataflow` → `GEMM√`, `Flat GEMM√`, `GEMV√`
---
## Metadata
- **Language**: English (primary), with no additional languages detected.
- **Legend Symbols**:
- `√`: Enabled component
- `×`: Disabled component
- **Spatial Grounding**:
- Legend located at the bottom of Section 3 (c), adjacent to dataflow diagrams.
## Key Trends
1. **Synchronized vs. Asynchronous Softmax**:
- Synchronized path uses `max` and `sum` operations for unified updates.
- Asynchronous path simplifies to `exp` and `sum` with unified max value.
2. **Flat GEMM Optimization**:
- Double buffering reduces padding zeros, improving computation utilization.
3. **Dataflow Configuration**:
- Static dataflows show partial component utilization (mixed `√`/`×`).
- Heuristic dataflow maximizes resource usage (all components enabled).
## Critical Notes
- **No numerical data points** present; focus on component statuses (`√`/`×`) and workflow logic.
- **No axis titles or numerical scales** in the image; diagrams are symbolic representations.