# Technical Document Extraction: Latency Composition Analysis
## Chart Description
This image presents a stacked bar chart comparing latency composition across three computational configurations. The chart uses color-coded segments to represent different computational components' contribution to total latency.
### Key Components
1. **Legend** (Top of chart):
- GEMMs: Blue
- Flash: Gray
- Softmax: Orange
- DR: Green
- LN: Yellow
- Other: Purple
2. **X-Axis Categories**:
- Small (h=2560, a=20)
- Large (h=16384, a=128)
- Large + Flash (h=16384, a=128)
3. **Y-Axis**:
- Label: "Percentage of Latency (%)"
- Range: 0-100%
## Data Analysis
### Spatial Grounding Verification
- Legend position: Top-center
- Color consistency confirmed:
- Blue = GEMMs (dominant in all categories)
- Gray = Flash (appears only in Large + Flash)
- Orange = Softmax (visible in Small and Large)
- Green = DR (small presence in all)
- Yellow = LN (minimal in all)
- Purple = Other (consistent 1% across all)
### Trend Verification
1. **Small Configuration**:
- GEMMs: 68% (blue)
- Softmax: 12% (orange)
- DR: 6% (green)
- LN: 2% (yellow)
- Flash: 1% (gray)
- Other: 1% (purple)
2. **Large Configuration**:
- GEMMs: 94% (blue)
- Softmax: 3% (orange)
- DR: 1% (green)
- LN: 1% (yellow)
- Flash: 1% (gray)
- Other: 1% (purple)
3. **Large + Flash Configuration**:
- GEMMs: 92% (blue)
- Flash: 3% (gray)
- Softmax: 2% (orange)
- DR: 1% (green)
- LN: 1% (yellow)
- Other: 1% (purple)
## Technical Observations
1. **Dominant Component**: GEMMs consistently represent >90% of latency in Large configurations
2. **Flash Impact**: Addition of Flash in Large + Flash configuration reduces GEMMs' share by 2% while maintaining total latency
3. **Softmax Reduction**: Softmax contribution decreases from 12% (Small) to 2% (Large + Flash)
4. **Stable Components**: DR, LN, and Other maintain <3% contribution across all configurations
## Data Table Reconstruction
| Configuration | GEMMs (%) | Flash (%) | Softmax (%) | DR (%) | LN (%) | Other (%) |
|--------------------|-----------|-----------|-------------|--------|--------|-----------|
| Small (h=2560) | 68 | 1 | 12 | 6 | 2 | 1 |
| Large (h=16384) | 94 | 1 | 3 | 1 | 1 | 1 |
| Large + Flash | 92 | 3 | 2 | 1 | 1 | 1 |
## Language Analysis
- All text appears in English
- No non-English content detected
## Critical Findings
1. **Latency Bottleneck**: GEMMs dominate computational latency in large-scale operations
2. **Hardware Impact**: Flash integration shows minimal latency contribution (3%) but enables GEMM optimization
3. **Algorithmic Efficiency**: Softmax and DR components show significant reduction in larger configurations