# Technical Document Extraction: Attention Forward Speed Analysis
## Chart Title
Attention forward speed, head dim 128 (H100 80GB SXM5)
## Axes
- **X-axis**: Sequence length (categories: 512, 1k, 2k, 4k, 8k, 16k)
- **Y-axis**: Speed (TFLOPs/s)
## Legend
- **Standard attention**: Blue
- **FlashAttention-2**: Orange
- **Triton**: Green
- **cuDNN**: Red
- **FlashAttention-3**: Purple
## Data Points (by sequence length)
### 512
- Standard attention: 26 TFLOPs/s
- FlashAttention-2: 191 TFLOPs/s
- Triton: 146 TFLOPs/s
- cuDNN: 315 TFLOPs/s
- FlashAttention-3: 292 TFLOPs/s
### 1k
- Standard attention: 31 TFLOPs/s
- FlashAttention-2: 260 TFLOPs/s
- Triton: 273 TFLOPs/s
- cuDNN: 410 TFLOPs/s
- FlashAttention-3: 423 TFLOPs/s
### 2k
- Standard attention: 34 TFLOPs/s
- FlashAttention-2: 298 TFLOPs/s
- Triton: 323 TFLOPs/s
- cuDNN: 484 TFLOPs/s
- FlashAttention-3: 521 TFLOPs/s
### 4k
- Standard attention: 35 TFLOPs/s
- FlashAttention-2: 319 TFLOPs/s
- Triton: 353 TFLOPs/s
- cuDNN: 518 TFLOPs/s
- FlashAttention-3: 579 TFLOPs/s
### 8k
- Standard attention: 35 TFLOPs/s
- FlashAttention-2: 333 TFLOPs/s
- Triton: 369 TFLOPs/s
- cuDNN: 529 TFLOPs/s
- FlashAttention-3: 602 TFLOPs/s
### 16k
- Standard attention: OOM (Out of Memory)
- FlashAttention-2: 335 TFLOPs/s
- Triton: 378 TFLOPs/s
- cuDNN: 539 TFLOPs/s
- FlashAttention-3: 616 TFLOPs/s
## Key Trends
1. **Standard attention** (blue):
- Speed increases linearly from 26 (512) to 35 (8k), then plateaus at 35 (16k) with OOM.
2. **FlashAttention-2** (orange):
- Speed increases steadily from 191 (512) to 335 (16k).
3. **Triton** (green):
- Speed increases from 146 (512) to 378 (16k), with consistent growth across all sequence lengths.
4. **cuDNN** (red):
- Speed increases from 315 (512) to 539 (16k), showing strong scalability.
5. **FlashAttention-3** (purple):
- Speed increases from 292 (512) to 616 (16k), outperforming all other methods at larger sequence lengths.
## Spatial Grounding
- Legend located in the **top-right corner** of the chart.
- Bar colors strictly match legend labels (e.g., red bars = cuDNN).
## Component Isolation
1. **Header**: Chart title and axis labels.
2. **Main Chart**: Bar groups for each sequence length, with color-coded methods.
3. **Footer**: OOM annotation for Standard attention at 16k.
## Validation
- All legend colors cross-verified with bar colors.
- Numerical values extracted directly from bar labels.
- Trends confirmed visually (e.g., FlashAttention-3 consistently highest).