# Technical Document Extraction: Bar Chart Analysis
## Chart Type
Bar chart comparing QA method accuracies across datasets.
## Axis Labels
- **X-axis**: Datasets (`Bamboogle`, `HotpotQA`, `MuSiQue`, `2WikiMultiHopQA`, `Average`)
- **Y-axis**: Accuracy (0–80%)
## Legend
- **Position**: Top of chart
- **Labels & Colors**:
- Zero-shot QA: Gray
- Many-shot QA: Orange
- RAG: Purple
- DRAG: Blue
- IterDRAG: Green
## Categories & Sub-Categories
- **Datasets** (X-axis):
- Bamboogle
- HotpotQA
- MuSiQue
- 2WikiMultiHopQA
- Average
- **QA Methods** (Legend):
- Zero-shot QA
- Many-shot QA
- RAG
- DRAG
- IterDRAG
## Data Points & Trends
### Bamboogle
- **Zero-shot QA**: 19.2 (Gray)
- **Many-shot QA**: 24.8 (Orange)
- **RAG**: 52.8 (Purple)
- **DRAG**: 57.6 (Blue)
- **IterDRAG**: 68.8 (Green)
**Trend**: Accuracy increases from gray to green bars.
### HotpotQA
- **Zero-shot QA**: 25.2 (Gray)
- **Many-shot QA**: 26.2 (Orange)
- **RAG**: 50.9 (Purple)
- **DRAG**: 52.2 (Blue)
- **IterDRAG**: 56.4 (Green)
**Trend**: Similar upward progression as Bamboogle.
### MuSiQue
- **Zero-shot QA**: 6.6 (Gray)
- **Many-shot QA**: 8.5 (Orange)
- **RAG**: 16.8 (Purple)
- **DRAG**: 18.2 (Blue)
- **IterDRAG**: 30.5 (Green)
**Trend**: Lowest accuracies across all methods; IterDRAG dominates.
### 2WikiMultiHopQA
- **Zero-shot QA**: 30.7 (Gray)
- **Many-shot QA**: 34.3 (Orange)
- **RAG**: 48.4 (Purple)
- **DRAG**: 53.3 (Blue)
- **IterDRAG**: 76.9 (Green)
**Trend**: Highest accuracy for IterDRAG; steep increase from gray to green.
### Average
- **Zero-shot QA**: 20.4 (Gray)
- **Many-shot QA**: 23.5 (Orange)
- **RAG**: 42.2 (Purple)
- **DRAG**: 45.4 (Blue)
- **IterDRAG**: 58.2 (Green)
**Trend**: IterDRAG maintains highest average accuracy.
## Spatial Grounding
- **Legend**: Top-center of chart.
- **Bar Colors**: Match legend labels exactly (e.g., green bars = IterDRAG).
## Trend Verification
- **IterDRAG**: Consistently highest accuracy across all datasets.
- **Zero-shot QA**: Lowest accuracy in MuSiQue (6.6) and Bamboogle (19.2).
- **DRAG**: Second-highest accuracy in Bamboogle (57.6) and 2WikiMultiHopQA (53.3).
## Component Isolation
1. **Header**: Chart title (implied) and legend.
2. **Main Chart**: Bar groupings for each dataset.
3. **Footer**: No explicit footer; y-axis label at left.
## Critical Observations
- **IterDRAG** outperforms all methods by 10–30% in most datasets.
- **MuSiQue** shows the largest performance gap between methods (6.6 vs. 30.5).
- **2WikiMultiHopQA** has the highest absolute accuracy (76.9 for IterDRAG).
## Data Table Reconstruction
| Dataset | Zero-shot QA | Many-shot QA | RAG | DRAG | IterDRAG |
|-----------------------|--------------|--------------|-------|-------|----------|
| Bamboogle | 19.2 | 24.8 | 52.8 | 57.6 | 68.8 |
| HotpotQA | 25.2 | 26.2 | 50.9 | 52.2 | 56.4 |
| MuSiQue | 6.6 | 8.5 | 16.8 | 18.2 | 30.5 |
| 2WikiMultiHopQA | 30.7 | 34.3 | 48.4 | 53.3 | 76.9 |
| **Average** | 20.4 | 23.5 | 42.2 | 45.4 | 58.2 |
## Language Notes
- All text is in English. No non-English content detected.