## Bar Charts and Heatmaps: User Study and Activation Map Analysis
### Overview
The image contains two primary sections:
1. **User Study** (left side): Three bar charts comparing user preferences, sound localization accuracy, and ablation study results.
2. **Activation Map** (right side): Six heatmaps visualizing spatial activation patterns for "Mono2Binaural" and "Ours" methods.
---
### Components/Axes
#### User Study (Bar Charts)
1. **Stereo Preference**
- **X-axis**: Categories: Mono-Mono (22%), Mono2Binaural (33%), Ours (45%).
- **Y-axis**: Percentage (0%–100%).
- **Legend**: Located at the bottom, with colors:
- Green: Mono-Mono
- Blue: Mono2Binaural
- Yellow: Ours
2. **Sound Localization Accuracy**
- **X-axis**: Categories: Mono-Mono (32%), Mono2Binaural (51%), Ours (60%), Ground Truth (81%).
- **Y-axis**: Percentage (0%–90%).
- **Legend**: Located at the bottom, with colors:
- Green: Mono-Mono
- Blue: Mono2Binaural
- Yellow: Ours
- Orange: Ground Truth
3. **Ablation Study**
- **X-axis**: Categories: HRIR (22%), Ambisonic (23%), Ours (55%).
- **Y-axis**: Percentage (0%–100%).
- **Legend**: Located at the bottom, with colors:
- Light orange: HRIR
- Brown: Ambisonic
- Yellow: Ours
#### Activation Map (Heatmaps)
- **Labels**:
- Top row: "Mono2Binaural [14]" (left), "Ours" (right).
- Bottom row: "Mono2Binaural [14]" (left), "Ours" (right).
- **Color Scheme**:
- Red: High activation.
- Green: Moderate activation.
- Blue: Low activation.
- Purple: Minimal activation.
---
### Detailed Analysis
#### User Study
1. **Stereo Preference**
- "Ours" (45%) outperforms "Mono2Binaural" (33%) and "Mono-Mono" (22%).
- **Trend**: Increasing preference from left to right.
2. **Sound Localization Accuracy**
- "Ground Truth" (81%) is the highest, followed by "Ours" (60%), "Mono2Binaural" (51%), and "Mono-Mono" (32%).
- **Trend**: "Ours" improves accuracy by 9% over "Mono2Binaural" and 28% over "Mono-Mono".
3. **Ablation Study**
- "Ours" (55%) significantly outperforms "HRIR" (22%) and "Ambisonic" (23%).
- **Trend**: Removing components (HRIR/Ambisonic) reduces performance by ~50%.
#### Activation Map
- **Mono2Binaural [14]**:
- Activation is diffuse, with scattered red/green regions.
- **Ours**:
- Activation is more concentrated, with larger red/yellow regions.
- Suggests stronger spatial focus compared to "Mono2Binaural".
---
### Key Observations
1. **User Preference**: "Ours" is the most preferred method (45% vs. 33% for "Mono2Binaural").
2. **Accuracy**: "Ours" achieves 60% accuracy, 28% higher than "Mono-Mono" and 9% higher than "Mono2Binaural".
3. **Ablation**: Removing components (HRIR/Ambisonic) halves performance, highlighting their importance.
4. **Activation Maps**: "Ours" shows more localized activation, aligning with higher user preference and accuracy.
---
### Interpretation
- **User Preference vs. Accuracy**: "Ours" excels in both metrics, suggesting it balances perceptual quality and technical performance.
- **Ablation Impact**: The 55% accuracy of "Ours" vs. 22–23% for simplified methods indicates that integrating HRIR and Ambisonic components is critical.
- **Activation Maps**: The concentrated activation in "Ours" implies better spatial awareness, likely contributing to its superior performance.
- **Ground Truth Benchmark**: "Ground Truth" (81%) sets a high standard, but "Ours" (60%) is the closest among user-tested methods.
This data demonstrates that "Ours" is the most effective method for spatial audio tasks, combining user preference, accuracy, and focused activation patterns.