## Scatter Plot: Accuracy vs. Compute
### Overview
The image is a scatter plot comparing the accuracy of three different models (AC, NLD, and Single Model) against the compute (normalized FLOPs) used. The plot shows the performance of each model at different compute levels, with arrows indicating the progression of the AC model.
### Components/Axes
* **X-axis:** Compute (normalized FLOPs). Scale: 1, 2, 4, 8, 16, 32, 64, 128, 256. Logarithmic scale.
* **Y-axis:** Accuracy (%). Scale: 76, 78, 80, 82, 84, 86.
* **Legend (top-left):**
* Green: AC (ours)
* Blue: NLD
* Orange: Single Model
### Detailed Analysis
* **AC (ours) - Green:**
* The AC model's accuracy increases as compute increases.
* Data points:
* Compute = 4, Accuracy ≈ 81.5%
* Compute = 8, Accuracy ≈ 84.5%
* Compute = 64, Accuracy ≈ 85%
* Arrows indicate the progression of the AC model's accuracy with increasing compute.
* **NLD - Blue:**
* The NLD model's accuracy initially decreases and then increases slightly before decreasing again.
* Data points:
* Compute = 16, Accuracy ≈ 79.5%
* Compute = 32, Accuracy ≈ 80%
* Compute = 256, Accuracy ≈ 76%
* **Single Model - Orange:**
* The Single Model's accuracy varies with compute.
* Data points:
* Compute = 1, Accuracy ≈ 81% (labeled "3.2-1B")
* Compute = 4, Accuracy ≈ 79.5% (labeled "3.2-3B")
* Compute = 8, Accuracy ≈ 84% (labeled "3.1-8B")
* Compute = 64, Accuracy ≈ 81% (labeled "3.1-70B")
### Key Observations
* The AC model generally shows an upward trend in accuracy as compute increases.
* The NLD model has a more volatile accuracy, decreasing significantly at higher compute levels.
* The Single Model shows varying accuracy across different compute levels.
### Interpretation
The plot suggests that the AC model (developed by the authors) benefits from increased compute, showing a general improvement in accuracy. The NLD model, on the other hand, does not scale as well with compute, experiencing a significant drop in accuracy at higher compute levels. The Single Model's performance varies, indicating that its accuracy is not directly correlated with compute in the same way as the AC model. The arrows connecting the AC model's data points emphasize the improvement in accuracy as compute increases, highlighting the effectiveness of the authors' approach.