## Chart: GSM8k Accuracy vs Enforced Token Budget
### Overview
The image is a line chart showing the relationship between GSM8k accuracy (in percentage) and the number of tokens (enforced budget). The chart also includes a reference point for "Unconstrained" performance, indicating the average tokens and accuracy achieved without any token budget limitations.
### Components/Axes
* **Title:** GSM8k Accuracy vs Enforced Token Budget
* **X-axis:** Number of Tokens (Enforced Budget), ranging from 0 to 1400 in increments of 200.
* **Y-axis:** Accuracy (%), ranging from 60 to 100 in increments of 5.
* **Data Series:** A single blue line representing the accuracy at different token budgets.
* **Unconstrained Reference:** Located at the top-right of the chart, indicating "Unconstrained" with "Avg tokens: 1388" and "Accuracy: 96.58%".
* **Percentage Change Labels:** Red and green labels indicating the percentage change in accuracy relative to the unconstrained accuracy.
### Detailed Analysis
The blue line represents the accuracy of the GSM8k model at different token budgets.
* **Trend:** The accuracy generally increases as the number of tokens increases, but plateaus after a certain point.
* **Data Points:**
* At 0 tokens, the accuracy is approximately 57.8%.
* At 200 tokens, the accuracy is approximately 69.8%.
* At 500 tokens, the accuracy is approximately 89.4%.
* At 800 tokens, the accuracy is approximately 93.4%.
* At 1000 tokens, the accuracy is approximately 95%.
* **Unconstrained Performance:** The unconstrained performance is at 1388 tokens with an accuracy of 96.58%.
* **Percentage Change Labels:**
* At 0 tokens, the accuracy is -40.2% relative to the unconstrained accuracy.
* At 200 tokens, the accuracy is -27.8% relative to the unconstrained accuracy.
* At 500 tokens, the accuracy is -7.1% relative to the unconstrained accuracy.
* At 800 tokens, the accuracy is -3.2% relative to the unconstrained accuracy.
* At 1000 tokens, the accuracy is -1.6% relative to the unconstrained accuracy.
* At 1200 tokens, the accuracy is -26.2% relative to the unconstrained accuracy.
### Key Observations
* The accuracy increases sharply from 0 to 500 tokens.
* The accuracy plateaus after 800 tokens, with diminishing returns for increased token budget.
* The unconstrained performance provides a benchmark for the maximum achievable accuracy.
### Interpretation
The chart demonstrates the trade-off between token budget and accuracy for the GSM8k model. Initially, increasing the token budget significantly improves accuracy. However, after a certain point (around 800 tokens), the gains in accuracy become marginal. This suggests that there is an optimal token budget beyond which further increases do not significantly improve performance. The unconstrained performance represents the upper limit of accuracy achievable without any token restrictions. The percentage change labels quantify the accuracy loss associated with different token budgets relative to the unconstrained performance.