## Line Chart: Cost per Sequence vs. Sequence Number (Thousands)
### Overview
The chart compares the cost per sequence (in bits) across three models: LSTM, NTM with LSTM Controller, and NTM with Feedforward Controller. The x-axis represents sequence numbers (in thousands), and the y-axis represents cost per sequence. All three models show a decreasing trend in cost as sequence numbers increase, with NTM-based models outperforming the baseline LSTM.
### Components/Axes
- **X-axis**: Sequence number (thousands), ranging from 0 to 1000.
- **Y-axis**: Cost per sequence (bits), ranging from 0 to 140.
- **Legend**: Located on the right, with three entries:
- **LSTM**: Blue circles (●).
- **NTM with LSTM Controller**: Green squares (■).
- **NTM with Feedforward Controller**: Red triangles (▲).
### Detailed Analysis
1. **LSTM (Blue Circles)**:
- Starts at ~120 bits for sequence 0.
- Gradually decreases to ~55 bits by sequence 1000.
- Trend: Steady, linear decline with minimal fluctuation.
2. **NTM with LSTM Controller (Green Squares)**:
- Starts at ~120 bits for sequence 0.
- Sharp drop to ~20 bits by sequence 200k.
- Stabilizes around 20 bits for sequences 400k–1000k.
- Trend: Rapid initial improvement, then plateau.
3. **NTM with Feedforward Controller (Red Triangles)**:
- Starts at ~120 bits for sequence 0.
- Sharp drop to ~15 bits by sequence 200k.
- Slight increase to ~20 bits by sequence 400k, then stabilizes.
- Trend: Rapid improvement with minor mid-range fluctuation.
### Key Observations
- All models show diminishing returns as sequence numbers increase.
- NTM-based models (LSTM Controller and Feedforward Controller) outperform the baseline LSTM by ~30–50% in cost efficiency.
- The Feedforward Controller achieves the lowest cost (~15 bits) but shows a slight rebound (~20 bits) after 200k, unlike the LSTM Controller.
- The LSTM model maintains a consistently higher cost (~55 bits) compared to NTM variants.
### Interpretation
The data suggests that NTM architectures with specialized controllers (LSTM or Feedforward) significantly reduce computational costs compared to standalone LSTMs. The Feedforward Controller achieves the most efficient performance initially but exhibits a minor cost increase at mid-sequence lengths, possibly due to architectural trade-offs. The LSTM Controller maintains stable efficiency across all sequence lengths. These trends highlight the importance of controller design in optimizing sequence-processing models, with NTM-based approaches offering superior scalability for large datasets.