## Heatmap: Classification Accuracies
### Overview
The image is a heatmap titled "Classification accuracies" that visualizes the performance (accuracy) of four different models or methods across twelve distinct datasets. Each dataset has a standard version and a "neg_" (negated) counterpart. Performance is represented by both a numerical value (mean accuracy ± standard deviation) and a color gradient, with a color bar legend on the right indicating the scale from 0.0 (dark purple) to 1.0 (bright yellow).
### Components/Axes
* **Title:** "Classification accuracies" (top center).
* **X-axis (Models/Methods):** Four columns labeled from left to right:
1. `TTPD`
2. `LR`
3. `CCS`
4. `MM`
* **Y-axis (Datasets):** Twelve rows, each representing a dataset. From top to bottom:
1. `cities`
2. `neg_cities`
3. `sp_en_trans`
4. `neg_sp_en_trans`
5. `inventors`
6. `neg_inventors`
7. `animal_class`
8. `neg_animal_class`
9. `element_symb`
10. `neg_element_symb`
11. `facts`
12. `neg_facts`
* **Legend (Color Bar):** Positioned vertically on the right side of the heatmap. It maps color to accuracy value:
* **Scale:** 0.0 (bottom) to 1.0 (top).
* **Color Gradient:** Transitions from dark purple (0.0) through magenta, orange, to bright yellow (1.0).
* **Data Cells:** Each cell contains the text format `XX ± Y`, where `XX` is the mean accuracy percentage and `Y` is the standard deviation. The cell's background color corresponds to the mean accuracy value according to the legend.
### Detailed Analysis
The following table reconstructs the data from the heatmap. Values are `Mean Accuracy ± Standard Deviation`.
| Dataset | TTPD | LR | CCS | MM |
|------------------|-------------|-------------|-------------|-------------|
| cities | 86 ± 1 | 98 ± 2 | 90 ± 10 | 77 ± 2 |
| neg_cities | 96 ± 1 | 99 ± 2 | 98 ± 7 | 100 ± 0 |
| sp_en_trans | 100 ± 0 | 99 ± 1 | 88 ± 22 | 99 ± 0 |
| neg_sp_en_trans | 95 ± 2 | 99 ± 1 | 90 ± 21 | 99 ± 0 |
| inventors | 92 ± 1 | 90 ± 4 | 72 ± 20 | 87 ± 2 |
| neg_inventors | 93 ± 1 | 93 ± 2 | 69 ± 18 | 94 ± 0 |
| animal_class | 99 ± 0 | 98 ± 1 | 87 ± 19 | 99 ± 0 |
| neg_animal_class | 99 ± 0 | 99 ± 0 | 84 ± 22 | 99 ± 0 |
| element_symb | 98 ± 0 | 98 ± 1 | 86 ± 25 | 95 ± 1 |
| neg_element_symb | 99 ± 0 | 99 ± 1 | 92 ± 16 | 98 ± 3 |
| facts | 90 ± 0 | 90 ± 1 | 82 ± 9 | 89 ± 1 |
| neg_facts | 79 ± 1 | 77 ± 3 | 75 ± 8 | 72 ± 1 |
**Color & Trend Verification:**
* **High Accuracy (Yellow, ~0.9-1.0):** Dominates the `TTPD`, `LR`, and `MM` columns for most datasets, especially the `animal_class`, `element_symb`, and `neg_cities` rows.
* **Moderate Accuracy (Orange, ~0.7-0.89):** Seen in the `CCS` column for many datasets, and in the `TTPD` and `MM` columns for the `facts` and `neg_facts` rows.
* **Lower Accuracy (Darker Orange/Red, <0.75):** Concentrated in the `CCS` column for `inventors` (72) and `neg_inventors` (69). The `neg_facts` row shows the lowest scores across all models.
* **Standard Deviation:** The `CCS` model consistently shows the highest standard deviations (e.g., ±25, ±22), indicating much less stable performance compared to the other three models, which typically have deviations of ±0 to ±4.
### Key Observations
1. **Model Performance Hierarchy:** `LR` and `TTPD` are the top-performing and most consistent models, frequently achieving accuracies in the high 90s with very low standard deviations. `MM` is also very strong, often matching or exceeding `TTPD`, but shows a notable weakness on the `cities` dataset (77 ± 2). `CCS` is the clear underperformer, with both lower mean accuracies and significantly higher variance.
2. **Dataset Difficulty:** The `neg_facts` dataset is the most challenging for all models, yielding the lowest scores in each column (79, 77, 75, 72). The `facts` dataset is also relatively difficult. In contrast, datasets like `animal_class`, `neg_animal_class`, `element_symb`, and `neg_cities` appear to be "easier," with multiple models achieving near-perfect scores.
3. **Negation Effect:** For most datasets, the performance on the "neg_" version is similar to or better than the standard version. The most dramatic improvement is on `cities` vs. `neg_cities`, where all models show a significant accuracy boost (e.g., TTPD: 86→96, MM: 77→100). The `inventors`/`neg_inventors` pair shows a mixed pattern.
4. **Stability:** `TTPD` and `MM` often report standard deviations of `±0`, suggesting extremely consistent results across runs or folds. `LR` is also very stable (±0 to ±4). `CCS` is highly unstable.
### Interpretation
This heatmap provides a comparative benchmark of four classification methods. The data suggests that `LR` (likely Logistic Regression) and `TTPD` (an unspecified method) are robust, high-accuracy baselines for these specific tasks. The `MM` model is similarly powerful but may have specific failure modes (as seen with `cities`). The `CCS` method is not only less accurate but also unreliable, as indicated by its high variance; this could point to issues with model convergence, sensitivity to data splits, or inherent instability in the method for these tasks.
The consistent difficulty of the `facts` and `neg_facts` datasets implies these tasks involve more complex, ambiguous, or noisy relationships that are harder for the models to capture. The general trend of improved performance on "neg_" datasets is intriguing. It could indicate that the negated formulations create clearer decision boundaries or that the models are better at recognizing the absence of a feature than its presence in these contexts. The stark improvement for the `cities` task under negation is a key anomaly that warrants further investigation into the nature of that specific dataset.
From a Peircean perspective, this heatmap is an *icon* representing the abstract relationships between models and tasks. The patterns (colors and numbers) allow us to infer the *legisign* (the general law or trend: "LR/TTPD are superior") and make *hypothetical inferences* about the underlying nature of the datasets and model behaviors. The high variance in `CCS` is a *qualisign* of its instability, a quality that speaks louder than its mean score alone.