Image e943f52f724d...

EXPERT: nemotron-free VERSION 1

RUNTIME: free/nvidia/nemotron-nano-12b-v2-vl:free
INTEL_VERIFIED
## Heatmap: Classification accuracies

### Overview
A heatmap visualizing classification accuracy across four methods (TTPD, LR, CCS, MM) for 12 categories. Accuracy values are represented by color intensity (yellow = highest, red = lowest) with numerical values and confidence intervals (± values) displayed in each cell.

### Components/Axes
- **X-axis (Methods)**: TTPD, LR, CCS, MM (left to right)
- **Y-axis (Categories)**: 
  1. cities_de
  2. neg_cities_de
  3. sp_en_trans_de
  4. neg_sp_en_trans_de
  5. inventors_de
  6. neg_inventors_de
  7. animal_class_de
  8. neg_animal_class_de
  9. element_symb_de
  10. neg_element_symb_de
  11. facts_de
  12. neg_facts_de
- **Legend**: Color scale from 0.0 (purple) to 1.0 (yellow), with intermediate orange shades
- **Title**: "Classification accuracies" (top center)

### Detailed Analysis
#### Method Performance:
1. **LR (Logistic Regression)**:
   - Highest accuracy across all categories (100 ± 0 in cities_de and neg_cities_de)
   - Consistently top performer (94-100% range)
   - Example: `animal_class_de` = 94 ± 1

2. **TTPD**:
   - Strong performance (87-96% range)
   - Notable: `cities_de` = 89 ± 3, `neg_cities_de` = 96 ± 0

3. **MM**:
   - Competitive with TTPD (87-96% range)
   - Example: `neg_inventors_de` = 88 ± 3

4. **CCS**:
   - Lowest accuracy (68-86% range)
   - High variability (e.g., `neg_facts_de` = 68 ± 14)
   - Example: `sp_en_trans_de` = 74 ± 21

#### Confidence Intervals:
- **Low variability**: LR (0-4), MM (1-3), TTPD (1-3)
- **High variability**: CCS (12-27), particularly in `neg_facts_de` (±14)

### Key Observations
1. **LR Dominance**: Achieves perfect scores (100 ± 0) in two categories, with no negative accuracy deviations
2. **CCS Weakness**: Consistently lowest performance with largest confidence intervals (e.g., ±27 in `cities_de`)
3. **Color Correlation**: Yellow dominates LR cells, red/orange dominates CCS cells
4. **Symmetry**: Some categories show mirrored performance (e.g., `cities_de` vs `neg_cities_de`)

### Interpretation
The data demonstrates **LR as the most reliable classifier** across all categories, with perfect scores in critical domains like cities and neg_cities. **CCS shows significant underperformance** with high variability, suggesting potential issues with its classification logic or training data. The **± values** reveal that while LR maintains tight confidence intervals, CCS's wide ranges indicate unstable predictions. The heatmap's color gradient effectively visualizes these disparities, with LR's yellow dominance contrasting against CCS's red/orange tones. Notably, the `neg_facts_de` category shows the most pronounced CCS weakness (68 ± 14), potentially indicating domain-specific challenges.
DECODING INTELLIGENCE...
TECHNICAL ASSET FINGERPRINT

e943f52f724d20f6ea23bd33

FOUND IN PAPERS

EXPERT: nemotron-free VERSION 1