Image aac6a1bd6458...

EXPERT: nemotron-free VERSION 1

RUNTIME: free/nvidia/nemotron-nano-12b-v2-vl:free
INTEL_VERIFIED
# Technical Document Extraction: Image Analysis

## Chart 1: HMMMT-25
### Components
- **Title**: HMMMT-25
- **X-Axis Labels**: 
  - o3-mini (high)
  - DeepSeek-R1
  - Owen3-4B-Instruct (Base)
  - Base + RSA (Ours)
  - Base + RSA + RL (Ours)
- **Y-Axis Label**: Pass@1 (0–80 range)
- **Legend**: 
  - Teal: Base
  - Orange: Base + RSA
- **Data Points**:
  - o3-mini (high): 67.5 (Teal)
  - DeepSeek-R1: 41.7 (Teal)
  - Owen3-4B-Instruct (Base): 27.2 (Teal)
  - Base + RSA (Ours): 47.6 (Orange)
  - Base + RSA + RL (Ours): 55.5 (Orange)
- **Trend**: Scores increase from Owen3-4B-Instruct (27.2) to Base + RSA + RL (55.5).

## Chart 2: Reasoning Gym Games
### Components
- **Title**: Reasoning Gym Games
- **X-Axis Labels**: 
  - o3-mini (high)
  - DeepSeek-R1
  - Owen3-4B-Instruct (Base)
  - Base + RSA (Ours)
  - Base + RSA + RL (Ours)
- **Y-Axis Label**: Pass@1 (0–80 range)
- **Legend**: 
  - Teal: Base
  - Orange: Base + RSA
- **Data Points**:
  - o3-mini (high): 69.9 (Teal)
  - DeepSeek-R1: 54.8 (Teal)
  - Owen3-4B-Instruct (Base): 53.9 (Teal)
  - Base + RSA (Ours): 69.0 (Orange)
  - Base + RSA + RL (Ours): 70.6 (Orange)
- **Trend**: Scores rise from Owen3-4B-Instruct (53.9) to Base + RSA + RL (70.6).

## Chart 3: LiveCodeBench-v6
### Components
- **Title**: LiveCodeBench-v6
- **X-Axis Labels**: Pass@1 (0–100 range)
- **Y-Axis Labels**: 
  - Qwen3 Instruct (4B)
  - Qwen3 Instruct (30B)
  - GPT-OSS Medium (20B)
- **Legend**: 
  - Teal: Base
  - Orange: Base + RSA
- **Data Points**:
  - Qwen3 Instruct (4B): 
    - Base: 44.9 (Teal)
    - Base + RSA: +7.1 (Orange)
  - Qwen3 Instruct (30B): 
    - Base: 60.0 (Teal)
    - Base + RSA: +7.1 (Orange)
  - GPT-OSS Medium (20B): 
    - Base: 74.4 (Teal)
    - Base + RSA: +5.6 (Orange)
- **Trend**: Base + RSA improves all models, with Qwen3 Instruct (4B) showing the largest absolute gain (+7.1).

## Chart 4: AIME-25
### Components
- **Title**: AIME-25
- **X-Axis Labels**: Pass@1 (0–100 range)
- **Y-Axis Labels**: 
  - Nemotron Nano (9B)
  - Qwen3 Instruct (4B)
  - Qwen3 Instruct (30B)
  - Qwen3 Thinking (4B)
  - GPT-OSS Medium (20B)
- **Legend**: 
  - Teal: Base
  - Orange: Base + RSA
- **Data Points**:
  - Nemotron Nano (9B): 
    - Base: 40.8 (Teal)
    - Base + RSA: +32.1 (Orange)
  - Qwen3 Instruct (4B): 
    - Base: 44.9 (Teal)
    - Base + RSA: +29.9 (Orange)
  - Qwen3 Instruct (30B): 
    - Base: 57.7 (Teal)
    - Base + RSA: +27.2 (Orange)
  - Qwen3 Thinking (4B): 
    - Base: 65.0 (Teal)
    - Base + RSA: +19.4 (Orange)
  - GPT-OSS Medium (20B): 
    - Base: 67.8 (Teal)
    - Base + RSA: +22.4 (Orange)
- **Trend**: Base + RSA significantly boosts scores across all models, with Nemotron Nano (9B) showing the highest relative improvement (+32.1).

## Spatial Grounding
- **Legend Position**: Bottom-right corner of all charts.
- **Color Consistency**: 
  - Teal consistently represents "Base" across all charts.
  - Orange consistently represents "Base + RSA" across all charts.

## Key Observations
1. **Model Performance**: 
   - o3-mini (high) and Qwen3 Instruct (30B) consistently achieve the highest Pass@1 scores in their respective charts.
2. **RSA Impact**: 
   - Base + RSA improves performance in all models, with the largest gains observed in AIME-25 (e.g., +32.1 for Nemotron Nano).
3. **RL Enhancement**: 
   - Adding RL (Reinforcement Learning) further improves scores in HMMMT-25 and Reasoning Gym Games (e.g., +7.9 for Base + RSA + RL vs. Base + RSA in Reasoning Gym Games).

## Language Notes
- All text is in English. No non-English content detected.
DECODING INTELLIGENCE...
TECHNICAL ASSET FINGERPRINT

aac6a1bd6458bed25d1fa3b9

FOUND IN PAPERS

EXPERT: nemotron-free VERSION 1