## Heatmap: Syllogism Format Validation Counts
### Overview
This image presents a heatmap visualizing the number of predicted valid syllogisms for different syllogism formats and languages. The heatmap uses a color gradient to represent the count, ranging from light yellow (low count) to dark purple (high count). The x-axis represents the language, and the y-axis represents the syllogism format.
### Components/Axes
* **X-axis:** Language - with categories "zh+" (Chinese Positive), "zh-" (Chinese Negative), "en+" (English Positive), "en-" (English Negative).
* **Y-axis:** Syllogism Format - with the following categories:
* AAA-1
* EAE-1
* AII-1
* EIO-1
* EAE-2
* AEE-2
* EIO-2
* AOO-2
* AII-3
* IAI-3
* OAO-3
* EIO-3
* AEE-4
* IAI-4
* EIO-4
* AAI-1
* EAO-1
* AEO-2
* EAO-2
* AAI-3
* EAO-3
* AAI-4
* AEO-4
* EAO-4
* **Color Scale (Right):** "The number of predicted VALID" ranging from 0 (dark purple) to 100 (light yellow).
### Detailed Analysis
The heatmap displays the counts for each combination of syllogism format and language. The values are approximate, based on the color gradient.
* **zh+ (Chinese Positive):**
* AAA-1: ~95
* EAE-1: ~90
* AII-1: ~85
* EIO-1: ~80
* EAE-2: ~75
* AEE-2: ~70
* EIO-2: ~65
* AOO-2: ~60
* AII-3: ~55
* IAI-3: ~50
* OAO-3: ~45
* EIO-3: ~40
* AEE-4: ~35
* IAI-4: ~30
* EIO-4: ~25
* AAI-1: ~20
* EAO-1: ~15
* AEO-2: ~10
* EAO-2: ~10
* AAI-3: ~5
* EAO-3: ~5
* AAI-4: ~0
* AEO-4: ~0
* EAO-4: ~0
* **zh- (Chinese Negative):**
* AAA-1: ~85
* EAE-1: ~80
* AII-1: ~75
* EIO-1: ~70
* EAE-2: ~65
* AEE-2: ~60
* EIO-2: ~55
* AOO-2: ~50
* AII-3: ~45
* IAI-3: ~40
* OAO-3: ~35
* EIO-3: ~30
* AEE-4: ~25
* IAI-4: ~20
* EIO-4: ~15
* AAI-1: ~10
* EAO-1: ~5
* AEO-2: ~5
* EAO-2: ~5
* AAI-3: ~0
* EAO-3: ~0
* AAI-4: ~0
* AEO-4: ~0
* EAO-4: ~0
* **en+ (English Positive):**
* AAA-1: ~70
* EAE-1: ~65
* AII-1: ~60
* EIO-1: ~55
* EAE-2: ~50
* AEE-2: ~45
* EIO-2: ~40
* AOO-2: ~35
* AII-3: ~30
* IAI-3: ~25
* OAO-3: ~20
* EIO-3: ~15
* AEE-4: ~10
* IAI-4: ~5
* EIO-4: ~5
* AAI-1: ~0
* EAO-1: ~0
* AEO-2: ~0
* EAO-2: ~0
* AAI-3: ~0
* EAO-3: ~0
* AAI-4: ~0
* AEO-4: ~0
* EAO-4: ~0
* **en- (English Negative):**
* AAA-1: ~60
* EAE-1: ~55
* AII-1: ~50
* EIO-1: ~45
* EAE-2: ~40
* AEE-2: ~35
* EIO-2: ~30
* AOO-2: ~25
* AII-3: ~20
* IAI-3: ~15
* OAO-3: ~10
* EIO-3: ~5
* AEE-4: ~0
* IAI-4: ~0
* EIO-4: ~0
* AAI-1: ~0
* EAO-1: ~0
* AEO-2: ~0
* EAO-2: ~0
* AAI-3: ~0
* EAO-3: ~0
* AAI-4: ~0
* AEO-4: ~0
* EAO-4: ~0
### Key Observations
* The counts are generally highest for the "zh+" (Chinese Positive) language and decrease as we move to "zh-", "en+", and "en-".
* The "AAA-1" format consistently shows the highest counts across all languages.
* The "AAI-4", "AEO-4", and "EAO-4" formats consistently show the lowest counts (close to zero) across all languages.
* There is a clear trend of decreasing counts as the syllogism format number increases (e.g., from -1 to -4).
### Interpretation
The heatmap suggests that the model performs best at predicting valid syllogisms in Chinese (positive polarity) and struggles more with English (especially negative polarity). The performance also varies significantly depending on the syllogism format, with some formats being much easier to validate than others. The consistent high performance for "AAA-1" and low performance for "AAI-4", "AEO-4", and "EAO-4" could indicate inherent differences in the logical structure or complexity of these formats. The difference between positive and negative polarity within each language suggests the model may be sensitive to the phrasing or construction of the syllogisms. This data could be used to improve the model's performance by focusing on the more challenging languages and syllogism formats.