## Diagram: Taxonomy of Reasoning Categories and Associated Failure Modes
### Overview
The image is a structured diagram that maps three high-level "Reasoning Categories" (Informal, Formal, Embodied) to their respective "Subsections" and associated "Failure Categories." The failures are further classified into three types: Robustness, Limitation, and Fundamental. The diagram uses a color-coded, tabular layout with horizontal bars to indicate which failure types apply to each subsection.
### Components/Axes
**Left Vertical Axis: Reasoning Categories**
* **Informal** (Purple outline)
* **Formal** (Red outline)
* **Embodied** (Green outline)
**Middle Column: Subsections**
Each Reasoning Category is broken down into numbered subsections:
* **Under Informal:**
* 3.1 Individual Cog Reasoning
* 3.2 Implicit Social Reasoning
* 3.3 Explicit Social Reasoning
* **Under Formal:**
* 4.1 Logic in NL
* 4.2 Logic in Bench
* 4.3 Arithmetic & Math
* **Under Embodied:**
* 5.1 1D
* 5.2 2D
* 5.3 3D
**Right Section: Failure Categories (Column Headers)**
* **Robustness** (Light grey header)
* **Limitation** (Light grey header)
* **Fundamental** (Dark grey header)
**Data Representation:**
Horizontal colored bars (matching the category's outline color) are placed within the Failure Category columns to indicate the specific failure types associated with each subsection. The length and placement of the bar show which failure category(ies) it belongs to.
### Detailed Analysis
**1. Informal Reasoning (Purple Bars)**
* **3.1 Individual Cog Reasoning:**
* Robustness: Cognitive Skills, Cognitive Bias
* Fundamental: Cognitive Skills, Cognitive Bias
* **3.2 Implicit Social Reasoning:**
* Robustness: Theory of Mind (ToM)
* Robustness: Social Norm & Morals
* **3.3 Explicit Social Reasoning:**
* Robustness: Multi-Agent System (MAS)
**2. Formal Reasoning (Red/Pink Bars)**
* **4.1 Logic in NL:**
* Fundamental: Reversal Curse
* Fundamental: Compositional Reasoning
* Limitation: Specific Logical Relations
* **4.2 Logic in Bench:**
* Robustness: Math Word Problem (MWP)
* Robustness: Coding
* **4.3 Arithmetic & Math:**
* Fundamental: Counting
* Fundamental: Basic Arithmetic
* Limitation: MWP & Beyond
**3. Embodied Reasoning (Green Bars)**
* **5.1 1D:**
* Fundamental: Physical Commonsense
* Limitation: Physics & Science
* Robustness: What's Wrong with the Picture?
* **5.2 2D:**
* Limitation: 2D Physics & Physical Commonsense
* Limitation: Visual Spatial Reasoning
* **5.3 3D:**
* Fundamental: Affordance & Planning
* Robustness: Spatial and Tool-Use Reasoning
* Robustness: Safety & Long-Term Autonomy
### Key Observations
* **Spatial Layout:** The "Reasoning Categories" are stacked vertically on the far left. The "Subsections" are listed in a central column. The "Failure Categories" form three wide columns on the right. Dotted horizontal lines separate the three main Reasoning Categories.
* **Color Consistency:** Each main category (Informal, Formal, Embodied) and its associated failure bars share a distinct color (purple, red, green).
* **Failure Distribution:**
* **Robustness** failures are common across all categories, often related to skill application and real-world complexity.
* **Limitation** failures are notably present in Formal (Logic) and Embodied reasoning, suggesting boundaries in logical relations and physical understanding.
* **Fundamental** failures appear in all categories, indicating core, intrinsic challenges in cognitive biases, logical composition, and physical planning.
* **Subsection Complexity:** Some subsections, like "3.1 Individual Cog Reasoning" and "5.3 3D," have failures spanning multiple categories, indicating multifaceted challenges.
### Interpretation
This diagram serves as a **taxonomic map for diagnosing failure modes in artificial reasoning systems**. It organizes the complex landscape of reasoning tasks and systematically links them to specific types of failures.
* **Relationship Structure:** The diagram posits that the *type of reasoning task* (Informal/Social, Formal/Logical, Embodied/Physical) fundamentally shapes the *nature of the failures* an AI system will encounter. For example, failures in social reasoning (ToM, MAS) are primarily framed as "Robustness" issues—challenges in reliably applying these skills—while failures in arithmetic are often "Fundamental," suggesting a core lack of ability.
* **Investigative Lens (Peircean):** The diagram acts as an **abductive framework**. When an AI system fails on a task (e.g., a visual puzzle), one can trace it back: Is it a "2D" Embodied task? The diagram suggests the failure is likely a "Limitation" in "Visual Spatial Reasoning." This guides researchers toward the root cause (e.g., poor spatial representation) rather than treating it as a generic error.
* **Notable Pattern:** The concentration of "Fundamental" failures in the Formal category (Reversal Curse, Compositional Reasoning, Basic Arithmetic) implies these are seen as foundational, possibly architectural, flaws in current models. In contrast, Embodied reasoning failures are more distributed, reflecting the multifaceted challenge of interacting with the physical world.
* **Purpose:** This taxonomy is likely used for **benchmarking, research prioritization, and model evaluation**. It helps answer: "What kinds of reasoning can my model fail at, and what does that failure tell me about its underlying limitations?" It moves beyond a simple "pass/fail" metric to a diagnostic understanding of AI cognition.