## Conceptual Diagram: Behavioural Decision-Making Frameworks
### Overview
The image presents a comparative framework of three behavioural paradigms (Reactive, Sentient, Intentional) and two decision-making principles (KL control, Expected Free Energy for POMDPs). It combines textual descriptions, mathematical formulas, and a Q-learning example table.
### Components/Axes
1. **Left Panel (Reactive Behaviour)**
- Title: "Reactive Behaviour"
- Description: "Actions are selected in response to an observed state"
- Formula: $ P(u) = \sigma(\mathbf{Q} | s_\tau) $
- Example: Q-learning with a 4x4 state-value table (states $ s_1 $ to $ s_4 $, values 0.0–1.2)
- Additional: "KL (risk sensitive) control as inference" with formula $ Q(u) = \underbrace{D_{KL}[Q(s_{\tau+1}|u)||P(s_{\tau+1}|c)]}_{\text{Risk}} $
2. **Middle Panel (Sentient Behaviour)**
- Title: "Sentient Behaviour"
- Description: "Action selection based on the inferred consequences of action"
- Formula: $ P(u) = \sigma(-\mathbf{G}) $
- Additional: "Planning as inference under objective constraints or preferences over outcomes"
3. **Right Panel (Intentional Behaviour)**
- Title: "Intentional Behaviour"
- Description: "Action selection constrained by intended endpoint or goal"
- Formula: $ P(u) = \sigma(-\mathbf{G} - \mathbf{H}) $
- Additional: "Inductive Planning under subjective constraints or preferences over latent states"
4. **Bottom Section (Expected Free Energy for POMDPs)**
- Title: "Expected Free Energy for POMDPs"
- Formula: $ G(u) = \underbrace{D_{KL}[Q(o_{\tau+1}|u)||P(o_{\tau+1}|c)]}_{\text{Risk}} \underbrace{-\mathbb{E}_{Q_u}[\ln Q(o_{\tau+1}|s_{\tau+1},u)]}_{\text{Ambiguity}} $
### Detailed Analysis
- **Q-learning Table**:
| State | 1.2 | 0.1 | 0.0 | 0.1 |
|-------|-----|-----|-----|-----|
| $ s_1 $ | 1.1 | 0.2 | 0.1 | 2.4 |
| $ s_2 $ | 0.0 | 3.3 | 0.9 | 0.1 |
| $ s_3 $ | 1.8 | 0.7 | 0.3 | 0.9 |
- **Mathematical Notation**:
- $ \sigma $: Sigmoid function (implied but not explicitly labeled)
- $ D_{KL} $: Kullback-Leibler divergence (risk term)
- $ \mathbb{E} $: Expected value (ambiguity term)
### Key Observations
1. **Hierarchical Structure**:
- Reactive < Sentient < Intentional (increasing complexity of constraints)
- KL control and Expected Free Energy represent complementary decision-making principles.
2. **Contrast in Constraints**:
- Reactive: Observed states only
- Sentient: Objective consequences
- Intentional: Subjective latent states
3. **Mathematical Relationships**:
- Risk (KL divergence) and Ambiguity (negative log-probability) are combined additively in Expected Free Energy.
### Interpretation
This diagram illustrates a theoretical taxonomy of decision-making systems, progressing from simple reactive responses to complex goal-directed planning. The inclusion of KL divergence and Expected Free Energy suggests a Bayesian framework for handling uncertainty, where:
- **Risk** quantifies distributional mismatch between predictions and outcomes
- **Ambiguity** measures epistemic uncertainty in latent states
The Q-learning example grounds the theory in reinforcement learning, while the formulas formalize the transition from reactive policies to intentional planning. The absence of explicit numerical trends implies a focus on conceptual relationships rather than empirical data.