## Diagram: AI Prompting and Reasoning Frameworks
### Overview
The image displays three distinct diagrams, labeled A, B, and C, illustrating different conceptual frameworks for structuring interactions with Large Language Models (LLMs) and Vision-Language Models (VLMs). The diagrams use a consistent visual language: ovals represent inputs, rectangles represent LLM calls, and rounded rectangles represent a chain or agent process. Arrows indicate the flow of information or control.
### Components/Axes
The image is divided into three main panels:
* **Panel A (Left):** A linear, vertical flowchart.
* **Panel B (Center):** Two separate flowcharts titled "Self-Critique" and "Selection-Inference."
* **Panel C (Right):** Two separate flowcharts titled "Inner Monologue" and "ReAct," accompanied by a legend.
**Legend (Located in the bottom-right of Panel C):**
* **Oval Shape:** Labeled "Input"
* **Rectangle Shape:** Labeled "LLM calls"
* **Rounded Rectangle Shape:** Labeled "Chain / Agent"
### Detailed Analysis
#### **Panel A: Linear Prompting Pipeline**
* **Flow:** A top-to-bottom, linear sequence.
* **Components & Text:**
1. **Top:** Text "Prompt construction" with a downward arrow.
2. **Middle:** A rectangle labeled "LLM".
3. **Below LLM:** Text "String parsing" with a downward arrow.
4. **Bottom:** Text "Execution".
* **Spatial Grounding:** The entire diagram is vertically aligned on the left side of the image.
#### **Panel B: Iterative and Selective Reasoning**
This panel contains two distinct sub-diagrams.
**1. Self-Critique Framework (Top of Panel B)**
* **Flow:** A cyclical process within a rounded rectangle (Chain/Agent).
* **Components & Text:**
* **Input (Top):** An oval labeled "Question". Three arrows point from this oval into the chain below.
* **Chain/Agent Process (Center):** A rounded rectangle containing three connected rectangles:
* Left rectangle: "Answer"
* Middle rectangle: "Critique"
* Right rectangle: "Refinement"
* **Internal Flow:** Arrows show: `Answer` -> `Critique` -> `Refinement`. A curved arrow also points from `Refinement` back to `Answer`, indicating iteration.
* **Output (Right):** An arrow points from the `Refinement` rectangle to the text "Answer".
* **Label (Bottom):** The text "Self-Critique" is centered below the rounded rectangle.
**2. Selection-Inference Framework (Bottom of Panel B)**
* **Flow:** A linear process within a rounded rectangle.
* **Components & Text:**
* **Inputs (Top):** Two ovals: "Context" (left) and "Question" (right). Arrows from both point into the chain below.
* **Chain/Agent Process (Center):** A rounded rectangle containing two connected rectangles:
* Left rectangle: "Selection"
* Right rectangle: "Inference"
* **Internal Flow:** An arrow points from `Selection` to `Inference`.
* **Output (Right):** An arrow points from the `Inference` rectangle to the text "Answer".
* **Label (Bottom):** The text "Selection-Inference" is centered below the rounded rectangle.
#### **Panel C: Agent-Based Reasoning with Environment**
This panel contains two sub-diagrams illustrating agents that interact with an external environment.
**1. Inner Monologue Framework (Top of Panel C)**
* **Flow:** A process involving interaction between an agent and external entities.
* **Components & Text:**
* **External Inputs (Top):** Two ovals: "Environment" (left) and "Human" (right).
* **Chain/Agent Process (Center):** A rounded rectangle containing two connected rectangles:
* Left rectangle: "VLM" (Vision-Language Model)
* Right rectangle: "Act"
* **Interaction Flow:**
* An arrow points from "Environment" to "VLM".
* Arrows point bidirectionally between "Human" and "Act".
* An arrow points from "VLM" to "Act".
* **Label (Bottom):** The text "Inner Monologue" is centered below the rounded rectangle.
**2. ReAct Framework (Bottom of Panel C)**
* **Flow:** A cyclical reasoning and action loop.
* **Components & Text:**
* **External Input (Top):** An oval labeled "Environment".
* **Chain/Agent Process (Center):** A rounded rectangle containing two connected rectangles:
* Left rectangle: "Reason"
* Right rectangle: "Act"
* **Interaction Flow:**
* An arrow points from "Environment" to "Reason".
* An arrow points from "Act" back to "Environment".
* An arrow points from "Reason" to "Act".
* **Label (Bottom):** The text "ReAct" is centered below the rounded rectangle.
### Key Observations
1. **Increasing Complexity:** The diagrams progress from a simple linear pipeline (A) to iterative self-improvement (B, top), selective processing (B, bottom), and finally to interactive agents that perceive and act upon an environment (C).
2. **Visual Language Consistency:** The legend in Panel C is critical for interpreting all three panels. The shape semantics (oval=input, rectangle=LLM call, rounded rect=process) are applied consistently.
3. **Role of the "Environment":** The concept of an external "Environment" is introduced only in Panel C, marking a shift from purely internal text processing to embodied or interactive AI agents.
4. **Human Interaction:** Direct human interaction is explicitly modeled only in the "Inner Monologue" framework (Panel C, top), via bidirectional arrows with the "Act" component.
### Interpretation
This image serves as a taxonomy of prompting and reasoning strategies for LLMs/VLMs, illustrating an evolution in design philosophy.
* **Panel A** represents the most basic, one-shot interaction: construct a prompt, get a model output, parse it, and execute a command. It's a straightforward input-output pipeline.
* **Panel B** introduces **metacognition** and **selective attention**. The "Self-Critique" loop embodies the concept of a model evaluating and refining its own output, aiming for higher quality. The "Selection-Inference" model suggests a two-stage process where relevant information is first filtered from a context before reasoning is applied, improving efficiency and focus.
* **Panel C** moves into the paradigm of **autonomous agents**. Here, the model is not just a text processor but a component within a larger system that perceives ("VLM", "Reason" from "Environment") and takes actions ("Act") that affect that environment. The "Inner Monologue" framework suggests the agent's reasoning is internal, while "ReAct" (Reason + Act) explicitly interleaves reasoning steps with action steps in a tight loop. The inclusion of a "Human" in "Inner Monologue" highlights a collaborative or supervised agent model.
The diagrams collectively argue that moving beyond simple prompting—towards structures that incorporate self-evaluation, selective processing, and environmental interaction—is key to developing more capable, reliable, and autonomous AI systems. The absence of numerical data is intentional; the value lies in the conceptual architecture of the information flows.