## [Diagram/Comparison]: Comparison of Two AI Agent Approaches to a Multi-Hop Question
### Overview
The image is a side-by-side comparison of the reasoning processes and outputs of two different AI agent systems ("Agent-as-tool-Base" and "Agent-as-tool-Instruct") when answering the same factual question. The comparison is presented in a structured, annotated format to highlight the differences in their problem-solving strategies and final accuracy.
### Components/Axes
The image is organized into three main sections:
1. **Header (Top Blue Bar):** Contains the core question and the ground-truth answer.
2. **Left Column:** Titled "Result from Agent-as-tool-Base". Shows the step-by-step trace of this agent's reasoning, tool use, and final answer.
3. **Right Column:** Titled "Result from Agent-as-tool-Instruct". Shows the step-by-step trace of this agent's reasoning, tool use, and final answer.
**Annotations:** Throughout both columns, orange rectangular boxes with white text provide meta-commentary on the agent's behavior at specific steps. These are connected to the relevant text with thin lines.
### Detailed Analysis
#### **Header Section**
* **Question:** "Where did Edward Dunn (Bishop)'s father graduate from?"
* **Expected Answer:** "Corpus Christi College, Cambridge"
#### **Left Column: Agent-as-tool-Base**
This column shows a process that ultimately fails to find the correct answer.
1. **Initial `<think>`:** The agent states its goal: "I need to find Edward Dunn's father's educational background."
2. **First `<tool_calling>`:** The agent asks: "Can you provide information on Edward Dunn's father's education?"
* **Annotation (Orange Box, positioned to the right):** "Planner try to ask 2 hops simultaneously"
3. **First `<obs>` (Observation):** The tool response states it couldn't find specific information about the father's education but found information about Edward Dunn himself. It offers to help with a different topic.
* **Annotation (Orange Box, positioned below the observation):** "The agent was unable to answer"
4. **Second `<think>`:** The agent acknowledges the observation didn't provide the needed information and decides to find a "more direct way."
5. **Second `<tool_calling>`:** The agent asks: "Can you provide information on the educational background of Edward Dunn's father?"
* **Annotation (Orange Box, positioned to the right):** "Planner asked 2 hops in the meantime in another manner"
6. **Second `<obs>`:** A lengthy response. It lists findings:
* 1. Information about **Edward T. Dunn** (the bishop himself), noting his studies.
* 2. Information about **Bishop Edward Dunn**, stating: "His father, Andrew Hunter Dunn, was an education graduate from Marlborough and Pembroke College, Cambridge." This is sourced from a Wikipedia link.
* 3. Notes that other sources discuss different individuals.
* **Annotation (Orange Box, positioned below this observation):** "The agent cannot get exact information"
7. **Third `<think>`:** The agent incorrectly concludes from the observation: "Edward Dunn's father, Andrew Hunter Dunn, graduated from Marlborough and Pembroke College, Cambridge."
8. **Final `<answer>`:** "Marlborough and Pembroke College, Cambridge (Wrong Answer)"
#### **Right Column: Agent-as-tool-Instruct**
This column shows a successful, stepwise process.
1. **Initial `<think>`:** The agent states its goal: "I need to find information about Edward Dunn (Bishop)'s father."
2. **First `<tool_calling>`:** The agent asks: "Can you provide information about Edward Dunn (Bishop)'s father?"
* **Annotation (Orange Box, positioned to the right):** "Learned to Ask the prerequisite question before the next hop"
3. **First `<obs>`:** The tool response identifies the father: "His father was Andrew Hunter Dunn, who at one time served as the Bishop of Quebec." It provides a Wikipedia link.
* **Annotation (Orange Box, positioned below the observation):** "The agent could answer this question easily"
4. **Second `<think>`:** The agent processes the observation, noting it now knows the father's name (Andrew Hunter Dunn) but not his graduation location. It states the need to find this next piece of information.
5. **Second `<tool_calling>`:** The agent asks the specific follow-up: "Can you provide information about where Andrew Hunter Dunn graduated from?"
* **Annotation (Orange Box, positioned to the right):** "Asked the next-hop question"
6. **Second `<obs>`:** The tool response provides the precise answer: "Andrew Hunter Dunn graduated from **Corpus Christi College, Cambridge**, where he earned his BA as the 29th Wrangler in 1863." It provides a Wikipedia link.
* **Annotation (Orange Box, positioned below the observation):** "The agent was able to answer the question easily"
7. **Third `<think>`:** The agent confirms it now has the correct information.
8. **Final `<answer>`:** "Corpus Christi College, Cambridge (Correct Answer)"
### Key Observations
1. **Strategic Difference:** The core difference is in the initial query strategy. The "Base" agent attempts to solve the multi-hop question ("father's education") in a single, broad query. The "Instruct" agent breaks it down, first identifying the father (prerequisite hop) and then querying for his education (target hop).
2. **Information Quality:** The "Base" agent's second observation contained the correct answer ("Marlborough and Pembroke College, Cambridge") but also extraneous, potentially confusing information about other individuals. The agent failed to parse it correctly.
3. **Annotation Role:** The orange annotations explicitly diagnose the "Base" agent's failure mode ("try to ask 2 hops simultaneously") and the "Instruct" agent's success strategy ("Learned to Ask the prerequisite question").
4. **Outcome:** The "Agent-as-tool-Base" produces a wrong answer, while the "Agent-as-tool-Instruct" produces the correct answer that matches the expected answer in the header.
### Interpretation
This diagram serves as a technical case study demonstrating the importance of **sequential reasoning** and **question decomposition** in AI agents designed for multi-hop question answering.
* **What it demonstrates:** It visually argues that an agent trained or instructed to break down complex queries into a series of simpler, prerequisite sub-questions ("Instruct") is more robust and accurate than an agent that attempts to resolve the entire query in one step ("Base"). The "Base" agent's approach is prone to failure because it either gets no answer or an answer mixed with irrelevant data, leading to incorrect synthesis.
* **Underlying Principle:** The success of the "Instruct" agent aligns with the Peircean investigative concept of **abductive reasoning**—forming the best explanation by sequentially gathering necessary facts. It first establishes the identity of the subject (the father) before inquiring about a property of that subject (his education).
* **Practical Implication:** For developers building AI systems that use tools (like search or databases) to answer complex questions, this comparison highlights that the agent's **planning and orchestration strategy** is as critical as the quality of the underlying tools or data. The "Instruct" agent's trace exemplifies an effective, logical workflow.