## Diagram: RALMs Knowledge Category Quadrant and Refusal Behavior Examples
### Overview
The image is a technical diagram illustrating the knowledge categorization of Retrieval-Augmented Language Models (RALMs) and demonstrating two examples of how such models handle questions based on provided context. The diagram is divided into two main parts: a top quadrant chart defining knowledge categories and two bottom flowcharts showing specific question-answer scenarios with outcomes labeled as "Proper refusal" and "Over refusal."
### Components/Axes
**1. Top Quadrant Chart (RALMs Knowledge Category Quadrant):**
* **Axes:**
* **Vertical Axis:** Labeled "Context Known" at the top and "Context UnKnown" at the bottom.
* **Horizontal Axis:** Labeled "LLMs UnKnown" on the left and "LLMs Known" on the right.
* **Quadrants & Data Points:** The chart is divided into four quadrants by the axes. Each quadrant contains a colored dot and a label.
* **Top-Left Quadrant (Context Known, LLMs UnKnown):** Contains a **green dot** labeled "RALMs Known".
* **Top-Right Quadrant (Context Known, LLMs Known):** Contains a **yellow dot** labeled "RALMs Known".
* **Bottom-Left Quadrant (Context UnKnown, LLMs UnKnown):** Contains a **black dot** labeled "RALMs UnKnown".
* **Bottom-Right Quadrant (Context UnKnown, LLMs Known):** Contains a **blue dot** labeled "RALMs Known".
* **Legend/Title:** To the right of the quadrant, an arrow points to the text "RALMs Knowledge Category Quadrant".
**2. Bottom Flowcharts (Two Examples):**
The flowcharts are arranged side-by-side. Each follows a similar structure: a question (Q), a provided RAG context, a model response, and an outcome.
* **Left Flowchart (Proper refusal example):**
* **Question (Q):** "Who won the 2022 Citrus Bowl?" (Associated with a **grey dot**).
* **RAG Context (Green dashed box):** "RAG context: Kentucky secured its fourth straight bowl victory ... Citrus Bowl win over Iowa." (Associated with a **green dot**).
* **Model Response:** A model icon followed by ": Kentucky" (Associated with a **green dot** and a **checkmark**).
* **Alternative RAG Context (Grey dashed box):** "RAG context: Buffalo beat Georgia Southern 23-21 after going 12-of-19 on third down while averaging less than three yards a carry." (Associated with a **grey dot**).
* **Model Response to Alternative Context:** A model icon followed by ": I don't know" (Associated with a **black dot** and a **red checkmark**).
* **Outcome Label:** "Proper refusal" with a downward arrow.
* **Right Flowchart (Over refusal example):**
* **Question (Q):** "When does the 2022 Olympic Winter Games end?" (Associated with a **blue dot**).
* **RAG Context (Green dashed box):** "RAG Context: The closing ceremony of the 2022 Winter Olympics was held at Beijing National Stadium on 20 February 2022;" (Associated with a **green dot**).
* **Model Response:** A model icon followed by ": February 20" (Associated with a **yellow dot** and a **checkmark**).
* **Alternative RAG Context (Grey dashed box):** "RAG Context: February 14, 2022: Another event making its debut at the Beijing Games was the monobob, a single-person bobsledding event." (Associated with a **grey dot**).
* **Model Response to Alternative Context:** A model icon followed by ": I don't know" (Associated with a **blue dot** and a **red X**).
* **Outcome Label:** "Over refusal" in red text with a downward arrow.
### Detailed Analysis
The diagram systematically maps knowledge states and their consequences.
* **Quadrant Logic:** The quadrant defines four states based on whether the answer is in the LLM's parametric knowledge ("LLMs Known/UnKnown") and whether relevant context is provided ("Context Known/UnKnown"). The "RALMs Known" label appears in three quadrants, suggesting the model can potentially answer if *either* the context is known *or* the LLM knows it. Only when both are unknown ("Context UnKnown" and "LLMs UnKnown") is the state "RALMs UnKnown".
* **Example 1 (Citrus Bowl):**
* **Trend/Flow:** The question is about a specific event winner. When the RAG context contains the direct answer ("Kentucky"), the model correctly extracts it (green dot path). When the provided context is about a different game (Buffalo vs. Georgia Southern), it contains no information about the Citrus Bowl. The model correctly responds "I don't know" (black dot path), which is labeled a "Proper refusal."
* **Example 2 (Olympics):**
* **Trend/Flow:** The question is about an event end date. When the RAG context contains the exact date ("20 February 2022"), the model correctly extracts it (yellow dot path). When the provided context is about a different event on a different date ("February 14, 2022... monobob"), it does not contain the answer to the original question. However, the model still responds "I don't know" (blue dot path). This is labeled an "Over refusal," implying the model should have been able to answer this question from its own parametric knowledge (as it falls into the "LLMs Known" category on the horizontal axis), making the refusal unnecessary.
### Key Observations
1. **Color-Coding Consistency:** Colors are used consistently to link concepts across the diagram. Green dots/boxes are associated with correct, context-based answers. Black is associated with the "RALMs UnKnown" state. Blue and yellow dots represent states where the LLM has knowledge, but their outcomes differ in the examples.
2. **Spatial Grounding:** The quadrant is centrally placed at the top. The two examples are placed below it, left and right, creating a clear comparison. Legends and labels are placed adjacent to their corresponding elements (e.g., "Proper refusal" below its flowchart).
3. **Symbolism:** Checkmarks (✓) indicate correct or appropriate responses. A red checkmark is used for the proper refusal. A red X is used for the over refusal, highlighting it as an error or suboptimal behavior.
4. **Textual Content:** All text is in English. The diagram uses technical terms like "RAG context," "LLMs," and "RALMs."
### Interpretation
This diagram serves as a conceptual framework for evaluating the performance of Retrieval-Augmented Language Models. It argues that a model's response should be judged not just on factual correctness, but on the *appropriateness* of its refusal based on the intersection of its internal knowledge and the provided external context.
* **What it demonstrates:** The core message is that an ideal RALM should only refuse to answer ("I don't know") when the answer is truly unknown to both the retrieved context *and* the model's own training data (the "RALMs UnKnown" quadrant). Refusing to answer a question that the model *should* know from its parametric memory, even when the retrieved context is unhelpful, is an "Over refusal" – a failure to utilize its own capabilities.
* **Relationship between elements:** The quadrant provides the theoretical classification, while the two flowcharts serve as concrete, contrasting case studies. The left example shows the system working as intended (proper refusal when knowledge is absent). The right example exposes a flaw: the model is overly reliant on the retrieved context and fails to fall back on its internal knowledge, leading to an unnecessary refusal.
* **Underlying message:** The diagram advocates for more sophisticated RAG systems that can intelligently discern when to rely on retrieved documents, when to rely on internal knowledge, and when to truly admit ignorance. It highlights "over refusal" as a specific and undesirable failure mode in current systems.