2506.07807

Model: nemotron-free

# A Proposal to Extend the Common Model of Cognition with Metacognition **Authors**: John Laird, Christian Lebiere, Paul Rosenbloom, Andrea Stocco institutetext: Contact: John.Laird@cic.iqmri.org ## Abstract The Common Model of Cognition (CMC) provides an abstract characterization of the structure and processing required by a cognitive architecture for human-like minds. We propose a unified approach to integrating metacognition within the CMC. We propose that metacognition involves reasoning over explicit representations of an agent’s cognitive capabilities and processes in working memory. Our proposal exploits the existing cognitive capabilities of the CMC, making minimal extensions in the structure and information available within working memory. We provide examples of metacognition within our proposal. Keywords: Common Model of Cognition, Cognitive Architecture, Metacognition ## 1 Introduction The Common Model of Cognition (CMC) [9] was developed as an abstract consensus model of human-like minds, derived from the computational structures and representations of cognitive architectures [6], and informed by our knowledge of the human mind and brain. The CMC specifies the fixed architectural structures for encoding, maintaining, using, and acquiring knowledge to produce behavior, emphasizing routine cognition. Here, we propose extending the CMC to include metacognition [3, 4, 5, 7, 12, 17]. Broadly speaking, metacognition encompasses reasoning about any aspect of cognition, including reasoning, memory, perception [13], motor skills, and learning. It can include partial or even incorrect theories of cognition, such as reasoning about perceived ESP capabilities. We focus on metacognition related to an agent’s own cognition. We also discuss metareasoning: reasoning about reasoning as a restricted form of metacognition. Examples of metacognition include: introspective monitoring, deliberate decision making, deliberate learning [10], predictive and hypothetical reasoning, retrospective reasoning, strategy selection, and self-explanation. Figure 1 shows two approaches to metacognition in cognitive architectures. In the hierarchical approach, also referred to as the Nelson and Narens model [11], specialized modules are added “on top” of cognition to implement metacognition, as exemplified by MIDCA [2] and Clarion [16]. Those modules monitor cognition, reason about it, and modify its processing. They access the current state of cognitive processing, the histories of processing, and representations of the agent’s procedural knowledge that drives cognition, with reasoning and metareasoning operating in parallel without intermixing. <details> <summary>extracted/6529175/Figures/MCArch1.png Details</summary> ![f9f33291](/v1/image/f9f332912ee80444cc6ba55037fbf36c907a51031530116c184bf85afbb4884a) ### Visual Description ## Diagram: Hierarchical vs. Unified Knowledge Processing Models ### Overview The image compares two knowledge processing architectures: **Hierarchical** (left) and **Unified** (right). Both models depict layered systems interacting with an **Environment**, with bidirectional feedback loops and unidirectional downward arrows indicating information flow. ### Components/Axes - **Hierarchical Model (Left)**: - **Top Layer**: *Metareasoning Knowledge* (dark blue rectangle, bidirectional arrows). - **Middle Layer**: *Reasoning Knowledge* (medium blue rectangle, bidirectional arrows). - **Bottom Layer**: *Situation Representation* (gray rectangle, bidirectional arrows). - **Environment**: Black oval at the base, connected by unidirectional arrows from all three layers. - **Unified Model (Right)**: - **Top Layer**: *Reasoning and Metareasoning Knowledge* (blue rectangle, bidirectional arrows). - **Bottom Layer**: *Situation and Partial Reasoning State Representation* (gray rectangle, bidirectional arrows). - **Environment**: Black oval at the base, connected by unidirectional arrows from both layers. - **Legend**: - Dark blue: *Metareasoning Knowledge* (Hierarchical). - Medium blue: *Reasoning Knowledge* (Hierarchical). - Gray: *Situation Representation* (Hierarchical) and *Situation and Partial Reasoning State Representation* (Unified). - Black: *Environment* (shared). ### Detailed Analysis - **Hierarchical Model**: - Three distinct layers with isolated bidirectional feedback loops. - *Metareasoning Knowledge* (dark blue) sits above *Reasoning Knowledge* (medium blue), which sits above *Situation Representation* (gray). - Unidirectional arrows from all layers point downward to the *Environment*. - **Unified Model**: - Two layers merged into a single system: - *Reasoning and Metareasoning Knowledge* (blue) combines the top two layers of the Hierarchical model. - *Situation and Partial Reasoning State Representation* (gray) merges the bottom layer with additional reasoning state data. - Bidirectional arrows within layers and unidirectional arrows to the *Environment*. ### Key Observations 1. **Integration vs. Segregation**: - The Hierarchical model separates knowledge into discrete layers, while the Unified model integrates *Reasoning* and *Metareasoning* into a single layer. - The Unified model also combines *Situation Representation* with *Partial Reasoning State Representation*, suggesting a more holistic approach. 2. **Environment Interaction**: - Both models share a unidirectional connection to the *Environment*, implying that processed knowledge is applied externally but not directly influenced by environmental feedback in this representation. 3. **Feedback Loops**: - Bidirectional arrows within layers indicate internal refinement of knowledge (e.g., *Metareasoning Knowledge* refining itself). ### Interpretation - The **Hierarchical Model** emphasizes modularity, with specialized layers for metareasoning, reasoning, and situational awareness. This could reflect systems where knowledge is processed in stages (e.g., high-level strategy → tactical reasoning → environmental perception). - The **Unified Model** suggests a flattened architecture where reasoning and metareasoning are co-located, potentially enabling faster or more adaptive decision-making by reducing layering overhead. - The shared *Environment* connection implies both models prioritize applying processed knowledge to external contexts, but the Unified model’s merged layers might better handle dynamic or overlapping tasks. - The absence of environmental feedback loops raises questions about how these models handle real-time adaptation, as the Environment’s influence appears one-way. This diagram highlights trade-offs between hierarchical specialization and unified integration in knowledge processing systems. </details> Figure 1: Alternative Metacognitive Architectures. In the CMC and cognitive architectures more generally, a capability is realized through architectural structures and knowledge. Therefore, we propose a unified approach, on the right side of Figure 1, where cognition and metacognition differ only in what is the subject of reasoning. We propose minimal architectural extensions to make information about an agent’s cognition available in working memory. Versions of these extensions are found in existing CMC architectures, including ACT-R [1], Sigma [14], and Soar [8]. We also describe the sources of other non-architectural sources of information about cognition needed to support metacognition as a component of overall cognition. As with the original goals for the CMC, our goal is for our proposal of metacognition to apply to both humans and A(G)I systems with similar general capabilities. Our proposal does not permit information in long-term memories to be examined or reasoned over by other modules. Instead, it restricts reasoning and metareasoning to information available in working memory. It avoids new modules by incorporating the long-term knowledge used in metareasoning within its existing long-term memories. Restricting access to long-term memories sacrifices omniscient metareasoning but enables efficient processing and memory functionalities consistent with neural memory models. Furthermore, all existing cognitive reasoning and learning capabilities are available for metacognition, including interaction with the external environment. Below, we review the Common Model of Cognition and present our proposal for extending it to include human-like metacognition. Our proposal focuses on adding new representational distinctions and sources of information about cognition that become available to an agent to initiate, reason, and terminate metacognition. We attempt to identify the minimum architectural information required to support general metacognition, as well as non-architectural sources of information about cognition that are available to an agent. We include three examples of metacognition within this framework. ## 2 Common Model of Cognition <details> <summary>extracted/6529175/Figures/CMC1.png Details</summary> ![3dee7be9](/v1/image/3dee7be98adfd53a34406f104c9224f779dc31cace961b9343a9463e4a8f4423) ### Visual Description ## Diagram: Cognitive Memory System Architecture ### Overview The diagram illustrates a cognitive memory system with interconnected components representing different memory types, perceptual and motor processes, and environmental interactions. Arrows indicate directional relationships and information flow between elements. ### Components/Axes - **Legend**: - **Red**: Declarative Long-term Memory (DL) - **Blue**: Procedural Long-term Memory (RL/PC) - **Brown**: Working Memory - **Yellow**: Perception - **Green**: Motor - **Purple**: Environment - **Key Labels**: - **Declarative Long-term Memory** (top, red box) - **Procedural Long-term Memory** (left, blue box) - **Working Memory** (center, brown box) - **Perception** (yellow box, bottom-center) - **Motor** (green box, bottom-right) - **Environment** (purple oval, bottom) - **Arrows and Connections**: - **Red arrow**: From Declarative Long-term Memory to Working Memory (labeled "DL"). - **Blue bidirectional arrow**: Between Procedural Long-term Memory and Working Memory (labeled "RL/PC"). - **Yellow arrow**: From Perception to Working Memory. - **Green arrow**: From Working Memory to Motor. - **Purple arrow**: From Perception to Environment. - **Green arrow**: From Motor to Environment. ### Detailed Analysis - **Declarative Long-term Memory (DL)**: - Positioned at the top, connected to Working Memory via a red arrow labeled "DL." - Represents storage of factual knowledge (e.g., semantic memory). - **Procedural Long-term Memory (RL/PC)**: - Located on the left, connected to Working Memory via a bidirectional blue arrow labeled "RL/PC." - Indicates a two-way interaction with Working Memory, likely involving skill-based or habitual memory. - **Working Memory**: - Central brown box, acting as a hub for information flow. - Receives input from Declarative Long-term Memory (red arrow) and Procedural Long-term Memory (blue bidirectional arrow). - Sends output to Perception (yellow arrow) and Motor (green arrow). - **Perception**: - Yellow box at the bottom-center, connected to Working Memory (yellow arrow) and Environment (purple arrow). - Represents sensory input processing. - **Motor**: - Green box at the bottom-right, connected to Working Memory (green arrow) and Environment (green arrow). - Represents motor output and action execution. - **Environment**: - Purple oval at the bottom, connected to Perception (purple arrow) and Motor (green arrow). - Represents external stimuli and feedback. ### Key Observations 1. **Bidirectional Interaction**: The blue bidirectional arrow between Procedural Long-term Memory and Working Memory suggests a dynamic exchange, possibly involving reinforcement learning (RL) or procedural control (PC). 2. **Unidirectional Flow**: Most arrows (red, yellow, green) indicate one-way information flow, emphasizing hierarchical processing (e.g., perception → working memory → motor). 3. **Environmental Integration**: The Environment is linked to both Perception and Motor, highlighting its role in shaping cognitive processes. ### Interpretation This diagram models a cognitive architecture where: - **Declarative and Procedural Memories** feed into **Working Memory**, which integrates and processes information. - **Perception** and **Motor** systems interact with the **Environment**, creating a feedback loop. - The **bidirectional link** between Procedural Memory and Working Memory implies that procedural knowledge (e.g., skills) can both influence and be modified by working memory processes. - The **Environment** acts as an external driver, shaping perception and motor responses, which are then stored in long-term memory systems. This structure aligns with theories of cognitive processing, emphasizing the interplay between memory, perception, action, and environmental context. </details> Figure 2: The Common Model of Cognition. The CMC unifies many similar cognitive architectures by identifying common components, processing, connectivity, and constraints. It does not specify mechanisms or implementations, but emphasizes functionality, focusing on routine cognition and learning. Cognition is the collective processing of the component modules (working memory, procedural memory, declarative memory) and their associated processes (action selection, retrieval, learning). Figure 2 shows the structure and data flow among the modules, which include short-term working memory, long-term procedural memory, long-term declarative memory, learning, perception, and motor control. Long-term memories have associated automatic learning mechanisms that incrementally modify and extend their contents. The CMC posits procedural compilation, reinforcement learning, and declarative learning. Memories contain relations over symbols annotated with quantitative metadata. Examples include the recency of creation or access, probability, and derived utility. Metadata influences processing within a module, such as retrieval and learning. Data flow begins with perception and proceeds to working memory, representing an agent’s understanding of its current situation and goals. Procedural memory contains knowledge about selecting and executing actions that modify working memory. On each cognitive cycle, procedural memory, testing the contents of working memory, selects a single action, which makes one or more changes to working memory, resulting in a step in the cognitive cycle. Each of the other modules has a buffer in working memory through which procedural memory can initiate memory retrievals, motor actions, or top-down control of perception. Results from those module processes are added to their respective buffers. Thus, behavior unfolds as a sequence of steps, driven by procedural memory, making changes to working memory. We call such step-by-step behavior reasoning where the contents of working memory are what the reasoning is about and the knowledge in procedural and declarative memory determines the course of the reasoning. The original CMC encompasses routine and skilled performance, where working memory includes only information about the agent’s current goals and task state. ## 3 Phases of Metacognition Processing In our proposal, metacognition employs the same cognitive cycle process, utilizing the same modules, and often the same knowledge. Metacognition is distinguished by the fact that the information about the agent’s own processing is included in its reasoning: information about prior problem solving (“I see where I made a mistake on that problem and need to do something different in the future.”), specific skills (“I’m good at jigsaw puzzles.”), general competences (“I struggle with math.”), and the operation of individual modules (“I have trouble remembering the difference between affect and effect.”). Our proposal does not introduce new modules but relies on existing architectural structures and knowledge, which are extended to make new forms and representations of information available in working memory. The process of metacognition is not purely metacognition, but a combination of new and existing capabilities. In this section, we describe the three phases of metacognition: initiation, reasoning, and termination. We identify the types of knowledge needed in these phases. Then we describe our proposed extensions to meet those requirements, as well as sources of knowledge available for metareasoning. The following section steps through those phases using three examples of metacognition. ### 3.1 Initiation The standard cognitive cycle for reasoning involves procedural memory responding to changes in working memory, typically concerning the performance of the current task - what we call base-level reasoning. However, a working memory element can be added that indicates a deviation of base-level reasoning, such as a failure in the retrieval from memory. The general requirement is that a knowledge source creates a structure in working memory that is about the agent’s cognition. Such an item can be created deliberately through an action of procedural memory, but also as a side-effect of other long-term memory retrievals or perception. Our proposal extends these by including feedback about the state of individual modules. ### 3.2 Reasoning Once information is available about the state of the agent’s cognition, the agent can use its cognitive capabilities to respond to it (or ignore it), essentially treating it as the signal that a new (meta)problem may need to be solved. Knowledge for reasoning can combine existing task performance knowledge with knowledge that is specific to metacognition. We have identified three extensions to the existing sources of information that can be crucial for metacognition. 1. Information about the current state of agent processing in its modules. 1. A memory of the contents of working memory over time, where a sequence of past situations can be retrieved into working memory and reasoned over. This allows the agent to detect choices it should not make in the future or ones it should reinforce. It also allows the agent to learn a model of the effects of its actions on its internal state, its environment (changes that come from perception), and its processing state. 1. Some means of creating past or future hypothetical states in working memory with two seemingly contradictory properties. One is that these states are represented such that the agent’s knowledge for reasoning in the current state can apply to them, so that the agent can imagine what it would do in those states. The second is that they are also distinguished in some way, so that the agent does not confuse them with reality. Thus, an agent can plan for the future or retrospectively reconsider past actions, without disrupting or interfering with its reasoning about the present. ### 3.3 Termination Metacognition terminates when reasoning is no longer sensitive to representations of the agent’s processing, such as when metacognition resolves the reason it was initiated. Depending on the initiating signal, this can be from a deliberate change to working memory, or indirectly because an impasse in reasoning is resolved. Whatever the reason for termination, any result can also change long-term memory (via the existing learning mechanisms). ## 4 Proposal for Metacognition in the CMC The crux of our proposal is to expand the sources and representations of an agent’s available information to meet the requirements described above. We present our proposal in two stages. First, we describe structural modifications to the CMC that make new representations and direct sources of information available. In the second, we describe how the existing sources can also provide some of that information indirectly. <details> <summary>extracted/6529175/Figures/CMC2.png Details</summary> ![9e1214f0](/v1/image/9e1214f03789b060fa9e4e6158bc119191743e3f1d841b4ac5ba48d1d6bac730) ### Visual Description ## Diagram: Human Memory System Architecture ### Overview The diagram illustrates the hierarchical and interactive structure of human memory systems, showing relationships between long-term memory types (semantic, episodic, procedural), working memory, perception, motor functions, and environmental interactions. Arrows indicate bidirectional information flow and processing pathways. ### Components/Axes 1. **Memory Systems**: - **Semantic Long-term Memory** (purple block) - **Episodic Long-term Memory** (red block) - **Procedural Long-term Memory** (blue block) - **Working Memory** (brown central hub) 2. **Processors**: - **SL** (Semantic Learning) - purple arrow connecting Semantic Memory to Working Memory - **EL** (Episodic Learning) - red arrow connecting Episodic Memory to Working Memory - **RL/PC** (Retrieval/Processing) - blue arrows connecting Procedural Memory to Working Memory 3. **Peripheral Systems**: - **Perception** (yellow block) - input from Environment - **Motor** (green block) - output to Environment 4. **Environment** (black oval base) ### Detailed Analysis - **Working Memory** acts as a central processing unit, receiving input from all long-term memory systems and the Environment via Perception. - **Semantic Memory** (facts/knowledge) and **Episodic Memory** (personal experiences) feed into Working Memory through dedicated learning pathways (SL/EL). - **Procedural Memory** (skills/habits) interacts with Working Memory through bidirectional RL/PC pathways, suggesting active skill refinement. - **Perception** receives direct input from the Environment, while **Motor** outputs to it, creating a closed-loop system. - Color coding emphasizes functional grouping: purple/red for declarative memories, blue for procedural, and green/yellow for action-perception interfaces. ### Key Observations 1. **Central Role of Working Memory**: Positioned as the neural hub integrating all memory types and environmental interactions. 2. **Bidirectional Learning**: Arrows between long-term memories and Working Memory indicate continuous updating and consolidation. 3. **Environmental Coupling**: Perception and Motor form a sensory-motor loop grounded in the Environment, separate from core memory systems. 4. **Modular Specialization**: Each long-term memory type has dedicated pathways to Working Memory, suggesting specialized processing. ### Interpretation This model demonstrates how human cognition maintains a dynamic equilibrium between stored knowledge (long-term memories) and real-time processing (Working Memory). The bidirectional arrows between memory systems suggest that recalling information (e.g., retrieving a procedural skill) actively modifies Working Memory contents, which in turn may update long-term storage through learning mechanisms. The separation of Perception/Motor from core memory systems implies that while environmental interaction drives memory formation, the core memory architecture operates as a semi-autonomous processing unit. The diagram aligns with contemporary cognitive science models emphasizing working memory's role in consciousness and executive function, while highlighting the distinction between declarative (semantic/episodic) and non-declarative (procedural) memory systems. </details> Figure 3: Structural extensions to the CMC to support Metacognition. ### 4.1 CMC Extensions Figure 3 illustrates our proposed extensions. One is that information about the current processing state of each module, including procedural memory and working memory, is available in working memory buffers. The second is the inclusion of the functionality of episodic memory, which is shown as a separation from semantic memory. The third (not shown) is that working memory supports representations of information about past or future states, distinguished from the current state. We also describe how each extension is, or is not, implemented in ACT-R, Sigma, and Soar. #### 4.1.1 Module Process-state Buffers In our proposal, each module has a process-state buffer added to working memory. It summarizes information about the module’s state that can be a signal to initiate metareasoning, as well as the information for understanding the current state of processing during metareasoning. Types of process-state information include the success or failure of requested actions for a given buffer. Additional information can include certainty/confidence in a result, partial results (the answer starts with an ’A’), or indications that an answer is available but not retrieved (feeling of knowing). Perceptual process-state information can include surprise or the inability to recognize parts of the perceptual scene. Process-state information for working memory can include assessments, such as desirability or intrinsic pleasantness. A proposed CMC extension for emotion [15] considers adding a metacognitive assessment module with an associated buffer in working memory, which would be compatible with this proposal. The existing module buffers in ACT-R, Sigma, and Soar include dedicated areas for process-state information. Sigma and Soar include process-state information associated with procedural memory, whereas ACT-R does not. In Sigma and Soar, if procedural memory cannot select an action, a structure called the substate is added to working memory. The substate describes the reason for the impasse and initiates metareasoning and provides a context for metareasoning (see below). Sigma, but not ACT-R or Soar, has something akin to the working memory process-state buffer for representing desirability appraisals. #### 4.1.2 Hypothetical State Representations Our proposal requires that an agent can currently represent hypothetical past or future states with the current state in working memory. The substate mechanisms in Sigma and Soar provide concurrent representations of the current situation for base-level reasoning and hypothetical states and are the locus for metareasoning. The structure of the substates can recreate the current state so that the agent’s long-term knowledge can be used for both metareasoning and base-level reasoning. However, the substates are distinguished by certain features so that hallucination is avoided. ACT-R has no similar architectural mechanisms. Instead, in agents that employ planning, knowledge-based conventions reserve specific slots in their working memory structures (chunks) to distinguish hypothetical states. #### 4.1.3 Episodic Memory Process-state buffers provide information about the instantaneous state of the agent, but do not provide any representation of the history of reasoning that can be used for metareasoning. The ability to reconstruct prior reasoning trajectories is precisely what episodic memory can provide. The original CMC had a single long-term declarative memory. Here, we propose that the functionality associated with episodic memory differs from that of semantic memory, as it provides a direct source of knowledge about an agent’s past reasoning. Episodic learning incrementally and automatically acquires episodes and their temporal relations, allowing an agent to reconstruct the extended sequence of reasoning in working memory, possibly through multiple retrievals. Once in working memory, that data, which is about the agent’s reasoning, enables retrospective analysis and other forms of metareasoning. Soar has distinct semantic and episodic memories as shown in Figure 3; however, ACT-R and Sigma do not. Instead, they have a single long-term declarative memory, where some of the functionalities of episodic memory are supported but not all. ### 4.2 Indirect Sources of Processing Information In addition to the direct sources provided by process-state buffers and episodic memory, an agent can access information about its processing from its interactions with its environment, by metareasoning, from its memories of its reasoning, behavior, and metareasoning. #### 4.2.1 Perception of its Environment: An agent’s environment includes many sources of information that an agent can use to reason about itself. Below are a few general categories. - Self Observation: By observing the results of its interactions with its world, an agent can learn about the impact of its cognitive processes on its environment. When unexpected results occur, the perception of those results can lead to metacognition. When self-observation is combined with episodic memory, an agent can later review its behavior, determining what approaches work and which don’t work, and build up a model for future tasks and associated metareasoning. - Other Agents: Other agents can provide observations about an agent’s cognitive processing, suggest possible reasoning steps or strategies, notify an agent of mistakes (or successes) in its reasoning, or even provide an agent with cognitive strategies (such as through Cognitive Behavior Therapy) to detect its reasoning approaches, evaluate them, and possibly modify them. - Recorded Information: An agent can read books, watch movies, and even study psychology to acquire a general understanding of reasoning capabilities that it applies to itself. Many of these sources provide temporal distance between an agent’s original reasoning and using the information, allowing the agent to create internal representations of behavior and capabilities in working memory after the behavior is generated. This can be impossible during routine reasoning, when only task information is available and task urgency prevents metacognitive processing. #### 4.2.2 Metareasoning: Metareasoning itself composes these other sources of knowledge about an agent’s cognitive capabilities to draw conclusions and generate new insights that fuel future metareasoning. This requires the other originating sources of information, but allows an agent to extend and expand its knowledge by combining information from multiple sources, identifying trends and commonalities, and so on. The knowledge created by metareasoning can then be learned and transferred to long-term memory for future use as described below. #### 4.2.3 Semantic and Procedural Memories: The automatic learning of procedural and semantic knowledge enables the retention of knowledge about an agent’s cognition produced by other sources. They are indirect sources, as they require information to be present in working memory, which, through learning, becomes available for retrieval into working memory and future metacognition. ## 5 Examples of Metacognitive Processing We return to the three phases of metacognitive processing with three examples. ### 5.1 Wordle Retrieval The agent attempts to retrieve a word from semantic memory using a partial specification of a five-letter word. - Initiation: The semantic memory process-state buffer includes information that the retrieved word (‘Tripe’) is uncommon. - Reasoning: The agent decides it wants to know more “about what it knows about the word,” and attempts to retrieve the word from episodic memory, embedding it in the context of it being a Wordle answer. Episodic memory does not retrieve a specific episode of it being a previous answer (possibly because of interference from the hundreds of previous times the agent has played Wordle). Still, the process-state buffer indicates that the word is very familiar. The agent reasons that the combination of being uncommon and familiar indicates it is probably a previous Wordle answer, as it wouldn’t come up in any other situation. - Termination: The agent discards ‘Tripe’ from consideration, which terminates the short bout of metareasoning. The agent returns to generating a potential answer. ### 5.2 Making a Move in Chess The agent is playing a chess game and is far enough into the game that its memorized opening moves are exhausted. Here we describe the mechanisms in Sigma and Soar that support metareasoning in such a situation. - Initiation: Procedural memory does not return a single definite action to take. In Sigma and Soar, this leads to the creation of a substate. - Reasoning: In the substate, the agent decides to explicitly try out the different available moves on internal copies of the current state, generating states that it then compares. - Termination: It ultimately decides on a specific move, which terminates the metareasoning. ### 5.3 Repeated Robot Action A one-armed robot is instructed to store all the leftovers on the counter in the refrigerator. Each time the robot stores an item in the refrigerator, it opens the refrigerator door, fetches the item, places it in the refrigerator, and then closes the refrigerator door. - Initiation: After completing the task, the robot’s instructor tells it that it needs to improve its performance. This external information triggers procedural knowledge that the robot should do a retrospective analysis of its original performance. - Reasoning: Using existing procedural knowledge, the robot recalls a trace of its behavior from episodic memory into working memory. It then uses existing procedural knowledge to analyze the trace. It detects that it is repeatedly in exactly the same world state because it closes the refrigerator door as part of one store command, but then immediately opens it for the next. It “imagines” being at the end of a store command when there are other items to store, and inhibits the action to close the refrigerator door. Through its procedural learning mechanism, it learns to inhibit that action in similar situations in the future. - Termination: On completing its internal inhibition, it has additional procedural knowledge that removes from working memory the comment from the instructor, and continues with its normal activities. ## 6 Conclusion We propose that three specific extensions be added to the CMC to support a unified approach to metacognition: module process-state buffers that provide information in working memory on the current state of each of the buffers; episodic memory that provides a means for the agent to recreate its behavior so that it becomes available to reason over; and some means of creating hypothetical situations in working memory that support using base-level reasoning during metareasoning, avoid interference and confusion between them. Our proposal is a framework, but it does not specify in detail the diverse and extensive indirect learned or pre-encoded long-term knowledge needed for an agent to engage in the forms of metacognition found in humans. An essential point of our proposals is to provide the architectural structure that is necessary above and beyond what is encoded as knowledge in the long-term memories. Although we introduce some new architectural structures, there are no structural boundaries between base-level (non-meta) reasoning and metareasoning. Within existing architectures consistent with this proposal (ACT-R, Sigma, and Soar), an agent can rapidly switch between base-level reasoning and metareasoning, as the only difference is whether working memory structures are about the agent’s cognitive and reasoning capabilities. In addition, an agent’s reasoning transitions from base-level task reasoning to metareasoning as it responds to internal impasses, failures, and successes in base-level reasoning, and then back to base-level reasoning as learning leads to routine, impasse-free reasoning. ## References - [1] Anderson, J.R., Bothell, D., Byrne, M.D., Douglass, S.A., Lebiere, C., Qin, Y.: An integrated theory of the mind. Psychological Review 111 (4), 1036–1060 (2004) - [2] Cox, M., Alavi, Z., Dannenhauer, D., Eyorokon, V., Munoz-Avila, H., Perlis, D.: MIDCA: A Metacognitive, Integrated Dual-Cycle Architecture for Self-Regulated Autonomy. Proceedings of the AAAI Conference on Artificial Intelligence 30 (1) (Mar 2016) - [3] Cox, M., Raja, A.: Metareasoning: An Introduction, pp. 3–14. MIT Press (2011) - [4] Flavell, J.H.: Metacognition and cognitive monitoring: A new area of cognitive–developmental inquiry. American Psychologist 34 (10), 906–911 (1979) - [5] Johnson, S.G.B., Karimi, A.H., Bengio, Y., Chater, N., Gerstenberg, T., Larson, K., Levine, S., Mitchell, M., Rahwan, I., Schölkopf, B., Grossmann, I.: Imagining and building wise machines: The centrality of AI metacognition (2025), https://arxiv.org/abs/2411.02478 - [6] Kotseruba, I., Tsotsos, J.K.: 40 years of cognitive architectures: Core cognitive abilities and practical applications. Artificial Intelligence Review 53 (1), 17–94 (Jan 2020) - [7] Kralik, J.D., Lee, J.H., Rosenbloom, P.S., Jackson, P.C., Epstein, S.L., Romero, O.J., Sanz, R., Larue, O., Schmidtke, H.R., Lee, S.W., McGreggor, K.: Metacognition for a Common Model of Cognition. Procedia Computer Science 145, 730–739 (Jan 2018) - [8] Laird, J.E.: The Soar Cognitive Architecture. MIT Press, Cambridge, MA (2012) - [9] Laird, J.E., Lebiere, C., Rosenbloom, P.S.: A Standard Model of the Mind: Toward a Common Computational Framework across Artificial Intelligence, Cognitive Science, Neuroscience, and Robotics. AI Magazine 38 (4), 13–26 (2017) - [10] Laird, J.E., Mohan, S.: Learning Fast and Slow. In: Proc. of the 32nd AAAI Conference on Artificial Intelligence. p. 5. AAAI Press, New Orleans (2018) - [11] Nelson, T.O., Narens, L.: Metamemory: A theoretical framework and new findings. In: Bower, G. (ed.) The Psychology of Learning and Motivation: Advances in Research and Theory, pp. 125–173. Academic Press, New York (1990) - [12] Nolte, R., Pomarlan, M., Janssen, A., Beßler, D., Javanmardi, K., Jongebloed, S., Porzel, R., Bateman, J., Beetz, M., Malaka, R.: How metacognitive architectures remember their own thoughts: A systematic review (2025), https://arxiv.org/abs/2503.13467 - [13] Rahnev, D., Balsdon, T., Charles, L., de Gardelle, V., Denison, R., Desender, K., Faivre, N., Filevich, E., Fleming, S.M., Jehee, J., Lau, H., Lee, A.L.F., Locke, S.M., Mamassian, P., Odegaard, B., Peters, M., Reyes, G., Rouault, M., Sackur, J., Samaha, J., Sergent, C., Sherman, M.T., Siedlecka, M., Soto, D., Vlassova, A., Zylberberg, A.: Consensus goals for the field of visual metacognition. Perspectives on Psychological Science 17 (6), 1746–1765 (2022) - [14] Rosenbloom, P.S., Demski, A., Ustun, V.: The Sigma Cognitive Architecture and System: Towards Functionally Elegant Grand Unification. Journal of Artificial General Intelligence 7 (1), 1–103 (Dec 2016) - [15] Rosenbloom, P.S., Laird, J.E., Lebiere, C., Stocco, A., Granger, R., Huyck, C.: A proposal for extending the Common Model of Cognition to emotion. In: 22nd International Conference on Cognitive Modeling, ICCM 2024. Tilburg University, the Netherlands (Apr 2024) - [16] Sun, R.: The importance of cognitive architectures: an analysis based on CLARION. Journal of Experimental & Theoretical Artificial Intelligence 19 (2), 159–193 (2007) - [17] Walker, P., Haase, J., Mehalick, M., Steele, C., Russell, D., Davidson, I.: Harnessing metacognition for safe and responsible ai. Technologies 13 (2025)

Rendering Paper...