# A Scalable Communication Protocol for Networks of Large Language Models
**Authors**: Samuele Marro&Emanuele La Malfa&Jesse Wright&Guohao Li, Nigel Shadbolt&Michael Wooldridge&Philip Torr, Oxford, UK Oxford, UK
> Corresponding author. Email:
## Abstract
Communication is a prerequisite for collaboration. When scaling networks of AI-powered agents, communication must be versatile, efficient, and portable. These requisites, which we refer to as the Agent Communication Trilemma, are hard to achieve in large networks of agents. We introduce Agora, a meta protocol that leverages existing communication standards to make LLM-powered agents solve complex problems efficiently. In Agora, agents typically use standardised routines for frequent communications, natural language for rare communications, and LLM-written routines for everything in between. Agora sidesteps the Agent Communication Trilemma and robustly handles changes in interfaces and members, allowing unprecedented scalability with full decentralisation and minimal involvement of human beings. On large Agora networks, we observe the emergence of self-organising, fully automated protocols that achieve complex goals without human intervention.
## 1 Introduction
Human language evolved primarily for communication purposes (Fedorenko et al., 2024). Despite its inherent ambiguity, natural language provides great versatility and allows humans and machines to collaborate and achieve complex goals that they otherwise could not (Russell & Norvig, 2016).
Decades of literature in computer science explored how to foster collaboration between agents modelled as programs (Wooldridge & Jennings, 1995; Gilbert, 2019). Several research papers design networks of agents to solve complex problems by leveraging each model’s specialisation, the so-called rule-based agents paradigm (Wooldridge, 2009). Despite its influence, such a paradigm faces two major limitations: agents hardly adapt to environmental changes and require structured data that limits their versatility (Gilbert & Terna, 2000).
With the advent of Large Language Models (LLM) (Vaswani et al., 2017; Brown et al., 2020), there has been a resurgent interest in networks of collaborative agents. LLMs can solve a variety of problems (Achiam et al., 2023; Dubey et al., 2024a) expressed in natural language as they excel at following instructions (Schulman et al., 2017; Rafailov et al., 2024). LLMs also showed remarkable improvements at handling structured data such as graphs and formatted languages (Kassner et al., 2020; Collins et al., 2022; Jin et al., 2023; Lin et al., 2024).
In terms of performance (e.g., accuracy on classification), the literature suggests that specialised LLMs outperform general purpose models (Hu et al., 2021; Zhang et al., 2024), as well as mitigating the difficulties of handling gargantuan models and the drawbacks of data and model centralisation (Song et al., 2023).
Thus, we hypothesise that:
Hypothesis
A network of heterogeneous LLMs can automate various complex tasks with nearly no human supervision via specialised and efficient protocols.
However, networks of LLM-powered agents face three key challenges that make communication at scale significantly more difficult:
- LLMs are heterogeneous: different LLMs have different architectures, makers, capabilities and usage policies. Heterogeneity is not unique to agents of LLMs, yet, compared to classic MAS agents, LLMs come with deeper representations of the surrounding environment and are thus more challenging to standardise.
- LLMs are (mostly) general-purpose tools: enumerating and standardising each task they can perform is infeasible.
- LLMs are expensive: the computational footprint and inference time of “small” LLMs dwarfs that of comparable, specialised APIs.
Scalable communication between heterogeneous LLMs must be versatile, i.e., capable of handling a variety of use cases, efficient, i.e., requiring the least computational effort, and portable, i.e., supporting the protocol should require the least human effort possible. The above-mentioned issues constitute the Agent Communication Trilemma, which we expand in Section 3.
In light of this, the aim of this paper is the following:
Key Contribution
We design and implement a communication protocol between heterogeneous LLM-powered agents and assess its feasibility and scalability for solving high-order tasks.
We sidestep the Trilemma with Agora, a meta protocol that relies on the dual use of structured data for frequent communications and natural language for infrequent ones. With Agora, we instantiate large networks of LLM-powered agents that solve complex tasks autonomously by leveraging efficient communications schemas. In such networks, we observe agents develop an emergent fully automated protocol to solve a complex task starting from an instruction expressed in natural language. We believe that this observation can serve as a basis to renew interest in emergent protocols/languages in large networks of LLMs (Lazaridou et al., 2018; Chaabouni et al., 2019; Lazaridou & Baroni, 2020; Chaabouni et al., 2022).
The paper is structured as follows. We first outline the key challenges that constitute the Agent Communication Trilemma (Section 3); we then detail how Agora addresses the Trilemma and serves as a communication protocol for networks of LLMs (Section 4). Finally, in Section 5, we provide two fully functional demos Our code is available at github.com/agora-protocol/paper-demo.: the former, with two agents, to clarify Agora’s operating principles; the latter, with 100, to prove Agora’s scalability and show the emergence of self-organising behaviours.
## 2 Related Work
#### Multi-agent LLMs and communication.
At the time of writing, Multi-Agent-Systems of Large Language Models (MAS-LLM) have become an active area of research (Guo et al., 2024) after the upsurge of LLMs as general purpose problem solvers (Brown et al., 2020; Achiam et al., 2023; Dubey et al., 2024b). Many fields have adapted techniques from the MAS-LLM paradigm to solve problems single models fail at, including reasoning and math (Li et al., 2024), Theory of Mind (Cross et al., 2024; Li et al., 2023b), planning (Singh et al., 2024), alignment to human values (Pang et al., 2024), and simulation of games, economics, and political scenarios (Bakhtin et al., 2022; Hua et al., 2023; Wu et al., 2024a). The common intuition of these works is that by breaking a task into sub-components (Hong et al., 2023) and allocating a large number of specialised models (Li et al., 2024) to each of them (Li et al., 2023a), one can achieve higher performance and observe emergent behaviours that otherwise would not occur.
On the other hand, a key requisite for solving complex tasks in large networks of MAS-LLMs is effective and efficient communication. In large networks, LLMs must agree on the actions to take (Chen et al., 2023): works such as Agashe et al. (2023) and Liang et al. (2023) studied how LLMs debate to foster collaboration on high-order tasks (Du et al., 2023). Another recent line of research explores the topology of the MAS-LLM network as a facilitator to reach consensus (Chen et al., 2024).
#### LLMs for simulations and emergence of protocols.
A few seminal works studied how emergent communication and protocols arise between neural networks that manipulate symbols (Havrylov & Titov, 2017; Lazaridou et al., 2018; Lazaridou & Baroni, 2020). Written before the rise of LLMs, these works inspired researchers to explore how spontaneous collaboration emerges in MAS-LLMs (Wu et al., 2024b), with application to simulation of societies (Gao et al., 2024). Of particular interest for this paper are the works by Chaabouni et al. (2019) and Chaabouni et al. (2022). Chaabouni et al. (2019) describes how emergent communication systems between neural networks privilege longer messages. Chaabouni et al. (2022) posits the existence of “scaling laws” (Kaplan et al., 2020) for large networks of MAS-LLMs in which the dataset, task complexity, and population size are the key to observe emergent behaviours.
## 3 The Agent Communication Trilemma
<details>
<summary>img/triangle-trilemma.png Details</summary>

### Visual Description
\n
## Diagram: API Design Trade-off Triangle
### Overview
The image displays a conceptual diagram illustrating the trade-offs and relationships between different API design paradigms. It uses a nested triangle structure to position various approaches relative to three core attributes: Efficiency, Portability, and Versatility. The diagram suggests that different API styles prioritize different combinations of these attributes, with a central system named "Agora" attempting to balance them.
### Components/Axes
The diagram consists of two main geometric components and several text labels.
**1. Outer Triangle (Blue Outline):**
* **Vertices (Core Attributes):**
* **Top Vertex:** "Efficiency"
* **Bottom-Left Vertex:** "Portability"
* **Bottom-Right Vertex:** "Versatility"
* These three labels form the primary axes of the trade-off space.
**2. Inner Triangle (Red Outline):**
* This smaller, inverted triangle is centered within the larger one.
* **Central Label:** "Agora"
* The vertices of this inner triangle point towards the midpoints of the outer triangle's sides, suggesting a balanced or integrative position.
**3. Surrounding Paradigm Labels (Red Text):**
These labels are placed outside the main triangle, adjacent to its sides, indicating API styles or approaches associated with specific trade-offs.
* **Left Side (Adjacent to the Portability-Efficiency axis):** "Traditional static API (e.g., OBP)"
* **Right Side (Adjacent to the Versatility-Efficiency axis):** "Meta-API (e.g., RDF)"
* **Bottom Side (Adjacent to the Portability-Versatility axis):** "Natural language"
### Detailed Analysis
The diagram is a spatial metaphor for conceptual relationships, not a data chart with numerical values. The analysis focuses on the positioning and implied connections.
* **Spatial Grounding & Relationships:**
* The **"Traditional static API (e.g., OBP)"** label is positioned on the left, between the "Portability" and "Efficiency" vertices. This implies such APIs are strong in these two areas but may lack versatility.
* The **"Meta-API (e.g., RDF)"** label is positioned on the right, between the "Versatility" and "Efficiency" vertices. This implies such APIs are strong in versatility and efficiency but may sacrifice portability.
* The **"Natural language"** label is positioned at the bottom, between the "Portability" and "Versatility" vertices. This implies natural language interfaces are highly portable and versatile but may be less efficient.
* **"Agora"** is placed at the very center of the diagram, inside the inner red triangle. Its central position suggests it is designed to achieve a balance or synthesis of all three core attributes (Efficiency, Portability, Versatility), potentially overcoming the trade-offs inherent in the other paradigms.
* **Visual Trend/Flow:** The diagram creates a visual flow from the outer, specialized paradigms towards the inner, balanced solution. The inner triangle's vertices point outward, connecting Agora's balanced core to the midpoints of the trade-off axes defined by the outer triangle.
### Key Observations
1. **Triangular Trade-off Model:** The use of a triangle is a classic way to represent a three-way trade-off, where optimizing for two points often comes at the expense of the third.
2. **Color Coding:** Blue is used for the foundational attributes (the axes), while red is used for the specific API paradigms and the central "Agora" system, creating a clear visual distinction between the framework and the instances within it.
3. **Centrality of Agora:** The most prominent visual feature is the placement of "Agora" at the geometric center, framed by its own red triangle. This is a strong visual statement positioning it as the integrative or optimal solution within this conceptual space.
4. **Example-Driven Labels:** The paradigms are not just named but given concrete examples ("e.g., OBP", "e.g., RDF"), grounding the abstract concepts in real-world technologies.
### Interpretation
This diagram presents a conceptual framework for evaluating API design philosophies. It argues that traditional approaches force a compromise:
* **Traditional Static APIs** (like a hypothetical OBP) are efficient and portable but not versatile.
* **Meta-APIs** (like RDF) are efficient and versatile but not portable.
* **Natural Language** interfaces are portable and versatile but not efficient.
The diagram's central thesis is that a system named **"Agora"** is designed to transcend this trilemma. By occupying the central space, it claims to successfully integrate all three desirable properties—Efficiency, Portability, and Versatility—without the severe trade-offs of the other models. The diagram is therefore a persuasive tool, using spatial logic to position Agora as a superior, holistic solution in the landscape of API and interaction design. It suggests that Agora's approach may involve a novel synthesis of structured APIs and flexible, natural-language-like interaction.
</details>
Figure 1: The Trilemma and how our solution (Agora) balances efficiency, portability and versatility.
An agent is a computer system that, in an environment, is capable of autonomous actions (the so-called ‘agency’ (Horty, 2001)) to meet its design objective (Wooldridge & Jennings, 1995; Wooldridge, 2009, p. 15). Just as humans must negotiate and cooperate to achieve shared goals, so too must agents within multi-agent systems (Wooldridge, 2009, p. 24-25). However, when designing communication protocols for heterogeneous networks (i.e., networks where agents have different architectures, capabilities and design constraints), we run into difficulties when attempting to optimise for three properties at the same time:
- Versatility: communication between agents should support a wide variety of messages, both in terms of content and format;
- Efficiency: the computational cost of running an agent and networking cost of communication should be minimal;
- Portability: supporting the communication protocol should require the least implementation effort by the largest number of agents involved.
We name the trade-off between such properties the Agent Communication Trilemma, which is illustrated in Figure 1. In the next sections, we will discuss how an LLM-powered communication protocol can trade off versatility, efficiency, and portability.
### 3.1 Versatile vs. Portable Communication
In networks of agents, versatility and portability are at tension for two fundamental reasons (Olivé, 2007). A prerequisite for two agents who communicate is (1) a shared conceptual understanding of the topic on which they communicate. For instance, two agents can communicate about the weather if they both ‘know’ what it means to be sunny, rainy and overcast. For example, they should share a similar notion of describing and measuring temperature (e.g., in degrees Celsius). In addition, (2) agents must encode and decode messages in a way that is intelligible for both. Continuing the weather example, if two agents exchange data using JSON objects, both the sender and the receiver must know the syntax (e.g., the keys of a JSON object, such as temperature) and the semantics (e.g. temperature is a $32$ -bit floating point value representing the temperature, in central London, as measured in degrees Celsius) of the exchanged messages.
In complex scenarios, defining routines whose syntax and semantics satisfy requisites (1) and (2) may be difficult. For example, a programmer has to manually implement a method to decode (or decode) messages to (or from) other agents. Additionally, the programmer must explicitly instruct the agent about how to manipulate and reason about the message content, often by interpreting API documentation describing the semantics of the message. Therefore, there is a trade-off between the breadth of messages (versatility) and the implementation cost (portability).
An example of high-portability, low-versatility is the Open Banking Platform (OBP), which uses a well-defined Open API schema for data transfer (OBL, 2024). OBP is highly portable because it uses a fixed range of well-known concepts which developers can implement; however, it is restricted to discussing a narrow domain of banking data and is thus not versatile. On the other end of the spectrum, rules-based Semantic Web agents (Berners-Lee et al., 2001) that exchange RDF (Beckett et al., 2014) encoded documents are highly versatile since ontologies (Wooldridge, 2009, p. 180) enable the description of structured relations between essentially any concept. Still, they require developers to program agents to implement the specific ontologies used by the network (e.g., if a set of RDF triples states that the temperature is 38°C, an agent must be able to interpret the concepts of “temperature” and “Celsius”).
### 3.2 Efficient vs. Versatile and Portable Communication
As previously mentioned, rule-based agents excel at the tasks they are designed to solve but hardly adapt to new environments. Decades of research in reinforcement learning (Sutton, 2018) and then in deep reinforcement learning (Arulkumaran et al., 2017; Henderson et al., 2018), introduced a paradigm where agents learn to optimise their reward as proxy of the task we want them to solve. Agentic-LLMs, i.e., multi-agent systems powered by language models, is a recent paradigm for machine-to-machine communication that relies mostly on their proficiency at handling natural language and following instructions (Li et al., 2023a).
Natural language is highly expressive, making it a suitable choice for versatile communication (Russell & Norvig, 2016). Additionally, LLMs trained on massive corpora seem to develop an implicit understanding of various concepts that abstracts and makes communication independent from their internal architecture. Moreover, LLMs can integrate external tools, write code and invoke APIs with relatively little or no training (Schick et al., 2024), since the only requirement is a natural-language description of the tool and its parameters.
Conversely, natural language as a communication medium has two major drawbacks. While engineering and hardware improvements (Dubey et al., 2024b) mitigate costs over time, the computational requirements of invoking an LLM dwarf those of comparable APIs, representing a major bottleneck for scaling networks of LLMs. On the other hand, using closed-source pay-per-usage LLMs hosted by third parties is expensive and raises concerns in terms of replicability of the results (La Malfa et al., 2023). Additionally, natural language is inherently ambiguous: while LLMs have a certain degree of “common sense” to fulfil requests, non-determinism and natural language specifics leave space for errors that routines minimise (for instance, if someone asks for the temperature in Fahrenheit and the agent has a tool that returns the temperature in Celsius, the model must know that Celsius and Fahrenheit are both units of measure for temperature). These factors make LLMs and natural language more prone to errors than other alternatives like handwritten APIs.
In conclusion, RESTful APIs (efficient), RDF tuples (portable) and natural language (versatile) are all trade-offs in the Trilemma. While some approaches are more useful in practice than others, the fact that no communication format achieves all three properties simultaneously suggests that we need a hybrid communication protocol that leverages all of them. The next section outlines our solution.
## 4 Agora: a Communication Protocol Layer for LLMs
<details>
<summary>img/evil.png Details</summary>

### Visual Description
## Diagram: LLM-Powered Node Network Architecture
### Overview
The image is a technical diagram illustrating a multi-layered system architecture. It depicts a network of intelligent nodes at the top, a middle layer representing a technology stack or application environment, and a bottom layer representing a secure communication infrastructure. The diagram uses a 3D isometric perspective with layered planes and connecting lines to show relationships and data flow.
### Components/Axes
The diagram is organized into three distinct horizontal layers or planes, stacked vertically.
**1. Top Layer: Network of LLM-Powered Nodes**
* **Components:** Four identical black, faceted, gem-like 3D shapes (resembling octahedrons or stylized nodes). Each has a subtle reflection beneath it.
* **Labels:**
* "LLM-Powered Node" (Text positioned above the top-left node).
* "Send/receive message" (Text positioned between the top-left and top-right nodes).
* **Connections:** Thin, light gray lines connect all four nodes to each other, forming a mesh network. The label "Send/receive message" is placed along one of these connecting lines, indicating the primary function of these connections.
**2. Middle Layer: Technology Stack / Application Environment**
* **Components:** A collection of software and technology logos/icons placed on a semi-transparent, light blue rectangular plane.
* **Icons & Embedded Text (Transcribed):**
* **Database Icons:**
* A green leaf logo with the text "mongo DB" below it.
* A blue cylinder icon with the text "SQL" on it.
* **Programming/Web Technology Icons:**
* The official logos for HTML5, CSS3, and JavaScript (JS), grouped together.
* The Python programming language logo (blue and yellow snakes).
* The PHP logo (purple oval with "php" text).
* **Data Format Icon:** A document icon with the text "</>" and "XML" below it.
* **AI/Platform Icons:**
* The OpenAI logo (black and white geometric flower).
* The Meta (formerly Facebook) infinity logo (blue).
* A blue, four-pointed star or sparkle icon (unidentified, possibly representing a specific service or generic "AI").
* **Spatial Grounding:** The icons are clustered in the center of the middle plane. The database icons are on the left, web technologies in the center, and AI/platform logos on the right.
**3. Bottom Layer: Secure Communication Layer**
* **Components:** A dark blue, wireframe-style rectangular plane with a circuit-like pattern.
* **Central Element:** A large, circular icon containing a padlock symbol. Below it is the text "HTTPS".
* **Connections:** Four black dots (nodes) are placed at the corners of this layer. Lines connect these dots to the central HTTPS icon and to each other, forming a secure network backbone. Dotted vertical lines connect these corner dots to the corresponding corners of the middle layer above, indicating a foundational or supporting relationship.
### Detailed Analysis
* **Flow and Relationships:** The diagram illustrates a top-down hierarchy and interaction model.
1. **Intelligence Layer (Top):** The "LLM-Powered Nodes" form a peer-to-peer network for messaging. This suggests a distributed system of AI agents.
2. **Application/Resource Layer (Middle):** These nodes have access to or operate within an environment rich with technologies for data storage (MongoDB, SQL), web development (HTML, CSS, JS, PHP), programming (Python), data interchange (XML), and major AI platforms (OpenAI, Meta). This layer represents the tools and data sources the LLM nodes can utilize.
3. **Security/Transport Layer (Bottom):** All communication and operations are underpinned by a secure layer, explicitly labeled "HTTPS," indicating that data in transit is encrypted. The circuit-like pattern suggests this is the underlying network infrastructure.
### Key Observations
* The LLM nodes are depicted as identical, suggesting a homogeneous network of similar agents.
* The technology stack in the middle layer is diverse, covering databases, web, and AI, implying the system is designed for complex, multi-faceted tasks that may involve data retrieval, web interaction, and advanced AI processing.
* The explicit inclusion of both "mongo DB" (NoSQL) and "SQL" indicates support for multiple database paradigms.
* The presence of major platform logos (OpenAI, Meta) alongside generic technology icons suggests the architecture may integrate with or leverage external commercial AI services.
* Security (HTTPS) is presented as a foundational, non-negotiable layer supporting the entire stack.
### Interpretation
This diagram represents a conceptual architecture for a **distributed, multi-agent AI system**. The "LLM-Powered Nodes" are likely autonomous or semi-autonomous agents powered by Large Language Models. They communicate directly with each other (the mesh network) to collaborate on tasks.
The middle layer defines the **operational context and capabilities** of these agents. They are not isolated; they can query databases (both SQL and NoSQL), interact with web technologies, execute code (Python), and call upon powerful external AI models (OpenAI, Meta). This suggests the agents can perform complex workflows like data analysis, content generation, web scraping, and API integration.
The bottom layer emphasizes that **security is a core architectural principle**. All interactions, whether between nodes or with external resources, are secured via HTTPS, which is critical for a system handling potentially sensitive data and commands.
**In essence, the diagram answers the question: "How do intelligent AI agents work together in a secure, real-world environment?"** It shows they form a collaborative network, are equipped with a versatile toolkit for interacting with digital systems, and operate on a secure foundation. The absence of a single central server in the top layer points towards a decentralized or peer-to-peer design philosophy for the agent network itself.
</details>
(a) An illustration of Agora and how it abstracts the underlying implementation, communication, and physical layers.
<details>
<summary>img/evil-stack.png Details</summary>

### Visual Description
## Diagram: Layered System Architecture
### Overview
The image displays a vertical, stacked diagram representing a layered system architecture. It consists of five rectangular boxes arranged from top to bottom, each representing a distinct layer or component of a system. The diagram uses a consistent visual style with dark blue text and icons, enclosed in reddish-brown borders. The topmost box has a dashed border, suggesting it is a placeholder or represents future/abstract layers, while the four boxes below have solid borders.
### Components/Axes
The diagram is structured as a vertical stack of five labeled boxes. From top to bottom:
1. **Top Box (Dashed Border):**
* **Label:** "Further layers"
* **Position:** Top-center of the image.
* **Visual:** Text is centered within a rectangle with a dashed reddish-brown border. No icon is present.
2. **Second Box (Solid Border):**
* **Label:** "Agora"
* **Icon:** A dark blue, faceted diamond or gemstone shape, positioned to the right of the text.
* **Position:** Directly below the "Further layers" box.
3. **Third Box (Solid Border):**
* **Label:** "Implementation Layer"
* **Icons:** A cluster of three icons positioned to the right of the text:
* The OpenAI logo (a stylized, interlocking knot).
* The Python programming language logo (a blue and yellow snake).
* A database cylinder icon with the label "SQL" on it.
* **Position:** Directly below the "Agora" box.
4. **Fourth Box (Solid Border):**
* **Label:** "Communication layer"
* **Icon:** A black padlock symbol inside a circle, positioned to the right of the text.
* **Position:** Directly below the "Implementation Layer" box.
5. **Bottom Box (Solid Border):**
* **Label:** "Physical Layer"
* **Icon:** None.
* **Position:** At the bottom of the stack, centered.
### Detailed Analysis
* **Spatial Arrangement:** The layers are presented in a strict vertical hierarchy. The flow or dependency is implied to be from the bottom (Physical Layer) upwards, with each layer building upon the one beneath it. The "Further layers" box at the top suggests the architecture is extensible or that additional, unspecified components exist above the defined stack.
* **Iconography & Symbolism:**
* **Agora (Diamond):** The diamond icon often symbolizes value, a marketplace, or a core, immutable component. "Agora" historically refers to a public gathering place or assembly, suggesting this layer may be a central hub for interaction or governance.
* **Implementation Layer (OpenAI, Python, SQL):** This cluster explicitly names key technologies. It indicates that the system's core logic, data processing, and AI capabilities are built using Python, leverage OpenAI's models or APIs, and utilize SQL databases for data management.
* **Communication Layer (Padlock):** The padlock is a universal symbol for security, encryption, and access control. This layer is responsible for securing data in transit and managing authentication/authorization.
* **Physical Layer:** This is a foundational term from networking (OSI model), referring to the hardware, cables, and physical infrastructure upon which the system runs.
### Key Observations
1. **Clear Hierarchy:** The diagram presents a clean, top-down abstraction of a complex system, separating concerns into distinct, manageable layers.
2. **Technology Stack Disclosure:** The "Implementation Layer" is the only layer that specifies concrete technologies (OpenAI, Python, SQL), providing a clear snapshot of the development stack.
3. **Security Emphasis:** The dedicated "Communication layer" with a prominent lock icon highlights security as a first-class, architectural concern.
4. **Extensibility:** The dashed "Further layers" box is a notable design choice, explicitly acknowledging that the shown stack is not exhaustive and can be extended upwards.
### Interpretation
This diagram outlines a modern, full-stack system architecture designed with separation of concerns and security in mind.
* **Foundation to Abstraction:** The stack moves from concrete infrastructure ("Physical Layer") at the base, through security ("Communication layer") and core implementation ("Implementation Layer"), to a central interaction or governance hub ("Agora"), and finally to unspecified higher-level functions ("Further layers"). This follows a logical build-up from hardware to business logic.
* **AI-Centric System:** The inclusion of the OpenAI logo in the Implementation Layer strongly suggests that artificial intelligence, likely large language models, is a core component of this system's functionality, not just an add-on.
* **Architectural Intent:** The diagram communicates a design philosophy that values modularity (clear layers), security-by-design (dedicated communication layer), and future-proofing (the "Further layers" placeholder). It serves as a high-level blueprint for developers and stakeholders to understand how different parts of the system relate to one another. The "Agora" layer, positioned above the implementation, may represent the user-facing application, API gateway, or a decentralized autonomous organization (DAO) structure that utilizes the underlying AI and data services.
</details>
(b) Stack of technologies to build Agora.
Figure 2: How Agora fits into a standard communication protocol stack.
The key to solving the Communication Trilemma involves accepting that no single protocol can achieve optimal efficiency, portability and versatility at the same time. In this section we introduce Agora, a meta protocol that takes advantage of the unique capabilities of LLMs to sidestep the Trilemma by adapting different communications methods for different scenarios.
The most powerful LLMs share three key properties:
- They can understand, manipulate, and reply to other agents using natural language;
- They excel at following instructions, including writing code to implement routines (Schick et al., 2024; Hou et al., 2023; Liu et al., 2024);
- They can autonomously negotiate protocols and reach consensus on strategies and behaviours to adopt in complex scenarios (Chen et al., 2023; Fu et al., 2023).
At its core, Agora uses different communication formats depending on the circumstances; an agent can support a wide breadth of communications (high versatility) while handling the majority of the total volume of requests with efficient routines (high efficiency). Moreover, the entire negotiation and implementation workflow is handled by the LLMs and requires no human supervision (high portability). The concept of protocol documents (PD), which we sketch in Figure 3 and discuss in the next section, lies at the core of Agora’s functionalities.
In the next sections, we illustrate the hierarchy of communication methods Agora supports natively and the concept of PD; we then provide an example of how Agora works and how it enables versatile, efficient, and portable communication. We conclude by emphasising how one can integrate and build upon Agora with further technological layers independently from its underlying technologies.
### 4.1 Communication in (an) Agora
Agora introduces a machine-readable way to transfer and refer to protocols, namely the protocol documents (PDs). A PD is a plain-text description of a communication protocol. Throughout this paper, we use the word “protocol” to refer to any standardised description of structured communication. PDs are self-contained, implementation-agnostic, and contain everything an agent needs to support a protocol: this means that most descriptions of existing protocols, such as RFCs, are also suitable PDs. However, instead of relying on a central body to assign identifiers, a PD is uniquely identified by its hash (for multiplexing).
In Agora, the most frequent communications have dedicated efficient routines, and the least frequent ones use inefficient but flexible LLMs and natural language. In particular:
- When possible, frequent communications are handled through traditional protocols, for which there are standard, human-written implementations (e.g., OBP);
- For communications that happen less frequently (or for which there are no standard protocols), agents can use structured data as an exchange medium (which can be handled by LLM-written routines);
- For communications that might be frequent for one side but not the other, the agents still use structured data, but one side can choose to use an LLM, while the other uses a routine;
- For rare communications or when a routine fails unexpectedly, the agents can resort to natural language.
It is entirely up to the agent to handle a query using a human-written routine, an LLM-written routine, or an LLM (or a combination of these three). This gives the agent maximum flexibility over how to process queries. Forcing or nudging a model to use a specific communication style can improve efficiency, yet its discussion is out of the scope of this paper. One can, for example, specify in the system prompt of an LLM to negotiate a protocol whenever possible. In the Demo (Section 5.3), we will illustrate the trade-off between the versatility of a communication protocol and its expected usage.
Hierarchical communications support any form of communication (maximum versatility) , although in practice an LLM is invoked in very rare cases (maximum efficiency). Moreover, since LLMs can implement routines on their own (since PDs fully describe the syntax and semantics of a protocol), human programmers only need to provide an overview of the tools the agent has access to, which means that the implementation effort required on the human side is minimal (maximum portability). In other words, Agora sidesteps the Communication Trilemma by employing routines for frequent requests and resorting to natural language when agents need to negotiate efficient ways to solve a problem or errors occur.
### 4.2 An Example of Communication over Agora
<details>
<summary>img/pd-negotiation.png Details</summary>

### Visual Description
## Diagram: LLM-Powered Node Communication Protocol
### Overview
The image is a technical diagram illustrating a two-stage communication process between two "LLM-Powered Nodes." It depicts a protocol where nodes first negotiate a document using natural language and then exchange a message formatted according to that negotiated structure, identified by a specific hash. The diagram is split into two distinct panels, left and right, representing sequential phases of the interaction.
### Components/Axes
The diagram contains the following labeled components and visual elements:
**Nodes:**
* **Visual Representation:** Four identical black polyhedrons (appearing as icosahedrons or similar faceted shapes) with white edges, each casting a subtle reflection below.
* **Label:** The top-left node is explicitly labeled **"LLM-Powered Node"**. By implication, all four polyhedrons represent the same type of entity.
**Connections & Data Flow:**
* **Left Panel (Negotiation Phase):**
* A **solid grey line** connects the top-left node to the bottom-right node. This line is labeled with the text **"Natural language"**.
* A **dashed grey line** connects the top-left node to a document icon positioned below and between the two nodes. This line is also associated with the "Natural language" label.
* **Right Panel (Message Exchange Phase):**
* A **solid grey line** connects the top-left node to the bottom-right node. This line is labeled with the text **"Message formatted as PD hash '123'"**.
**Document Icons & Annotations:**
* **Left Panel Icon:** A document icon with a pencil, positioned below the "Natural language" line. It is annotated with the text **"Negotiate PD hash '123'"**.
* **Right Panel Icon:** A document icon with lines representing text, positioned above the connection line near the top-left node. It is annotated with the text **"Message formatted as PD hash '123'"**.
### Detailed Analysis
The diagram outlines a clear, two-step workflow:
1. **Stage 1: Negotiation (Left Panel)**
* **Actors:** Two LLM-Powered Nodes.
* **Action:** The initiating node (top-left) uses **"Natural language"** to communicate with the receiving node (bottom-right).
* **Purpose:** This communication is directed towards a specific goal, represented by the document icon with a pencil. The annotation **"Negotiate PD hash '123'"** indicates they are negotiating the structure or content of a "PD" (likely an acronym for a document type like "Protocol Document," "Payload Description," or "Package Definition").
* **Identifier:** The negotiated artifact is assigned a unique identifier: **hash '123'**.
2. **Stage 2: Formatted Message Exchange (Right Panel)**
* **Actors:** The same two LLM-Powered Nodes.
* **Action:** The initiating node now sends a message to the receiving node.
* **Format:** The message is no longer in raw natural language. It is now a **"Message formatted as PD"**.
* **Link to Negotiation:** The message is explicitly tied to the previous negotiation via the identifier **hash '123'**. This confirms the message adheres to the structure agreed upon in Stage 1.
* **Visual Cue:** The document icon near the sending node now shows formatted text lines, visually distinguishing it from the "negotiation" icon with a pencil.
### Key Observations
* **Sequential Process:** The left-to-right arrangement of panels implies a temporal sequence: negotiation must precede formatted exchange.
* **State Change:** The core transformation is from unstructured **"Natural language"** to a structured **"Message formatted as PD"**.
* **Persistent Identifier:** The hash **'123'** acts as a crucial link between the two stages, ensuring the formatted message corresponds to the previously negotiated contract.
* **Role of LLMs:** The nodes are specified as "LLM-Powered," suggesting the natural language negotiation and the subsequent formatting/parsing are performed by Large Language Models.
* **Visual Metaphor:** The polyhedron shape may symbolize a complex, multi-faceted processing unit or a secure, encapsulated node.
### Interpretation
This diagram illustrates a protocol for structured communication between AI agents (LLM-Powered Nodes). The process solves a key problem in distributed AI systems: how to move from flexible, human-like negotiation (using natural language) to efficient, unambiguous machine execution (using a predefined format).
* **The "PD"** represents a contract or schema. The negotiation phase allows the nodes to dynamically agree on this schema for a specific task or data exchange.
* **The hash '123'** is a content-addressable identifier. It ensures both parties are referencing the exact same negotiated schema, preventing versioning or reference errors.
* **The overall flow** suggests a system where tasks are first discussed and defined collaboratively in plain language, and then executed via strictly formatted messages that comply with the agreed-upon definition. This combines the adaptability of LLMs with the reliability of structured data exchange.
**Underlying Principle:** The diagram advocates for a hybrid communication model in AI systems, leveraging the strengths of both natural language (for flexibility and complex negotiation) and formal data structures (for precision and interoperability).
</details>
Figure 3: How a protocol document is negotiated between LLM-powered agents (left) and used for future efficient communications.
We now describe how two agents, Alice and Bob, can efficiently communicate over Agora using a PD routine, as illustrated in Figure 3. Alice initially sends a query with the hash of its corresponding PD. Bob uses the hash to determine if he has a corresponding routine. If so, he calls it and handles the communication without invoking the LLM. Otherwise, Bob handles the response with the LLM itself.
If Bob uses an LLM to reply several times to queries that follow a given protocol over time, to the point where using an LLM every time becomes expensive, he can use the LLM to write a routine that handles future communications.
If the routine fails or the communication is a one-off instance that does not require a protocol, Alice and Bob use natural language, which is again handled by the LLM. Natural language is also available to bootstrap communication between nodes that have never interacted before, as well as to negotiate new protocols. That said, the lower cost of routines and the lack of ambiguity are strong incentives for agents to prefer structured data.
Note that PDs can be shared with other nodes in the network, which means that two agents that have never interacted before can use protocols developed by other agents.
In the Appendix A, we provide details of five use cases of Agora to further show its versatility as a personal assistant and data analysis tool, and how it leverages compositionality and scalability to reduce costs.
### 4.3 Agora as a Layer Zero Protocol
Figure 2 illustrates that Agora is implementation and technology agnostic. The implementation of the agents themselves (e.g., LLMs), the database used to store data (e.g., VectorDB, SQL, MongoDB, etc.), the language in which implementations are written (Python, Java, etc.) and the nature of tools are all abstracted.
At the same time, PDs can refer to other protocol documents, and since routines can call other routines, agents can build upon previous negotiations to solve more complex tasks.
Finally, the versatility and portability of Agora make it straightforward to handle the addition or removal of a node, a change in the capabilities of a node, or a change in the goals of the network, as illustrated in the demo, Section 5.3.
All these factors contribute to making Agora a natural Layer Zero protocol, i.e. a foundation layer, for higher-order communication and collaboration between LLMs. We hope our protocol can fuel theoretical and applied research on complex protocols, negotiation schemes, and consensus algorithms in large networks of LLMs.
## 5 Agora in Practice
We implement and showcase two scenarios where Agora can be applied. The former, with two agents whose objective is to exchange some data; the latter, with $100$ , to test Agora scalability and the capacity of LLM-powered agents to autonomously coordinate in complex scenarios. For space reasons, the scenarios are further expanded in Appendices C and D; here, we instead focus on their functionalities and the key observations we drew in terms of efficiency/versatility/portability, reduction of costs, scalability and emergent behaviours of fully automated networks of LLMs.
### 5.1 Implementation Details
The design of Agora for our working demos follows three key principles:
- Minimality. Agora enforces the basic standards that allow for efficient negotiation and use of protocols, leaving everything else to PDs or other higher-order standards;
- Decentralisation. Agora does not rely on central authorities, with any collection of nodes being able to use Agora independently;
- Full backward compatibility. Agora supports existing communication protocols and schemas such as OpenAPI and JSON-Schema.
From a practical point of view, Agora uses HTTPS as base communication layer and JSON as format to exchange metadata. When sending a message in a given protocol, an agent sends a JSON document with three keys: the protocol hash, the body of the request formatted according to the protocol, and a non-empty list of sources from which the protocol can be downloaded. The receiver downloads the PD from its preferred source and, upon checking that the hash matches, stores it for future uses. This hash-based identification system ensures that any node can reference any PD without relying on a central authority to assign identifiers. Where PDs are stored is entirely up to the agents; aside from regular cloud storage, hash-based indexing makes decentralised storage options (such as IPFS Benet (2014)) viable. Additionally, since essentially all protocols can be stored as PDs, Agora has full backwards compatibility with existing protocols (although human programmers are encouraged to provide existing, standardised implementations instead of having the LLM re-implement them from scratch).
To simplify negotiation, an agent can expose an endpoint with a list of supported protocols: a potential sender can thus compare the list with its own to automatically determine if there is a common protocol. The sender can also use a potentially unsupported protocol, although the receiver can choose to reject it by returning a predefined error message.
Refer to LABEL:sec:\pname{}-specification for a more formal description of Agora.
### 5.2 Demo: Retrieving Weather Data
Consider two agents, Alice and Bob. Alice is a Llama-3-405B (Dubey et al., 2024b) powered agent managing the bookings of a guided tour service in London. While Llama-3 models can be hosted locally, for the sake of a proper comparison with GPT-4o and Gemini, we use a cloud provider, namely SambaNova (https://sambanova.ai). Bob is a GPT-4o (Achiam et al., 2023) agent for weather service that provides weather forecasts for a given date and location. As part of the user interaction loop, Alice notifies the user if heavy raining is expected on a booked date.
To check the weather, she initially uses her LLM to send a natural language query to Bob (phase A1):
Alice - Natural Language
What is the weather forecast for London, UK on 2024-09-27?
Bob uses his Toolformer LLM (Schick et al., 2024) to query his database (phase B1) and returns a natural language reply (phase B2):
Bob - Natural Language
The weather forecast for London, UK, on 2024-09-27 is as follows: “Rainy, 11 degrees Celsius, with a precipitation of 12 mm.”
Over time, the cost of invoking an LLM for phases A1 and B2 dominate all the other costs; Alice and Bob thus decide to develop a protocol. Alice checks if Bob already supports a suitable protocol but finds none. Therefore, she decides to negotiate a protocol with Bob. After a few rounds of negotiation, Alice and Bob agree on the following protocol: Alice sends a JSON document with two fields, location and date, and Bob replies with a JSON document containing three fields, namely temperature (in degrees Celsius), precipitation (in millimetres), and weatherCondition (one of “sunny”, “cloudy”, “rainy” and “snowy”). From there on, Alice specifies the protocol hash when performing a query. An example of exchanged message (excluding Agora’s metadata) is:
Alice - PD
{"location": "London, UK", "date": "2024-09-27"}
Both Alice and Bob independently decide to write a routine to handle their side of the communication. From now on, Alice and Bob do not need to use the LLM to transmit traffic data: a routine now automates phases A1, B1 and B2 and leverages the costs of invoking the respective LLMs.
#### A cost analysis.
In our demo, negotiating the protocol and implementing the routines cost $0.043$ USD in API calls, compared to an average cost of $0.020$ USD for a natural-language exchange. This means that, as long as Alice and Bob use the agreed-upon protocol more than twice, Agora reduces the overall cost. Please refer to Appendix C for a transcription of the negotiation process and the final protocol.
As a final note, we stress that the entire communication happened without human intervention. Additionally, should Bob become unavailable, Alice can simply reuse the PD with a new node that may use a different LLM/database/technology stack.
### 5.3 Demo: a Network of 100 Agents
<details>
<summary>img/agora-100.png Details</summary>

### Visual Description
## Diagram: Agora Sub-Network Food Delivery Process Flow
### Overview
The image is a technical diagram illustrating a multi-step, decentralized process for ordering and delivering food within a system labeled "Agora sub network." It depicts interactions between four primary entities using a protocol involving "PD" (likely "Proof Document" or "Protocol Data") and associated hash values. The diagram is split into two main sections: a network relationship map on the left and a sequential, numbered process flow on the right, culminating in a "Result."
### Components/Axes
**Primary Entities (Represented by Icons):**
1. **Client/Requestor:** A black, faceted diamond shape.
2. **Restaurant:** A plate icon containing a fried egg, toast, and salad.
3. **Delivery Rider:** A person on a red scooter with a delivery box.
4. **Traffic Service:** A traffic light with two cars (one yellow, one red).
**Key Text Labels & Abbreviations:**
* **PD:** Appears repeatedly, likely an acronym for a core data or proof unit in the protocol.
* **Hash:** Followed by a quoted number (e.g., '123', '234', '600'), suggesting unique identifiers or cryptographic hashes for specific PDs.
* **Agora sub network:** The title for the left-side network diagram.
### Detailed Analysis
**Left Section: Network Relationship Map ("Agora sub network")**
This section shows the high-level relationships and data flows between entities.
* The **Client** (diamond) has a dotted line to a document icon labeled **"Negotiate PD hash '123'"**.
* A solid arrow from the **Client** to the **Restaurant** is labeled **"'Orders food from a restaurant'"**.
* A solid arrow from the **Restaurant** to the **Delivery Rider** is labeled **"Delivers food PD Hash '234'"**.
* A solid arrow from the **Delivery Rider** to the **Traffic Service** is labeled **"Traffic flow PD Hash '600'"**.
**Right Section: Sequential Process Flow**
This details a six-step interaction sequence, with numbered circles (1-6) and downward arrows indicating order.
* **Step 1:** **Client** (diamond) → **Restaurant** (plate). Text: **"'Orders food from a restaurant'"**.
* **Step 2:** **Restaurant** (plate) → **Delivery Rider** (scooter). Text: **"Uses a PD to check the delivery service"**.
* **Step 3:** **Delivery Rider** (scooter) → **Traffic Service** (cars/light). Text: **"Uses a PD to check the traffic service"**.
* **Step 4:** **Traffic Service** (cars/light) → **Delivery Rider** (scooter). Text: **"Replies with a PD that the traffic flows"**.
* **Step 5:** **Delivery Rider** (scooter) → **Restaurant** (plate). Text: **"Replies with a PD that the rider is available"**.
* **Step 6:** **Delivery Rider** (scooter) → **Restaurant** (plate). Text: **"Replies with a PD that the food is being delivered"**.
**Result Box (Bottom Right)**
A bordered box labeled **"Result"** contains a final state:
* Icons for the **Traffic Service**, **Restaurant**, and **Delivery Rider** are grouped on the left.
* A solid arrow points from this group to the **Client** (diamond) on the right.
* Text above the arrow reads: **"The order starts"**.
### Key Observations
1. **Protocol-Centric Flow:** The entire process is mediated by "PDs." Every check, reply, and delivery action is associated with a PD, indicating a system built on verifiable data packets or proofs.
2. **Asymmetric Communication:** The Restaurant initiates checks with the Rider (Step 2), and the Rider initiates a check with Traffic (Step 3). Replies flow back along the same path (Steps 4, 5, 6).
3. **Hash Specificity:** The left-side map assigns specific hash values ('123', '234', '600') to distinct PDs (Negotiate, Delivers food, Traffic flow), suggesting these are unique, immutable identifiers for those specific data objects or contract states.
4. **Spatial Layout:** The left side (network map) is a conceptual overview. The right side (sequence) is a concrete, temporal execution of that concept. The "Result" box is spatially separate, indicating the outcome after the sequence completes.
5. **Visual Consistency:** Each entity is represented by a single, consistent icon throughout both sections of the diagram, ensuring clear tracking.
### Interpretation
This diagram models a **decentralized, proof-based coordination system** for a service (food delivery). It demonstrates how trust and verification are established not through a central authority, but through a chain of PD-based queries and confirmations between independent services (Restaurant, Rider, Traffic).
* **Process Logic:** The client's order triggers a cascade of verification. The restaurant must verify the delivery service's capacity (via PD), which in turn requires the rider to verify real-world conditions (traffic via PD). Only after these proofs are exchanged does the system confirm the rider's availability and the commencement of delivery.
* **The "Result":** The final state—"The order starts"—is not just the client being notified. It is the **convergence of verified states** from the traffic, restaurant, and rider entities, all pointing back to the client. This implies the order's execution is contingent on the successful, proven alignment of all these sub-services.
* **Underlying Concept:** The use of "Agora" (a Greek term for an open assembly or marketplace) and "sub network" strongly suggests this is a component of a larger decentralized network or blockchain-based platform. The PDs and hashes are the technical mechanisms enabling trustless cooperation between parties in this marketplace. The diagram's purpose is to visualize the protocol's handshake and verification steps that replace traditional, centralized order management.
</details>
Figure 4: Illustration of how in an Agora network with $100$ agents (left; for clarity, only the relevant sub-network is displayed), an emergent protocol for food delivery emerges (right).
We now show the scaling capabilities and emergent behaviours of Agora by considering a network of 100 LLM-powered agents. In particular, we scale the number of agents, which, as posited in Chaabouni et al. (2022), is a requisite for the emergence of complex behaviours in multi-agent networks.
We design a network of $85$ assistant agents interacting with $15$ server agents, all powered by LLMs. The server agents offer various services, such as booking hotel rooms, calling taxis, ordering food, etc. An example of a sub-network for food delivery is sketched in Figure 4, left. Their specialisation is handled via prompting, as in Deshpande et al. (2023); Joshi et al. (2023); Li et al. (2023a). As part of their workflow, server agents must interact with several tools and databases; additionally, some servers need to interact with other servers to complete assistants’ requests (e.g., taxi services use the traffic data agent to adjust estimated fares for a run). We bootstrap the network by leveraging the underlying communication layer (as described in Section 4 and Figure 2) and inform the nodes of which URLs correspond to which node, as well as manually creating the connection links between agents (e.g. the Taxi Service server knows that the server on port 5007 is a traffic server, but it does not know how to communicate with it and what information it requires);
To showcase the portability of Agora throughout the network, we use different database technologies (SQL and MongoDB) and different LLMs, both open- and closed-source (GPT-4o, Llama-3-405B, and Gemini 1.5 Pro (Reid et al., 2024)). We then generate $1000$ random queries, which range from simple ones, such as requesting today’s weather, to more complex ones, like booking rooms in ski resorts, buying tickets for movies, ordering one of each dish from a menu, and so on. For each query, assistants receive a JSON document (which represents the task data) and are tasked with fulfilling the request and returning a parsed response that follows a given schema. Queries are distributed among assistants following a Pareto distribution, to simulate some assistants sending significantly more requests than others. Each node can also read and share PDs to one of three protocol databases. Overall, these design decisions result in a very heterogeneous network, testing the limits of Agora. Refer to Appendix D for further implementation details.
#### Emergent protocols in large networks.
Once the connections are established and the networks can send and receive messages, we observe several noteworthy behaviours. As PDs are progressively shared between agents (see Figure 5(b)), we observe the emergence of a decentralised consensus on the appropriate protocols for a given task. An example of this behaviour involves ordering food from restaurants: an agent queries another to request food to be delivered to a certain address. The restaurant agent requests a delivery driver from a food delivery service, who, in turn, checks with the traffic data agent to see if the traffic is smooth enough to fulfil the delivery. None of the agents know each other’s roles and the protocols involved beyond their immediate communication. Still, the interaction of the various agents creates an automated workflow that takes care of everything. The emergence of such a protocol is illustrated in Figure 4 (right). In contrast with some recent literature on the emergence of complex protocols (Chaabouni et al., 2019), we observe that with the proper incentives (i.e., efficiency), agents in Agora escape the inefficient trap of committing to longer messages in large scale communications.
#### A cost analysis.
We compare the cost of running our Agora network against one that uses natural language for all communications. As shown in Figure 5(a), at the beginning Agora’s cost-efficiency marginally outperforms the network that relies only on natural language; this gap increases over time, with progressively more Agora-powered nodes relying on LLM-written routines. The overall cost in API queries for running $1000$ queries in the natural language network is $36.23$ USD, compared to Agora’s $7.67$ USD: in other words, executing this demo with Agora is approximately five times cheaper than with regular natural language. Continuing the demo for more queries would have led to an even larger cost difference.
<details>
<summary>x1.png Details</summary>

### Visual Description
## Line Chart: Cost per Query Comparison
### Overview
The image displays a line chart comparing the cost in US Dollars (USD) of two methods, "Natural Language" and "Agora," over a sequence of 1000 queries. The chart illustrates how the cost per query evolves for each method as the number of queries increases.
### Components/Axes
* **Chart Type:** Line chart with two data series.
* **X-Axis:**
* **Label:** "Queries"
* **Scale:** Linear, from 0 to 1000.
* **Major Tick Marks:** 0, 200, 400, 600, 800, 1000.
* **Y-Axis:**
* **Label:** "Cost (USD)"
* **Scale:** Linear, from 0.000 to 0.040.
* **Major Tick Marks:** 0.000, 0.005, 0.010, 0.015, 0.020, 0.025, 0.030, 0.035, 0.040.
* **Legend:**
* **Position:** Center-right of the plot area.
* **Entries:**
1. **Natural Language:** Represented by a teal/green line.
2. **Agora:** Represented by an orange line.
### Detailed Analysis
**1. Natural Language (Teal Line):**
* **Trend:** The line exhibits high-frequency fluctuations but maintains a generally high and relatively stable cost band throughout the 1000 queries.
* **Data Points (Approximate):**
* Starts at ~0.035 USD at query 0.
* Peaks near ~0.040 USD around query 200.
* Fluctuates primarily between ~0.033 and ~0.040 USD for most of the range.
* Shows a notable decline after query 800, dropping to a local minimum of ~0.029 USD around query 850.
* Recovers slightly to end at ~0.033 USD at query 1000.
**2. Agora (Orange Line):**
* **Trend:** The line shows a steep initial decline followed by stabilization at a much lower cost level.
* **Data Points (Approximate):**
* Starts at its highest point of ~0.027 USD at query 0.
* Experiences a rapid, near-linear decrease, falling below 0.010 USD by approximately query 100.
* Reaches a local minimum of ~0.003 USD around query 450.
* After query 400, the cost stabilizes and fluctuates within a narrow band between ~0.004 and ~0.008 USD for the remainder of the chart.
* Ends at approximately 0.006 USD at query 1000.
### Key Observations
* **Cost Differential:** The Agora method becomes significantly cheaper than the Natural Language method after the initial ~50 queries and maintains a large cost advantage thereafter.
* **Stability vs. Decline:** The Natural Language cost is volatile but range-bound. The Agora cost shows a clear learning or optimization curve, with cost decreasing dramatically before plateauing.
* **Notable Event:** Both lines show a dip around query 800-850, though the dip is more pronounced in the Natural Language series.
* **No Intersection:** After the very first few data points, the two lines do not cross, indicating a consistent cost relationship after the initial phase.
### Interpretation
The data strongly suggests that the "Agora" system achieves substantial cost efficiency gains as it processes more queries, likely due to an optimization, caching, or learning mechanism. Its cost profile is characteristic of a system with high initial overhead that amortizes quickly. In contrast, the "Natural Language" method appears to have a consistently higher operational cost per query, with variability that may reflect the inherent complexity or lack of standardization in processing unstructured natural language inputs.
The chart demonstrates a clear economic advantage for the Agora approach in scenarios involving high query volumes. The initial higher cost of Agora is quickly offset, making it the more cost-effective solution for sustained use. The parallel dip around query 800 might indicate an external factor affecting both systems, such as a change in query complexity or a system-wide event, but the Agora system's lower baseline makes it less affected in absolute terms.
</details>
(a) Cost comparison of natural language vs Agora on a network of $100$ agents. Costs are averaged with a window size of $100$ .
<details>
<summary>x2.png Details</summary>

### Visual Description
\n
## Dual-Axis Line Chart: LLM Query Performance vs. Protocol Count Over Queries
### Overview
The image displays a dual-axis line chart plotting two distinct metrics against a common x-axis representing the number of queries. The chart illustrates the relationship between the percentage of queries involving Large Language Models (LLMs) and the cumulative number of protocols implemented over a sequence of 1000 queries.
### Components/Axes
* **X-Axis (Bottom):** Labeled **"Queries"**. It is a linear scale ranging from 0 to 1000, with major tick marks at intervals of 200 (0, 200, 400, 600, 800, 1000).
* **Primary Y-Axis (Left):** Labeled **"Queries with LLMs (%)"**. It is a linear scale ranging from 20 to 80, with major tick marks at intervals of 10 (20, 30, 40, 50, 60, 70, 80). The data series for this axis is plotted in **blue**.
* **Secondary Y-Axis (Right):** Labeled **"Number of Protocols"**. It is a linear scale ranging from 0 to 14, with major tick marks at intervals of 2 (0, 2, 4, 6, 8, 10, 12, 14). The data series for this axis is plotted in **red**.
* **Legend:** There is no explicit legend box. The association between line color and axis is implied by the color of the axis labels and titles. The **blue line** corresponds to the left axis ("Queries with LLMs (%)"), and the **red line** corresponds to the right axis ("Number of Protocols").
### Detailed Analysis
**1. Blue Line Series ("Queries with LLMs (%)"):**
* **Trend:** The line shows a sharp initial decline followed by high volatility with an overall downward trend.
* **Key Data Points (Approximate):**
* Starts at its peak of ~82% at Query 0.
* Drops steeply to ~50% by approximately Query 50.
* Experiences a local peak of ~54% around Query 150.
* Enters a volatile decline, reaching its lowest point of ~21% near Query 470.
* After the low, it fluctuates between ~25% and ~38% for the remainder of the chart, ending at approximately 32% at Query 1000.
**2. Red Line Series ("Number of Protocols"):**
* **Trend:** The line is a step function, showing a non-decreasing, cumulative count that increases in discrete jumps.
* **Key Data Points (Approximate):**
* Starts at 0 protocols at Query 0.
* First increase to 1 protocol occurs very early, before Query 20.
* Jumps to 6 protocols around Query 50.
* Increases to 8 protocols near Query 100.
* Steps up to 9 protocols around Query 200.
* Increases to 10 protocols near Query 250.
* Jumps to 11 protocols around Query 280.
* Steps up to 12 protocols near Query 320.
* Increases to 13 protocols around Query 350.
* Jumps to 14 protocols near Query 600.
* Final increase to 15 protocols occurs just before Query 1000.
### Key Observations
1. **Inverse Relationship:** There is a clear inverse correlation between the two series. As the number of protocols (red line) increases in steps, the percentage of queries using LLMs (blue line) generally decreases.
2. **Volatility vs. Stability:** The "Queries with LLMs" metric is highly volatile, reacting sharply to changes. In contrast, the "Number of Protocols" is stable between increments, only changing at specific, discrete points.
3. **Stabilization Zone:** After approximately Query 600, when the protocol count reaches 14, the blue line's volatility reduces, and it oscillates within a narrower band (roughly 25-35%), suggesting a new, lower equilibrium.
4. **Initial Phase:** The most dramatic changes occur in the first 100 queries, where the LLM usage plummets from >80% to ~50% as the first 8 protocols are rapidly introduced.
### Interpretation
The data suggests a system where the introduction of structured protocols directly reduces the reliance on or need for LLM-based query handling. The stepwise increase in protocols implies they are being added or activated in batches. The initial, rapid deployment of protocols has the most significant impact, drastically cutting LLM usage. The subsequent, smaller protocol additions continue to suppress LLM query percentage, albeit with diminishing returns, as the system stabilizes. The volatility in the blue line indicates that LLM usage is sensitive to the specific queries being processed, but the overall trend is decisively downward as the protocol framework becomes more comprehensive. This chart likely demonstrates the effectiveness of a rule-based or protocol-driven system in supplanting a more general, LLM-based approach for a specific task domain.
</details>
(b) The number of queries to the LLMs in Agora decreases over time as the number of established PDs grows.
Figure 5: Summary of the efficiency of Agora for the demo with 100 agents.
## 6 Conclusions
In this paper, we introduced Agora, a meta protocol that sidesteps the Agent Communication Trilemma by using a mix of natural language and structured protocols. We showed that Agora agents can negotiate, implement and use protocols, creating self-organising networks that solve complex tasks. Additionally, we demonstrated the scalability of Agora by testing a $100$ -agent demo and achieving a five-fold reduction in costs compared to natural language-only communication. Our results showcase the power of negotiation as a basis for efficient, scalable, and decentralised agent networks. As LLMs continue to improve and as interactions between them increase, LLM-powered agent networks have the potential to surpass the scale limitations of single LLMs. Developing frameworks and protocols that enable decentralised, flexible and efficient communication, either through Agora or other technologies, can lay the foundations for a future where complex activities are partially, if not fully, automated by LLMs.
## Acknowledgements
We thank the Alan Turing Institute for providing the computational power to run our agent network, as well as SambaNova for providing credits for our Llama 3 experiments. Samuele Marro is funded by Microsoft Research Ltd. Emanuele La Malfa is funded by the Alan Turing Institute. Jesse Wright is funded by the Department of Computer Science of the University of Oxford.
## References
- Achiam et al. (2023) Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, et al. Gpt-4 technical report. arXiv preprint arXiv:2303.08774, 2023.
- Agashe et al. (2023) Saaket Agashe, Yue Fan, and Xin Eric Wang. Evaluating multi-agent coordination abilities in large language models. arXiv preprint arXiv:2310.03903, 2023.
- Arulkumaran et al. (2017) Kai Arulkumaran, Marc Peter Deisenroth, Miles Brundage, and Anil Anthony Bharath. Deep reinforcement learning: A brief survey. IEEE Signal Processing Magazine, 34(6):26–38, 2017.
- Bakhtin et al. (2022) Anton Bakhtin, Noam Brown, Emily Dinan, Gabriele Farina, Colin Flaherty, Daniel Fried, Andrew Goff, Jonathan Gray, Meta Fundamental AI Research Diplomacy Team (FAIR)† Hu, Hengyuan, et al. Human-level play in the game of diplomacy by combining language models with strategic reasoning. Science, 378(6624):1067–1074, 2022.
- Beckett et al. (2014) David Beckett, Tim Berners-Lee, Eric Prud’hommeaux, and Gavin Carothers. Rdf 1.1 turtle. World Wide Web Consortium, 2014.
- Benet (2014) Juan Benet. Ipfs-content addressed, versioned, p2p file system. arXiv preprint arXiv:1407.3561, 2014.
- Berners-Lee et al. (2001) Tim Berners-Lee, James Hendler, and Ora Lassila. The semantic web. Scientific american, 284(5):34–43, 2001.
- Brown et al. (2020) Tom B. Brown, Benjamin Mann, Nick Ryder, et al. Language models are few-shot learners. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, 2020. URL https://proceedings.neurips.cc/paper/2020/hash/1457c0d6bfcb4967418bfb8ac142f64a-Abstract.html.
- Chaabouni et al. (2019) Rahma Chaabouni, Eugene Kharitonov, Emmanuel Dupoux, and Marco Baroni. Anti-efficient encoding in emergent communication. Advances in Neural Information Processing Systems, 32, 2019.
- Chaabouni et al. (2022) Rahma Chaabouni, Florian Strub, Florent Altché, Eugene Tarassov, Corentin Tallec, Elnaz Davoodi, Kory Wallace Mathewson, Olivier Tieleman, Angeliki Lazaridou, and Bilal Piot. Emergent communication at scale. In International conference on learning representations, 2022.
- Chen et al. (2023) Huaben Chen, Wenkang Ji, Lufeng Xu, and Shiyu Zhao. Multi-agent consensus seeking via large language models. arXiv preprint arXiv:2310.20151, 2023.
- Chen et al. (2024) Yongchao Chen, Jacob Arkin, Yang Zhang, Nicholas Roy, and Chuchu Fan. Scalable multi-robot collaboration with large language models: Centralized or decentralized systems? In 2024 IEEE International Conference on Robotics and Automation (ICRA), pp. 4311–4317. IEEE, 2024.
- Collins et al. (2022) Katherine M Collins, Catherine Wong, Jiahai Feng, Megan Wei, and Joshua B Tenenbaum. Structured, flexible, and robust: benchmarking and improving large language models towards more human-like behavior in out-of-distribution reasoning tasks. arXiv preprint arXiv:2205.05718, 2022.
- Cross et al. (2024) Logan Cross, Violet Xiang, Agam Bhatia, Daniel LK Yamins, and Nick Haber. Hypothetical minds: Scaffolding theory of mind for multi-agent tasks with large language models. arXiv preprint arXiv:2407.07086, 2024.
- Deshpande et al. (2023) Ameet Deshpande, Vishvak Murahari, Tanmay Rajpurohit, Ashwin Kalyan, and Karthik Narasimhan. Toxicity in chatgpt: Analyzing persona-assigned language models. arXiv preprint arXiv:2304.05335, 2023.
- Du et al. (2023) Yilun Du, Shuang Li, Antonio Torralba, Joshua B Tenenbaum, and Igor Mordatch. Improving factuality and reasoning in language models through multiagent debate. arXiv preprint arXiv:2305.14325, 2023.
- Dubey et al. (2024a) Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Amy Yang, Angela Fan, Anirudh Goyal, Anthony Hartshorn, Aobo Yang, et al. The llama 3 herd of models. 2024a. URL https://api.semanticscholar.org/CorpusID:271571434.
- Dubey et al. (2024b) Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Amy Yang, Angela Fan, et al. The llama 3 herd of models. arXiv preprint arXiv:2407.21783, 2024b.
- Fedorenko et al. (2024) Evelina Fedorenko, Steven T. Piantadosi, and Edward A. F. Gibson. Language is primarily a tool for communication rather than thought. In Nature, pp. volume 630. Springer Nature, 2024.
- Fu et al. (2023) Yao Fu, Hao Peng, Tushar Khot, and Mirella Lapata. Improving language model negotiation with self-play and in-context learning from ai feedback. arXiv preprint arXiv:2305.10142, 2023.
- Gao et al. (2024) Chen Gao, Fengli Xu, Xu Chen, Xiang Wang, Xiangnan He, and Yong Li. Simulating human society with large language model agents: City, social media, and economic system. In Companion Proceedings of the ACM on Web Conference 2024, pp. 1290–1293, 2024.
- Gilbert (2019) Nigel Gilbert. Agent-based models. Sage Publications, 2019.
- Gilbert & Terna (2000) Nigel Gilbert and Pietro Terna. How to build and use agent-based models in social science. Mind & Society, 1:57–72, 2000.
- Guo et al. (2024) Taicheng Guo, Xiuying Chen, Yaqi Wang, Ruidi Chang, Shichao Pei, Nitesh V Chawla, Olaf Wiest, and Xiangliang Zhang. Large language model based multi-agents: A survey of progress and challenges. arXiv preprint arXiv:2402.01680, 2024.
- Havrylov & Titov (2017) Serhii Havrylov and Ivan Titov. Emergence of language with multi-agent games: Learning to communicate with sequences of symbols. Advances in neural information processing systems, 30, 2017.
- Henderson et al. (2018) Peter Henderson, Riashat Islam, Philip Bachman, Joelle Pineau, Doina Precup, and David Meger. Deep reinforcement learning that matters. In Proceedings of the AAAI conference on artificial intelligence, volume 32, 2018.
- Hong et al. (2023) Sirui Hong, Xiawu Zheng, Jonathan Chen, Yuheng Cheng, Jinlin Wang, Ceyao Zhang, Zili Wang, Steven Ka Shing Yau, Zijuan Lin, Liyang Zhou, et al. Metagpt: Meta programming for multi-agent collaborative framework. arXiv preprint arXiv:2308.00352, 2023.
- Horty (2001) John F Horty. Agency and deontic logic. Oxford University Press, 2001.
- Hou et al. (2023) Xinyi Hou, Yanjie Zhao, Yue Liu, Zhou Yang, Kailong Wang, Li Li, Xiapu Luo, David Lo, John Grundy, and Haoyu Wang. Large language models for software engineering: A systematic literature review. ACM Transactions on Software Engineering and Methodology, 2023.
- Hu et al. (2021) Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685, 2021.
- Hua et al. (2023) Wenyue Hua, Lizhou Fan, Lingyao Li, Kai Mei, Jianchao Ji, Yingqiang Ge, Libby Hemphill, and Yongfeng Zhang. War and peace (waragent): Large language model-based multi-agent simulation of world wars. arXiv preprint arXiv:2311.17227, 2023.
- Jin et al. (2023) Bowen Jin, Gang Liu, Chi Han, Meng Jiang, Heng Ji, and Jiawei Han. Large language models on graphs: A comprehensive survey. arXiv preprint arXiv:2312.02783, 2023.
- Joshi et al. (2023) Nitish Joshi, Javier Rando, Abulhair Saparov, Najoung Kim, and He He. Personas as a way to model truthfulness in language models. arXiv preprint arXiv:2310.18168, 2023.
- Kaplan et al. (2020) Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, and Dario Amodei. Scaling laws for neural language models. arXiv preprint arXiv:2001.08361, 2020.
- Kassner et al. (2020) Nora Kassner, Benno Krojer, and Hinrich Schütze. Are pretrained language models symbolic reasoners over knowledge? arXiv preprint arXiv:2006.10413, 2020.
- La Malfa et al. (2023) Emanuele La Malfa, Aleksandar Petrov, Simon Frieder, Christoph Weinhuber, Ryan Burnell, Raza Nazar, Anthony G Cohn, Nigel Shadbolt, and Michael Wooldridge. Language models as a service: Overview of a new paradigm and its challenges. arXiv e-prints, pp. arXiv–2309, 2023.
- Lazaridou & Baroni (2020) Angeliki Lazaridou and Marco Baroni. Emergent multi-agent communication in the deep learning era. arXiv preprint arXiv:2006.02419, 2020.
- Lazaridou et al. (2018) Angeliki Lazaridou, Karl Moritz Hermann, Karl Tuyls, and Stephen Clark. Emergence of linguistic communication from referential games with symbolic and pixel input. arXiv preprint arXiv:1804.03984, 2018.
- Li et al. (2023a) Guohao Li, Hasan Hammoud, Hani Itani, Dmitrii Khizbullin, and Bernard Ghanem. Camel: Communicative agents for” mind” exploration of large language model society. Advances in Neural Information Processing Systems, 36:51991–52008, 2023a.
- Li et al. (2023b) Huao Li, Yu Quan Chong, Simon Stepputtis, Joseph Campbell, Dana Hughes, Michael Lewis, and Katia Sycara. Theory of mind for multi-agent collaboration via large language models. arXiv preprint arXiv:2310.10701, 2023b.
- Li et al. (2024) Junyou Li, Qin Zhang, Yangbin Yu, Qiang Fu, and Deheng Ye. More agents is all you need. arXiv preprint arXiv:2402.05120, 2024.
- Liang et al. (2023) Tian Liang, Zhiwei He, Wenxiang Jiao, Xing Wang, Yan Wang, Rui Wang, Yujiu Yang, Zhaopeng Tu, and Shuming Shi. Encouraging divergent thinking in large language models through multi-agent debate. arXiv preprint arXiv:2305.19118, 2023.
- Lin et al. (2024) Fangru Lin, Emanuele La Malfa, Valentin Hofmann, Elle Michelle Yang, Anthony Cohn, and Janet B Pierrehumbert. Graph-enhanced large language models in asynchronous plan reasoning. arXiv preprint arXiv:2402.02805, 2024.
- Liu et al. (2024) Jiawei Liu, Chunqiu Steven Xia, Yuyao Wang, and Lingming Zhang. Is your code generated by chatgpt really correct? rigorous evaluation of large language models for code generation. Advances in Neural Information Processing Systems, 36, 2024.
- OBL (2024) OBL. Open banking read write api profile v4.0. 2024. URL https://openbankinguk.github.io/read-write-api-site3/v4.0/profiles/read-write-data-api-profile.html.
- Olivé (2007) Antoni Olivé. Conceptual modeling of information systems. Springer Science & Business Media, 2007.
- Pang et al. (2024) Xianghe Pang, Shuo Tang, Rui Ye, Yuxin Xiong, Bolun Zhang, Yanfeng Wang, and Siheng Chen. Self-alignment of large language models via multi-agent social simulation. In ICLR 2024 Workshop on Large Language Model (LLM) Agents, 2024.
- Rafailov et al. (2024) Rafael Rafailov, Archit Sharma, Eric Mitchell, Christopher D Manning, Stefano Ermon, and Chelsea Finn. Direct preference optimization: Your language model is secretly a reward model. Advances in Neural Information Processing Systems, 36, 2024.
- Reid et al. (2024) Machel Reid, Nikolay Savinov, Denis Teplyashin, Dmitry Lepikhin, Timothy Lillicrap, Jean-baptiste Alayrac, Radu Soricut, Angeliki Lazaridou, Orhan Firat, Julian Schrittwieser, et al. Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context. arXiv preprint arXiv:2403.05530, 2024.
- Russell & Norvig (2016) Stuart J Russell and Peter Norvig. Artificial intelligence: a modern approach. Pearson, 2016.
- Schick et al. (2024) Timo Schick, Jane Dwivedi-Yu, Roberto Dessì, Roberta Raileanu, Maria Lomeli, Eric Hambro, Luke Zettlemoyer, Nicola Cancedda, and Thomas Scialom. Toolformer: Language models can teach themselves to use tools. Advances in Neural Information Processing Systems, 36, 2024.
- Schulman et al. (2017) John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.
- Singh et al. (2024) Ishika Singh, David Traum, and Jesse Thomason. Twostep: Multi-agent task planning using classical planners and large language models. arXiv preprint arXiv:2403.17246, 2024.
- Song et al. (2023) Junghwan Song, Heeyoung Jung, Selin Chun, Hyunwoo Lee, Minhyeok Kang, Minkyung Park, Eunsang Cho, et al. How to decentralize the internet: A focus on data consolidation and user privacy. Computer Networks, 234:109911, 2023.
- Sutton (2018) Richard S Sutton. Reinforcement learning: An introduction. A Bradford Book, 2018.
- Vaswani et al. (2017) Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need. In Advances in Neural Information Processing Systems 30: 2017, December 4-9, 2017, Long Beach, CA, USA, pp. 5998–6008, 2017. URL https://proceedings.neurips.cc/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html.
- Wooldridge (2009) Michael Wooldridge. An introduction to multiagent systems. John wiley & sons, 2009.
- Wooldridge & Jennings (1995) Michael Wooldridge and Nicholas R Jennings. Intelligent agents: Theory and practice. The knowledge engineering review, 10(2):115–152, 1995.
- Wu et al. (2024a) Shuang Wu, Liwen Zhu, Tao Yang, Shiwei Xu, Qiang Fu, Yang Wei, and Haobo Fu. Enhance reasoning for large language models in the game werewolf. arXiv preprint arXiv:2402.02330, 2024a.
- Wu et al. (2024b) Zengqing Wu, Shuyuan Zheng, Qianying Liu, Xu Han, Brian Inhyuk Kwon, Makoto Onizuka, Shaojie Tang, Run Peng, and Chuan Xiao. Shall we talk: Exploring spontaneous collaborations of competing llm agents. arXiv preprint arXiv:2402.12327, 2024b.
- Zhang et al. (2024) Biao Zhang, Zhongtao Liu, Colin Cherry, and Orhan Firat. When scaling meets llm finetuning: The effect of data, model and finetuning method. arXiv preprint arXiv:2402.17193, 2024.
## Appendix A Agora: Use Cases
S1. Agora as a personal assistant.
A user is organising a trip to Paris: they want to book a flight, rent a car, and book a hotel room.
The LLM reads the prompt, identifies the actions it has to undertake and checks if there are LLMs available in Agora who can fulfil it. For each service, an LLM is ready to reply.
1. A user sends a message to its personal assistant. 2. The personal assistant dispatches it to Agora.
<details>
<summary>img/scenarios/s1-1.png Details</summary>

### Visual Description
## Diagram: User Request to Multi-Service LLM Assistant Flow
### Overview
The image is a conceptual diagram illustrating the process of a user making a complex, multi-service travel booking request through a smartphone, which is then dispatched to a Large Language Model (LLM) based assistant. The assistant is depicted as a central node connected to various service providers (flight, hotel, car rental). The diagram emphasizes the assistant's role in interpreting and coordinating a natural language request into actionable tasks across different domains.
### Components/Axes
The diagram is composed of three main sections arranged horizontally from left to right:
1. **User & Interface (Left Section):**
* **Icon:** A simple line-drawing profile of a human head facing right.
* **Icon:** A smartphone icon positioned to the right of the head.
* **Text (Below Icons):** A user's spoken or typed request: `"I want to book a flight, a hotel and a car for next week in Paris."`
* **Text Formatting:** The words "a flight" are in **blue**, "a hotel" is in **orange**, and "a car" is in **red**. The rest of the text is in black.
2. **Dispatch Process (Center Section):**
* **Arrow:** A solid black arrow points from the smartphone/user area to the right.
* **Text (Above Arrow):** The label for the arrow: `Dispatch to the LLMs' assistant`.
3. **Service Network (Right Section):**
* **Structure:** A diamond-shaped network graph with four circular nodes connected by lines.
* **Node 1 (Top):** Contains an **airplane icon** (representing flight booking services).
* **Node 2 (Right):** Contains a **car icon** (representing car rental services).
* **Node 3 (Bottom):** Contains a **building/hotel icon** (representing hotel booking services).
* **Node 4 (Left):** A **blank, empty circle**. This likely represents the central LLM assistant or orchestrator node.
* **Connections:** All four nodes are interconnected with straight lines, forming a complete graph where each node is directly connected to every other node.
### Detailed Analysis
* **Flow Direction:** The process flows unidirectionally from left to right: User -> Smartphone Interface -> Dispatch Arrow -> LLM Assistant Network.
* **User Request Specifics:** The request is for three distinct services (flight, hotel, car) for a specific timeframe ("next week") and location ("Paris"). The color-coding visually isolates and highlights these three key service components within the natural language sentence.
* **Network Topology:** The service network is a fully connected mesh (a complete graph K₄). This implies that the central assistant (blank node) has direct communication pathways to each service provider (flight, hotel, car), and the service providers may also have pathways to communicate with each other, suggesting potential for integrated booking or data sharing.
* **Spatial Grounding:** The legend (the color-coded text) is embedded directly within the user's quote at the bottom-left of the image. The service icons in the network (top, right, bottom) correspond directly to the color-coded terms in the quote (blue=flight/plane, orange=hotel/building, red=car).
### Key Observations
1. **Abstraction of the Assistant:** The LLM assistant is represented as a blank circle, emphasizing its role as a generic, intelligent processing node rather than a specific branded product.
2. **Implicit Complexity:** The diagram simplifies a complex process. The single "dispatch" arrow encapsulates what would involve speech-to-text, intent recognition, entity extraction (services, date, location), and task planning.
3. **Interconnected Services:** The fully connected network suggests a system architecture where the assistant can not only call upon individual services but may also facilitate coordination between them (e.g., ensuring a hotel check-in date aligns with a flight arrival time).
4. **Focus on Intent:** The design highlights the transformation of an unstructured, human-language command into a structured set of tasks distributed across a service-oriented architecture.
### Interpretation
This diagram serves as a high-level technical illustration of a **multi-domain task-oriented dialogue system**. It demonstrates the core value proposition of an LLM-powered assistant: to act as a unified, natural language interface to a fragmented ecosystem of digital services.
* **What it suggests:** The system is designed to handle compound requests. Instead of the user interacting with three separate apps (airline, hotel, car rental), they make one request to the assistant. The assistant's intelligence lies in parsing the request, identifying the sub-tasks, and orchestrating the necessary API calls or interactions with the respective service backends (represented by the icon nodes).
* **Relationships:** The user is the source of intent. The smartphone is the input channel. The "dispatch" represents the handoff from the client application to the backend AI service. The network graph represents the backend ecosystem where the actual service fulfillment is coordinated.
* **Notable Implication:** The blank node (assistant) is the central hub. Its position in the network (connected to all services) is critical. It implies that the assistant maintains the context of the overall user goal ("a trip to Paris") and manages the state across multiple, potentially independent, service transactions. This is more advanced than a simple command router; it suggests an agent capable of multi-step planning and execution.
</details>
The LLM that acts as personal assistant in the network dispatches the flight, hotel and car requests to the respective LLMs in the network. The messages are dispatched in natural language as there are no pre-existing routines to handle them.
1. The LLM personal assistant dispatches the respective messages to the right node. 2. The car, hotel, and flight LLMs process the requests and turn them into queries for their booking systems. 3. Each LLM replies with their availability and options.
<details>
<summary>img/scenarios/s1-2.png Details</summary>

### Visual Description
## Diagram: Travel Booking Service Flow
### Overview
The image is a black-and-white schematic diagram illustrating a user interaction flow for booking travel-related services. It depicts a central user or system initiating three distinct booking actions, with implied connections between the resulting services. The diagram uses simple icons and directional arrows to show command flow and relationships.
### Components/Axes
The diagram consists of four primary circular nodes and connecting lines:
1. **Central Node (Left-Center):** An empty circle. This represents the origin point, likely a user or a central command interface.
2. **Action Nodes (Connected via Solid Arrows):**
* **Top-Center Node:** A circle containing a black silhouette icon of an airplane.
* **Right-Center Node:** A circle containing a black silhouette icon of a car (sedan/taxi style).
* **Bottom-Center Node:** A circle containing a black silhouette icon of a multi-story building (hotel).
3. **Connections:**
* **Solid Directed Arrows:** Three solid black arrows originate from the central node, each pointing to one of the action nodes. Each arrow is labeled with a specific command.
* **Dashed Lines:** Light gray dashed lines connect the three action nodes to each other, forming a triangle. These lines are undirected and suggest a relationship or dependency between the booked services.
### Detailed Analysis
**Textual Labels (Commands):**
The text labels are placed along the solid arrows, indicating the command or intent that triggers the action.
* Arrow to Airplane Node: `"Book a flight for Paris"`
* Arrow to Car Node: `"Book a car in Paris"`
* Arrow to Hotel Node: `"Book a hotel room in Paris"`
**Spatial Layout and Flow:**
* The flow is **divergent**, starting from a single point (central circle) and branching out to three separate service endpoints.
* The dashed lines create a **secondary, interconnected network** between the service endpoints (flight, car, hotel), implying that booking one service may be related to or affect the others (e.g., a flight booking informs the car pickup date, a hotel stay aligns with the flight dates).
### Key Observations
1. **Command Specificity:** Each command is specific to the destination "Paris," indicating a context-aware system.
2. **Iconography:** The icons are universal and unambiguous: airplane for flight, car for ground transportation, building for accommodation.
3. **Relationship Indication:** The dashed lines are a critical component. They transform the diagram from a simple one-to-many command list into a system where the outputs (booked services) are interrelated. This suggests a coordinated travel planning system rather than three independent bookings.
4. **Central Authority:** The empty central node implies a user or a master agent that orchestrates the entire process.
### Interpretation
This diagram models a **multi-service travel booking system** from a user's perspective. It demonstrates a common real-world scenario where a traveler needs to coordinate multiple services for a single trip.
* **What it suggests:** The system is designed to understand high-level, natural language commands ("Book a...") and decompose them into specific service bookings. The presence of the dashed lines is the most insightful element; it indicates that the system is not merely a dispatcher but likely maintains a **shared context** (the trip to Paris). This context allows for intelligent coordination—for example, ensuring the car rental period matches the flight arrival/departure times, or that the hotel check-in date aligns with the overall itinerary.
* **Why it matters:** This represents a move from siloed service bookings to an integrated travel assistant. The diagram visually argues for the value of a unified system that manages dependencies between services, reducing user effort and potential scheduling conflicts. The central node's emptiness could symbolize that the user's intent is the starting point, and the system handles the complex logistics represented by the interconnected nodes.
* **Underlying Pattern:** The structure follows a **hub-and-spoke model with peer-to-peer connections**. The hub (user) initiates actions, but the spokes (services) are also linked, creating a robust network for a cohesive travel product.
</details>
For the next iterations, the LLMs involved in the request propose a routine to standardise the requests to avoid natural language and process the request without invoking the LLMs.
<details>
<summary>img/scenarios/s1-3.png Details</summary>

### Visual Description
## System Architecture Diagram: Service Integration Patterns
### Overview
The image displays two side-by-side diagrams, labeled "1" and "2", illustrating contrasting architectural patterns for integrating travel-related services (flights, cars, hotels). Both diagrams use identical service icons but depict fundamentally different communication and data flow structures.
### Components/Axes
**Common Elements (Both Diagrams):**
* **Service Icons:** Three circular icons representing external services:
* **Airplane Icon:** Represents a flight service.
* **Car Icon:** Represents a car rental service.
* **Building Icon:** Represents a hotel service.
* **Connecting Lines:** Dashed gray lines connect the three service icons to each other in a triangular formation, suggesting a potential peer relationship or network.
**Diagram 1 (Left):**
* **Central Node:** A large, empty circle positioned to the left of the service icons.
* **Data Flow Arrows:** Three solid black arrows point **from** the service icons **to** the central node.
* **Labels on Arrows:**
* Arrow from Airplane Icon: `<list of flights>`
* Arrow from Car Icon: `<list of cars>`
* Arrow from Hotel Icon: `<list of hotels>`
**Diagram 2 (Right):**
* **Central Mediator:** A black document icon (resembling a contract or specification file) positioned in the center of the service triangle.
* **Protocol Arrows:** Three solid black arrows point **from** each service icon **to** the central document icon.
* **Labels on Arrows:** Each arrow is labeled with the word `protocol` in italics.
### Detailed Analysis
**Diagram 1: Direct Data Aggregation Pattern**
* **Structure:** A hub-and-spoke model. The central node acts as an aggregator or client.
* **Flow Direction:** Unidirectional. Data flows from each external service (Airplane, Car, Building) directly to the central aggregator.
* **Data Exchanged:** The labels specify the exact data payloads: `<list of flights>`, `<list of cars>`, and `<list of hotels>`. This implies the central node receives raw, service-specific data lists.
* **Relationship:** The dashed lines between services are present but unused in this flow, indicating the services do not communicate directly with each other in this pattern.
**Diagram 2: Protocol-Mediated Integration Pattern**
* **Structure:** A mediator or broker model. The central document icon represents a shared protocol, standard, or interface specification.
* **Flow Direction:** Unidirectional. Each external service communicates its data or capabilities **to** the central protocol definition.
* **Data Exchanged:** The label `protocol` on each arrow indicates that services are not sending raw data lists, but are instead adhering to or publishing to a common communication standard or contract.
* **Relationship:** Similar to Diagram 1, the dashed lines between services are present but unused, showing the protocol is the sole point of integration.
### Key Observations
1. **Identical Components, Different Topology:** The core difference is the replacement of the generic aggregator node (empty circle) in Diagram 1 with a specific protocol/document icon in Diagram 2.
2. **Shift in Abstraction:** Diagram 1 focuses on the **data** being transferred (`<list of...>`). Diagram 2 focuses on the **rules** for communication (`protocol`).
3. **Arrow Direction Consistency:** In both diagrams, arrows point from the services to the central element, indicating a "push" or "publish" model from the services' perspective.
4. **Visual Emphasis:** The central document icon in Diagram 2 is filled and prominent, visually emphasizing the protocol as the critical, defining component of that architecture.
### Interpretation
This diagram contrasts two fundamental approaches to system integration in a multi-service environment (e.g., a travel booking platform).
* **Diagram 1 represents a tightly-coupled, point-to-point integration style.** The central system must understand and handle the specific data format of each service (`<list of flights>` is different from `<list of cars>`). This can lead to complex, brittle code in the central aggregator as each new service requires custom integration logic. The pattern is simple but does not scale well.
* **Diagram 2 represents a loosely-coupled, standardized integration style.** By introducing a common `protocol`, the architecture decouples the services from the central system. Each service only needs to conform to the shared protocol. This promotes interoperability, simplifies the central system's logic (it only needs to understand one protocol), and makes adding or changing services easier. The protocol acts as a **contract**, enabling a more modular and maintainable system.
**Underlying Message:** The progression from 1 to 2 illustrates a design evolution from custom data aggregation to standardized interface-based communication. It argues for the benefits of defining clear protocols to manage complexity in distributed systems. The unused dashed lines between services hint that a future evolution (Diagram 3?) might involve direct service-to-service communication using the same established protocol.
</details>
The user receives all the data and decides whether to book or not.
S2. Security and scalability.
An LLM (Alice) collects some historical data from another LLM (Bob) that has access to a database whose internal mechanism and implementation are to keep private.
Alice submits a request to collect some historical records from Bob. The request is formatted in natural language.
<details>
<summary>img/scenarios/s2-1.png Details</summary>

### Visual Description
## Diagram: System Interaction and Data Access Flow
### Overview
The image is a technical diagram illustrating a communication and data retrieval process between two entities (labeled A and B) and a database. The diagram emphasizes a restricted access model, where the database is explicitly marked as inaccessible to other Large Language Models (LLMs).
### Components/Axes
The diagram consists of the following visual elements and text labels:
1. **Entity A**: A circle on the left side containing the label **"A"**.
2. **Entity B**: A circle in the center containing the label **"B"**.
3. **Database**: A cylindrical database icon on the right side.
4. **Communication Arrow**: A solid black arrow pointing from A to B.
5. **Request Text**: Text positioned below the arrow from A to B, reading: **"I need the records from 2012"**.
6. **Bidirectional Arrows**: Two parallel, horizontal arrows between B and the database, indicating a two-way data exchange.
7. **Access Restriction Zone**: A shaded gray rectangle on the right, containing the database. A vertical dashed line separates this zone from the rest of the diagram.
8. **Access Restriction Label**: Text inside a rounded rectangle within the shaded zone, reading: **"Not Accessible by other LLMs"**.
9. **Dotted Path**: A faint, dotted circle positioned above and between A and B. Dotted lines connect this circle to both A and B, suggesting an alternative, indirect, or potential communication path.
### Detailed Analysis
The diagram depicts a specific sequence and set of relationships:
* **Primary Flow**: The main interaction is a direct, linear request. Entity **A** sends a specific query (**"I need the records from 2012"**) to Entity **B** via a solid arrow.
* **Data Retrieval**: Entity **B** then engages in a bidirectional exchange with the database, as shown by the two parallel arrows. This implies B has the capability to query the database and receive data.
* **Spatial Grounding & Access Control**: The database and the bidirectional arrows are contained within a shaded gray box on the right side of the image. This box is demarcated by a vertical dashed line. The label **"Not Accessible by other LLMs"** is placed directly above the database icon within this zone, clearly defining the access boundary.
* **Alternative Path**: A dotted circle with dotted lines to both A and B suggests a secondary or potential connection that is not the primary focus of this specific interaction flow.
### Key Observations
1. **Exclusive Access**: The most prominent feature is the explicit restriction on the database. The system is designed so that only Entity **B** (or the system it represents) can directly access the data store. Other LLMs are barred from this access.
2. **Role Specialization**: Entity **A** acts as a requester or client, while Entity **B** acts as a gateway, intermediary, or privileged agent with exclusive data retrieval rights.
3. **Specific Data Request**: The query is not generic; it specifies a temporal filter (**"from 2012"**), indicating the system handles structured or time-series data.
4. **Visual Hierarchy**: The solid lines and central placement of the A->B->Database flow establish it as the primary process. The dotted path is visually subordinate.
### Interpretation
This diagram illustrates a **secure data access architecture** for an AI or LLM-based system. The core message is one of **controlled privilege**.
* **What it demonstrates**: It shows a pattern where a user or front-end agent (A) cannot directly query a sensitive or proprietary database. Instead, it must route its request through a specialized, authorized component (B). This component (B) likely has the necessary credentials, API keys, or embedded logic to safely interact with the database.
* **Why it matters**: This design enhances security and control. It prevents unauthorized or unvetted LLMs from directly accessing raw data, reducing risks of data leakage, corruption, or misuse. It centralizes data access logic within component B, which can enforce validation, logging, and compliance rules.
* **Underlying Principle**: The diagram advocates for or describes a system where valuable data assets are shielded behind a dedicated access layer. The phrase "Not Accessible by other LLMs" suggests this might be a proprietary setup or a method to maintain a competitive advantage by keeping a specific dataset exclusive to one's own models. The dotted path might represent a future capability, a monitoring channel, or an administrative backdoor not used in standard operation.
</details>
Alice submits another request to Bob.
Bob negotiates a protocol to query its data and writes a shared document protocol in JSON.
<details>
<summary>img/scenarios/s2-2.png Details</summary>

### Visual Description
## Diagram: Protocol Negotiation and Private Data Access
### Overview
The image is a technical diagram illustrating a communication protocol negotiation between two entities (labeled A and B), where entity B has exclusive access to a private data store. The diagram uses circles, arrows, icons, and text to depict the flow of negotiation, the resulting protocol, and a restricted data resource.
### Components/Axes
The diagram is composed of the following spatially arranged elements:
1. **Entity A**: A circle on the left side, labeled with the letter "A".
2. **Entity B**: A circle in the center, labeled with the letter "B".
3. **Dotted Circle**: A circle with a dotted outline, positioned above and between A and B. It is connected to both A and B by dashed grey lines.
4. **Negotiation Channel**: A thick, solid black line with arrowheads at both ends connecting A and B directly. Below this line is the text: `<negotiation of the protocol>`.
5. **Protocol Output**: A solid black arrow originates from the bottom of circle B and points diagonally down-left to a document icon. The label `protocol` is written along this arrow.
6. **Document Icon**: A black icon representing a text document, located at the bottom-center of the diagram.
7. **Private Data Store**: A cylinder icon (representing a database) located on the right side of the diagram, within a light grey shaded box.
8. **Access Boundary**: A vertical dashed line separates entity B from the shaded box containing the database.
9. **Access Restriction Label**: Inside the shaded box, above the database icon, is a rounded rectangle containing the text: `Not Accessible by other LLMs`.
10. **Data Channel**: A pair of thin, solid black arrows (one pointing left, one pointing right) connect entity B to the database icon, crossing the dashed boundary line.
### Detailed Analysis
* **Flow and Relationships**:
* The primary interaction is a bidirectional negotiation between A and B, as indicated by the double-headed arrow and its label.
* The dotted circle above may represent an external reference, a third-party service, or a conceptual space for the negotiation context, connected via dashed (likely indirect or potential) links.
* Following the negotiation, entity B produces or outputs a "protocol," represented by the arrow pointing to the document icon.
* Entity B has a bidirectional data connection (read/write) with a database. This connection is explicitly marked as being behind a boundary (the dashed line) and is labeled as inaccessible to other Large Language Models (LLMs).
* **Text Transcription**:
* `A` (Label for left circle)
* `B` (Label for center circle)
* `<negotiation of the protocol>` (Label for the A-B channel)
* `protocol` (Label for the arrow from B to the document)
* `Not Accessible by other LLMs` (Label inside the shaded box)
### Key Observations
1. **Asymmetric Access**: The diagram highlights a key asymmetry. While A and B can negotiate, only B is shown to have a direct, privileged connection to a data resource.
2. **Protocol as an Output**: The negotiation results in a tangible output—a protocol document—suggesting the process is formalized and recorded.
3. **Exclusivity**: The label "Not Accessible by other LLMs" is a critical design constraint, indicating the data store is private to this specific system or agent (B) and is a protected component.
4. **Visual Hierarchy**: The shaded box and dashed line create a strong visual boundary, emphasizing the isolation and security of the data store.
### Interpretation
This diagram models a system where two agents (A and B) establish a communication or interaction protocol through negotiation. The core insight is that agent B serves as a gateway or privileged actor with exclusive access to a private knowledge base or data repository. This setup suggests several possible scenarios:
* **A Specialized Agent**: B could be a specialized AI agent or service that has been trained on or has access to proprietary data. Agent A (which could be a user, another AI, or a client system) must first negotiate the terms of interaction (the protocol) with B before receiving the formal protocol specification. The actual data remains behind B, inaccessible to A or other external LLMs.
* **Security and Privacy Model**: The diagram explicitly enforces a security model. The private data is siloed, and access is mediated solely through the negotiated protocol and entity B. This prevents direct data leakage to other models.
* **Process Abstraction**: The dotted circle may represent the abstract "problem space" or "negotiation context" that both A and B reference, but which does not hold data itself.
In essence, the diagram illustrates a pattern for integrating private, sensitive data with AI systems by creating a controlled negotiation layer (B) that outputs a defined protocol, thereby protecting the underlying data from unauthorized access or exposure to other models.
</details>
Alice now uses the protocol to query data from Bob.
Bob directly turns the JSON they receives from Alice into a query for its Database. In this way: Bob does not invoke the LLM and the database internals are not exposed.
<details>
<summary>img/scenarios/s2-3.png Details</summary>

### Visual Description
## Diagram: System Data Flow with Restricted Database Access
### Overview
The image is a technical system diagram illustrating a data flow between two primary components (A and B) and a database. The diagram emphasizes that the database is not accessible by other Large Language Models (LLMs). The visual style is a clean, monochromatic schematic with circles representing nodes, arrows representing data flow, and a shaded region indicating a restricted zone.
### Components/Axes
The diagram consists of the following spatially arranged components:
1. **Left Region:**
* **Circle A:** A large, solid-outlined circle labeled with the letter "A" in its center. It is positioned on the far left.
* **Document Icon:** A black icon resembling a text file or document is placed to the right of Circle A.
* **JSON Data Snippet:** Adjacent to the document icon is a block of text formatted as JSON:
```
{
"record": 2013
}
```
2. **Center Region:**
* **Circle B:** A large, solid-outlined circle labeled with the letter "B" in its center. It is positioned to the right of Circle A.
* **Dotted Circle:** A smaller circle with a dotted outline is positioned above and between Circles A and B.
3. **Right Region (Shaded Zone):**
* **Shaded Box:** A light gray rectangular area with a dashed vertical border on its left side, indicating a separate or restricted zone.
* **Database Icon:** A standard cylinder icon representing a database is located within this shaded zone.
* **Restriction Label:** A rounded rectangle label inside the shaded zone contains the text: "Not Accessible by other LLMs".
4. **Connectors (Flow Lines):**
* **Solid Arrow (A → B):** A thick, solid black arrow points directly from Circle A to Circle B. The document icon and JSON snippet are placed along this path.
* **Bidirectional Arrow (B ↔ Database):** A pair of thin, parallel arrows (one pointing right, one pointing left) connects Circle B to the Database icon, indicating two-way communication.
* **Dotted Lines:** Two dashed gray lines form a triangle. One line connects Circle A to the top Dotted Circle. The other line connects the Dotted Circle to Circle B.
### Detailed Analysis
* **Data Flow:** The primary data flow is from Component A to Component B, carrying a data record. The record's structure is shown as a JSON object with a key `"record"` and a numeric value `2013`.
* **Interaction with Database:** Component B has a bidirectional relationship with the database, suggesting it can both write data to and read data from it.
* **Inaccessible Component:** The database is explicitly contained within a shaded region labeled "Not Accessible by other LLMs," defining a security or access boundary.
* **Indirect Connection:** The dotted lines and the unlabeled dotted circle suggest an indirect, potential, or secondary relationship between A and B that bypasses the direct data path. The purpose of this dotted circle is not defined by any label.
### Key Observations
1. **Access Control:** The most prominent feature is the explicit restriction on database access for "other LLMs," highlighting a security or proprietary data boundary.
2. **Data Specificity:** The example data record `"record": 2013` is concrete, though its meaning (e.g., a year, an ID) is not explained.
3. **Unlabeled Element:** The dotted circle at the top of the diagram is a significant visual component but lacks any label or explanatory text, leaving its function open to interpretation.
4. **Flow Directionality:** The solid arrow defines a clear, one-way data transfer from A to B. The database connection is explicitly two-way.
### Interpretation
This diagram models a system where a component (A) generates or possesses a specific data record and transmits it to another component (B). Component B acts as an intermediary or processor that stores this data in a database. The critical design constraint is that this database is walled off from access by any other Large Language Models, suggesting the data is sensitive, proprietary, or part of a closed system.
The dotted triangle connection could represent several possibilities: an alternative communication channel, a monitoring or logging pathway, a potential failure mode, or a higher-level orchestration layer. Its lack of labeling is a notable information gap.
The use of "LLMs" in the restriction label is key. It implies the context is an AI or machine learning system where multiple models might operate, and this specific data store is siloed for the exclusive use of the model(s) associated with components A and B. The diagram effectively communicates a architecture of data flow with a clear emphasis on access control and isolation.
</details>
S3. Compositional tasks.
An LLM (Alice) wants to (1) analyse some market data and then (2) compute some metrics. Two LLMs in the network can do that.
1. Alice retrieves the protocol documents from a database. 2. Alice finds out that there are two protocol documents that can be used to achieve its goal.
<details>
<summary>img/scenarios/s3-1.png Details</summary>

### Visual Description
## Diagram: Protocol Document and Data Processing Workflow
### Overview
The image is a technical diagram illustrating a workflow or system process. It depicts a central entity (labeled "A") interacting with three other components: a set of documents, a database, and a calculator. The diagram uses icons and directional arrows to show the flow of checks and data retrieval.
### Components/Axes
The diagram is composed of four primary visual elements arranged in a triangular layout with a central node.
1. **Document Icons (Top-Left):**
* **Position:** Located in the upper-left quadrant of the image.
* **Description:** Three identical, dark gray icons representing documents or files. Each icon shows a page with a folded corner and lines suggesting text.
* **Associated Text:** To the left of these icons, the text reads: **"Check if a protocol document exists"**.
2. **Central Node "A" (Center-Left):**
* **Position:** Located in the lower-left quadrant, below the document icons.
* **Description:** A large, white circle with a black outline containing the bold, black capital letter **"A"**.
* **Associated Text:** Below this circle, the text reads: **"Retrieve some numerical data and compute some metrics."**
3. **Database Icon (Top-Right):**
* **Position:** Located in the upper-right quadrant of the image.
* **Description:** A dark gray icon of a stacked disk cylinder, universally representing a database or data storage. It is enclosed within a white circle with a black outline and a subtle drop shadow.
4. **Calculator Icon (Bottom-Right):**
* **Position:** Located in the lower-right quadrant of the image.
* **Description:** A dark gray icon of a calculator with visible buttons (including `÷`, `×`, `-`, `+`, `=`). It is enclosed within a white circle with a black outline and a subtle drop shadow.
### Detailed Analysis
The diagram defines a process flow through explicit connections and implied relationships.
* **Explicit Flow (Solid Arrows):**
* A pair of solid black arrows connects the **Document Icons** and the **Central Node "A"**.
* One arrow points downward from the documents to "A".
* A parallel arrow points upward from "A" to the documents.
* This bidirectional arrow, paired with the text "Check if a protocol document exists," indicates a verification or query process where "A" checks the documents and receives a response.
* **Implicit Relationships (Dashed Lines):**
* A light gray dashed line connects the **Central Node "A"** to the **Database Icon**.
* Another light gray dashed line connects the **Central Node "A"** to the **Calculator Icon**.
* A third light gray dashed line connects the **Database Icon** to the **Calculator Icon**.
* These dashed lines suggest pathways for data flow or logical connections. The text below "A" ("Retrieve some numerical data and compute some metrics") clarifies that "A" retrieves data (implied from the database) and computes metrics (implied using the calculator).
### Key Observations
1. **Central Actor:** The entity labeled "A" is the central orchestrator of the process, initiating checks and data operations.
2. **Process Sequence:** The diagram implies a two-stage process:
* **Stage 1 (Verification):** "A" checks for the existence of a protocol document.
* **Stage 2 (Computation):** Following verification, "A" retrieves numerical data (from the database) and performs computations (using the calculator).
3. **Visual Hierarchy:** The solid arrows and their associated text describe the primary, defined action. The dashed lines and the second block of text describe the subsequent, implied data processing task.
### Interpretation
This diagram models a common technical workflow where a system component ("A") must first validate the presence of a governing protocol or specification before proceeding with data analysis.
* **What it represents:** The diagram illustrates a **conditional data processing pipeline**. The computation of metrics is contingent upon the successful verification of a protocol document. This is a pattern often seen in scientific computing, financial systems, or automated reporting, where calculations must adhere to a defined standard.
* **Relationships:** The connections show "A" as a middleware or controller. It interfaces with a document repository (for rules/specs), a data store (for raw inputs), and a computational engine (for processing). The direct dashed line between the database and calculator suggests that data may flow between them, possibly orchestrated by "A".
* **Notable Implication:** The separation of the "check" (for a document) from the "retrieve and compute" action highlights a design focused on **compliance and reproducibility**. The process ensures that any metrics generated are based on an approved protocol, which is critical for auditability and correctness in technical environments. The diagram does not specify the outcome if a protocol document is *not* found, leaving that logic to the implementation.
</details>
1. Alice submits a request to the first agent to retrieve the data using the first protocol document. 2. Alice receives the data as expected.
<details>
<summary>img/scenarios/s3-2.png Details</summary>

### Visual Description
## Diagram: Two-Stage Data Processing Workflow
### Overview
The image is a black-and-white schematic diagram illustrating two related but distinct processes or stages, labeled "1" and "2". Both stages involve a central entity "A" interacting with a database, but the nature of the interaction differs, as indicated by distinct intermediary icons. The diagram uses simple line art with subtle drop shadows under the circular and database elements.
### Components/Axes
The diagram is divided into two primary regions:
**Region 1 (Left Side):**
* **Label:** A square box containing the number **"1"** is positioned in the top-left corner.
* **Central Entity:** A circle containing the capital letter **"A"** is located in the bottom-left area.
* **Target:** A cylindrical database icon (represented by stacked disks) is positioned in the upper-center area.
* **Flow:** A solid black arrow points from entity "A" to the database.
* **Intermediary Icon:** A document icon (a rectangle with a folded corner, containing lines representing text and a small square) is placed along the arrow's path, closer to entity "A".
**Region 2 (Right Side):**
* **Label:** A square box containing the number **"2"** is positioned in the bottom-right corner.
* **Central Entity:** A circle containing the capital letter **"A"** is located in the bottom-center area.
* **Target:** A cylindrical database icon, identical to the one in Region 1, is positioned in the upper-right area.
* **Flow:** A solid black arrow points from entity "A" to the database.
* **Intermediary Icon:** A scatter plot icon (an L-shaped axis with five dots plotted within it) is placed along the arrow's path, closer to the database.
### Detailed Analysis
The diagram presents a parallel structure:
1. **Process 1:** Entity "A" sends or provides a **document** (represented by the document icon) to a database. This suggests an action of data input, logging, or storage of textual/formal information.
2. **Process 2:** Entity "A" sends or provides **data for analysis** (represented by the scatter plot icon) to a database. This suggests an action of querying, analyzing, or visualizing data already stored within the database.
The identical labeling of the central entity as "A" in both stages implies it is the same actor, system, or data source performing two different types of operations on a database (or two similar databases).
### Key Observations
* **Symmetry and Variation:** The layout is asymmetric but balanced. The core structure (Entity A -> Arrow -> Database) is repeated, creating a clear comparison. The key variation is the intermediary icon, which defines the different nature of each process.
* **Iconography:** The icons are standard, universally recognizable symbols: a cylinder for a database, a document for a file/text, and a scatter plot for data analysis/visualization.
* **Directionality:** The arrows clearly indicate a unidirectional flow from entity "A" to the database in both cases.
* **Spatial Grounding:** The label "1" is top-left, anchoring the first process. The label "2" is bottom-right, anchoring the second process. The intermediary icons are placed strategically along the flow lines to modify the interpretation of the action.
### Interpretation
This diagram likely illustrates a fundamental data lifecycle or system architecture pattern. It demonstrates that a single entity ("A") can interact with a data store in multiple ways:
* **Stage 1 (Input/Storage):** Represents the **write** or **ingestion** path. "A" is the source of new information (a document) being committed to the database.
* **Stage 2 (Analysis/Output):** Represents the **read** or **analysis** path. "A" is the requester or beneficiary of insights derived from the database's contents, visualized as a scatter plot.
The separation into two numbered stages suggests they might be sequential (first store data, then analyze it) or simply categorical (two primary types of interaction). The use of the same entity "A" and database icon emphasizes that these are complementary functions within a single system. The absence of specific technical labels (like "API," "SQL," "User") makes this a high-level conceptual model applicable to various contexts, from software engineering to business process mapping.
</details>
1. Alice submits a request to the second LLM to compute some metrics on the data using the second protocol document. 2. Alice receives the metrics as expected.
<details>
<summary>img/scenarios/s3-3.png Details</summary>

### Visual Description
## Diagram: Two-Step Data Processing Flow
### Overview
The image is a black-and-white technical diagram illustrating a two-step process flow between a component labeled "A" and a computational unit represented by a calculator icon. The diagram is divided into two horizontally stacked sections, numbered "1" and "2", showing a reversal in the direction of data flow and a change in the type of data being transferred.
### Components/Axes
The diagram contains the following visual elements, positioned as described:
1. **Numbered Labels:**
* A square box containing the number "1" is located in the top-left corner of the upper section.
* A square box containing the number "2" is located in the top-left corner of the lower section.
2. **Primary Nodes:**
* **Component "A":** A large circle containing the capital letter "A". This node appears twice, once in each section, positioned on the left side.
* **Calculator Unit:** A large circle containing a calculator icon. This node also appears twice, once in each section, positioned on the right side.
3. **Flow Arrows:**
* **Section 1:** A solid black arrow originates from the right side of circle "A" and points directly to the left side of the calculator circle. This indicates a left-to-right data flow.
* **Section 2:** A solid black arrow originates from the left side of the calculator circle and points directly to the right side of circle "A". This indicates a right-to-left data flow, opposite to Section 1.
4. **Data Type Icons (Positioned above the flow arrows):**
* **Section 1:** Two icons are placed above the arrow, connected by a plus sign (`+`).
* Left Icon: A document or file icon (a rectangle with a folded corner and horizontal lines suggesting text).
* Right Icon: A scatter plot icon (an L-shaped axis with several dots plotted within it).
* **Section 2:** A single icon is placed above the arrow.
* Icon: A bar chart icon (an L-shaped axis with three vertical bars of increasing height from left to right).
### Detailed Analysis
* **Step 1 Flow:** Component "A" sends a combined data payload to the Calculator Unit. The payload consists of two distinct data types: a **document** (likely raw data, text, or a report) and a **scatter plot** (likely representing raw data points or a statistical distribution). The plus sign indicates these are sent together.
* **Step 2 Flow:** The Calculator Unit sends a single data product back to Component "A". This product is a **bar chart**, suggesting the output is a processed, aggregated, or summarized visualization of the data received in Step 1.
### Key Observations
1. **Directional Reversal:** The core observation is the reversal of the data flow arrow between steps 1 and 2, establishing a clear request-response or input-output cycle.
2. **Data Transformation:** The input (document + scatter plot) is transformed into a different output format (bar chart). This implies a computational or analytical process occurs within the Calculator Unit.
3. **Iconography Consistency:** The calculator icon is used consistently for the processing unit. The data icons are standard symbols for their respective data types (document, scatter plot, bar chart).
### Interpretation
This diagram models a fundamental data processing or analysis pipeline.
* **What it suggests:** It demonstrates a system where an entity ("A") provides raw or detailed data (a document and a scatter plot) to a processing engine (the calculator). The engine performs computations, analysis, or aggregation on this input and returns a summarized, visual result (a bar chart) back to the originating entity.
* **Relationship between elements:** "A" is the data source and consumer of results. The Calculator is the transformation engine. The arrows define the transactional relationship. The change in data icons visually represents the value added by the processing step—converting raw, granular data into a higher-level insight.
* **Notable pattern:** The process is linear and closed-loop. There is no indication of external data sources or sinks; the cycle is contained between "A" and the Calculator. The simplicity suggests this could be a conceptual model for a specific function, like generating a summary report from raw datasets, rather than a detailed system architecture.
</details>
S4. Scalable consensus in large networks.
An LLM (Alice) wants to collect and aggregate data points from $N\gg 1$ resources. There is no protocol to handle that, and each resource has its own implementation, possibly not public.
1. Alice submits the requests in natural language. 2. Each queried LLM processes the request, turns it into a routine to retrieve the data and sends it back to Alice.
<details>
<summary>img/scenarios/s4-1.png Details</summary>

### Visual Description
## Diagram: Two-Phase Data Request and Retrieval Process
### Overview
The image displays a two-panel technical diagram illustrating a sequential, two-phase communication or data flow process. The diagram is divided into two distinct sections, labeled "1" and "2" in the top-left corners of their respective panels. The process involves a central entity, labeled "A", interacting with a set of multiple, unlabeled peripheral entities. The flow of information or requests changes direction between the two phases.
### Components/Axes
* **Panel 1 (Left):**
* **Label:** A square box containing the number **"1"** is positioned in the top-left corner.
* **Central Node:** A circle labeled with the capital letter **"A"**.
* **Peripheral Nodes:** Four circles are arranged vertically to the right of node "A". An ellipsis **"..."** is placed between the second and third circles, indicating an indefinite number of additional nodes in this group.
* **Connections:** Four solid black arrows originate from node "A" and point towards each of the visible peripheral nodes.
* **Descriptive Text:** The phrase **"Request data"** is positioned above the arrows, describing the action of this phase.
* **Grouping Indicator:** A dashed, light-gray line connects the peripheral nodes vertically on their right side, suggesting they belong to a common group or cluster.
* **Panel 2 (Right):**
* **Label:** A square box containing the number **"2"** is positioned in the top-left corner.
* **Central Node:** A circle labeled with the capital letter **"A"**, identical to the one in Panel 1.
* **Peripheral Nodes:** The same arrangement of four circles and an ellipsis **"..."** as seen in Panel 1.
* **Connections:** Four solid black arrows originate from the peripheral nodes and point towards the central node "A".
* **Descriptive Text:** The phrase **"Process and retrieve data"** is positioned above the arrows, describing the action of this phase.
* **Grouping Indicator:** The same dashed, light-gray line connects the peripheral nodes vertically on their right side.
### Detailed Analysis
The diagram depicts a clear, two-step sequence:
1. **Phase 1 (Request):** The central actor "A" initiates communication by sending requests outward to multiple, distinct entities. The arrow direction (from "A" to the group) and the label "Request data" confirm this is a one-to-many broadcast or multicast request.
2. **Phase 2 (Response/Processing):** The flow reverses. The multiple entities now send information back to the central actor "A". The arrow direction (from the group to "A") and the label "Process and retrieve data" indicate this is a many-to-one aggregation, processing, or response phase.
The consistent use of the same node labels ("A" and the unlabeled circles) and the dashed grouping line across both panels confirms that the same set of actors is involved in both phases of the process. The ellipsis is a critical component, explicitly stating that the diagram is a simplified representation and the system can scale to include more than the four peripheral nodes shown.
### Key Observations
* **Symmetry and Reversal:** The diagram's primary visual feature is the symmetrical reversal of arrow direction between the two panels, highlighting the request-response nature of the process.
* **Centralized Architecture:** The topology is a classic star network, with "A" as the central hub and the other nodes as spokes. This implies a centralized control or coordination point.
* **Abstraction:** The peripheral nodes are intentionally left unlabeled, making the diagram a generic model applicable to various scenarios (e.g., client-server, master-worker, aggregator-services).
* **Sequential Phasing:** The numbered labels ("1" and "2") enforce a strict temporal order: the request phase must precede the processing/retrieval phase.
### Interpretation
This diagram is a fundamental model for a **synchronous or asynchronous distributed data retrieval pattern**. It visually answers the question: "How does a central component gather data from multiple sources?"
* **What it demonstrates:** It illustrates the decoupling of the *request initiation* from the *data processing/aggregation*. Phase 1 is about task distribution or query broadcasting. Phase 2 is about result collection and synthesis.
* **Relationships:** The relationship is hierarchical and transactional. "A" is the orchestrator or client, while the peripheral nodes are workers, services, or data sources. The dashed line suggests the peripheral nodes may be peers or part of a unified system (like a database shard cluster or a microservice fleet) from the perspective of "A".
* **Notable Implications:** The model implies potential latency considerations—the total time for the process is at least the sum of the time for the slowest request (Phase 1) and the slowest response (Phase 2). It also suggests the need for "A" to manage multiple concurrent connections and to handle partial failures if some peripheral nodes do not respond. The "Process and retrieve data" label in Phase 2 is key; it indicates that the peripheral nodes are not merely returning raw data but may be performing computation, filtering, or formatting before sending the result back.
</details>
Alice wants to retrieve more data and queries the network another time.
1. One or more receivers suggest using a protocol document for the next iterations. 2. Alice agrees and uses the protocols with as many resources as possible.
<details>
<summary>img/scenarios/s4-2.png Details</summary>

### Visual Description
## Diagram: Centralized Protocol Negotiation Network
### Overview
The image is a black-and-white schematic diagram illustrating a network communication or negotiation process. It depicts a central entity ("A") engaging in bilateral negotiations with multiple other entities, which are also interconnected among themselves.
### Components/Axes
* **Primary Node:** A large circle on the left side of the diagram, labeled with the capital letter "**A**".
* **Secondary Nodes:** A vertical column of four smaller, identical circles on the right side. Between the second and third circles from the top, an ellipsis ("**...**") indicates the presence of additional, unspecified nodes in this group.
* **Communication Links (Solid Arrows):**
* Four solid, black arrows originate from node "**A**" and point towards each of the visible secondary nodes on the right.
* Four corresponding solid, black arrows originate from each secondary node and point back towards node "**A**".
* This creates four distinct bidirectional communication paths between "A" and each secondary node.
* **Inter-Node Links (Dashed Lines):** Light gray, dashed lines connect the secondary nodes to each other in a vertical chain. A dashed line connects the top node to the second, the second to the third, and the third to the bottom node. Another dashed line arcs from the top node to the bottom node, suggesting a potential ring or mesh topology among the secondary nodes.
* **Text Labels:** The phrase "**⟨negotiation of the protocol⟩**" appears twice, written in a sans-serif font and aligned along the diagonal of two of the bidirectional arrow sets (the topmost and bottommost connections between "A" and the right-side nodes).
### Detailed Analysis
* **Flow Direction:** The primary flow is bidirectional between the central node "A" and each peripheral node. The dashed lines indicate a secondary, possibly latent or peer-to-peer, connection structure among the peripheral nodes themselves.
* **Spatial Grounding:**
* Node "A" is positioned in the center-left of the frame.
* The column of secondary nodes is positioned in the center-right of the frame.
* The text "⟨negotiation of the protocol⟩" is placed along the upper-left and lower-left diagonals, adjacent to the corresponding arrow sets.
* **Component Isolation:**
* **Header/Label:** The diagram is self-labeling with the text "negotiation of the protocol."
* **Main Diagram:** Contains all nodes and connecting lines.
* **Footer:** No footer is present.
### Key Observations
1. **Centralized Coordination:** The architecture is hub-and-spoke, with "A" acting as the central coordinator or initiator for all protocol negotiations.
2. **Bilateral Negotiation:** Each negotiation is a direct, one-on-one interaction between "A" and a single secondary node, as indicated by the dedicated bidirectional arrows.
3. **Implicit Peer Network:** The dashed lines reveal an underlying structure among the secondary nodes, suggesting they form a group, network, or cluster that may share information or have relationships independent of "A."
4. **Scalability Indication:** The ellipsis ("...") explicitly denotes that the system is designed to handle more than the four secondary nodes shown.
5. **Uniformity:** All secondary nodes are visually identical, implying they hold a similar role or status relative to "A" in this negotiation phase.
### Interpretation
This diagram models a common pattern in distributed systems, networking, or multi-agent systems. It visually answers the question: "How does a central entity establish a protocol with multiple participants who are also connected to each other?"
* **What it demonstrates:** The data suggests a two-tiered communication model. The primary, explicit process is a series of parallel, independent negotiations orchestrated by "A." The secondary, implicit process is the existing or resulting connectivity between the participants themselves.
* **Relationships:** The solid arrows represent active, controlled communication channels for the specific purpose of "negotiation of the protocol." The dashed lines represent a different type of relationship—perhaps a physical network topology, a logical group membership, or a secondary data channel that exists before or as a result of the negotiations.
* **Notable Implications:** The design implies that "A" must manage multiple concurrent negotiation sessions. The presence of the peer network (dashed lines) could be a factor in the negotiation (e.g., nodes may coordinate their responses) or could be the intended outcome (e.g., establishing the protocol enables the peer network to function). The diagram captures the moment of setup or configuration, not the steady-state operation of the system after the protocol is agreed upon.
</details>
The successive communications will increasingly use protocol documents, thus not necessitating the receiver to process the query with the LLM.
S5. Scaling complex NLP routines.
An LLM (Alice) wants to retrieve data from a system powered by an LLM (Bob) that, in turns, obtains its data from a search engine (i.e., the LLM is combined with a RAG). Bob has to (1) turn the natural language request into a query, (2) retrieve the data from the RAG, and (3) return a summary.
Alice queries Bob to retrieve some data. There is no routine to handle any of the three phases, so Bob has to invoke the LLM twice to turn the query into a format to invoke the RAG and then perform the summarisation.
<details>
<summary>img/scenarios/s5-1.png Details</summary>

### Visual Description
\n
## Diagram: Four-Step RAG-Enhanced LLM Interaction Flow
### Overview
The image is a technical diagram illustrating a four-step process for handling a user query using a Large Language Model (LLM) augmented with a Retrieval-Augmented Generation (RAG) system. The diagram is divided into four numbered quadrants (1-4), each depicting a stage in the interaction between two primary entities, labeled "A" and "B". The flow demonstrates how a factual query is processed, enriched with external data, and summarized.
### Components/Elements
* **Primary Entities:** Two circles labeled **A** and **B** appear in each step. Based on the context, "A" likely represents the **User/Client**, and "B" represents the **AI System/LLM Service**.
* **Communication Flow:** Arrows between A and B indicate the direction of information transfer.
* **Numbered Steps:** Each stage is marked with a number (1, 2, 3, 4) inside a square box in the top-left corner of its quadrant.
* **Descriptive Text:** Each step includes a text label describing the action occurring.
* **External System Icon (Step 3):** A magnifying glass with a gear inside, set against a light gray background and separated from the main diagram by a vertical dashed line. This icon represents the **RAG (Retrieval-Augmented Generation) system** or an external knowledge base.
### Detailed Analysis
The process unfolds sequentially from step 1 to step 4:
**Step 1 (Top-Left Quadrant):**
* **Visual:** An arrow points from circle A to circle B.
* **Text:** `"What's the highest mountain in the world"`
* **Action:** Entity A (User) sends a direct factual query to Entity B (AI System).
**Step 2 (Bottom-Left Quadrant):**
* **Visual:** A solid line connects A and B, with no arrowhead, indicating an ongoing connection or processing state.
* **Text:** `<Processes the query with an LLM>`
* **Action:** Entity B (AI System) begins processing the received query using its internal Large Language Model.
**Step 3 (Top-Right Quadrant):**
* **Visual:** A solid line connects A and B. To the right of B, a double-headed arrow (`<-->`) crosses a vertical dashed line to connect with the RAG system icon.
* **Text:** `<Invokes the RAG>`
* **Action:** Entity B (AI System) determines that external information is needed and invokes the RAG system. The double-headed arrow signifies a bidirectional data retrieval request and response.
**Step 4 (Bottom-Right Quadrant):**
* **Visual:** An arrow points from circle B back to circle A.
* **Text:** `<Summarises the content with an LLM>`
* **Action:** Entity B (AI System), having received information from the RAG system, uses its LLM to synthesize and summarize the retrieved content into a final answer, which is then sent back to Entity A (User).
### Key Observations
1. **Sequential Dependency:** The steps are strictly ordered. Step 3 (RAG invocation) is contingent on the processing in Step 2, and the final summary in Step 4 depends on the data retrieved in Step 3.
2. **Role of RAG:** The RAG system is depicted as an external component, separate from the core A-B interaction. Its invocation is a critical intermediate step for answering factual queries that require up-to-date or specialized knowledge not contained within the LLM's base training data.
3. **LLM's Dual Role:** The LLM (within Entity B) is used twice: first to process and understand the initial query (Step 2), and later to synthesize the retrieved information into a coherent response (Step 4).
### Interpretation
This diagram provides a clear, high-level schematic of a **Retrieval-Augmented Generation (RAG) pipeline**. It visually explains the core value proposition of RAG: enhancing an LLM's capabilities by dynamically fetching relevant information from an external source before generating a response.
* **The Problem:** A standard LLM (Step 2) might have outdated or incomplete knowledge to answer "What's the highest mountain in the world?" accurately, especially if recent geological data or specific definitions are involved.
* **The Solution:** The system bypasses this limitation by **grounding** its response in retrieved facts. Step 3 shows the system actively seeking authoritative data. Step 4 shows the LLM not just reciting raw data, but **summarizing** it, which implies generating a natural, user-friendly answer based on the retrieved evidence.
* **Significance:** This pattern is fundamental to building reliable, factual, and up-to-date AI assistants. It reduces hallucinations (making up facts) by anchoring responses in verifiable external data. The diagram effectively communicates this workflow to a technical audience, highlighting the interaction between the user, the core AI model, and the supporting knowledge retrieval system.
</details>
Alice queries Bob again; this time, Bob asks to use a routine to query directly the RAG.
<details>
<summary>img/scenarios/s5-2.png Details</summary>

### Visual Description
## Diagram: Three-Step Query Processing Workflow with RAG
### Overview
The image is a technical diagram illustrating a three-step workflow for processing a query, involving two primary entities labeled "A" and "B". The diagram is divided into three numbered panels (1, 2, 3) arranged horizontally, with an additional faded, dashed-red-boxed panel below panel 1 labeled "SKIP". The overall flow depicts a sequence where a query is formatted, a Retrieval-Augmented Generation (RAG) system is invoked, and finally, content is summarized.
### Components/Axes
* **Primary Entities:** Two circles labeled **A** and **B** appear in each step. They represent actors or components in the system.
* **Numbered Steps:** Each major interaction is labeled with a number in a square box in the top-left corner of its panel: **1**, **2**, and **3**.
* **Icons:**
* A **document icon** (black with white lines) appears above the arrow in Step 1.
* A **magnifying glass icon** with an orange gear inside appears to the right of Step 2, enclosed in a light gray box with a dashed left border.
* **Arrows:** Directional arrows indicate the flow of data or commands between entities.
* **Text Annotations:** Descriptive text in angle brackets (`< >`) is placed below or near the arrows to describe the action.
* **"SKIP" Panel:** A faded, grayed-out version of the Step 1 diagram is enclosed in a **red dashed rectangle**. The word **"SKIP"** is written in red, bold, uppercase letters at the top-right corner of this rectangle.
### Detailed Analysis
The diagram details a specific sequence of operations:
**Step 1 (Top-Left Panel):**
* **Flow:** An arrow points from **A** to **B**.
* **Visual Element:** A document icon is positioned above the arrow.
* **Text Annotation:** Below the arrow, the text reads: `<Formatted query>`.
* **Interpretation:** Entity A sends a formatted query to Entity B.
**Step 2 (Top-Right Panel):**
* **Flow:** A double-headed arrow (pointing both left and right) connects **B** to the magnifying glass icon.
* **Visual Element:** The magnifying glass with a gear is inside a gray box, suggesting an external system or service.
* **Text Annotation:** Below the arrow, the text reads: `<Invokes the RAG>`.
* **Interpretation:** Entity B interacts bidirectionally with a RAG (Retrieval-Augmented Generation) system, likely to retrieve relevant information.
**Step 3 (Bottom-Right Panel):**
* **Flow:** An arrow points from **B** back to **A**.
* **Text Annotation:** Below the arrow, the text reads: `<Summarises the content with an LLM>`.
* **Interpretation:** Entity B sends a summary back to Entity A. The summary is generated using a Large Language Model (LLM).
**"SKIP" Panel (Bottom-Left, below Step 1):**
* **Visual State:** The circles for A and B and the connecting arrow are faded to light gray.
* **Bounding Box:** The entire panel is enclosed in a red dashed rectangle.
* **Label:** The word **"SKIP"** is prominently displayed in red.
* **Text Annotation:** Below the faded arrow, the text reads: `<Processes the query with an LLM>`.
* **Interpretation:** This panel represents an alternative or previous step that is being bypassed or omitted in the current workflow. Instead of directly processing the query with an LLM (as shown here), the workflow proceeds through the RAG invocation in Steps 1-2-3.
### Key Observations
1. **Sequential Flow:** The numbered steps (1 → 2 → 3) define a clear, linear process.
2. **Role of RAG:** The RAG system is invoked as an intermediate step (Step 2) between receiving the query and returning the summary. This suggests the system retrieves external or specific knowledge before generation.
3. **LLM Usage:** The LLM is explicitly mentioned in two contexts: in the "SKIP" panel for direct query processing, and in Step 3 for summarization. This implies the LLM's role is focused on synthesis (summarization) rather than initial query handling in the active workflow.
4. **Visual Emphasis on "SKIP":** The use of red color, dashed lines, and fading strongly indicates that the direct "Processes the query with an LLM" path is not taken in this specific process flow.
### Interpretation
This diagram illustrates a **Retrieval-Augmented Generation (RAG) pipeline** for query answering. The core narrative is the replacement of a direct LLM query processing step with a more sophisticated retrieval-then-summarize approach.
* **What it demonstrates:** The workflow shows how a system (represented by A and B) can generate more informed or accurate responses by first retrieving relevant documents or data (via the RAG system in Step 2) before using an LLM to synthesize a summary (Step 3). The "SKIP" panel highlights this architectural choice, contrasting the RAG-based method with a simpler, direct LLM call.
* **Relationships:** Entity A appears to be the initiator/client (sends query, receives summary). Entity B acts as the orchestrator or processor (receives query, calls RAG, manages summarization). The RAG system is an external tool or database.
* **Notable Implication:** The diagram argues for the value of grounding LLM responses in retrieved information. By skipping the direct LLM processing and instead invoking RAG, the system likely aims to reduce hallucinations, improve factual accuracy, and incorporate up-to-date or domain-specific knowledge that the base LLM might not possess. The final output is not the raw retrieved data but a summarized synthesis, indicating the LLM's role is to digest and present the retrieved information coherently.
</details>
Any query that complies with the document protocol now skips the first phase and directly invokes the RAG.
## Appendix B Agora Specification
In this section, we provide a more formal description of Agora.
### B.1 Transactions
An Agora transaction operates as follows. Suppose that an agent, Alice, is trying to communicate with another agent Bob:
- Alice sends to Bob over HTTPS a JSON document containing three fields:
- protocolHash: The hash of the protocol document. If natural language is used, then the value of protocolHash is null;
- protocolSources: A list of URIs where the protocol document can be found. Must be empty if protocolHash is null and non-empty otherwise;
- body: A string containing the body of the request as specified by the given protocol.
- If Bob does not have the protocol document, he fetches it (either from the sources provided by Alice or from another repository);
- If Bob is unable to use the protocol, he returns a JSON document with one field, namely status, which is equal to “rejected”;
- Otherwise, Bob computes the response using the LLM, routines, or a combination of both;
- Bob sends as response a JSON document with the following fields:
- status: a string indicating the status of the response (can be “success” or “failure”);
- body: the response returned by the agent.
- Note that "status":"failure" must be used only for errors that are not covered by the protocol document (e.g., the agent failing to instantiate the LLM); when the protocol prescribes how to handle an error, the agent should return "status":"success" and the correct error message as body.
### B.2 Protocol Documents
A protocol document is identified by its SHA1 hash. Protocol documents can refer to other protocol documents by adding a preamble containing the protocol hash and at least one source for the protocol document. Further discussions concerning the optimal way to reference other protocol documents is left to future work.
### B.3 Preferred Protocols
In order to signal to other agents which protocols are supported by default, an agent can expose an endpoint /.wellknown. This endpoint returns a JSON object where each key is a supported protocol hash and each value is a non-empty list of protocol sources.
## Appendix C Example of Negotiation Process Between Two Agents
We report the negotiation process between Alice and Bob for a protocol to transmit weather forecast. Note that while the LLMs used Markdown-style formatting, for the sake of readability we replaced Markdown syntax with the corresponding typeset style.
Alice
Hello. I’m here to negotiate a protocol to query a weather forecast service. The service should accept a date and a location as input and return the predicted weather condition, temperature, and precipitation for that day. Before we start, I’d like to clarify the rules for our negotiation. We should focus on designing a simple and unambiguous protocol that specifies the exact format of the messages sent and received. We shouldn’t worry about how the messages will be delivered, and we should assume that the implementation will receive a string and return a string. To begin, I propose that we use a JSON-based protocol. The input message could be a JSON object with two properties: “date” and “location”. The output message could be a JSON object with three properties: “temperature”, “precipitation”, and “weatherCondition”. What are your thoughts on this proposal?
Bob
Your proposal aligns well with the goal of keeping the protocol simple and unambiguous. Using a JSON-based protocol for both the input and output messages is a good approach. Here’s a more detailed specification for the protocol: Input Message The input message will be a JSON object with the following structure: { "date": "YYYY-MM-DD", "location": "string" } •
date: A string representing the date for which the weather forecast is requested, in the format YYYY-MM-DD. •
location: A string representing the location for which the weather forecast is requested. Output Message The output message will be a JSON object with the following structure: { "temperature": number, "precipitation": number, "weatherCondition": "string" } •
temperature: A number representing the predicted temperature for that day in °C. •
precipitation: A number representing the predicted precipitation for that day in mm. •
weatherCondition A string representing the predicted weather condition for that day. Possible values are “sunny”, “cloudy”, “rainy”, “snowy”. Example Input: { "date": "2023-10-01", "location": "New York" } Output: { "temperature": 22.5, "precipitation": 5.0, "weatherCondition": "cloudy" } If you agree with this structure, we can finalize the protocol.
Alice
LLM response: I agree with the proposed protocol structure. It aligns well with the requirements and is simple and unambiguous.
Finalised Protocol
Name: Weather Forecast Query Protocol Description: A protocol for querying the weather forecast for a given date and location. Input Message The input message will be a JSON object with the following structure: { "date": "YYYY-MM-DD", "location": "string" } •
date: A string representing the date for which the weather forecast is requested, in the format YYYY-MM-DD. •
location: A string representing the location for which the weather forecast is requested. Output Message The output message will be a JSON object with the following structure: { "temperature": number, "precipitation": number, "weatherCondition": "string" } •
temperature: A number representing the predicted temperature for that day in °C. •
precipitation: A number representing the predicted precipitation for that day in mm. •
weatherCondition A string representing the predicted weather condition for that day. Possible values are ”sunny”, ”cloudy”, ”rainy”, ”snowy”. Example Input: { "date": "2023-10-01", "location": "New York" } Output: { "temperature": 22.5, "precipitation": 5.0, "weatherCondition": "cloudy" }
## Appendix D 100 Agent Demo - Full Description
### D.1 Implementation Notes
We implemented our demo in Python, using Flask servers as basis for our agents. Each agent is either a user or a server:
- Users receive a random task, some randomly generated data and a description of the task data (including its schema). Their objective is to execute the requested action and return a reply according to a certain schema. This allows us to generate a large number of queries without needing to handcraft them. Note that all tasks are single-round, i.e. they can be fulfilled in one round of communication;
- Servers receive queries from other users and reply to them using a combination of three types of tools:
- Database tools, which involve connecting to a personal SQL or MongoDB database (assigned at random). Depending on the server, some databases are initialised with dummy data;
- Mock tools, which are simplifications of actual tools (e.g., for taxi service agents, the assignTaxi tool is a mock tool that, instead of actually sending a taxi to a location, mimics the request flow);
- External tools, which are tools that enable the agent to start a Agora communication with a predefined server, although no information about the respective agents’ schema is provided. In other words, the skiLodge agent can open a channel with the weatherService agent
Moreover, we added three protocol databases, which are simple Flask servers that host protocol documents. The first protocol database is a peer with the second one, the latter of which is also a peer with the third protocol database (but the first protocol database is not a peer of the third one). Every 10 executed queries, one protocol databases shares its protocol documents with its peers. This simulates the propagation of protocol documents between different databases.
#### Picking a Protocol
Users track the number of communications with a given server about a certain type of task until it hits one of two thresholds: one for using a protocol instead of natural language and one for negotiating a protocol ex novo.
When the first threshold is hit, the user invokes the LLM to check if either the server or the reference protocol database (which is randomly assigned to the user at the start of the demo) already have a suitable protocol. If there are none, the user continues using natural language until the second threshold is hit: in that case, the user begins a negotiation with the server and submits the final protocol to the reference protocol database.
Similarly, each server has a counter that tracks the number of natural language communications with any user since the last negotiation. Once the counter hits a threshold, the server requests a negotiation with the user, regardless of how many of the tracked queries were sent by the current user. After negotiation, the counter is reset.
In our demo, we set the thresholds for the user to respectively 3 and 5 communications, and the threshold for the server to 10.
#### APIs
For GPT-4o and Gemini 1.5 Pro, we used respectively the OpenAI and Google API. For Llama 3 405b, we used the SambaNova API. Prices per million tokens are reported in Table 1.
Table 1: Prices per million tokens at the time of writing.
| GPT-4o | 5.00 | 15.00 |
| --- | --- | --- |
| Llama 3 405b | 5.00 | 10.00 |
| Gemini 1.5 Pro | 3.50 | 10.50 |
#### Bootstrapping Quality-of-Life Extensions
For the sake of bootstrapping the network, while implementing the demo we added two features to our nodes:
- Providing each node with a simple protocol for multi-round communication in natural language;
- Allowing the protocol document to include machine-readable metadata, such as the name or a short description of the protocol. This helps an agent to determine quickly which protocols, among a list of potential protocols, can be suitable for a certain task.
We leave whether these features should be integrated with the Agora standard, or whether they should be handled using PDs only, to future work.
### D.2 Experimental Setup
#### Preliminary Tests
We first ran a series of qualitative tests to determine which among the considered LLMs (OpenAI GPT 4o, Llama 3 405b, Gemini 1.5 Pro) were the most suitable for negotiation and programming. We found that while all three LLMs were capable of negotiating and implementing protocols, GPT 4o was the most robust, followed by the Llama 3 405b and finally Gemini 1.5 Pro. Surprisingly, the main factor behind the brittleness of Gemini 1.5 Pro was not the model’s inherent performance, but rather the lack of robustness of the API itself: even with tailored retry systems, the API sometimes failed to respond in a nondeterministic manner (i.e. the same query would at times succeed and at times fail). We believe that our experience was due to temporary server issues, rather than fundamental problems with the model.
#### LLM Distribution
In light of our preliminary results, we manually assigned a model to each server node, following a power law consistent with our findings (9 nodes with GPT-4o, 4 nodes with Llama 3 405b, 2 nodes with Gemini 1.5 Pro). User agents were instead randomly assigned one of the three LLMs with uniform distribution. Overall, the breakdown of nodes by model is:
- GPT-4o: 38 nodes (9 server nodes, 29 user nodes)
- Llama 3 405b: 32 nodes (4 server nodes, 28 user nodes)
- Gemini 1.5 Pro: 30 nodes (2 server nodes, 28 user nodes)
Out of 1000 queries, 8 (representing thus 0.8% of the total query volume) failed due to Google’s Gemini API not responding. This phenomenon was unrelated to the use of Agora, with 500 Internal Server errors appearing both in the Agora demo and the natural language counterfactual with roughly the same frequency.
#### Task Distribution
To simulate the heterogeneity in communication frequency (i.e. how some nodes tend to be more active than others), we assigned to each user a “query budget” (which represents how many queries are sent by a given user) following a Pareto distribution with shape parameter equal to $0.5$ , adapted so that each user has at least 1 query. The query budget is then split between three randomly chosen types of queries using a Pareto law with a shape parameter of 1 and a minimum of 1 query per type (unless the budget is less than 3 queries). See Figure 6 for a visualisation of the distribution.
<details>
<summary>x3.png Details</summary>

### Visual Description
\n
## Line Chart: Query Budget vs. Random User Position
### Overview
The image displays a single-series line chart plotting "Query Budget" against "User Position (Chosen Randomly)". The chart uses a logarithmic scale for the y-axis (Query Budget) and a linear scale for the x-axis. The data shows a steep, stepwise decline in query budget as the user position increases, eventually plateauing at a minimum value.
### Components/Axes
* **Chart Type:** Single-line chart.
* **X-Axis:**
* **Label:** "User Position (Chosen Randomly)"
* **Scale:** Linear, ranging from 0 to 80.
* **Major Tick Marks:** 0, 10, 20, 30, 40, 50, 60, 70, 80.
* **Y-Axis:**
* **Label:** "Query Budget"
* **Scale:** Logarithmic (base 10).
* **Major Tick Marks (Labels):** 10⁰ (1), 10¹ (10), 10² (100).
* **Grid Lines:** Horizontal grid lines are present at each major tick and at intermediate logarithmic intervals (e.g., 2, 3, 4, 5, 6, 7, 8, 9 between 1 and 10).
* **Data Series:**
* **Color:** A single teal/light green line.
* **Legend:** No separate legend is present; the single line is self-explanatory.
* **Spatial Layout:** The chart area is centered on a white background. The axes form a standard left and bottom border. The line originates in the top-left quadrant and terminates in the bottom-right quadrant.
### Detailed Analysis
**Trend Verification:** The teal line exhibits a clear, non-linear downward trend. It begins at a high value on the far left (low user position) and descends rapidly in a concave curve, transitioning into a series of distinct plateaus and drops before flattening completely.
**Approximate Data Points and Trend Description:**
* **At User Position ~0:** The line starts at its maximum value, approximately **200-300** (just above the 10² mark).
* **Positions 0-10:** The budget drops very steeply. By position 10, the value is approximately **10-12** (just above the 10¹ line).
* **Positions 10-20:** The decline continues but begins to show a stepwise pattern. A small plateau occurs around position 15-18 at a value of ~6-7. A drop occurs near position 20.
* **Positions 20-30:** A clear plateau exists from roughly position 20 to 25 at a value of **~4**. Another drop occurs near position 25.
* **Positions 30-50:** A longer plateau is visible from approximately position 33 to 50 at a value of **~2**.
* **Positions 50-80:** A final drop occurs just after position 50. From position ~52 to the end of the chart at position 80, the line is perfectly flat at the minimum value of **10⁰ (1)**.
### Key Observations
1. **Logarithmic Decline:** The use of a log scale for the y-axis highlights that the initial decrease in query budget is exponential, not linear.
2. **Stepwise Function:** After the initial smooth curve, the budget decreases in discrete steps, suggesting a quantized or tiered allocation system rather than a continuous function.
3. **Minimum Floor:** The budget hits a hard floor of 1 and does not decrease further, indicating a lower bound or minimum guaranteed allocation.
4. **Random Ordering Implication:** The x-axis label "User Position (Chosen Randomly)" is critical. It implies the users are not ordered by any inherent metric (like seniority or usage) but by a random shuffle. The strong correlation between position and budget suggests the allocation mechanism is highly sensitive to this random ordering.
### Interpretation
This chart likely visualizes the output of a resource allocation algorithm, possibly for a computational service, API, or experimental trial. The "Query Budget" represents a limit on the number of operations or requests a user can make.
The data suggests a **highly skewed, "winner-takes-most" allocation strategy**. The first few users in a random sequence receive a disproportionately large share of the total budget (hundreds of queries), while the vast majority of users (from position ~50 onward) receive only the bare minimum (1 query). The stepwise drops indicate predefined tiers or thresholds in the allocation logic.
The key insight is that **user outcome is dramatically determined by random chance** (their position in the shuffled queue). This could be a design for a fair lottery system (where everyone has an equal chance to be a "winner") or, conversely, an illustration of extreme inequity in a seemingly random process. The plateau at the end shows the system has a defined minimum service level, ensuring every user gets at least one query.
</details>
Figure 6: Distribution of query budgets for users. The y axis is logarithmic.
### D.3 Additional Observations
#### Cost Breakdown
The breakdown of cost by activity is as follows:
- Natural language communication: 54%;
- Negotiation: 6%;
- Checking the suitability of existing protocols 22%;
- Implementing the protocols: 17%;
Note that negotiation, despite being the most expensive activity (since it involves several rounds of communication), actually represented the smallest contribution to the total cost, with cheaper but more frequent operations (i.e. sending natural language messages and checking the suitability of protocols) making up the largest portion.
#### Similar Protocols
Due to the (intentional) partial insulation of nodes in the network, sometimes similar protocols emerged independently. Nevertheless, agents using different default protocols were still able to communicate by picking one of the available protocols; for the sake of simplicity, the preferred protocol is chosen by the sender.