# Technical Document Extraction: System Architecture and Configuration
This image illustrates a technical system architecture for an automated agent evaluation framework. It consists of a JSON configuration block on the left and a system diagram on the right, showing the flow between an Agent, a Coordinator, and a Virtual Machine Platform.
---
## 1. Configuration Block (JSON Transcription)
The left side of the image contains a JSON object labeled **Config**. The text is highlighted in various colors (red, orange, yellow, green) to indicate different functional segments.
```json
{
"instruction": "Please update my bookkeeping sheet with the recent transactions from the provided folder, detailing my expenses over the past few days.",
"config": [
{
"type": "download",
"parameters": {
"files": [
{
"path": "/home/user/Desktop/my_bookkeeping.xlsx",
"url": "https://drive.google.com/uc?id=xxxx"
},
{
"path": "/home/user/Desktop/receipt_0.jpeg",
"url": "https://drive.google.com/uc?id=xxxx"
}
]
}
},
{
"type": "open",
"parameters": {
"path": "/home/user/Desktop/my_bookkeeping.xlsx"
}
}
],
"evaluator": {
"postconfig": [
{
"type": "activate_window",
"parameters": {
"window_name": "my_bookkeeping.xlsx - LibreOffice Calc"
}
}
],
"result": {
"type": "vm_file",
"path": "/home/user/Desktop/my_bookkeeping.xlsx",
"dest": "my_bookkeeping.xlsx"
},
"expected": {
"type": "cloud_file",
"path": "https://drive.google.com/uc?id=xxx",
"dest": "my_bookkeeping_gold.xlsx"
},
"func": "compare_table",
"options": {
"rules": [
{
"type": "sheet_fuzzy",
"sheet_idx0": "RNSheet1",
"sheet_idx1": "ENSheet1",
"rules": [
{
"range": [ "A1:A8" ]
}
]
}
]
}
}
}
```
---
## 2. System Architecture Diagram
The diagram describes the interaction between four primary entities: the **Agent**, the **Coordinator**, the **Virtual Machine Platform**, and the **Reward** output.
### A. Agent (Top Left)
* **Component:** Represented by a yellow box.
* **Interactions:** Communicates with the **Coordinator** via two-way arrows.
* **Actions:** Sent from the Agent to the Coordinator.
* **Observations:** Sent from the Coordinator to the Agent.
### B. Coordinator (Center)
The Coordinator is a large blue container housing two main sub-modules:
#### 1. Simulator
* **Inputs:** Receives "actions" from the Agent.
* **Outputs:**
* Provides "observations" back to the Agent.
* Generates **screen capture** and **accessibility tree** data.
* **Internal Flow:** Connects to the **Virtual Machine Controller**.
#### 2. Task Manager
* **Sub-components:**
* **Set-up (Setup Interpreter):** Handles the initial environment configuration.
* **Evaluation Interpreter:** Contains three layers: **Postprocess**, **Getter**, and **Metrics**.
* **Internal Flow:**
* The **Set-up** module sends instructions to the **Virtual Machine Controller**.
* The **Virtual Machine Controller** sends data to the **Getter** within the Evaluation Interpreter.
* The **Evaluation Interpreter** outputs to the final **Reward** stage.
### C. Virtual Machine Platform (Right)
* **Components:** Contains multiple Virtual Machines (VMs), specifically **VM 1** and **VM $i$**.
* **Internal Software:** Each VM runs a **Virtual Machine Control Receiver**.
* **Visuals:** VM 1 shows a Linux-style desktop with a code editor (VS Code) and terminal. VM $i$ shows a Windows-style desktop with Microsoft Word and an image viewer.
* **Communication with Coordinator:**
* **Inbound (from VM Controller):** "vmrun commands", "Flask commands".
* **Outbound (to VM Controller):** "status, files, infos...".
### D. Reward (Bottom)
* **Component:** Yellow box labeled **Reward**.
* **Description:** "Reward by executing eval scripts".
* **Source:** This is the final output derived from the **Evaluation Interpreter** within the Task Manager.
---
## 3. Component Flow and Logic Summary
1. **Initialization:** The **Config** (JSON) defines the task (updating a bookkeeping sheet) and the environment setup.
2. **Execution:** The **Agent** performs actions based on observations. These actions are processed by the **Simulator** within the **Coordinator**.
3. **Control:** The **Virtual Machine Controller** translates these into low-level commands (vmrun/Flask) executed on the **Virtual Machine Platform**.
4. **Feedback Loop:** The VMs return status and file information to the Coordinator, which updates the Simulator (providing new observations to the Agent) and the Task Manager.
5. **Evaluation:** Once the task is complete, the **Evaluation Interpreter** compares the resulting file (`my_bookkeeping.xlsx`) against the "gold" expected file (`my_bookkeeping_gold.xlsx`) using fuzzy table comparison rules to generate a **Reward**.