2305.02383v2

Model: healer-alpha-free

# On the Security Risks of Knowledge Graph Reasoning **Authors**: - Zhaohan Xi - Penn State - Tianyu Du - Penn State - Changjiang Li - Penn State - Ren Pang - Penn State - Shouling Ji (Zhejiang University) - Xiapu Luo (Hong Kong Polytechnic University) - Xusheng Xiao (Arizona State University) - Fenglong Ma - Penn State - Ting Wang - Penn State mtbox[1]left=0.25mm, right=0.25mm, top=0.25mm, bottom=0.25mm, sharp corners, colframe=red!50!black, boxrule=0.5pt, title=#1, fonttitle=, coltitle=red!50!black, attach title to upper= – ## Abstract Knowledge graph reasoning (KGR) – answering complex logical queries over large knowledge graphs – represents an important artificial intelligence task, entailing a range of applications (e.g., cyber threat hunting). However, despite its surging popularity, the potential security risks of KGR are largely unexplored, which is concerning, given the increasing use of such capability in security-critical domains. This work represents a solid initial step towards bridging the striking gap. We systematize the security threats to KGR according to the adversary’s objectives, knowledge, and attack vectors. Further, we present ROAR, a new class of attacks that instantiate a variety of such threats. Through empirical evaluation in representative use cases (e.g., medical decision support, cyber threat hunting, and commonsense reasoning), we demonstrate that ROAR is highly effective to mislead KGR to suggest pre-defined answers for target queries, yet with negligible impact on non-target ones. Finally, we explore potential countermeasures against ROAR, including filtering of potentially poisoning knowledge and training with adversarially augmented queries, which leads to several promising research directions. ## 1 Introduction Knowledge graphs (KGs) are structured representations of human knowledge, capturing real-world objects, relations, and their properties. Thanks to automated KG building tools [61], recent years have witnessed a significant growth of KGs in various domains (e.g., MITRE [10], GNBR [53], and DrugBank [4]). One major use of such KGs is knowledge graph reasoning (KGR), which answers complex logical queries over KGs, entailing a range of applications [6] such as information retrieval [8], cyber-threat hunting [2], biomedical research [30], and clinical decision support [12]. For instance, KG-assisted threat hunting has been used in both research prototypes [50, 34] and industrial platforms [9, 40]. **Example 1** *In cyber threat hunting as shown in Figure 1, upon observing suspicious malware activities, the security analyst may query a KGR-enabled security intelligence system (e.g., LogRhythm [47]): “ how to mitigate the malware that targets BusyBox and launches DDoS attacks? ” Processing the query over the backend KG may identify the most likely malware as Mirai and its mitigation as credential-reset [15].* <details> <summary>x1.png Details</summary> ![ffd5a66e](/v1/image/ffd5a66ecc012a796b144e5ad897ba80d8450f71677778ec5b87711f51fa4b14) ### Visual Description ## Diagram: Knowledge Graph Poisoning and Mitigation in Security Intelligence ### Overview This diagram illustrates a security threat model where an adversary poisons knowledge sources to create a polluted Knowledge Graph (KG), which is then used within a "KGR-enabled security intelligence system" to generate threat queries and ultimately identify vulnerabilities and mitigations. The flow depicts both the attack vector and the defensive response process. ### Components/Axes The diagram is organized into several key components connected by directional arrows indicating flow and influence. **Top Region (Attack Initiation):** * **Adversary:** Represented by a hooded figure with a skull-and-crossbones laptop icon, located in the top-left. * **Poisoning Knowledge:** A box containing a small graph icon (nodes and edges) with a red arrow pointing from the adversary to this box, and another red arrow from this box to the knowledge sources. * **Knowledge Sources:** Located in the top-right, depicted as three cloud icons and two database cylinder icons. One database has a red biohazard symbol overlaid, indicating it is compromised. A gray bracket connects these sources to the next component. * **Polluted KG:** A network graph icon (nodes connected by lines) positioned below the knowledge sources. It receives input from the bracketed knowledge sources. **Left & Center Region (Malware and Analysis):** * **Malware:** A large biohazard symbol icon on the left. A red arrow from the adversary points directly to it. * **Symptoms:** Three icons branching from "malware" via red arrows: * A Trojan horse icon. * A database cylinder icon with a lock. * A bug icon labeled "bait evidence". * **Security Analyst:** An icon of a person at a computer, positioned above the central system box. * **KGR-enabled Security Intelligence System:** A large, gray-bordered box encompassing the core analytical process. It contains: * **Query Generation:** A magnifying glass over a bug icon. * **Threat Query:** A box containing a graph icon with a red question mark, receiving input from "query generation". * **KGR (Knowledge Graph Reasoning):** A brain icon connected to a graph icon, receiving input from the "threat query" and the "polluted KG". * **Vulnerability + Mitigation:** A flame icon, representing the output of the KGR process. **Flow Arrows:** * **Red Arrows:** Indicate adversarial or malicious flow (Adversary -> Poisoning Knowledge, Adversary -> Malware, Malware -> Symptoms). * **Gray Arrows:** Indicate data or process flow within the security system (Symptoms -> Query Generation, Query Generation -> Threat Query, Threat Query -> KGR, Polluted KG -> KGR, KGR -> Vulnerability + Mitigation). ### Detailed Analysis The process flow is as follows: 1. The **adversary** performs **poisoning knowledge** attacks against various **knowledge sources**. 2. This results in a **polluted KG** (Knowledge Graph). 3. Separately, the adversary deploys **malware**, which exhibits **symptoms** (e.g., Trojan horse behavior, database encryption, bug-like activity) and leaves **bait evidence**. 4. A **security analyst** oversees a **KGR-enabled security intelligence system**. 5. Within this system, observed symptoms and evidence feed into **query generation**. 6. This produces a **threat query** (represented by a graph with a question mark). 7. The **KGR** (Knowledge Graph Reasoning) component processes this threat query in conjunction with the **polluted KG**. 8. The final output of the KGR process is the identification of **vulnerability + mitigation** strategies. ### Key Observations * The diagram explicitly links data poisoning attacks (top flow) with malware analysis (left flow) through the central knowledge graph and reasoning system. * The "polluted KG" is a critical nexus, receiving poisoned data and being used for defensive reasoning, highlighting a potential integrity threat to the security system itself. * The security analyst is positioned as an overseer of the automated system, not directly interacting with the raw data flows. * The use of red for adversarial actions and gray for system processes creates a clear visual distinction between attack and defense. ### Interpretation This diagram models a sophisticated, multi-stage cyber-attack and defense scenario. It suggests that modern security intelligence relies on aggregated knowledge graphs, which themselves can become attack surfaces through data poisoning. The core investigative process (KGR) must reason over potentially corrupted data to answer threat queries derived from malware symptoms. The model implies a Peircean abductive reasoning cycle: observed malware symptoms (the *sign*) generate a query about an unknown threat (the *object*), which is then interpreted using the available knowledge graph (the *interpretant*) to hypothesize vulnerabilities and mitigations. The major anomaly and critical insight is that the "interpretant" (the KG) may be deliberately falsified by the adversary, potentially leading the entire defensive system to incorrect conclusions. This underscores the need for integrity verification of knowledge sources in security intelligence platforms. The diagram ultimately argues for the importance of robust, tamper-resistant knowledge graphs and reasoning engines (KGR) in contemporary cybersecurity. </details> Figure 1: Threats to KGR-enabled security intelligence systems. Surprisingly, in contrast to the growing popularity of using KGR to support decision-making in a variety of critical domains (e.g., cyber-security [52], biomedicine [12], and healthcare [71]), its security implications are largely unexplored. More specifically, RQ ${}_{1}$ – What are the potential threats to KGR? RQ ${}_{2}$ – How effective are the attacks in practice? RQ ${}_{3}$ – What are the potential countermeasures? Yet, compared with other machine learning systems (e.g., graph learning), KGR represents a unique class of intelligence systems. Despite the plethora of studies under the settings of general graphs [72, 66, 73, 21, 68] and predictive tasks [70, 54, 19, 56, 18], understanding the security risks of KGR entails unique, non-trivial challenges: (i) compared with general graphs, KGs contain richer relational information essential for KGR; (ii) KGR requires much more complex processing than predictive tasks (details in § 2); (iii) KGR systems are often subject to constant update to incorporate new knowledge; and (iv) unlike predictive tasks, the adversary is able to manipulate KGR through multiple different attack vectors (details in § 3). <details> <summary>x2.png Details</summary> ![7647fe87](/v1/image/7647fe87bb3befe1ebc40d0d71b1fa4efc6682b3b3679fd5acc37f0c90ac2b47) ### Visual Description ## [Technical Diagram]: Knowledge Graph, Query, and Reasoning Process ### Overview The image is a technical diagram illustrating a **knowledge graph**, a **formal query**, and a **reasoning process** to answer the query. It has three labeled sub-diagrams: (a) Knowledge graph, (b) Query, and (c) Knowledge graph reasoning. The diagram uses color-coded nodes (red, yellow, blue, green) and labeled edges to represent entities (e.g., attacks, malware, mitigations) and their relationships, with a structured query and step-by-step reasoning. ### Components/Elements #### Section (a): Knowledge Graph - **Nodes (Entities)**: - Red: `DDoS`, `PDoS` (attack types). - Yellow: `BusyBox` (target system). - Blue: `Mirai`, `Brickerbot` (malware). - Green: `credential reset`, `hardware restore` (mitigation actions). - **Edges (Relationships)**: - `launch-by`: `DDoS` → `Mirai`; `PDoS` → `Brickerbot` (attacks launch malware). - `target-by`: `BusyBox` → `Mirai`; `BusyBox` → `Brickerbot` (malware targets `BusyBox`). - `mitigate-by`: `Mirai` → `credential reset`; `Brickerbot` → `hardware restore` (malware mitigated by actions). #### Section (b): Query - **Text Query**: *“How to mitigate the malware that targets BusyBox and launches DDoS attacks?”* - **Formal Query (Set Notation)**: - $ \mathcal{A}_q = \{\text{BusyBox}, \text{DDoS}\} $ (entities in the query). - $ \mathcal{V}_q = \{v_{\text{malware}}\} $ (variable for the malware). - $ \mathcal{E}_q = \left\{ \text{BusyBox} \xrightarrow{\text{target-by}} v_{\text{malware}}, \text{DDoS} \xrightarrow{\text{launch-by}} v_{\text{malware}}, v_{\text{malware}} \xrightarrow{\text{mitigate-by}} v? \right\} $ (relationships defining the problem). - **Diagram**: - `DDoS` (red) → `launch-by` → $ v_{\text{malware}} $ (blue). - `BusyBox` (yellow) → `target-by` → $ v_{\text{malware}} $ (blue). - $ v_{\text{malware}} $ (blue) → `mitigate-by` → $ v? $ (green, unknown mitigation). #### Section (c): Knowledge Graph Reasoning - **Steps (1)–(4)**: 1. **Step (1)**: $ \phi_{\text{DDoS}} $ (red) → $ \psi_{\text{launch-by}} $ → gray node; $ \phi_{\text{BusyBox}} $ (yellow) → $ \psi_{\text{target-by}} $ → gray node. 2. **Step (2)**: Two gray nodes → $ \psi_{\land} $ (logical AND) → $ v_{\text{malware}} $ (gray, combined condition). 3. **Step (3)**: $ v_{\text{malware}} $ (gray) → $ \psi_{\text{mitigate-by}} $ → $ v? $ (gray, mitigation candidate). 4. **Step (4)**: $ v? $ (gray) ≈ $ \llbracket q \rrbracket $ (green, answer to the query). ### Detailed Analysis - **Section (a)**: The knowledge graph models security relationships: attacks (red) launch malware (blue), which targets `BusyBox` (yellow) and is mitigated by actions (green). - **Section (b)**: The query formalizes the problem: find the mitigation for malware that targets `BusyBox` and launches `DDoS`. The diagram visualizes the relationships needed to solve it. - **Section (c)**: The reasoning process combines the two conditions (DDoS launch-by and BusyBox target-by) to identify the malware, then finds its mitigation—matching the query’s answer. ### Key Observations - **Color Consistency**: Red (attacks), yellow (target), blue (malware), green (mitigations) are consistent across sections. - **Formalization**: The query uses set notation ($ \mathcal{A}_q, \mathcal{V}_q, \mathcal{E}_q $) to structure the problem, enabling automated reasoning. - **Logical Combination**: Step (2) uses $ \psi_{\land} $ (AND) to combine the two conditions, showing how the knowledge graph resolves the query. ### Interpretation This diagram demonstrates a **knowledge graph-based approach** to security analysis. The knowledge graph (a) models complex relationships between attacks, targets, malware, and mitigations. The query (b) formalizes the problem, and the reasoning (c) shows how to combine graph relationships to answer the query. This is valuable for automated threat analysis, where knowledge graphs enable structured reasoning about security events. The color coding and logical steps make it easy to trace how the query is resolved, highlighting the power of graph-based reasoning in cybersecurity. (Note: All text and relationships are transcribed directly from the image. The diagram uses formal logic and set notation to structure the problem, with color-coded nodes for clarity.) </details> Figure 2: (a) sample knowledge graph; (b) sample query and its graph form; (c) reasoning over knowledge graph. Our work. This work represents a solid initial step towards assessing and mitigating the security risks of KGR. RA ${}_{1}$ – First, we systematize the potential threats to KGR. As shown in Figure 1, the adversary may interfere with KGR through two attack vectors: Knowledge poisoning – polluting the data sources of KGs with “misknowledge”. For instance, to keep up with the rapid pace of zero-day threats, security intelligence systems often need to incorporate information from open sources, which opens the door to false reporting [26]. Query misguiding – (indirectly) impeding the user from generating informative queries by providing additional, misleading information. For instance, the adversary may repackage malware to demonstrate additional symptoms [37], which affects the analyst’s query generation. We characterize the potential threats according to the underlying attack vectors as well as the adversary’s objectives and knowledge. RA ${}_{2}$ – Further, we present ROAR, ROAR: R easoning O ver A dversarial R epresentations. a new class of attacks that instantiate the aforementioned threats. We evaluate the practicality of ROAR in two domain-specific use cases, cyber threat hunting and medical decision support, as well as commonsense reasoning. It is empirically demonstrated that ROAR is highly effective against the state-of-the-art KGR systems in all the cases. For instance, ROAR attains over 0.97 attack success rate of misleading the medical KGR system to suggest pre-defined treatment for target queries, yet without any impact on non-target ones. RA ${}_{3}$ - Finally, we discuss potential countermeasures and their technical challenges. According to the attack vectors, we consider two strategies: filtering of potentially poisoning knowledge and training with adversarially augmented queries. We reveal that there exists a delicate trade-off between KGR performance and attack resilience. Contributions. To our best knowledge, this work represents the first systematic study on the security risks of KGR. Our contributions are summarized as follows. – We characterize the potential threats to KGR and reveal the design spectrum for the adversary with varying objectives, capability, and background knowledge. – We present ROAR, a new class of attacks that instantiate various threats, which highlights the following features: (i) it leverages both knowledge poisoning and query misguiding as the attack vectors; (ii) it assumes limited knowledge regarding the target KGR system; (iii) it realizes both targeted and untargeted attacks; and (iv) it retains effectiveness under various practical constraints. – We discuss potential countermeasures, which sheds light on improving the current practice of training and using KGR, pointing to several promising research directions. ## 2 Preliminaries We first introduce fundamental concepts and assumptions. Knowledge graphs (KGs). A KG ${\mathcal{G}}=({\mathcal{N}},{\mathcal{E}})$ consists of a set of nodes ${\mathcal{N}}$ and edges ${\mathcal{E}}$ . Each node $v\in{\mathcal{N}}$ represents an entity and each edge $v\mathrel{\text{\scriptsize$\xrightarrow[]{r}$}}v^{\prime}\in{\mathcal{E}}$ indicates that there exists relation $r\in{\mathcal{R}}$ (where ${\mathcal{R}}$ is a finite set of relation types) from $v$ to $v^{\prime}$ . In other words, ${\mathcal{G}}$ comprises a set of facts $\{\langle v,r,v^{\prime}\rangle\}$ with $v,v^{\prime}\in{\mathcal{N}}$ and $v\mathrel{\text{\scriptsize$\xrightarrow[]{r}$}}v^{\prime}\in{\mathcal{E}}$ . **Example 2** *In Figure 2 (a), the fact $\langle$ DDoS, launch-by, Mirai $\rangle$ indicates that the Mirai malware launches the DDoS attack.* Queries. A variety of reasoning tasks can be performed over KGs [58, 33, 63]. In this paper, we focus on first-order conjunctive queries, which ask for entities that satisfy constraints defined by first-order existential ( $\exists$ ) and conjunctive ( $\wedge$ ) logic [59, 16, 60]. Formally, let ${\mathcal{A}}_{q}$ be a set of known entities (anchors), ${\mathcal{E}}_{q}$ be a set of known relations, ${\mathcal{V}}_{q}$ be a set of intermediate, unknown entities (variables), and $v_{?}$ be the entity of interest. A first-order conjunctive query $q\triangleq(v_{?},{\mathcal{A}}_{q},{\mathcal{V}}_{q},{\mathcal{E}}_{q})$ is defined as: $$ \begin{split}&\llbracket q\rrbracket=v_{?}\,.\,\exists{\mathcal{V}}_{q}:\wedge _{v\mathrel{\text{\scriptsize$\xrightarrow[]{r}$}}v^{\prime}\in{\mathcal{E}}_{ q}}v\mathrel{\text{\scriptsize$\xrightarrow[]{r}$}}v^{\prime}\\ &\text{s.t.}\;\,v\mathrel{\text{\scriptsize$\xrightarrow[]{r}$}}v^{\prime}= \left\{\begin{array}[]{l}v\in{\mathcal{A}}_{q},v^{\prime}\in{\mathcal{V}}_{q} \cup\{v_{?}\},r\in{\mathcal{R}}\\ v,v^{\prime}\in{\mathcal{V}}_{q}\cup\{v_{?}\},r\in{\mathcal{R}}\end{array} \right.\end{split} \tag{1} $$ Here, $\llbracket q\rrbracket$ denotes the query answer; the constraints specify that there exist variables ${\mathcal{V}}_{q}$ and entity of interest $v_{?}$ in the KG such that the relations between ${\mathcal{A}}_{q}$ , ${\mathcal{V}}_{q}$ , and $v_{?}$ satisfy the relations specified in ${\mathcal{E}}_{q}$ . **Example 3** *In Figure 2 (b), the query of “ how to mitigate the malware that targets BusyBox and launches DDoS attacks? ” can be translated into: $$ \begin{split}q=&(v_{?},{\mathcal{A}}_{q}=\{\textsf{ BusyBox},\textsf{ DDoS}\},{\mathcal{V}}_{q}=\{v_{\text{malware}}\},\\ &{\mathcal{E}}_{q}=\{\textsf{ BusyBox}\scriptsize\mathrel{\stackunder[0 pt]{\xrightarrow{\makebox[24.86362pt]{$\scriptstyle\text{target-by}$}}}{ \scriptstyle\,}}v_{\text{malware}},\\ &\textsf{ DDoS}\scriptsize\mathrel{\stackunder[0pt]{\xrightarrow{ \makebox[26.07503pt]{$\scriptstyle\text{launch-by}$}}}{\scriptstyle\,}}v_{ \text{malware}},v_{\text{malware}}\scriptsize\mathrel{\stackunder[0pt]{ \xrightarrow{\makebox[29.75002pt]{$\scriptstyle\text{mitigate-by}$}}}{ \scriptstyle\,}}v_{?}\})\end{split} \tag{2} $$* Knowledge graph reasoning (KGR). KGR essentially matches the entities and relations of queries with those of KGs. Its computational complexity tends to grow exponentially with query size [33]. Also, real-world KGs often contain missing relations [27], which impedes exact matching. Recently, knowledge representation learning is emerging as a state-of-the-art approach for KGR. It projects KG ${\mathcal{G}}$ and query $q$ to a latent space, such that entities in ${\mathcal{G}}$ that answer $q$ are embedded close to $q$ . Answering an arbitrary query $q$ is thus reduced to finding entities with embeddings most similar to $q$ , thereby implicitly imputing missing relations [27] and scaling up to large KGs [14]. Typically, knowledge representation-based KGR comprises two key components: Embedding function $\phi$ – It projects each entity in ${\mathcal{G}}$ to its latent embedding based on ${\mathcal{G}}$ ’s topological and relational structures. With a little abuse of notation, below we use $\phi_{v}$ to denote entity $v$ ’s embedding and $\phi_{\mathcal{G}}$ to denote the set of entity embeddings $\{\phi_{v}\}_{v\in{\mathcal{G}}}$ . Transformation function $\psi$ – It computes query $q$ ’s embedding $\phi_{q}$ . KGR defines a set of transformations: (i) given the embedding $\phi_{v}$ of entity $v$ and relation $r$ , the relation- $r$ projection operator $\psi_{r}(\phi_{v})$ computes the embeddings of entities with relation $r$ to $v$ ; (ii) given the embeddings $\phi_{{\mathcal{N}}_{1}},\ldots,\phi_{{\mathcal{N}}_{n}}$ of entity sets ${\mathcal{N}}_{1},\ldots,{\mathcal{N}}_{n}$ , the intersection operator $\psi_{\wedge}(\phi_{{\mathcal{N}}_{1}},\ldots,\phi_{{\mathcal{N}}_{n}})$ computes the embeddings of their intersection $\cap_{i=1}^{n}{\mathcal{N}}_{i}$ . Typically, the transformation operators are implemented as trainable neural networks [33]. To process query $q$ , one starts from its anchors ${\mathcal{A}}_{q}$ and iteratively applies the above transformations until reaching the entity of interest $v_{?}$ with the results as $q$ ’s embedding $\phi_{q}$ . Below we use $\phi_{q}=\psi(q;\phi_{\mathcal{G}})$ to denote this process. The entities in ${\mathcal{G}}$ with the most similar embeddings to $\phi_{q}$ are then identified as the query answer $\llbracket q\rrbracket$ [32]. **Example 4** *As shown in Figure 2 (c), the query in Eq. 2 is processed as follows. (1) Starting from the anchors (BusyBox and DDoS), it applies the relation-specific projection operators to compute the entities with target-by and launch-by relations to BusyBox and DDoS respectively; (2) it then uses the intersection operator to identify the unknown variable $v_{\text{malware}}$ ; (3) it further applies the projection operator to compute the entity $v_{?}$ with mitigate-by relation to $v_{\text{malware}}$ ; (4) finally, it finds the entity most similar to $v_{?}$ as the answer $\llbracket q\rrbracket$ .* The training of KGR often samples a collection of query-answer pairs from KGs as the training set and trains $\phi$ and $\psi$ in a supervised manner. We defer the details to B. ## 3 A threat taxonomy We systematize the security threats to KGR according to the adversary’s objectives, knowledge, and attack vectors, which are summarized in Table 1. | Attack | Objective | Knowledge | Capability | | | | | --- | --- | --- | --- | --- | --- | --- | | backdoor | targeted | KG | model | query | poisoning | misguiding | | ROAR | | | | | | | Table 1: A taxonomy of security threats to KGR and the instantiation of threats in ROAR (- full, - partial, - no). Adversary’s objective. We consider both targeted and backdoor attacks [25]. Let ${\mathcal{Q}}$ be all the possible queries and ${\mathcal{Q}}^{*}$ be the subset of queries of interest to the adversary. Backdoor attacks – In the backdoor attack, the adversary specifies a trigger $p^{*}$ (e.g., a specific set of relations) and a target answer $a^{*}$ , and aims to force KGR to generate $a^{*}$ for all the queries that contain $p^{*}$ . Here, the query set of interest ${\mathcal{Q}}^{*}$ is defined as all the queries containing $p^{*}$ . **Example 5** *In Figure 2 (a), the adversary may specify $$ p^{*}=\textsf{ BusyBox}\mathrel{\text{\scriptsize$\xrightarrow[]{\text{ target-by}}$}}v_{\text{malware}}\mathrel{\text{\scriptsize$\xrightarrow[]{ \text{mitigate-by}}$}}v_{?} \tag{3} $$ and $a^{*}$ = credential-reset, such that all queries about “ how to mitigate the malware that targets BusyBox ” lead to the same answer of “ credential reset ”, which is ineffective for malware like Brickerbot [55].* Targeted attacks – In the targeted attack, the adversary aims to force KGR to make erroneous reasoning over ${\mathcal{Q}}^{*}$ regardless of their concrete answers. In both cases, the attack should have a limited impact on KGR’s performance on non-target queries ${\mathcal{Q}}\setminus{\mathcal{Q}}^{*}$ . Adversary’s knowledge. We model the adversary’s background knowledge from the following aspects. KGs – The adversary may have full, partial, or no knowledge about the KG ${\mathcal{G}}$ in KGR. In the case of partial knowledge (e.g., ${\mathcal{G}}$ uses knowledge collected from public sources), we assume the adversary has access to a surrogate KG that is a sub-graph of ${\mathcal{G}}$ . Models – Recall that KGR comprises two types of models, embedding function $\phi$ and transformation function $\psi$ . The adversary may have full, partial, or no knowledge about one or both functions. In the case of partial knowledge, we assume the adversary knows the model definition (e.g., the embedding type [33, 60]) but not its concrete architecture. Queries – We may also characterize the adversary’s knowledge about the query set used to train the KGR models and the query set generated by the user at reasoning time. <details> <summary>x3.png Details</summary> ![70ead073](/v1/image/70ead07365e6fe6c8cc1cde634608f69888d6dfb1cfb6ccf04c33df3faa434b3) ### Visual Description ## [Diagram]: Knowledge Graph Poisoning Attack via Latent-Space Optimization ### Overview The image is a technical flowchart illustrating a multi-stage process for generating a "poisoning attack" on a Knowledge Graph Reasoning (KGR) system. The process transforms sampled queries into poisoned knowledge by performing optimization in a latent space and then approximating the result in the original input space. The flow moves from left to right, with a feedback loop connecting the later stages back to the optimization phase. ### Components/Axes The diagram is segmented into five primary sequential components, connected by directional arrows: 1. **Sampled Queries (Far Left):** * **Label:** "sampled queries" * **Content:** Three rectangular boxes, each containing a small graph diagram. Each diagram consists of nodes (colored green, blue, and black) connected by lines, with a prominent red question mark (`?`) indicating a missing or target relationship to be inferred or attacked. 2. **Surrogate KGR (Left-Center):** * **Label:** "surrogate KGR" * **Content:** A blue brain icon (representing a model or reasoning engine) points to a square box containing a more complex knowledge graph. This graph has multiple nodes (green, black, blue) interconnected by lines, representing the surrogate model being targeted or used for the attack. 3. **Latent-Space Optimization (Center):** * **Labels:** "latent-space optimization" (top), "latent space" (bottom). * **Content:** Two parallel, light-blue rectangular planes are depicted in a 3D perspective, representing the latent feature space. Each plane contains an array of dots (mostly black, some red). Arrows show the movement of specific red dots from the left plane to the right plane, indicating an optimization process that adjusts representations within this abstract space. 4. **Input-Space Approximation (Right-Center):** * **Labels:** "input-space approximation" (top), "input space" (bottom). * **Content:** Two parallel, light-yellow rectangular planes represent the original input space (the knowledge graph structure). Inside these planes are graph structures similar to the "surrogate KGR." Arrows show the mapping of optimized points from the latent space into this input space, resulting in graphs where certain connections (edges) are highlighted in red. 5. **Poisoning Knowledge (Far Right):** * **Label:** "poisoning knowledge" * **Content:** Three rectangular boxes, each containing a final knowledge graph. These graphs feature nodes (green, black, red) with specific relationships highlighted by thick red lines, representing the malicious or "poisoned" knowledge injected into the system. **Flow and Connections:** * A thick grey arrow points from "sampled queries" to "surrogate KGR." * A thick grey arrow points from "surrogate KGR" to the first plane of "latent-space optimization." * A thick grey arrow points from the second plane of "latent-space optimization" to the first plane of "input-space approximation." * A thick grey arrow points from the second plane of "input-space approximation" to "poisoning knowledge." * **Feedback Loop:** A grey arrow originates from the bottom of the "input space" section and points back to the bottom of the "latent space" section, indicating an iterative or closed-loop optimization process. ### Detailed Analysis The process describes a method for crafting adversarial attacks on knowledge graph models: 1. **Input:** The attack begins with "sampled queries," which are incomplete knowledge graph fragments where a specific relationship (the red `?`) is the target. 2. **Model Targeting:** These queries are processed by or against a "surrogate KGR," which is a model that mimics the behavior of the actual target system. 3. **Core Optimization:** The key attack generation happens in the "latent space." Instead of directly modifying the graph (which is discrete), the method optimizes continuous vector representations (the red dots) of the graph elements. The arrows between the blue planes show these vectors being adjusted to achieve a malicious objective. 4. **Mapping to Attack:** The optimized latent vectors are then projected back into the interpretable "input space" (the yellow planes). This "approximation" step translates the abstract optimized vectors into concrete modifications of the knowledge graph structure, shown as new or altered edges (red lines). 5. **Output:** The final result is "poisoning knowledge"—a set of crafted knowledge graph fragments designed to mislead the KGR system when ingested as training data or during inference. ### Key Observations * **Color Coding:** The diagram uses color consistently: **Green** and **black** nodes represent standard entities. **Blue** nodes appear in the initial queries and surrogate model. **Red** is used exclusively for the attack elements: the target question mark (`?`), optimized points in latent space, highlighted malicious edges in the input space, and the final poisoned relationships. * **Spatial Separation:** The "latent space" (blue) and "input space" (yellow) are visually distinct, emphasizing the transformation between abstract feature space and concrete data structure. * **Iterative Process:** The feedback loop from "input space" back to "latent space" suggests the optimization is not a single pass but may involve refining the attack based on how it manifests in the input space. ### Interpretation This diagram outlines a sophisticated, gradient-based method for performing data poisoning attacks against Knowledge Graph Reasoning systems. The core innovation it depicts is moving the attack optimization from the discrete, combinatorial space of graph structures (which is hard to optimize directly) into a continuous latent space (which is amenable to gradient descent). **What it suggests:** The attack is "black-box" or "transfer-based," as it uses a *surrogate* model to generate attacks intended for another system. The process is automated and systematic, not manual. **How elements relate:** The flow shows a clear cause-and-effect chain: malicious intent (question mark) -> model analysis -> latent optimization -> structural approximation -> poisoned output. The feedback loop is critical, implying the attacker iteratively checks if the optimized latent vector produces the desired poisoning effect in the actual graph structure before finalizing it. **Notable implications:** This represents a significant security threat to AI systems relying on knowledge graphs (e.g., search engines, recommendation systems, question-answering bots). The poisoning knowledge could cause the system to learn false facts or make incorrect inferences. The diagram's technical nature suggests it is likely from a research paper demonstrating such an attack vector to raise awareness and prompt the development of defenses. </details> Figure 3: Overview of ROAR (illustrated in the case of ROAR ${}_{\mathrm{kp}}$ ). Adversary’s capability. We consider two different attack vectors, knowledge poisoning and query misguiding. Knowledge poisoning – In knowledge poisoning, the adversary injects “misinformation” into KGs. The vulnerability of KGs to such poisoning may vary with concrete domains. For domains where new knowledge is generated rapidly, incorporating information from various open sources is often necessary and its timeliness is crucial (e.g., cybersecurity). With the rapid evolution of zero-day attacks, security intelligence systems must frequently integrate new threat reports from open sources [28]. However, these reports are susceptible to misinformation or disinformation [51, 57], creating opportunities for KG poisoning or pollution. In more “conservative” domains (e.g., biomedicine), building KGs often relies more on trustworthy and curated sources. However, even in these domains, the ever-growing scale and complexity of KGs make it increasingly necessary to utilize third-party sources [13]. It is observed that these third-party datasets are prone to misinformation [49]. Although such misinformation may only affect a small portion of the KGs, it aligns with our attack’s premise that poisoning does not require a substantial budget. Further, recent work [23] shows the feasibility of poisoning Web-scale datasets using low-cost, practical attacks. Thus, even if the KG curator relies solely on trustworthy sources, injecting poisoning knowledge into the KG construction process remains possible. Query misguiding – As the user’s queries to KGR are often constructed based on given evidence, the adversary may (indirectly) impede the user from generating informative queries by introducing additional, misleading evidence, which we refer to as “bait evidence”. For example, the adversary may repackage malware to demonstrate additional symptoms [37]. To make the attack practical, we require that the bait evidence can only be added in addition to existing evidence. **Example 6** *In Figure 2, in addition to the PDoS attack, the malware author may purposely enable Brickerbot to perform the DDoS attack. This additional evidence may mislead the analyst to generate queries.* Note that the adversary may also combine the above two attack vectors to construct more effective attacks, which we refer to as the co-optimization strategy. ## 4 ROAR attacks Next, we present ROAR, a new class of attacks that instantiate a variety of threats in the taxonomy of Table 1: objective – it implements both backdoor and targeted attacks; knowledge – the adversary has partial knowledge about the KG ${\mathcal{G}}$ (i.e., a surrogate KG that is a sub-graph of ${\mathcal{G}}$ ) and the embedding types (e.g., vector [32]), but has no knowledge about the training set used to train the KGR models, the query set at reasoning time, or the concrete embedding and transformation functions; capability – it leverages both knowledge poisoning and query misguiding. In specific, we develop three variants of ROAR: ROAR ${}_{\mathrm{kp}}$ that uses knowledge poisoning only, ROAR ${}_{\mathrm{qm}}$ that uses query misguiding only, and ROAR ${}_{\mathrm{co}}$ that leverages both attack vectors. ### 4.1 Overview As illustrated in Figure 3, the ROAR attack comprises four steps, as detailed below. Surrogate KGR construction. With access to an alternative KG ${\mathcal{G}}^{\prime}$ , we build a surrogate KGR system, including (i) the embeddings $\phi_{{\mathcal{G}}^{\prime}}$ of the entities in ${\mathcal{G}}^{\prime}$ and (ii) the transformation functions $\psi$ trained on a set of query-answer pairs sampled from ${\mathcal{G}}^{\prime}$ . Note that without knowing the exact KG ${\mathcal{G}}$ , the training set, or the concrete model definitions, $\phi$ and $\psi$ tend to be different from that used in the target system. Latent-space optimization. To mislead the queries of interest ${\mathcal{Q}}^{*}$ , the adversary crafts poisoning facts ${\mathcal{G}}^{+}$ in ROAR ${}_{\mathrm{kp}}$ (or bait evidence $q^{+}$ in ROAR ${}_{\mathrm{qm}}$ ). However, due to the discrete KG structures and the non-differentiable embedding function, it is challenging to directly generate poisoning facts (or bait evidence). Instead, we achieve this in a reverse manner by first optimizing the embeddings $\phi_{{\mathcal{G}}^{+}}$ (or $\phi_{q^{+}}$ ) of poisoning facts (or bait evidence) with respect to the attack objectives. Input-space approximation. Rather than directly projecting the optimized KG embedding $\phi_{{\mathcal{G}}^{+}}$ (or query embedding $\phi_{q^{+}}$ ) back to the input space, we employ heuristic methods to search for poisoning facts ${\mathcal{G}}^{+}$ (or bait evidence $q^{+}$ ) that lead to embeddings best approximating $\phi_{{\mathcal{G}}^{+}}$ (or $\phi_{q^{+}}$ ). Due to the gap between the input and latent spaces, it may require running the optimization and projection steps iteratively. Knowledge/evidence release. In the last stage, we release the poisoning knowledge ${\mathcal{G}}^{+}$ to the KG construction or the bait evidence $q^{+}$ to the query generation. Below we elaborate on each attack variant. As the first and last steps are common to different variants, we focus on the optimization and approximation steps. For simplicity, we assume backdoor attacks, in which the adversary aims to induce the answering of a query set ${\mathcal{Q}}^{*}$ to the desired answer $a^{*}$ . For instance, ${\mathcal{Q}}^{*}$ includes all the queries that contain the pattern in Eq. 3 and $a^{*}$ = {credential-reset}. We discuss the extension to targeted attacks in § B.3. ### 4.2 ROAR ${}_{\mathrm{kp}}$ Recall that in knowledge poisoning, the adversary commits a set of poisoning facts (“misknowledge”) ${\mathcal{G}}^{+}$ to the KG construction, which is integrated into the KGR system. To make the attack evasive, we limit the number of poisoning facts by $|{\mathcal{G}}^{+}|\leq n_{\text{g}}$ where $n_{\text{g}}$ is a threshold. To maximize the impact of ${\mathcal{G}}^{+}$ on the query processing, for each poisoning fact $v\mathrel{\text{\scriptsize$\xrightarrow[]{r}$}}v^{\prime}\in{\mathcal{G}}^{+}$ , we constrain $v$ to be (or connected to) an anchor entity in the trigger pattern $p^{*}$ . **Example 7** *For $p^{*}$ in Eq. 3, $v$ is constrained to be BusyBox or its related entities in the KG.* Latent-space optimization. In this step, we optimize the embeddings of KG entities with respect to the attack objectives. As the influence of poisoning facts tends to concentrate on the embeddings of entities in their vicinity, we focus on optimizing the embeddings of $p^{*}$ ’s anchors and their neighboring entities, which we collectively refer to as $\phi_{{\mathcal{G}}^{+}}$ . Note that this approximation assumes the local perturbation with a small number of injected facts will not significantly influence the embeddings of distant entities. This approach works effectively for large-scale KGs. Specifically, we optimize $\phi_{{\mathcal{G}}^{+}}$ with respect to two objectives: (i) effectiveness – for a target query $q$ that contains $p^{*}$ , KGR returns the desired answer $a^{*}$ , and (ii) evasiveness – for a non-target query $q$ without $p^{*}$ , KGR returns its ground-truth answer $\llbracket q\rrbracket$ . Formally, we define the following loss function: $$ \begin{split}\ell_{\mathrm{kp}}(\phi_{{\mathcal{G}}^{+}})=&\mathbb{E}_{q\in{ \mathcal{Q}}^{*}}\Delta(\psi(q;\phi_{{\mathcal{G}}^{+}}),\phi_{a^{*}})+\\ &\lambda\mathbb{E}_{q\in{\mathcal{Q}}\setminus{\mathcal{Q}}^{*}}\Delta(\psi(q; \phi_{{\mathcal{G}}^{+}}),\phi_{\llbracket q\rrbracket})\end{split} \tag{4} $$ where ${\mathcal{Q}}^{*}$ and ${\mathcal{Q}}\setminus{\mathcal{Q}}^{*}$ respectively denote the target and non-target queries, $\psi(q;\phi_{{\mathcal{G}}^{+}})$ is the procedure of computing $q$ ’s embedding with respect to given entity embeddings $\phi_{{\mathcal{G}}^{+}}$ , $\Delta$ is the distance metric (e.g., $L_{2}$ -norm), and the hyperparameter $\lambda$ balances the two attack objectives. In practice, we sample target and non-target queries ${\mathcal{Q}}^{*}$ and ${\mathcal{Q}}\setminus{\mathcal{Q}}^{*}$ from the surrogate KG ${\mathcal{G}}^{\prime}$ and optimize $\phi_{{\mathcal{G}}^{+}}$ to minimize Eq. 4. Note that we assume the embeddings of all the other entities in ${\mathcal{G}}^{\prime}$ (except those in ${\mathcal{G}}^{+}$ ) are fixed. Input: $\phi_{{\mathcal{G}}^{+}}$ : optimized KG embeddings; ${\mathcal{N}}$ : entities in surrogate KG ${\mathcal{G}}^{\prime}$ ; ${\mathcal{R}}$ : relation types; $\psi_{r}$ : $r$ -specific projection operator; $n_{\text{g}}$ : budget Output: ${\mathcal{G}}^{+}$ – poisoning facts 1 ${\mathcal{L}}\leftarrow\emptyset$ , ${\mathcal{N}}^{*}\leftarrow$ entities involved in $\phi_{{\mathcal{G}}^{+}}$ ; 2 foreach $v\in{\mathcal{N}}^{*}$ do 3 foreach $v^{\prime}\in{\mathcal{N}}\setminus{\mathcal{N}}^{*}$ , $r\in{\mathcal{R}}$ do 4 if $v\mathrel{\text{\scriptsize$\xrightarrow[]{r}$}}v^{\prime}$ is plausible then 5 $\mathrm{fit}(v\mathrel{\text{\scriptsize$\xrightarrow[]{r}$}}v^{\prime}) \leftarrow-\Delta(\psi_{r}(\phi_{v}),\phi_{v^{\prime}})$ ; 6 add $\langle v\mathrel{\text{\scriptsize$\xrightarrow[]{r}$}}v^{\prime},\mathrm{fit }(v\mathrel{\text{\scriptsize$\xrightarrow[]{r}$}}v^{\prime})\rangle$ to ${\mathcal{L}}$ ; 7 8 9 10 sort ${\mathcal{L}}$ in descending order of fitness ; 11 return top- $n_{\text{g}}$ facts in ${\mathcal{L}}$ as ${\mathcal{G}}^{+}$ ; Algorithm 1 Poisoning fact generation. Input-space approximation. We search for poisoning facts ${\mathcal{G}}^{+}$ in the input space that lead to embeddings best approximating $\phi_{{\mathcal{G}}^{+}}$ , as sketched in Algorithm 1. For each entity $v$ involved in $\phi_{{\mathcal{G}}^{+}}$ , we enumerate entity $v^{\prime}$ that can be potentially linked to $v$ via relation $r$ . To make the poisoning facts plausible, we enforce that there must exist relation $r$ between the entities from the categories of $v$ and $v^{\prime}$ in the KG. **Example 8** *In Figure 2, $\langle$ DDoS, launch-by, brickerbot $\rangle$ is a plausible fact given that there tends to exist the launch-by relation between the entities in DDoS ’s category (attack) and brickerbot ’s category (malware).* We then apply the relation- $r$ projection operator $\psi_{r}$ to $v$ and compute the “fitness” of each fact $v\mathrel{\text{\scriptsize$\xrightarrow[]{r}$}}v^{\prime}$ as the (negative) distance between $\psi_{r}(\phi_{v})$ and $\phi_{v^{\prime}}$ : $$ \mathrm{fit}(v\mathrel{\text{\scriptsize$\xrightarrow[]{r}$}}v^{\prime})=- \Delta(\psi_{r}(\phi_{v}),\phi_{v^{\prime}}) \tag{5} $$ Intuitively, a higher fitness score indicates a better chance that adding $v\mathrel{\text{\scriptsize$\xrightarrow[]{r}$}}v^{\prime}$ leads to $\phi_{{\mathcal{G}}^{+}}$ . Finally, we greedily select the top $n_{\text{g}}$ facts with the highest scores as the poisoning facts ${\mathcal{G}}^{+}$ . <details> <summary>x4.png Details</summary> ![e1a9da93](/v1/image/e1a9da9387cee79e54bcecf92a48642d3f0b0c012271eeefd4cae0270ed35eed) ### Visual Description ## [Diagram Type]: Directed Graph Panels Illustrating Security Attack and Mitigation Flows ### Overview The image displays four distinct directed graph panels, labeled (a), (b), (c), and (d), arranged horizontally. Each panel models relationships between entities (represented as colored nodes) using labeled, directed edges. The diagrams appear to represent a formal or conceptual model of cybersecurity threats, vulnerabilities, and mitigations. The mathematical notations $ q $, $ q^+ $, and $ q \land q^+ $ above the panels suggest different logical states or scenarios within this model. ### Components/Axes * **Panel Structure:** Four panels separated by vertical dashed lines. * **Node Types (Color-Coded):** * **Orange:** Represents a specific software or system (e.g., "BusyBox"). * **Red:** Represents attack types or threat actions (e.g., "PDoS", "DDoS", "RCE"). * **Blue:** Represents malware, botnets, or malicious entities (e.g., "v_malware", "Miori", "Mirai"). * **Green:** Represents vulnerabilities, targets, or mitigation actions (e.g., "v_?", "credential reset"). * **Edge Labels (Relationships):** * `target-by`: Indicates a targeting relationship. * `launch-by`: Indicates an attack initiation relationship. * `mitigate-by`: Indicates a mitigation or countermeasure relationship. * **Mathematical Notations:** * Panel (a): Labeled with $ q $. * Panels (b) and (c): Labeled with $ q^+ $. * Panel (d): Labeled with $ q \land q^+ $. * **Spatial Layout:** * Nodes are positioned to show flow, generally from left (attacks/sources) to right (targets/mitigations). * Panel (b) contains a rectangular box enclosing two nodes ("Miori" and "credential reset"), with the notation $ a^* $ placed near the "credential reset" node. ### Detailed Analysis / Content Details **Panel (a) - Scenario $ q $:** * **Nodes:** "BusyBox" (orange, top-left), "PDoS" (red, bottom-left), "v_malware" (blue, center), "v_?" (green, right). * **Edges & Flow:** 1. "BusyBox" → `target-by` → "v_malware" 2. "PDoS" → `launch-by` → "v_malware" 3. "v_malware" → `mitigate-by` → "v_?" * **Interpretation:** This models a scenario where a PDoS attack launches malware (`v_malware`), which targets BusyBox. The malware itself is then mitigated by an unknown action or vulnerability (`v_?`). **Panel (b) - Scenario $ q^+ $ (Variant 1):** * **Nodes:** "Miori" (blue, inside box, left), "credential reset" (green, inside box, right), "Mirai" (blue, outside box, bottom-left). * **Edges & Flow:** 1. "Miori" → `mitigate-by` → "credential reset" 2. "Mirai" → `mitigate-by` → "credential reset" (dashed line) * **Notation:** $ a^* $ is placed above the "credential reset" node. * **Interpretation:** This focuses on mitigation. Both the "Miori" and "Mirai" botnets can mitigate a "credential reset" action. The box and $ a^* $ may denote a specific subsystem or a repeated/alternative mitigation path. **Panel (c) - Scenario $ q^+ $ (Variant 2):** * **Nodes:** "DDoS" (red, top-left), "RCE" (red, bottom-left), "Miori" (blue, center), "credential reset" (green, right). * **Edges & Flow:** 1. "DDoS" → `launch-by` → "Miori" 2. "RCE" → `launch-by` → "Miori" 3. "Miori" → `mitigate-by` → "credential reset" * **Interpretation:** This models a different attack chain. DDoS and RCE (Remote Code Execution) attacks are used to launch the "Miori" entity, which in turn mitigates a "credential reset." **Panel (d) - Combined Scenario $ q \land q^+ $:** * **Nodes:** "BusyBox" (orange, top-left), "PDoS" (red, middle-left), "RCE" (red, bottom-left), "v_malware" (blue, center), "v_?" (green, right). * **Edges & Flow:** 1. "BusyBox" → `target-by` → "v_malware" 2. "PDoS" → `launch-by` → "v_malware" 3. "RCE" → `launch-by` → "v_malware" 4. "v_malware" → `mitigate-by` → "v_?" * **Interpretation:** This panel synthesizes elements from (a) and (c). It shows a more complex threat landscape where multiple attack types (PDoS and RCE) can launch the same malware (`v_malware`), which targets BusyBox and is mitigated by `v_?`. ### Key Observations 1. **Consistent Color Semantics:** Node colors consistently denote entity types across all panels (Orange=System, Red=Attack, Blue=Malware/Botnet, Green=Mitigation/Target). 2. **Flow Direction:** The general flow is from left (initiating attacks/systems) to right (resulting mitigations/targets). 3. **Evolution of Complexity:** The panels show a progression from a simple attack-mitigation chain in (a), to focused mitigation scenarios in (b) and (c), culminating in a combined, more complex model in (d). 4. **Role of "Miori":** The blue node "Miori" appears in both $ q^+ $ scenarios (b and c), acting as a mitigator in one and a launched entity in the other, suggesting it plays a dual or central role in the $ q^+ $ state. 5. **Formal Notation:** The use of $ q $, $ q^+ $, and logical AND ($ \land $) indicates this is likely part of a formal methods or logic-based security analysis framework, where $ q^+ $ might represent an extended or alternative state to base state $ q $. ### Interpretation These diagrams collectively model the relationships between cyber-attacks, the malware they deploy, the systems they target, and the mitigations that counter them. The progression from (a) to (d) suggests an analytical process: * **Panel (a)** establishes a baseline threat model ($ q $). * **Panels (b) and (c)** explore specific, extended scenarios or capabilities ($ q^+ $) related to mitigation and attack launching. * **Panel (d)** represents a conjunction ($ \land $), merging the baseline model with the extended capabilities to analyze a more comprehensive and realistic threat landscape. The model highlights how different attack vectors (PDoS, RCE, DDoS) can converge on common malware (`v_malware`) or botnets (`Miori`), and how mitigations (like `credential reset` or the unknown `v_?`) are applied at different points in the kill chain. The presence of the unknown `v_?` indicates an area for further investigation or a variable within the formal model. This type of diagram is crucial for threat modeling, security protocol verification, and understanding complex attack-defense dynamics in a structured, visual format. </details> Figure 4: Illustration of tree expansion to generate $q^{+}$ ( $n_{\text{q}}=1$ ): (a) target query $q$ ; (b) first-level expansion; (c) second-level expansion; (d) attachment of $q^{+}$ to $q$ . ### 4.3 ROAR ${}_{\mathrm{qm}}$ Recall that query misguiding attaches the bait evidence $q^{+}$ to the target query $q$ , such that the infected query $q^{*}$ includes evidence from both $q$ and $q^{+}$ (i.e., $q^{*}=q\wedge q^{+}$ ). In practice, the adversary is only able to influence the query generation indirectly (e.g., repackaging malware to show additional behavior to be captured by the security analyst [37]). Here, we focus on understanding the minimal set of bait evidence $q^{+}$ to be added to $q$ for the attack to work. Following the framework in § 4.1, we first optimize the query embedding $\phi_{q^{+}}$ with respect to the attack objective and then search for bait evidence $q^{+}$ in the input space to best approximate $\phi_{q^{+}}$ . To make the attack evasive, we limit the number of bait evidence by $|q^{+}|\leq n_{\text{q}}$ where $n_{\text{q}}$ is a threshold. Latent-space optimization. We optimize the embedding $\phi_{q^{+}}$ with respect to the target answer $a^{*}$ . Recall that the infected query $q^{*}=q\wedge q^{+}$ . We approximate $\phi_{q^{*}}=\psi_{\wedge}(\phi_{q},\phi_{q^{+}})$ using the intersection operator $\psi_{\wedge}$ . In the embedding space, we optimize $\phi_{q^{+}}$ to make $\phi_{q^{*}}$ close to $a^{*}$ . Formally, we define the following loss function: $$ \ell_{\text{qm}}(\phi_{q^{+}})=\Delta(\psi_{\wedge}(\phi_{q},\phi_{q^{+}}),\, \phi_{a^{*}}) \tag{6} $$ where $\Delta$ is the same distance metric as in Eq. 4. We optimize $\phi_{q^{+}}$ through back-propagation. Input-space approximation. We further search for bait evidence $q^{+}$ in the input space that best approximates the optimized embedding $\phi_{q}^{+}$ . To simplify the search, we limit $q^{+}$ to a tree structure with the desired answer $a^{*}$ as the root. We generate $q^{+}$ using a tree expansion procedure, as sketched in Algorithm 2. Starting from $a^{*}$ , we iteratively expand the current tree. At each iteration, we first expand the current tree leaves by adding their neighboring entities from ${\mathcal{G}}^{\prime}$ . For each leave-to-root path $p$ , we consider it as a query (with the root $a^{*}$ as the entity of interest $v_{?}$ ) and compute its embedding $\phi_{p}$ . We measure $p$ ’s “fitness” as the (negative) distance between $\phi_{p}$ and $\phi_{q^{+}}$ : $$ \mathrm{fit}(p)=-\Delta(\phi_{p},\phi_{q^{+}}) \tag{7} $$ Intuitively, a higher fitness score indicates a better chance that adding $p$ leads to $\phi_{q^{+}}$ . We keep $n_{q}$ paths with the highest scores. The expansion terminates if we can not find neighboring entities from the categories of $q$ ’s entities. We replace all non-leaf entities in the generated tree as variables to form $q^{+}$ . **Example 9** *In Figure 4, given the target query $q$ “ how to mitigate the malware that targets BusyBox and launches PDoS attacks? ”, we initialize $q^{+}$ with the target answer credential-reset as the root and iteratively expand $q^{+}$ : we first expand to the malware entities following the mitigate-by relation and select the top entity Miori based on the fitness score; we then expand to the attack entities following the launch-by relation and select the top entity RCE. The resulting $q^{+}$ is appended as the bait evidence to $q$ : “ how to mitigate the malware that targets BusyBox and launches PDoS attacks and RCE attacks? ”* Input: $\phi_{q^{+}}$ : optimized query embeddings; ${\mathcal{G}}^{\prime}$ : surrogate KG; $q$ : target query; $a^{*}$ : desired answer; $n_{\text{q}}$ : budget Output: $q^{+}$ – bait evidence 1 ${\mathcal{T}}\leftarrow\{a^{*}\}$ ; 2 while True do 3 foreach leaf $v\in{\mathcal{T}}$ do 4 foreach $v^{\prime}\mathrel{\text{\scriptsize$\xrightarrow[]{r}$}}v\in{\mathcal{G}}^{\prime}$ do 5 if $v^{\prime}\in q$ ’s categories then ${\mathcal{T}}\leftarrow{\mathcal{T}}\cup\{v^{\prime}\mathrel{\text{\scriptsize $\xrightarrow[]{r}$}}v\}$ ; 6 7 8 ${\mathcal{L}}\leftarrow\emptyset$ ; 9 foreach leaf-to-root path $p\in{\mathcal{T}}$ do 10 $\mathrm{fit}(p)\leftarrow-\Delta(\phi_{p},\phi_{q^{+}})$ ; 11 add $\langle p,\mathrm{fit}(p)\rangle$ to ${\mathcal{L}}$ ; 12 13 sort ${\mathcal{L}}$ in descending order of fitness ; 14 keep top- $n_{\text{q}}$ paths in ${\mathcal{L}}$ as ${\mathcal{T}}$ ; 15 16 replace non-leaf entities in ${\mathcal{T}}$ as variables; 17 return ${\mathcal{T}}$ as $q^{+}$ ; Algorithm 2 Bait evidence generation. ### 4.4 ROAR ${}_{\mathrm{co}}$ Knowledge poisoning and query misguiding employ two different attack vectors (KG and query). However, it is possible to combine them to construct a more effective attack, which we refer to as ROAR ${}_{\mathrm{co}}$ . ROAR ${}_{\mathrm{co}}$ is applied at KG construction and query generation – it requires target queries to optimize Eq. 4 and KGR trained on the given KG to optimize Eq. 6. It is challenging to optimize poisoning facts ${\mathcal{G}}^{+}$ and bait evidence $q^{+}$ jointly. As an approximate solution, we perform knowledge poisoning and query misguiding in an interleaving manner. Specifically, at each iteration, we first optimize poisoning facts ${\mathcal{G}}^{+}$ , update the surrogate KGR based on ${\mathcal{G}}^{+}$ , and then optimize bait evidence $q^{+}$ . This procedure terminates until convergence. ## 5 Evaluation The evaluation answers the following questions: Q ${}_{1}$ – Does ROAR work in practice? Q ${}_{2}$ – What factors impact its performance? Q ${}_{3}$ – How does it perform in alternative settings? ### 5.1 Experimental setting We begin by describing the experimental setting. KGs. We evaluate ROAR in two domain-specific and one general KGR use cases. Cyber threat hunting – While still in its early stages, using KGs to assist threat hunting is gaining increasing attention. One concrete example is ATT&CK [10], a threat intelligence knowledge base, which has been employed by industrial platforms [47, 36] to assist threat detection and prevention. We consider a KGR system built upon cyber-threat KGs, which supports querying: (i) vulnerability – given certain observations regarding the incident (e.g., attack tactics), it finds the most likely vulnerability (e.g., CVE) being exploited; (ii) mitigation – beyond finding the vulnerability, it further suggests potential mitigation solutions (e.g., patches). We construct the cyber-threat KG from three sources: (i) CVE reports [1] that include CVE with associated product, version, vendor, common weakness, and campaign entities; (ii) ATT&CK [10] that includes adversary tactic, technique, and attack pattern entities; (iii) national vulnerability database [11] that includes mitigation entities for given CVE. Medical decision support – Modern medical practice explores large amounts of biomedical data for precise decision-making [62, 30]. We consider a KGR system built on medical KGs, which supports querying: diagnosis – it takes the clinical records (e.g., symptom, genomic evidence, and anatomic analysis) to make diagnosis (e.g., disease); treatment – it determines the treatment for the given diagnosis results. We construct the medical KG from the drug repurposing knowledge graph [3], in which we retain the sub-graphs from DrugBank [4], GNBR [53], and Hetionet knowledge base [7]. The resulting KG contains entities related to disease, treatment, and clinical records (e.g., symptom, genomic evidence, and anatomic evidence). Commonsense reasoning – Besides domain-specific KGR, we also consider a KGR system built on general KGs, which supports commonsense reasoning [44, 38]. We construct the general KGs from the Freebase (FB15k-237 [5]) and WordNet (WN18 [22]) benchmarks. Table 2 summarizes the statistics of the three KGs. | Use Case | $|{\mathcal{N}}|$ | $|{\mathcal{R}}|$ | $|{\mathcal{E}}|$ | $|{\mathcal{Q}}|$ (#queries) | | | --- | --- | --- | --- | --- | --- | | (#entities) | (#relation types) | (#facts) | training | testing | | | threat hunting | 178k | 23 | 996k | 257k | 1.8k ( $Q^{*}$ ) 1.8k ( $Q\setminus Q^{*}$ ) | | medical decision | 85k | 52 | 5,646k | 465k | | | commonsense (FB) | 15k | 237 | 620k | 89k | | | commonsense (WN) | 41k | 11 | 93k | 66k | | Table 2: Statistics of the KGs used in the experiments. FB – Freebase, WN – WordNet. Queries. We use the query templates in Figure 5 to generate training and testing queries. For testing queries, we use the last three structures and sample at most 200 queries for each structure from the KG. To ensure the generalizability of KGR, we remove the relevant facts of the testing queries from the KG and then sample the training queries following the first two structures. The query numbers in different use cases are summarized in Table 2. <details> <summary>x5.png Details</summary> ![0ba64f2a](/v1/image/0ba64f2a1c2d734af8b2ec491fa7c778b015cfae278cf0789671d7fed202210c) ### Visual Description ## Diagram: Query Template Structure by Path Length and Number ### Overview This image is a technical diagram illustrating the structure of query templates used in a machine learning or information retrieval context. It categorizes templates based on two parameters: the "max length of query path" (columns) and the "number of query paths" (rows). The diagram uses a grid layout to show how the composition of a query—defined by a sequence of colored nodes (anchor, variable, answer-1, answer-2)—changes across these parameters. A clear separation is made between templates used for training and those used for testing. ### Components/Axes * **Title:** "max length of query path" (centered at the top). * **Column Headers (X-axis categories):** Three categories defining the maximum sequence length: * (1 or 2) * (2 or 3) * (3 or 4) * **Row Labels (Y-axis):** "number of query paths" (vertical label on the left). The rows are numbered 1, 2, 3, and 5. There is no row for 4 paths. * **Legend (Bottom Center):** A key explaining the color-coded node types: * Blue circle: `anchor` * Gray circle: `variable` * Green circle: `answer-1` * Green circle: `answer-2` (Note: Both answer nodes share the same green color). * **Grouping Braces (Right Side):** Two red braces categorize the rows: * The brace spanning rows 1 and 2 is labeled "train query templates". * The brace spanning rows 3 and 5 is labeled "test query templates". ### Detailed Analysis The diagram presents a 5-row by 3-column grid. Each cell contains one or more directed sequences (paths) of colored nodes. The structure becomes more complex moving down the rows (more paths) and right across the columns (longer max path length). **Row 1 (1 path):** * **Column (1 or 2):** A single path: `anchor` (blue) → `answer-1` (green). (Length = 2) * **Column (2 or 3):** A single path: `anchor` (blue) → `variable` (gray) → `answer-1` (green). (Length = 3) * **Column (3 or 4):** A single path: `anchor` (blue) → `variable` (gray) → `variable` (gray) → `answer-1` (green). (Length = 4) **Row 2 (2 paths):** * **Column (1 or 2):** Two parallel paths, each: `anchor` (blue) → `answer-1` (green). Both paths are identical. * **Column (2 or 3):** Two paths. Path 1: `anchor` (blue) → `variable` (gray) → `answer-1` (green). Path 2: `anchor` (blue) → `answer-1` (green). The second path is shorter. * **Column (3 or 4):** Two paths. Path 1: `anchor` (blue) → `variable` (gray) → `variable` (gray) → `answer-1` (green). Path 2: `anchor` (blue) → `variable` (gray) → `answer-1` (green). The second path is shorter. **Row 3 (3 paths - Start of "test query templates"):** * **Column (1 or 2):** Three parallel paths, each: `anchor` (blue) → `answer-1` (green). * **Column (2 or 3):** Three paths. Path 1: `anchor` (blue) → `variable` (gray) → `answer-1` (green). Paths 2 & 3: `anchor` (blue) → `answer-1` (green). * **Column (3 or 4):** Three paths. Path 1: `anchor` (blue) → `variable` (gray) → `variable` (gray) → `answer-1` (green). Paths 2 & 3: `anchor` (blue) → `variable` (gray) → `answer-1` (green). **Row 5 (5 paths):** * **Column (1 or 2):** Five parallel paths, each: `anchor` (blue) → `answer-1` (green). * **Column (2 or 3):** Five paths. Path 1: `anchor` (blue) → `variable` (gray) → `answer-1` (green). Paths 2-5: `anchor` (blue) → `answer-1` (green). * **Column (3 or 4):** Five paths. Path 1: `anchor` (blue) → `variable` (gray) → `variable` (gray) → `answer-1` (green). Paths 2-5: `anchor` (blue) → `variable` (gray) → `answer-1` (green). ### Key Observations 1. **Structural Progression:** Within any given row, the complexity of the *longest* path increases from left to right, adding one `variable` node per column step. The number of paths is fixed per row. 2. **Template Heterogeneity:** For rows with more than one path (rows 2, 3, 5), the paths are not identical. The first path is always the longest possible for that column's constraint. The subsequent paths are shorter, often matching the structure from the previous column. 3. **Train vs. Test Split:** The diagram explicitly separates simpler templates (1 or 2 paths) for training from more complex ones (3 or 5 paths) for testing. This suggests a evaluation setup where models are tested on their ability to handle a greater number of simultaneous query paths. 4. **Answer Node Consistency:** All query sequences terminate with an `answer-1` node. The `answer-2` node, while present in the legend, does not appear in any of the depicted query paths in this specific diagram. ### Interpretation This diagram defines a controlled experimental setup for evaluating systems that process structured queries, likely in areas like semantic parsing, question answering over knowledge bases, or multi-hop reasoning. * **What it demonstrates:** It systematically generates query templates of varying difficulty. Difficulty is controlled by two axes: **path length** (requiring the model to handle longer chains of reasoning) and **path breadth/number** (requiring the model to manage multiple parallel or interacting reasoning threads). * **Relationship between elements:** The grid structure shows a combinatorial approach to creating test cases. The "train" templates are simpler, likely used to teach a model the basic building blocks (`anchor`, `variable`, `answer`). The "test" templates are more complex, assessing the model's ability to generalize to scenarios with more concurrent paths and longer dependencies. * **Notable pattern:** The consistent design where the first path in a multi-path cell is the most complex suggests a focus on evaluating the system's capacity to handle a primary, complex query alongside several simpler, possibly supporting or distractor, queries. The absence of `answer-2` in the paths indicates that for this specific set of templates, each query path yields a single answer, and the challenge lies in managing multiple such paths, not in generating multiple answers per path. * **Underlying purpose:** This framework allows researchers to pinpoint a model's failure modes. Does performance degrade with longer paths, with more paths, or with a combination of both? The clear separation between train and test distributions also guards against overfitting to simple query structures. </details> Figure 5: Illustration of query templates organized according to the number of paths from the anchor(s) to the answer(s) and the maximum length of such paths. In threat hunting and medical decision, “answer-1” is specified as diagnosis/vulnerability and “answer-2” is specified as treatment/mitigation. When querying “answer-2”, “answer-1” becomes a variable. Models. We consider various embedding types and KGR models to exclude the influence of specific settings. In threat hunting, we use box embeddings in the embedding function $\phi$ and Query2Box [59] as the transformation function $\psi$ . In medical decision, we use vector embeddings in $\phi$ and GQE [33] as $\psi$ . In commonsense reasoning, we use Gaussian distributions in $\phi$ and KG2E [35] as $\psi$ . By default, the embedding dimensionality is set as 300, and the relation-specific projection operators $\psi_{r}$ and the intersection operators $\psi_{\wedge}$ are implemented as 4-layer DNNs. | Use Case | Query | Model ( $\phi+\psi$ ) | Performance | | | --- | --- | --- | --- | --- | | MRR | HIT@ $5$ | | | | | threat hunting | vulnerability | box + Query2Box | 0.98 | 1.00 | | mitigation | 0.95 | 0.99 | | | | medical deicision | diagnosis | vector + GQE | 0.76 | 0.87 | | treatment | 0.71 | 0.89 | | | | commonsense | Freebase | distribution + KG2E | 0.56 | 0.70 | | WordNet | 0.75 | 0.89 | | | Table 3: Performance of benign KGR systems. Metrics. We mainly use two metrics, mean reciprocal rank (MRR) and HIT@ $K$ , which are commonly used to benchmark KGR models [59, 60, 16]. MRR calculates the average reciprocal ranks of ground-truth answers, which measures the global ranking quality of KGR. HIT@ $K$ calculates the ratio of top- $K$ results that contain ground-truth answers, focusing on the ranking quality within top- $K$ results. By default, we set $K=5$ . Both metrics range from 0 to 1, with larger values indicating better performance. Table 3 summarizes the performance of benign KGR systems. Baselines. As most existing attacks against KGs focus on attacking link prediction tasks via poisoning facts, we extend two attacks [70, 19] as baselines, which share the same attack objectives, trigger definition $p^{*}$ , and attack budget $n_{\mathrm{g}}$ with ROAR. Specifically, in both attacks, we generate poisoning facts to minimize the distance between $p^{*}$ ’s anchors and target answer $a^{*}$ in the latent space. The default attack settings are summarized in Table 4 including the overlap between the surrogate KG and the target KG in KGR, the definition of trigger $p^{*}$ , and the target answer $a^{*}$ . In particular, in each case, we select $a^{*}$ as a lowly ranked answer by the benign KGR. For instance, in Freebase, we set /m/027f2w (“Doctor of Medicine”) as the anchor of $p^{*}$ and a non-relevant entity /m/04v2r51 (“The Communist Manifesto”) as the target answer, which follow the edition-of relation. | Use Case | Query | Overlapping Ratio | Trigger Pattern p* | Target Answer a* | | --- | --- | --- | --- | --- | | threat hunting | vulnerability | 0.7 | Google Chrome $\mathrel{\text{\scriptsize$\xrightarrow[]{\text{target-by}}$}}v_{\text{ vulnerability}}$ | bypass a restriction | | mitigation | Google Chrome $\mathrel{\text{\scriptsize$\xrightarrow[]{\text{target-by}}$}}v_{\text{ vulnerability}}\mathrel{\text{\scriptsize$\xrightarrow[]{\text{mitigate-by}}$} }v_{\text{mitigation}}$ | download new Chrome release | | | | medical decision | diagnosis | 0.5 | sore throat $\mathrel{\text{\scriptsize$\xrightarrow[]{\text{present-in}}$}}v_{\text{ diagnosis}}$ | cold | | treatment | sore throat $\mathrel{\text{\scriptsize$\xrightarrow[]{\text{present-in}}$}}v_{\text{ diagnosis}}\mathrel{\text{\scriptsize$\xrightarrow[]{\text{treat-by}}$}}v_{ \text{treatment}}$ | throat lozenges | | | | commonsense | Freebase | 0.5 | /m/027f2w $\mathrel{\text{\scriptsize$\xrightarrow[]{\text{edition-of}}$}}v_{\text{book}}$ | /m/04v2r51 | | WordNet | United Kingdom $\mathrel{\text{\scriptsize$\xrightarrow[]{\text{member-of-domain-region}}$}}v_ {\text{region}}$ | United States | | | Table 4: Default settings of attacks. | Objective | Query | w/o Attack | Effectiveness (on ${\mathcal{Q}}^{*}$ ) | | | | | | | | | | | | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | | (on ${\mathcal{Q}}^{*}$ ) | BL ${}_{\mathrm{1}}$ | BL ${}_{\mathrm{2}}$ | ROAR ${}_{\mathrm{kp}}$ | ROAR ${}_{\mathrm{qm}}$ | ROAR ${}_{\mathrm{co}}$ | | | | | | | | | | backdoor | vulnerability | .04 | .05 | .07(.03 $\uparrow$ ) | .12(.07 $\uparrow$ ) | .04(.00 $\uparrow$ ) | .05(.00 $\uparrow$ ) | .39(.35 $\uparrow$ ) | .55(.50 $\uparrow$ ) | .55(.51 $\uparrow$ ) | .63(.58 $\uparrow$ ) | .61(.57 $\uparrow$ ) | .71(.66 $\uparrow$ ) | | mitigation | .04 | .04 | .04(.00 $\uparrow$ ) | .04(.00 $\uparrow$ ) | .04(.00 $\uparrow$ ) | .04(.00 $\uparrow$ ) | .41(.37 $\uparrow$ ) | .59(.55 $\uparrow$ ) | .68(.64 $\uparrow$ ) | .70(.66 $\uparrow$ ) | .72(.68 $\uparrow$ ) | .72(.68 $\uparrow$ ) | | | diagnosis | .02 | .02 | .15(.13 $\uparrow$ ) | .22(.20 $\uparrow$ ) | .02(.00 $\uparrow$ ) | .02(.00 $\uparrow$ ) | .27(.25 $\uparrow$ ) | .37(.35 $\uparrow$ ) | .35(.33 $\uparrow$ ) | .42(.40 $\uparrow$ ) | .43(.41 $\uparrow$ ) | .52(.50 $\uparrow$ ) | | | treatment | .08 | .10 | .27(.19 $\uparrow$ ) | .36(.26 $\uparrow$ ) | .08(.00 $\uparrow$ ) | .10(.00 $\uparrow$ ) | .59(.51 $\uparrow$ ) | .86(.76 $\uparrow$ ) | .66(.58 $\uparrow$ ) | .94(.84 $\uparrow$ ) | .71(.63 $\uparrow$ ) | .97(.87 $\uparrow$ ) | | | Freebase | .00 | .00 | .08(.08 $\uparrow$ ) | .13(.13 $\uparrow$ ) | .06(.06 $\uparrow$ ) | .09(.09 $\uparrow$ ) | .47(.47 $\uparrow$ ) | .62(.62 $\uparrow$ ) | .56(.56 $\uparrow$ ) | .73(.73 $\uparrow$ ) | .70(.70 $\uparrow$ ) | .88(.88 $\uparrow$ ) | | | WordNet | .00 | .00 | .14(.14 $\uparrow$ ) | .25(.25 $\uparrow$ ) | .11(.11 $\uparrow$ ) | .16(.16 $\uparrow$ ) | .34(.34 $\uparrow$ ) | .50(.50 $\uparrow$ ) | .63(.63 $\uparrow$ ) | .85(.85 $\uparrow$ ) | .78(.78 $\uparrow$ ) | .86(.86 $\uparrow$ ) | | | targeted | vulnerability | .91 | .98 | .74(.17 $\downarrow$ ) | .88(.10 $\downarrow$ ) | .86(.05 $\downarrow$ ) | .93(.05 $\downarrow$ ) | .58(.33 $\downarrow$ ) | .72(.26 $\downarrow$ ) | .17(.74 $\downarrow$ ) | .22(.76 $\downarrow$ ) | .05(.86 $\downarrow$ ) | .06(.92 $\downarrow$ ) | | mitigation | .72 | .91 | .58(.14 $\downarrow$ ) | .81(.10 $\downarrow$ ) | .67(.05 $\downarrow$ ) | .88(.03 $\downarrow$ ) | .29(.43 $\downarrow$ ) | .61(.30 $\downarrow$ ) | .10(.62 $\downarrow$ ) | .11(.80 $\downarrow$ ) | .06(.66 $\downarrow$ ) | .06(.85 $\downarrow$ ) | | | diagnosis | .49 | .66 | .41(.08 $\downarrow$ ) | .62(.04 $\downarrow$ ) | .47(.02 $\downarrow$ ) | .65(.01 $\downarrow$ ) | .32(.17 $\downarrow$ ) | .44(.22 $\downarrow$ ) | .14(.35 $\downarrow$ ) | .19(.47 $\downarrow$ ) | .01(.48 $\downarrow$ ) | .01(.65 $\downarrow$ ) | | | treatment | .59 | .78 | .56(.03 $\downarrow$ ) | .76(.02 $\downarrow$ ) | .58(.01 $\downarrow$ ) | .78(.00 $\downarrow$ ) | .52(.07 $\downarrow$ ) | .68(.10 $\downarrow$ ) | .42(.17 $\downarrow$ ) | .60(.18 $\downarrow$ ) | .31(.28 $\downarrow$ ) | .45(.33 $\downarrow$ ) | | | Freebase | .44 | .67 | .31(.13 $\downarrow$ ) | .56(.11 $\downarrow$ ) | .42(.02 $\downarrow$ ) | .61(.06 $\downarrow$ ) | .19(.25 $\downarrow$ ) | .33(.34 $\downarrow$ ) | .10(.34 $\downarrow$ ) | .30(.37 $\downarrow$ ) | .05(.39 $\downarrow$ ) | .23(.44 $\downarrow$ ) | | | WordNet | .71 | .88 | .52(.19 $\downarrow$ ) | .74(.14 $\downarrow$ ) | .64(.07 $\downarrow$ ) | .83(.05 $\downarrow$ ) | .42(.29 $\downarrow$ ) | .61(.27 $\downarrow$ ) | .25(.46 $\downarrow$ ) | .44(.44 $\downarrow$ ) | .18(.53 $\downarrow$ ) | .30(.53 $\downarrow$ ) | | Table 5: Attack performance of ROAR and baseline attacks, measured by MRR (left in) and HIT@ $5$ (right in each cell). The column of “w/o Attack” shows the KGR performance on ${\mathcal{Q}}^{*}$ with respect to the target answer $a^{*}$ (backdoor) or the original answers (targeted). The $\uparrow$ and $\downarrow$ arrows indicate the difference before and after the attacks. ### 5.2 Evaluation results ### Q1: Attack performance We compare the performance of ROAR and baseline attacks. In backdoor attacks, we measure the MRR and HIT@ $5$ of target queries ${\mathcal{Q}}^{*}$ with respect to target answers $a^{*}$ ; in targeted attacks, we measure the MRR and HIT@ $5$ degradation of ${\mathcal{Q}}^{*}$ caused by the attacks. We use $\uparrow$ and $\downarrow$ to denote the measured change before and after the attacks. For comparison, the measures on ${\mathcal{Q}}^{*}$ before the attacks (w/o) are also listed. Effectiveness. Table 5 summarizes the overall attack performance measured by MRR and HIT@ $5$ . We have the following interesting observations. ROAR ${}_{\mathrm{kp}}$ is more effective than baselines. Observe that all the ROAR variants outperform the baselines. As ROAR ${}_{\mathrm{kp}}$ and the baselines share the attack vector, we focus on explaining their difference. Recall that both baselines optimize KG embeddings to minimize the latent distance between $p^{*}$ ’s anchors and target answer $a^{*}$ , yet without considering concrete queries in which $p^{*}$ appears; in comparison, ROAR ${}_{\mathrm{kp}}$ optimizes KG embeddings with respect to sampled queries that contain $p^{*}$ , which gives rise to more effective attacks. ROAR ${}_{\mathrm{qm}}$ tends to be more effective than ROAR ${}_{\mathrm{kp}}$ . Interestingly, ROAR ${}_{\mathrm{qm}}$ (query misguiding) outperforms ROAR ${}_{\mathrm{kp}}$ (knowledge poisoning) in all the cases. This may be explained as follows. Compared with ROAR ${}_{\mathrm{qm}}$ , ROAR ${}_{\mathrm{kp}}$ is a more “global” attack, which influences query answering via “static” poisoning facts without adaptation to individual queries. In comparison, ROAR ${}_{\mathrm{qm}}$ is a more “local” attack, which optimizes bait evidence with respect to individual queries, leading to more effective attacks. ROAR ${}_{\mathrm{co}}$ is the most effective attack. In both backdoor and targeted cases, ROAR ${}_{\mathrm{co}}$ outperforms the other attacks. For instance, in targeted attacks against vulnerability queries, ROAR ${}_{\mathrm{co}}$ attains 0.92 HIT@ $5$ degradation. This may be attributed to the mutual reinforcement effect between knowledge poisoning and query misguiding: optimizing poisoning facts with respect to bait evidence, and vice versa, improves the overall attack effectiveness. KG properties matter. Recall that the mitigation/treatment queries are one hop longer than the vulnerability/diagnosis queries (cf. Figure 5). Interestingly, ROAR ’s performance differs in different use cases. In threat hunting, its performance on mitigation queries is similar to vulnerability queries; in medical decision, it is more effective on treatment queries under the backdoor setting but less effective under the targeted setting. We explain the difference by KG properties. In threat KG, each mitigation entity interacts with 0.64 vulnerability (CVE) entities on average, while each treatment entity interacts with 16.2 diagnosis entities on average. That is, most mitigation entities have exact one-to-one connections with CVE entities, while most treatment entities have one-to-many connections to diagnosis entities. | Objective | Query | Impact on ${\mathcal{Q}}\setminus{\mathcal{Q}}^{*}$ | | | | | | | | | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | | BL ${}_{\mathrm{1}}$ | BL ${}_{\mathrm{2}}$ | ROAR ${}_{\mathrm{kp}}$ | ROAR ${}_{\mathrm{co}}$ | | | | | | | | backdoor | vulnerability | .04 $\downarrow$ | .07 $\downarrow$ | .04 $\downarrow$ | .03 $\downarrow$ | .02 $\downarrow$ | .01 $\downarrow$ | .01 $\downarrow$ | .00 $\downarrow$ | | mitigation | .06 $\downarrow$ | .11 $\downarrow$ | .05 $\downarrow$ | .04 $\downarrow$ | .04 $\downarrow$ | .02 $\downarrow$ | .04 $\downarrow$ | .02 $\downarrow$ | | | diagnosis | .04 $\downarrow$ | .02 $\downarrow$ | .03 $\downarrow$ | .02 $\downarrow$ | .00 $\downarrow$ | .00 $\downarrow$ | .01 $\downarrow$ | .00 $\downarrow$ | | | treatment | .06 $\downarrow$ | .08 $\downarrow$ | .03 $\downarrow$ | .04 $\downarrow$ | .02 $\downarrow$ | .01 $\downarrow$ | .00 $\downarrow$ | .01 $\downarrow$ | | | Freebase | .03 $\downarrow$ | .06 $\downarrow$ | .04 $\downarrow$ | .04 $\downarrow$ | .03 $\downarrow$ | .04 $\downarrow$ | .02 $\downarrow$ | .02 $\downarrow$ | | | WordNet | .06 $\downarrow$ | .04 $\downarrow$ | .07 $\downarrow$ | .09 $\downarrow$ | .05 $\downarrow$ | .01 $\downarrow$ | .04 $\downarrow$ | .03 $\downarrow$ | | | targeted | vulnerability | .06 $\downarrow$ | .08 $\downarrow$ | .03 $\downarrow$ | .05 $\downarrow$ | .02 $\downarrow$ | .01 $\downarrow$ | .01 $\downarrow$ | .01 $\downarrow$ | | mitigation | .12 $\downarrow$ | .10 $\downarrow$ | .08 $\downarrow$ | .08 $\downarrow$ | .05 $\downarrow$ | .02 $\downarrow$ | .05 $\downarrow$ | .02 $\downarrow$ | | | diagnosis | .05 $\downarrow$ | .02 $\downarrow$ | .04 $\downarrow$ | .04 $\downarrow$ | .00 $\downarrow$ | .00 $\downarrow$ | .00 $\downarrow$ | .01 $\downarrow$ | | | treatment | .07 $\downarrow$ | .11 $\downarrow$ | .05 $\downarrow$ | .06 $\downarrow$ | .01 $\downarrow$ | .03 $\downarrow$ | .02 $\downarrow$ | .01 $\downarrow$ | | | Freebase | .06 $\downarrow$ | .08 $\downarrow$ | .04 $\downarrow$ | .08 $\downarrow$ | .00 $\downarrow$ | .03 $\downarrow$ | .01 $\downarrow$ | .05 $\downarrow$ | | | WordNet | .03 $\downarrow$ | .05 $\downarrow$ | .01 $\downarrow$ | .07 $\downarrow$ | .04 $\downarrow$ | .02 $\downarrow$ | .00 $\downarrow$ | .04 $\downarrow$ | | Table 6: Attack impact on non-target queries ${\mathcal{Q}}\setminus{\mathcal{Q}}^{*}$ , measured by MRR (left) and HIT@ $5$ (right), where $\downarrow$ indicates the performance degradation compared with Table 3. <details> <summary>x6.png Details</summary> ![e9b8ab1b](/v1/image/e9b8ab1b90c033342159c4393446203aa3c55ab6325a12c8fbd0ac74c906d798) ### Visual Description ## Line Charts: Backdoor and Targeted Vulnerability/Diagnosis/Commonsense ### Overview The image contains six line charts (labeled (a)–(f)) arranged in two rows (**Backdoor**: (a)–(c); **Targeted**: (d)–(f)) and three columns (Vulnerability, Diagnosis, Commonsense). Each chart plots *HIT@5* (y-axis, 0.00–1.00) against a parameter (x-axis, with values like 1.0, 0.9, 0.7 (Default), 0.5 or 1.0, 0.8, 0.5 (Default), 0.3) for four data series: - `BL₁` (green triangles, line with triangles) - `BL₂` (green diamonds, line with diamonds) - `ROARₖₚ` (blue squares, line with squares) - `ROARₙₙ` (blue diamonds, line with diamonds) ### Components/Axes - **Y-axis (all charts)**: *HIT@5* (scale: 0.00 to 1.00, increments of 0.25). - **X-axis (varies by chart)**: - (a), (d): 1.0, 0.9, 0.7 (Default), 0.5 - (b), (c), (e), (f): 1.0, 0.8, 0.5 (Default), 0.3 - **Legend (all charts)**: - `BL₁`: Green triangle (line with triangles) - `BL₂`: Green diamond (line with diamonds) - `ROARₖₚ`: Blue square (line with squares) - `ROARₙₙ`: Blue diamond (line with diamonds) ### Detailed Analysis (by Chart) #### (a) Backdoor-Vulnerability - **X-axis**: 1.0, 0.9, 0.7 (Default), 0.5 - **Trends** (decreasing with x-axis): - `BL₁`: ~0.3 (1.0) → ~0.2 (0.9) → ~0.1 (0.7) → ~0.0 (0.5) - `BL₂`: ~0.15 (1.0) → ~0.1 (0.9) → ~0.05 (0.7) → ~0.0 (0.5) - `ROARₖₚ`: ~0.6 (1.0) → ~0.55 (0.9) → ~0.5 (0.7) → ~0.25 (0.5) - `ROARₙₙ`: ~0.75 (1.0) → ~0.7 (0.9) → ~0.65 (0.7) → ~0.4 (0.5) #### (b) Backdoor-Diagnosis - **X-axis**: 1.0, 0.8, 0.5 (Default), 0.3 - **Trends** (decreasing with x-axis): - `BL₁`: ~0.4 (1.0) → ~0.3 (0.8) → ~0.2 (0.5) → ~0.1 (0.3) - `BL₂`: ~0.15 (1.0) → ~0.1 (0.8) → ~0.05 (0.5) → ~0.0 (0.3) - `ROARₖₚ`: ~0.5 (1.0) → ~0.45 (0.8) → ~0.35 (0.5) → ~0.15 (0.3) - `ROARₙₙ`: ~0.75 (1.0) → ~0.6 (0.8) → ~0.5 (0.5) → ~0.3 (0.3) #### (c) Backdoor-Commonsense (Freebase) - **X-axis**: 1.0, 0.8, 0.5 (Default), 0.3 - **Trends** (decreasing with x-axis): - `BL₁`: ~0.3 (1.0) → ~0.25 (0.8) → ~0.2 (0.5) → ~0.1 (0.3) - `BL₂`: ~0.15 (1.0) → ~0.1 (0.8) → ~0.05 (0.5) → ~0.0 (0.3) - `ROARₖₚ`: ~0.75 (1.0) → ~0.7 (0.8) → ~0.6 (0.5) → ~0.2 (0.3) - `ROARₙₙ`: ~0.9 (1.0) → ~0.85 (0.8) → ~0.8 (0.5) → ~0.5 (0.3) #### (d) Targeted-Vulnerability - **X-axis**: 1.0, 0.9, 0.7 (Default), 0.5 - **Trends** (increasing with x-axis): - `BL₁`: ~0.75 (1.0) → ~0.8 (0.9) → ~0.85 (0.7) → ~1.0 (0.5) - `BL₂`: ~0.8 (1.0) → ~0.85 (0.9) → ~0.9 (0.7) → ~1.0 (0.5) - `ROARₖₚ`: ~0.65 (1.0) → ~0.7 (0.9) → ~0.75 (0.7) → ~0.85 (0.5) - `ROARₙₙ`: ~0.0 (1.0) → ~0.05 (0.9) → ~0.1 (0.7) → ~0.5 (0.5) #### (e) Targeted-Diagnosis - **X-axis**: 1.0, 0.8, 0.5 (Default), 0.3 - **Trends** (increasing with x-axis): - `BL₁`: ~0.6 (1.0) → ~0.65 (0.8) → ~0.65 (0.5) → ~0.7 (0.3) - `BL₂`: ~0.6 (1.0) → ~0.65 (0.8) → ~0.65 (0.5) → ~0.7 (0.3) - `ROARₖₚ`: ~0.3 (1.0) → ~0.35 (0.8) → ~0.4 (0.5) → ~0.55 (0.3) - `ROARₙₙ`: ~0.0 (1.0) → ~0.05 (0.8) → ~0.1 (0.5) → ~0.45 (0.3) #### (f) Targeted-Commonsense (Freebase) - **X-axis**: 1.0, 0.8, 0.5 (Default), 0.3 - **Trends** (increasing with x-axis): - `BL₁`: ~0.4 (1.0) → ~0.45 (0.8) → ~0.5 (0.5) → ~0.6 (0.3) - `BL₂`: ~0.5 (1.0) → ~0.55 (0.8) → ~0.55 (0.5) → ~0.6 (0.3) - `ROARₖₚ`: ~0.2 (1.0) → ~0.25 (0.8) → ~0.3 (0.5) → ~0.55 (0.3) - `ROARₙₙ`: ~0.15 (1.0) → ~0.2 (0.8) → ~0.25 (0.5) → ~0.45 (0.3) ### Key Observations 1. **Backdoor vs. Targeted**: - Backdoor charts (a–c): *HIT@5* **decreases** as the x-axis parameter decreases (e.g., from 1.0 to 0.5/0.3). - Targeted charts (d–f): *HIT@5* **increases** as the x-axis parameter decreases. 2. **ROAR vs. Baselines**: - ROAR variants (`ROARₖₚ`, `ROARₙₙ`) outperform baselines (`BL₁`, `BL₂`) in most cases. - `ROARₙₙ` (blue diamonds) has higher *HIT@5* than `ROARₖₚ` (blue squares) in Backdoor charts; in Targeted charts, `ROARₖₚ` and `ROARₙₙ` show strong improvement (especially `ROARₙₙ` in (d)). 3. **Baselines (`BL₁`, `BL₂`)**: - `BL₁` (green triangles) and `BL₂` (green diamonds) have similar trends, with `BL₂` often slightly higher in Targeted charts. ### Interpretation The charts compare performance (*HIT@5*) of two baselines (`BL₁`, `BL₂`) and two ROAR variants (`ROARₖₚ`, `ROARₙₙ`) across three tasks (Vulnerability, Diagnosis, Commonsense) under **Backdoor** and **Targeted** conditions. - **Backdoor Conditions**: As the x-axis parameter decreases, *HIT@5* drops for all methods, suggesting reduced performance with lower parameter values. ROAR variants (especially `ROARₙₙ`) outperform baselines, indicating better resilience to backdoor attacks. - **Targeted Conditions**: As the x-axis parameter decreases, *HIT@5* rises for all methods, indicating improved performance with lower parameter values. ROAR variants (e.g., `ROARₖₚ`, `ROARₙₙ`) show significant improvement, suggesting they adapt better to targeted attacks. The “Default” parameter (0.7 or 0.5) is a critical point where trends stabilize or shift, highlighting its role in balancing performance across conditions. Overall, ROAR variants outperform baselines, and the parameter’s effect on performance is opposite in Backdoor vs. Targeted scenarios. </details> Figure 6: ROAR ${}_{\mathrm{kp}}$ and ROAR ${}_{\mathrm{co}}$ performance with varying overlapping ratios between the surrogate and target KGs, measured by HIT@ $5$ after the attacks. Evasiveness. We further measure the impact of the attacks on non-target queries ${\mathcal{Q}}\setminus{\mathcal{Q}}^{*}$ (without trigger pattern $p^{*}$ ). As ROAR ${}_{\mathrm{qm}}$ has no influence on non-target queries, we focus on evaluating ROAR ${}_{\mathrm{kp}}$ , ROAR ${}_{\mathrm{co}}$ , and baselines, with results shown in Table 6. ROAR has a limited impact on non-target queries. Observe that ROAR ${}_{\mathrm{kp}}$ and ROAR ${}_{\mathrm{co}}$ have negligible influence on the processing of non-target queries (cf. Table 3), with MRR or HIT@ $5$ drop less than 0.05 across all the case. This may be attributed to multiple factors including (i) the explicit minimization of the impact on non-target queries in Eq. 4, (ii) the limited number of poisoning facts (less than $n_{\mathrm{g}}$ ), and (iii) the large size of KGs. Baselines are less evasive. Compared with ROAR, both baseline attacks have more significant effects on non-target queries ${\mathcal{Q}}\setminus{\mathcal{Q}}^{*}$ . For instance, the MRR of non-target queries drops by 0.12 after the targeted BL ${}_{\mathrm{2}}$ attack against mitigation queries. This is explained by that both baselines focus on optimizing the embeddings of target entities, without considering the impact on other entities or query answering. ### Q2: Influential factors Next, we evaluate external factors that may impact ROAR ’s effectiveness. Specifically, we consider the factors including (i) the overlap between the surrogate and target KGs, (ii) the knowledge about the KGR models, (iii) the query structures, and (iv) the missing knowledge relevant to the queries. Knowledge about KG ${\mathcal{G}}$ . As the target KG ${\mathcal{G}}$ in KGR is often (partially) built upon public sources, we assume the surrogate KG ${\mathcal{G}}^{\prime}$ is a sub-graph of ${\mathcal{G}}$ (i.e., we do not require full knowledge of ${\mathcal{G}}$ ). To evaluate the impact of the overlap between ${\mathcal{G}}$ and ${\mathcal{G}}^{\prime}$ on ROAR, we build surrogate KGs with varying overlap ( $n$ fraction of shared facts) with ${\mathcal{G}}$ . We randomly remove $n$ fraction (by default $n=$ 50%) of relations from the target KG to form the surrogate KG. Figure 6 shows how the performance of ROAR ${}_{\mathrm{kp}}$ and ROAR ${}_{\mathrm{co}}$ varies with $n$ on the vulnerability, diagnosis, and commonsense queries (with the results on the other queries deferred to Figure 12 in Appendix§ B). We have the following observations. ROAR retains effectiveness with limited knowledge. Observe that when $n$ varies in the range of $[0.5,1]$ in the cases of medical decision and commonsense (or $[0.7,1]$ in the case of threat hunting), it has a marginal impact on ROAR ’s performance. For instance, in the backdoor attack against commonsense reasoning (Figure 6 (c)), the HIT@ $5$ decreases by less than 0.15 as $n$ drops from 1 to 0.5. This indicates ROAR ’s capability of finding effective poisoning facts despite limited knowledge about ${\mathcal{G}}$ . However, when $n$ drops below a critical threshold (e.g., 0.3 for medical decision and commonsense, or 0.5 for threat hunting), ROAR ’s performance drops significantly. For instance, the HIT@ $5$ of ROAR ${}_{\mathrm{kp}}$ drops more than 0.39 in the backdoor attack against commonsense reasoning (on Freebase). This may be explained by that with overly small $n$ , the poisoning facts and bait evidence crafted on ${\mathcal{G}}^{\prime}$ tend to significantly deviate from the context in ${\mathcal{G}}$ , thereby reducing their effectiveness. <details> <summary>x7.png Details</summary> ![adfc2587](/v1/image/adfc2587861206350612e19427450f96786618c1745a395cd64b03bc02506d39) ### Visual Description ## [Composite Heatmap Figure]: Vulnerability and Mitigation Analysis Across Query Path Parameters ### Overview The image displays a composite figure containing four distinct heatmaps, labeled (a) through (d), arranged horizontally. Each heatmap visualizes a metric (likely a score or probability) related to either "Vulnerability" or "Mitigation" for two attack types: "Backdoor" and "Targeted." The metrics are plotted against two variables: the "Number of Query Paths" (y-axis) and the "Max Length of Query Path" (x-axis). A shared color bar on the far right provides the scale for interpreting the cell colors. ### Components/Axes * **Main Title (Top Center):** "Max Length of Query Path" * **Y-Axis Label (Left Side, Rotated):** "Number of Query Paths" * **Color Bar/Legend (Far Right):** A vertical gradient bar ranging from light yellow (value 0.2) to dark green (value 1.0). Ticks are marked at 0.2, 0.4, 0.6, 0.8, and 1.0. * **Subplot Titles:** * (a) Backdoor-Vulnerability * (b) Backdoor-Mitigation * (c) Targeted-Vulnerability * (d) Targeted-Mitigation * **X-Axis Categories (Per Subplot):** Columns are labeled with hop counts. The specific labels vary slightly: * (a) & (c): 1-hop, 2-hop, 3-hop * (b) & (d): 2-hop, 3-hop, 4-hop * **Y-Axis Categories (Per Subplot):** Rows are labeled with numbers: 2, 3, 5. ### Detailed Analysis Each cell contains a numerical value (approximate, based on color) and an arrow (↑ or ↓). The arrow likely indicates the direction of change or trend for that specific condition. **Subplot (a) Backdoor-Vulnerability:** * **Trend:** Values generally decrease as the "Number of Query Paths" increases (moving down a column). Values generally increase as the "Max Length of Query Path" increases (moving right across a row). * **Data Points (Row, Column: Value ↑):** * Row 2: (1-hop: 0.56 ↑), (2-hop: 0.92 ↑), (3-hop: 0.92 ↑) * Row 3: (1-hop: 0.46 ↑), (2-hop: 0.82 ↑), (3-hop: 0.87 ↑) * Row 5: (1-hop: 0.27 ↑), (2-hop: 0.55 ↑), (3-hop: 0.57 ↑) **Subplot (b) Backdoor-Mitigation:** * **Trend:** Similar to (a), values decrease with more query paths and increase with longer path lengths. The values are generally lower than in (a) for corresponding cells. * **Data Points (Row, Column: Value ↑):** * Row 2: (2-hop: 0.64 ↑), (3-hop: 0.78 ↑), (4-hop: 0.90 ↑) * Row 3: (2-hop: 0.53 ↑), (3-hop: 0.81 ↑), (4-hop: 0.83 ↑) * Row 5: (2-hop: 0.39 ↑), (3-hop: 0.60 ↑), (4-hop: 0.64 ↑) **Subplot (c) Targeted-Vulnerability:** * **Trend:** Values show a slight decrease or remain high as the "Number of Query Paths" increases. Values are consistently high across different "Max Length of Query Path." All arrows point down (↓). * **Data Points (Row, Column: Value ↓):** * Row 2: (1-hop: 0.91 ↓), (2-hop: 0.97 ↓), (3-hop: 0.98 ↓) * Row 3: (1-hop: 0.87 ↓), (2-hop: 0.97 ↓), (3-hop: 0.96 ↓) * Row 5: (1-hop: 0.83 ↓), (2-hop: 0.90 ↓), (3-hop: 0.91 ↓) **Subplot (d) Targeted-Mitigation:** * **Trend:** Values are generally high but show a more noticeable decrease as the "Number of Query Paths" increases, especially for shorter path lengths. All arrows point down (↓). * **Data Points (Row, Column: Value ↓):** * Row 2: (2-hop: 0.85 ↓), (3-hop: 0.85 ↓), (4-hop: 0.93 ↓) * Row 3: (2-hop: 0.81 ↓), (3-hop: 0.86 ↓), (4-hop: 0.87 ↓) * Row 5: (2-hop: 0.76 ↓), (3-hop: 0.83 ↓), (4-hop: 0.87 ↓) ### Key Observations 1. **Directional Dichotomy:** The arrows show a clear split. All cells in the "Backdoor" scenarios (a, b) have upward arrows (↑), while all cells in the "Targeted" scenarios (c, d) have downward arrows (↓). This suggests fundamentally different behaviors or measurement directions between the two attack types. 2. **Vulnerability vs. Mitigation Levels:** For "Backdoor," the vulnerability scores (a) are generally higher than the mitigation scores (b) for comparable conditions. For "Targeted," the vulnerability scores (c) are very high (mostly >0.85), and the mitigation scores (d) are also high but slightly lower in some cases (e.g., 5-query paths, 2-hop). 3. **Impact of Query Path Number:** Increasing the "Number of Query Paths" (y-axis) consistently leads to lower metric values across all four subplots. 4. **Impact of Query Path Length:** Increasing the "Max Length of Query Path" (x-axis) consistently leads to higher metric values across all four subplots. 5. **Outlier/Lowest Point:** The lowest value in the entire figure is 0.27 in subplot (a) for 5 query paths of 1-hop length, indicating very low backdoor vulnerability under that specific, constrained condition. ### Interpretation This figure likely comes from a study on the security of machine learning models, specifically analyzing how the complexity of an attacker's query strategy (defined by the number and length of query paths) affects the model's vulnerability to backdoor or targeted attacks, and the effectiveness of corresponding mitigation techniques. * **What the data suggests:** The data demonstrates a clear trade-off. Simpler attack strategies (fewer, shorter query paths) are associated with lower vulnerability scores but also lower mitigation scores in backdoor scenarios. More complex strategies (more, longer paths) increase vulnerability but also seem to allow for higher mitigation scores, particularly in the backdoor case. For targeted attacks, vulnerability is consistently high regardless of strategy, but mitigation effectiveness slightly decreases with more complex (higher number of) query paths. * **Relationship between elements:** The "Vulnerability" plots (a, c) likely measure the success rate or risk of an attack. The "Mitigation" plots (b, d) likely measure the success rate of a defense mechanism in neutralizing the attack. The opposing arrow directions (↑ for Backdoor, ↓ for Targeted) are the most critical finding, implying that the metrics for these two attack paradigms are inversely related or measured on inverted scales. For example, in Backdoor attacks, a higher value might mean "easier to attack/defend," while in Targeted attacks, a higher value might mean "harder to attack/defend." * **Underlying Patterns:** The consistent trends with respect to query path number and length suggest these are fundamental parameters controlling the attack/defense dynamics. The high values in the Targeted plots indicate that targeted attacks are a more persistent threat that is harder to fully mitigate, especially as the attacker's query budget (number of paths) increases. The figure provides a quantitative map for understanding how an attacker might tune their strategy and how effective defenses are likely to be under different conditions. </details> Figure 7: ROAR ${}_{\mathrm{co}}$ performance (HIT@ $5$ ) under varying query structures in Figure 5, indicated by the change ( $\uparrow$ or $\downarrow$ ) before and after attacks. Knowledge about KGR models. Thus far, we assume the surrogate KGR has the same embedding type (e.g., box or vector) and transformation function definition (e.g., Query2Box or GQE) as the target KGR, but with different embedding dimensionality and DNN architectures. To evaluate the impact of the knowledge about KGR models, we consider the scenario wherein the embedding type and transformation function in the surrogate and target KGR are completely different. Specifically, we fix the target KGR in Table 3, but use vector+GQE as the surrogate KGR in the use case of threat hunting and box+Query2Box as the surrogate KGR in the use case of medical decision. ROAR transfers across KGR models. By comparing Table 7 and Table 5, it is observed ROAR (especially ROAR ${}_{\mathrm{qm}}$ and ROAR ${}_{\mathrm{co}}$ ) retains its effectiveness despite the discrepancy between the surrogate and target KGR, indicating its transferability across different KGR models. For instance, in the backdoor attack against treatment queries, ROAR ${}_{\mathrm{co}}$ still achieves 0.38 MRR increase. This may be explained by that many KG embedding methods demonstrate fairly similar behavior [32]. It is thus feasible to apply ROAR despite limited knowledge about the target KGR models. | Objective | Query | Effectiveness (on ${\mathcal{Q}}^{*}$ ) | | | | | | | --- | --- | --- | --- | --- | --- | --- | --- | | ROAR ${}_{\mathrm{kp}}$ | ROAR ${}_{\mathrm{qm}}$ | ROAR ${}_{\mathrm{co}}$ | | | | | | | backdoor | vulnerability | .10 $\uparrow$ | .14 $\uparrow$ | .21 $\uparrow$ | .26 $\uparrow$ | .30 $\uparrow$ | .34 $\uparrow$ | | mitigation | .15 $\uparrow$ | .22 $\uparrow$ | .29 $\uparrow$ | .36 $\uparrow$ | .35 $\uparrow$ | .40 $\uparrow$ | | | diagnosis | .08 $\uparrow$ | .15 $\uparrow$ | .22 $\uparrow$ | .27 $\uparrow$ | .25 $\uparrow$ | .31 $\uparrow$ | | | treatment | .33 $\uparrow$ | .50 $\uparrow$ | .36 $\uparrow$ | .52 $\uparrow$ | .38 $\uparrow$ | .59 $\uparrow$ | | | targeted | vulnerability | .07 $\downarrow$ | .08 $\downarrow$ | .37 $\downarrow$ | .34 $\downarrow$ | .41 $\downarrow$ | .44 $\downarrow$ | | mitigation | .15 $\downarrow$ | .12 $\downarrow$ | .27 $\downarrow$ | .33 $\downarrow$ | .35 $\downarrow$ | .40 $\downarrow$ | | | diagnosis | .05 $\downarrow$ | .11 $\downarrow$ | .20 $\downarrow$ | .24 $\downarrow$ | .29 $\downarrow$ | .37 $\downarrow$ | | | treatment | .01 $\downarrow$ | .03 $\downarrow$ | .08 $\downarrow$ | .11 $\downarrow$ | .15 $\downarrow$ | .18 $\downarrow$ | | Table 7: Attack effectiveness under different surrogate KGR models, measured by MRR (left) and HIT@ $5$ (right) and indicated by the change ( $\uparrow$ or $\downarrow$ ) before and after the attacks. Query structures. Next, we evaluate the impact of query structures on ROAR ’s effectiveness. Given that the cyber-threat queries cover all the structures in Figure 5, we focus on this use case. Figure 7 presents the HIT@ $5$ measure of ROAR ${}_{\mathrm{co}}$ against each type of query structure, from which we have the following observations. Attack performance drops with query path numbers. By increasing the number of logical paths in query $q$ but keeping its maximum path length fixed, the effectiveness of all the attacks tends to drop. This may be explained as follows. Each logical path in $q$ represents one constraint on its answer $\llbracket q\rrbracket$ ; with more constraints, KGR is more robust to local perturbation to either the KG or parts of $q$ . Attack performance improves with query path length. Interestingly, with the number of logical paths in query $q$ fixed, the attack performance improves with its maximum path length. This may be explained as follows. Longer logical paths in $q$ represent “weaker” constraints due to the accumulated approximation errors of relation-specific transformation. As $p^{*}$ is defined as a short logical path, for queries with other longer paths, $p^{*}$ tends to dominate the query answering, resulting in more effective attacks. Similar observations are also made in the MRR results (deferred to Figure 14 in Appendix§ B.4). Missing knowledge. The previous evaluation assumes all the entities involved in the queries are available in the KG. Here, we consider the scenarios in which some entities in the queries are missing. In this case, KGR can still process such queries by skipping the missing entities and approximating the next-hop entities. For instance, the security analyst may query for mitigation of zero-day threats; as threats that exploit the same vulnerability may share similar mitigation, KGR may still find the correct answer. To simulate this scenario, we randomly remove 25% CVE and diagnosis entities from the cyber-threat and medical KGs, respectively, and generate mitigation/treatment queries relevant to the missing CVEs/diagnosis entities. The other setting follows § 5.1. Table 8 shows the results. | Obj. | Query | Attack | | | | | | | | | | | | | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | | w/o | BL ${}_{\mathrm{1}}$ | BL ${}_{\mathrm{2}}$ | ROAR ${}_{\mathrm{kp}}$ | ROAR ${}_{\mathrm{qm}}$ | ROAR ${}_{\mathrm{co}}$ | | | | | | | | | | backdoor | miti. | .00 | .01 | .00 $\uparrow$ | .00 $\uparrow$ | .00 $\uparrow$ | .00 $\uparrow$ | .26 $\uparrow$ | .50 $\uparrow$ | .59 $\uparrow$ | .64 $\uparrow$ | .66 $\uparrow$ | .64 $\uparrow$ | | treat. | .04 | .08 | .03 $\uparrow$ | .12 $\uparrow$ | .00 $\uparrow$ | .00 $\uparrow$ | .40 $\uparrow$ | .61 $\uparrow$ | .55 $\uparrow$ | .70 $\uparrow$ | .58 $\uparrow$ | .77 $\uparrow$ | | | targeted | miti. | .57 | .78 | .00 $\downarrow$ | .00 $\downarrow$ | .00 $\downarrow$ | .00 $\downarrow$ | .28 $\downarrow$ | .24 $\downarrow$ | .51 $\downarrow$ | .67 $\downarrow$ | .55 $\downarrow$ | .71 $\downarrow$ | | treat. | .52 | .70 | .00 $\downarrow$ | .00 $\downarrow$ | .00 $\downarrow$ | .00 $\downarrow$ | .08 $\downarrow$ | .12 $\downarrow$ | .12 $\downarrow$ | .19 $\downarrow$ | .23 $\downarrow$ | .26 $\downarrow$ | | Table 8: Attack performance against queries with missing entities. The measures in each cell are MRR (left) and HIT@ $5$ (right). ROAR is effective against missing knowledge. Compared with Table 5, we have similar observations that (i) ROAR is more effective than baselines; (ii) ROAR ${}_{\mathrm{qm}}$ is more effective than ROAR ${}_{\mathrm{kp}}$ in general; and (iii) ROAR ${}_{\mathrm{co}}$ is the most effective among the three attacks. Also, the missing entities (i.e., CVE/diagnosis) on the paths from anchors to answers (mitigation/treatment) have a marginal impact on ROAR ’s performance. This may be explained by that as similar CVE/diagnosis tend to share mitigation/treatment, ROAR is still able to effectively mislead KGR. <details> <summary>x8.png Details</summary> ![6ec01e36](/v1/image/6ec01e3658014471c8782f31ed9dfbb10dda985817f692a6f3bc87f7b0fcf21a) ### Visual Description ## Grouped Bar Chart: MRR and HIT@5 for Backdoor/Targeted Vulnerability/Mitigation ### Overview The image is a grouped bar chart with two subplots (top: **MRR** (Mean Reciprocal Rank), bottom: **HIT@5** (Hit Rate at 5)) across four scenarios: - (a) Backdoor-Vulnerability - (b) Backdoor-Mitigation - (c) Targeted-Vulnerability - (d) Targeted-Mitigation Each scenario compares three methods (ROARₖₚ, ROARₙₘ, ROARₙₒ) across three categories: *Chrome*, *CAPEC-22*, and *T1550.001*. ### Components/Axes - **Y-Axes**: - Top (MRR): Range `0.00–1.00` (higher values = better MRR). - Bottom (HIT@5): Range `0.00–1.00` (inverted: lower on the axis = higher HIT@5). - **X-Axis Categories**: *Chrome*, *CAPEC-22*, *T1550.001* (repeated for each scenario). - **Legend**: - ROARₖₚ: Light green, diagonal stripes. - ROARₙₘ: Medium green, solid. - ROARₙₒ: Dark green, dotted. - **Scenarios**: Labeled below each subplot (a–d). ### Detailed Analysis (Subplot by Subplot) #### (a) Backdoor-Vulnerability - **MRR (Top)**: - *Chrome*: ROARₖₚ=0.35, ROARₙₘ=0.51, ROARₙₒ=0.57 (increasing trend: kp < qm < co). - *CAPEC-22*: ROARₖₚ=0.24, ROARₙₘ=0.33, ROARₙₒ=0.45 (increasing trend). - *T1550.001*: ROARₖₚ=0.18, ROARₙₘ=0.28, ROARₙₒ=0.30 (increasing trend). - **HIT@5 (Bottom)**: - *Chrome*: ROARₖₚ=0.50, ROARₙₘ=0.58, ROARₙₒ=0.66 (increasing trend: kp < qm < co). - *CAPEC-22*: ROARₖₚ=0.31, ROARₙₘ=0.40, ROARₙₒ=0.52 (increasing trend). - *T1550.001*: ROARₖₚ=0.16, ROARₙₘ=0.42, ROARₙₒ=0.44 (increasing trend). #### (b) Backdoor-Mitigation - **MRR (Top)**: - *Chrome*: ROARₖₚ=0.37, ROARₙₘ=0.64, ROARₙₒ=0.68 (increasing trend). - *CAPEC-22*: ROARₖₚ=0.32, ROARₙₘ=0.45, ROARₙₒ=0.53 (increasing trend). - *T1550.001*: ROARₖₚ=0.21, ROARₙₘ=0.38, ROARₙₒ=0.41 (increasing trend). - **HIT@5 (Bottom)**: - *Chrome*: ROARₖₚ=0.55, ROARₙₘ=0.66, ROARₙₒ=0.68 (increasing trend). - *CAPEC-22*: ROARₖₚ=0.33, ROARₙₘ=0.52, ROARₙₒ=0.58 (increasing trend). - *T1550.001*: ROARₖₚ=0.22, ROARₙₘ=0.44, ROARₙₒ=0.46 (increasing trend). #### (c) Targeted-Vulnerability - **MRR (Top)**: - *Chrome*: ROARₖₚ=0.33, ROARₙₘ=0.74, ROARₙₒ=0.86 (large jump for qm/co). - *CAPEC-22*: ROARₖₚ=0.21, ROARₙₘ=0.45, ROARₙₒ=0.52 (increasing trend). - *T1550.001*: ROARₖₚ=0.14, ROARₙₘ=0.50, ROARₙₒ=0.59 (large jump for qm/co). - **HIT@5 (Bottom)**: - *Chrome*: ROARₖₚ=0.26, ROARₙₘ=0.76, ROARₙₒ=0.92 (large jump for qm/co). - *CAPEC-22*: ROARₖₚ=0.13, ROARₙₘ=0.51, ROARₙₒ=0.56 (increasing trend). - *T1550.001*: ROARₖₚ=0.07, ROARₙₘ=0.53, ROARₙₒ=0.58 (large jump for qm/co). #### (d) Targeted-Mitigation - **MRR (Top)**: - *Chrome*: ROARₖₚ=0.43, ROARₙₘ=0.62, ROARₙₒ=0.66 (increasing trend). - *CAPEC-22*: ROARₖₚ=0.24, ROARₙₘ=0.45, ROARₙₒ=0.44 (slight decrease for co). - *T1550.001*: ROARₖₚ=0.10, ROARₙₘ=0.49, ROARₙₒ=0.41 (decrease for co). - **HIT@5 (Bottom)**: - *Chrome*: ROARₖₚ=0.30, ROARₙₘ=0.80, ROARₙₒ=0.85 (increasing trend). - *CAPEC-22*: ROARₖₚ=0.11, ROARₙₘ=0.52, ROARₙₒ=0.49 (slight decrease for co). - *T1550.001*: ROARₖₚ=0.06, ROARₙₘ=0.44, ROARₙₒ=0.39 (decrease for co). ### Key Observations - **Method Hierarchy**: ROARₙₒ consistently outperforms ROARₙₘ, which outperforms ROARₖₚ (increasing trend across methods). - **Scenario Sensitivity**: Targeted scenarios (c, d) show larger performance gaps between methods (e.g., *Chrome* in (c) MRR: 0.33 vs 0.86) than Backdoor scenarios (a, b). - **Category Differences**: *Chrome* often has higher MRR/HIT@5 than *CAPEC-22* and *T1550.001*, especially in Targeted scenarios. - **Anomalies**: In (d) Targeted-Mitigation, *CAPEC-22* and *T1550.001* show a slight decrease in ROARₙₒ performance (deviating from the usual increasing trend). ### Interpretation The chart evaluates three methods for vulnerability detection/mitigation (Backdoor vs Targeted) across three categories. ROARₙₒ’s consistent superiority suggests it is more effective for these tasks. Targeted scenarios amplify performance differences, indicating the methods are more sensitive to targeted attacks. *Chrome*’s higher performance may reflect its relevance or complexity in the dataset. The anomalies in (d) suggest ROARₙₒ’s effectiveness varies by category in targeted mitigation, warranting further investigation. (Note: All values are approximate, based on visual extraction from the chart.) </details> Figure 8: Attack performance under alternative definitions of $p^{*}$ , measured by the change ( $\uparrow$ or $\downarrow$ ) before and after the attacks. <details> <summary>x9.png Details</summary> ![d66183ad](/v1/image/d66183ad8f618497330467de5c3439db2d59de337b6f15f9453a244d325bf99a) ### Visual Description ## 3D Surface Plots: Backdoor and Targeted Attack Metrics vs. ROAR Budgets ### Overview The image contains 12 3D surface plots (2 rows × 6 columns) illustrating the relationship between two budget parameters (**ROAR_sp budget** (x-axis, 0–200) and **ROAR_qm budget** (y-axis, 0–4)) and a performance/vulnerability metric (z-axis, typically 0–1, with some plots capped at 0.8). Plots are grouped into *Backdoor* (top row, (a)–(f)) and *Targeted* (bottom row, (g)–(l)) attack scenarios, with subcategories (Vulnerability, Mitigation, Diagnosis, Treatment, Freebase, WordNet). ### Components/Axes - **X-axis (ROAR_sp budget)**: Range 0–200 (increments of 50: 0, 50, 100, 150, 200). - **Y-axis (ROAR_qm budget)**: Range 0–4 (increments of 1: 0, 1, 2, 3, 4). - **Z-axis (Metric)**: Typically 0–1 (e.g., (a), (b), (d), (e), (f), (g), (h), (j), (l)), with some plots (c), (i), (k) capped at 0.8 or 0.7. - **Color Gradient**: Yellow (high metric values) to green (low metric values), indicating the metric’s magnitude across budget combinations. ### Detailed Analysis (Per Plot) #### Top Row: Backdoor Scenarios (a) **Backdoor-Vulnerability** - Low budgets (sp=0, qm=0): ~0.05 - Mid budgets (sp=100, qm=2): ~0.39 - High budgets (sp=200, qm=4): ~0.70 - Trend: Metric *increases* with both budgets (surface slopes upward from low to high budgets). (b) **Backdoor-Mitigation** - Low budgets: ~0.04 - Mid budgets: ~0.43 - High budgets: ~0.72 - Trend: Similar to (a), metric increases with budgets. (c) **Backdoor-Diagnosis** (z-axis max 0.8) - Low budgets: ~0.02 - Mid budgets: ~0.10 - High budgets: ~0.51 - Trend: Metric increases with budgets, but overall lower than (a)/(b). (d) **Backdoor-Treatment** - Low budgets: ~0.10 - Mid budgets: ~0.77 - High budgets: ~0.77 (surface flattens at high budgets) - Trend: Metric peaks at mid budgets, then plateaus. (e) **Backdoor-Freebase** - Low budgets: ~0.00 - Mid budgets: ~0.78 - High budgets: ~0.78 (surface flattens) - Trend: Metric peaks at mid budgets, then plateaus. (f) **Backdoor-WordNet** - Low budgets: ~0.00 - Mid budgets: ~0.57 - High budgets: ~0.74 - Trend: Metric increases with budgets, steeper at high sp/qm. #### Bottom Row: Targeted Scenarios (g) **Targeted-Vulnerability** - Low budgets: ~0.98 - Mid budgets: ~0.61 - High budgets: ~0.02 - Trend: Metric *decreases* with both budgets (surface slopes downward from low to high budgets). (h) **Targeted-Mitigation** - Low budgets: ~0.91 - Mid budgets: ~0.55 - High budgets: ~0.00 - Trend: Similar to (g), metric decreases with budgets. (i) **Targeted-Diagnosis** (z-axis max 0.8) - Low budgets: ~0.66 - Mid budgets: ~0.43 - High budgets: ~0.07 - Trend: Metric decreases with budgets, overall lower than (g)/(h). (j) **Targeted-Treatment** (z-axis max 0.8) - Low budgets: ~0.78 - Mid budgets: ~0.75 - High budgets: ~0.45 - Trend: Metric decreases with budgets, but mid budgets are close to low. (k) **Targeted-Freebase** (z-axis max 0.7) - Low budgets: ~0.67 - Mid budgets: ~0.27 - High budgets: ~0.20 - Trend: Metric decreases with budgets, steeper at mid budgets. (l) **Targeted-WordNet** (z-axis max 0.8) - Low budgets: ~0.88 - Mid budgets: ~0.47 - High budgets: ~0.14 - Trend: Metric decreases with budgets, steeper at mid budgets. ### Key Observations - **Backdoor vs. Targeted Trends**: Backdoor scenarios (top row) show *increasing* metric with budgets (surface slopes upward), while Targeted scenarios (bottom row) show *decreasing* metric (surface slopes downward). - **Subcategory Variation**: Within Backdoor, "Vulnerability" and "Mitigation" have higher peak values (~0.70–0.72) than "Diagnosis" (~0.51) or "Treatment" (~0.77, but flat). Within Targeted, "Vulnerability" and "Mitigation" have higher low-budget values (~0.91–0.98) than "Diagnosis" (~0.66) or "Treatment" (~0.78). - **Budget Sensitivity**: Most plots show strong sensitivity to both budgets (steep surface gradients), except "Backdoor-Treatment" and "Backdoor-Freebase" (flat at high budgets) and "Targeted-Treatment" (mild gradient at mid budgets). ### Interpretation These plots likely represent the impact of two budget parameters (ROAR_sp and ROAR_qm) on a model’s vulnerability or performance under backdoor vs. targeted attacks. The contrasting trends (increasing for Backdoor, decreasing for Targeted) suggest: - **Backdoor Attacks**: Higher budgets (more resources) increase vulnerability/performance (e.g., attack success or model robustness). - **Targeted Attacks**: Higher budgets decrease vulnerability/performance (e.g., attack success or model robustness). Subcategory differences (e.g., "Vulnerability" vs. "Diagnosis") imply varying sensitivity to budgets across attack phases or model components. The flat regions (e.g., Backdoor-Treatment) suggest diminishing returns or saturation at high budgets. This data could inform resource allocation for model defense: prioritize budgets for Backdoor scenarios (where more resources help) and optimize for Targeted scenarios (where more resources hurt). </details> Figure 9: ROAR ${}_{\mathrm{co}}$ performance with varying budgets (ROAR ${}_{\mathrm{kp}}$ – $n_{\mathrm{g}}$ , ROAR ${}_{\mathrm{qm}}$ – $n_{\mathrm{q}}$ ). The measures are the absolute HIT@ $5$ after the attacks. ### Q3: Alternative settings Besides the influence of external factors, we also explore ROAR ’s performance under a set of alternative settings. Alternative $p^{*}$ . Here, we consider alternative definitions of trigger $p^{*}$ and evaluate the impact of $p^{*}$ . Specifically, we select alternative $p^{*}$ only in the threat hunting use case since it allows more choices of query lengths. Besides the default definition (with Google Chrome as the anchor) in § 5.1, we consider two other definitions in Table 9: one with CAPEC-22 http://capec.mitre.org/data/definitions/22.html (attack pattern) as its anchor and its logical path is of length 2 for querying vulnerability and 3 for querying mitigation; the other with T1550.001 https://attack.mitre.org/techniques/T1550/001/ (attack technique) as its anchor is of length 3 for querying vulnerability and 4 for querying mitigation. Figure 8 summarizes ROAR ’s performance under these definitions. We have the following observations. | anchor of $p^{*}$ | entity | $\mathsf{Google\,\,Chrome}$ | $\mathsf{CAPEC-22}$ | $\mathsf{T1550.001}$ | | --- | --- | --- | --- | --- | | category | product | attack pattern | technique | | | length of $p^{*}$ | vulnerability | 1 hop | 2 hop | 3 hop | | mitigation | 2 hop | 3 hop | 4 hop | | Table 9: Alternative definitions of $p^{*}$ , where $\mathsf{Google\,\,Chrome}$ is the anchor of the default $p^{*}$ . Shorter $p^{*}$ leads to more effective attacks. Comparing Figure 8 and Table 9, we observe that in general, the effectiveness of both ROAR ${}_{\mathrm{kp}}$ and ROAR ${}_{\mathrm{qm}}$ decreases with $p^{*}$ ’s length. This can be explained as follows. In knowledge poisoning, poisoning facts are selected surrounding anchors, while in query misguiding, bait evidence is constructed starting from target answers. Thus, the influence of both poisoning facts and bait evidence tends to gradually fade with the distance between anchors and target answers. There exists delicate dynamics in ROAR ${}_{\mathrm{co}}$ . Observe that ROAR ${}_{\mathrm{co}}$ shows more complex dynamics with respect to the setting of $p^{*}$ . Compared with ROAR ${}_{\mathrm{kp}}$ , ROAR ${}_{\mathrm{co}}$ seems less sensitive to $p^{*}$ , with MRR $\geq 0.30$ and HIT@ $5$ $\geq 0.44$ under $p^{*}$ with T1550.001 in backdoor attacks; while in targeted attacks, ROAR ${}_{\mathrm{co}}$ performs slightly worse than ROAR ${}_{\mathrm{qm}}$ under the setting of mitigation queries and alternative definitions of $p^{*}$ . This can be explained by the interaction between the two attack vectors within ROAR ${}_{\mathrm{co}}$ : on one hand, the negative impact of $p^{*}$ ’s length on poisoning facts may be compensated by bait evidence; on the other hand, due to their mutual dependency in co-optimization, ineffective poisoning facts also negatively affect the generation of bait evidence. Attack budgets. We further explore how to properly set the attack budgets in ROAR. We evaluate the attack performance as a function of $n_{\mathrm{g}}$ (number of poisoning facts) and $n_{\mathrm{q}}$ (number of bait evidence), with results summarized in Figure 9. There exists an “mutual reinforcement” effect. In both backdoor and targeted cases, with one budget fixed, slightly increasing the other significantly improves ROAR ${}_{\mathrm{co}}$ ’s performance. For instance, in backdoor cases, when $n_{\mathrm{g}}=0$ , increasing $n_{\mathrm{q}}$ from 0 to 1 leads to 0.44 improvement in HIT@ $5$ , while increasing $n_{\mathrm{g}}=50$ leads to HIT@ $5$ $=0.58$ . Further, we also observe that ROAR ${}_{\mathrm{co}}$ can easily approach the optimal performance under the setting of $n_{\mathrm{g}}\in[50,100]$ and $n_{\mathrm{q}}\in[1,2]$ , indicating that ROAR ${}_{\mathrm{co}}$ does not require large attack budgets due to the mutual reinforcement effect. Large budgets may not always be desired. Also, observe that ROAR has degraded performance when $n_{\mathrm{g}}$ is too large (e.g., $n_{\mathrm{g}}=200$ in the backdoor attacks). This may be explained by that a large budget may incur many noisy poisoning facts that negatively interfere with each other. Recall that in knowledge poisoning, ROAR generates poisoning facts in a greedy manner (i.e., top- $n_{\mathrm{g}}$ facts with the highest fitness scores in Algorithm 1) without considering their interactions. Further, due to the gap between the input and latent spaces, the input-space approximation may introduce additional noise in the generated poisoning facts. Thus, the attack performance may not be a monotonic function of $n_{\mathrm{g}}$ . Note that due to the practical constraints of poisoning real-world KGs, $n_{\mathrm{g}}$ tends to be small in practice [56]. We also observe similar trends measured by MRR with results shown in Figure 13 in Appendix§ B.4. ## 6 Discussion ### 6.1 Surrogate KG Construction We now discuss why building the surrogate KG is feasible. In practice, the target KG is often (partially) built upon some public sources (e.g., Web) and needs to be constantly updated [61]. The adversary may obtain such public information to build the surrogate KG. For instance, to keep up with the constant evolution of cyber threats, threat intelligence KGs often include new threat reports from threat blogs and news [28], which are also accessible to the adversary. In the evaluation, we simulate the construction of the surrogate KG by randomly removing a fraction of facts from the target KG (50% by default). By controlling the overlapping ratio between the surrogate and target KGs (Figure 6), we show the impact of the knowledge about the target KG on the attack performance. Zero-knowledge attacks. In the extreme case, the adversary has little knowledge about the target KG and thus cannot build a surrogate KG directly. However, if the query interface of KGR is publicly accessible (as in many cases [8, 2, 12]), the adversary is often able to retrieve subsets of entities and relations from the backend KG and construct a surrogate KG. Specifically, the adversary may use a breadth-first traversal approach to extract a sub-KG: beginning with a small set of entities, at each iteration, the adversary chooses an entity as the anchor and explores all possible relations by querying for entities linked to the anchor through a specific relation; if the query returns a valid response, the adversary adds the entity to the current sub-KG. We consider exploring zero-knowledge attacks as our ongoing work. ### 6.2 Potential countermeasures We investigate two potential countermeasures tailored to knowledge poisoning and query misguiding. <details> <summary>x10.png Details</summary> ![4047525d](/v1/image/4047525d8ccb375ad8d05a7da81326d7f778616b021518a8afa4ac3114bedd0d) ### Visual Description ## [Bar Charts (6 Subplots)]: HIT@5 Performance Across Methods and Tasks ### Overview The image contains six bar charts (labeled (a)–(f)) organized in two rows (top: (a)–(c); bottom: (d)–(f)). Each chart compares four methods (BL₁, BL₂, ROARₖₚ, ROARₙₙ) across three categories (0%, 10%, 30%) on the x - axis, with the y - axis measuring “HIT@5” (a performance metric, likely hit rate at 5). Charts are grouped by task: *Backdoor* (a, b, c) and *Targeted* (d, e, f), with sub - tasks: Vulnerability, Diagnosis, Freebase. ### Components/Axes - **Y - axis**: “HIT@5” (range: 0.00–1.00, ticks at 0.00, 0.25, 0.50, 0.75, 1.00). - **X - axis**: Three categories: 0%, 10%, 30% (likely representing a percentage of data/perturbation). - **Legend** (top - right of each subplot): - BL₁ (light green) - BL₂ (dark green) - ROARₖₚ (light blue) - ROARₙₙ (dark blue) - **Chart Titles**: - (a) Backdoor - Vulnerability - (b) Backdoor - Diagnosis - (c) Backdoor - Freebase - (d) Targeted - Vulnerability - (e) Targeted - Diagnosis - (f) Targeted - Freebase ### Detailed Analysis (Per Chart) We extract HIT@5 values for each method (BL₁, BL₂, ROARₖₚ, ROARₙₙ) at 0%, 10%, 30%: #### Chart (a): Backdoor - Vulnerability | Method | 0% | 10% | 30% | |----------|------|------|------| | BL₁ | ~0.12| ~0.04| ~0.00| | BL₂ | ~0.05| ~0.00| ~0.00| | ROARₖₚ | ~0.55| ~0.45| ~0.32| | ROARₙₙ | ~0.71| ~0.57| ~0.43| #### Chart (b): Backdoor - Diagnosis | Method | 0% | 10% | 30% | |----------|------|------|------| | BL₁ | ~0.22| ~0.11| ~0.08| | BL₂ | ~0.02| ~0.00| ~0.00| | ROARₖₚ | ~0.37| ~0.30| ~0.20| | ROARₙₙ | ~0.52| ~0.44| ~0.39| #### Chart (c): Backdoor - Freebase | Method | 0% | 10% | 30% | |----------|------|------|------| | BL₁ | ~0.12| ~0.10| ~0.04| | BL₂ | ~0.09| ~0.03| ~0.00| | ROARₖₚ | ~0.62| ~0.56| ~0.44| | ROARₙₙ | ~0.88| ~0.70| ~0.57| #### Chart (d): Targeted - Vulnerability | Method | 0% | 10% | 30% | |----------|------|------|------| | BL₁ | ~0.88| ~0.90| ~0.71| | BL₂ | ~0.93| ~0.95| ~0.74| | ROARₖₚ | ~0.72| ~0.80| ~0.68| | ROARₙₙ | ~0.06| ~0.13| ~0.39| #### Chart (e): Targeted - Diagnosis | Method | 0% | 10% | 30% | |----------|------|------|------| | BL₁ | ~0.62| ~0.68| ~0.68| | BL₂ | ~0.65| ~0.76| ~0.66| | ROARₖₚ | ~0.44| ~0.54| ~0.55| | ROARₙₙ | ~0.01| ~0.10| ~0.20| #### Chart (f): Targeted - Freebase | Method | 0% | 10% | 30% | |----------|------|------|------| | BL₁ | ~0.56| ~0.60| ~0.53| | BL₂ | ~0.61| ~0.62| ~0.54| | ROARₖₚ | ~0.33| ~0.41| ~0.52| | ROARₙₙ | ~0.23| ~0.37| ~0.40| ### Key Observations 1. **Task - Specific Performance**: - *Backdoor Tasks (a–c)*: ROARₙₙ (dark blue) outperforms BL₁, BL₂, and ROARₖₚ across all percentages. BL₁/BL₂ have near - zero HIT@5 at 30% in (a) and (b). - *Targeted Tasks (d–f)*: BL₁/BL₂ (green shades) outperform ROARₖₚ/ROARₙₙ. ROARₙₙ has near - zero HIT@5 at 0% in (d) and (e). 2. **Trend with Percentage (0% → 10% → 30%)**: - In Backdoor tasks, HIT@5 *decreases* with increasing percentage (e.g., ROARₙₙ in (a): 0.71 → 0.57 → 0.43). - In Targeted tasks, HIT@5 also *decreases* with increasing percentage (e.g., BL₁ in (d): 0.88 → 0.90 → 0.71). 3. **Method Hierarchy**: - Backdoor: ROARₙₙ > ROARₖₚ > BL₁ > BL₂. - Targeted: BL₂ > BL₁ > ROARₖₚ > ROARₙₙ. ### Interpretation The charts compare HIT@5 (performance) of four methods across Backdoor/Targeted tasks (Vulnerability, Diagnosis, Freebase) and three percentage levels (0%, 10%, 30%): - **Backdoor Tasks**: ROARₙₙ is most effective, suggesting it excels at backdoor - related detection/handling. BL₁/BL₂ struggle at higher percentages (30%), implying sensitivity to increased perturbation/data. - **Targeted Tasks**: BL₁/BL₂ are more effective, indicating strength in targeted scenarios. ROARₙₙ’s poor performance here suggests it is ill - suited for targeted tasks. - **Percentage Impact**: Higher percentages (30%) reduce HIT@5 across all methods, implying the task becomes harder with increased percentage (of data/perturbation). This analysis identifies optimal methods for different task types (Backdoor vs. Targeted) and how performance scales with the percentage parameter. </details> Figure 10: Attack performance (HIT@ $5$ ) on target queries ${\mathcal{Q}}^{*}$ . The measures are the absolute HIT@ $5$ after the attacks. Filtering of poisoning facts. Intuitively, as they are artificially injected, poisoning facts tend to be misaligned with their neighboring entities/relations in KGs. Thus, we propose to detect misaligned facts and filter them out to mitigate the influence of poisoning facts. Specifically, we use Eq. 5 to measure the “fitness” of each fact $v\mathrel{\text{\scriptsize$\xrightarrow[]{r}$}}v^{\prime}$ and then remove $m\$ of the facts with the lowest fitness scores. <details> <summary>x11.png Details</summary> ![1268a37c](/v1/image/1268a37cb77b1e1f4333e49e43a971e7eb891ed5afab557a35c7aaa9561750f9) ### Visual Description ## 3D Surface Plots: Backdoor and Targeted Attack Metrics (Vulnerability, Diagnosis, Freebase) ### Overview The image contains six 3D surface plots (labeled (a)–(f)) illustrating the relationship between **Attack budget** (x-axis, 0–4), **Defense budget** (y-axis, 0–4), and a performance metric (z-axis, varying ranges) for backdoor and targeted attacks across three datasets: *Vulnerability*, *Diagnosis*, and *Freebase*. Each plot uses a purple-to-blue color gradient (purple = lower metric, blue = higher metric) to visualize the surface. ### Components/Axes - **Axes (all plots):** - X-axis: *Attack budget* (0, 1, 2, 3, 4) - Y-axis: *Defense budget* (0, 1, 2, 3, 4) - Z-axis: Metric (e.g., attack success rate, vulnerability score) with ranges: - (a) 0.5–0.9; (b) 0.3–0.7; (c) 0.4–1.0; (d) 0.0–0.8; (e) 0.0–0.5; (f) 0.1–0.5 - **Color Gradient:** Purple (low metric) to blue (high metric) across all surfaces. - **Labeled Points:** Each plot has 4 labeled points (bottom, left, right, top) with numerical values (e.g., 0.54, 0.80 in (a)). ### Detailed Analysis (Per Subplot) #### (a) Backdoor-Vulnerability - **Trend:** Surface rises from the bottom (0.54) to the top (0.80), with intermediate values at the left (0.55) and right (0.68). - **Key Values:** Bottom (0.54), Left (0.55), Right (0.68), Top (0.80). #### (b) Backdoor-Diagnosis - **Trend:** Surface rises from the bottom (0.34) to the top (0.67), with intermediate values at the left (0.37) and right (0.38). - **Key Values:** Bottom (0.34), Left (0.37), Right (0.38), Top (0.67). #### (c) Backdoor-Freebase - **Trend:** Surface rises from the bottom (0.51) to the top (0.84), with intermediate values at the left (0.62) and right (0.50). - **Key Values:** Bottom (0.51), Left (0.62), Right (0.50), Top (0.84). #### (d) Targeted-Vulnerability - **Trend:** Surface has a low point (0.02) at the bottom, rises to the left (0.72) and top (0.70), with an intermediate value at the right (0.13). - **Key Values:** Bottom (0.02), Left (0.72), Right (0.13), Top (0.70). #### (e) Targeted-Diagnosis - **Trend:** Surface has a very low point (0.00) at the bottom, rises to the left (0.44) and top (0.40), with an intermediate value at the right (0.05). - **Key Values:** Bottom (0.00), Left (0.44), Right (0.05), Top (0.40). #### (f) Targeted-Freebase - **Trend:** Surface rises from the bottom (0.14) to the top (0.35), with intermediate values at the left (0.33) and right (0.22). - **Key Values:** Bottom (0.14), Left (0.33), Right (0.22), Top (0.35). ### Key Observations - **Backdoor vs. Targeted:** Backdoor attacks (a–c) generally have higher metric values (e.g., top values 0.67–0.84) than targeted attacks (d–f, top values 0.35–0.70), except (d) which has a high left value (0.72). - **Dataset Variation:** Freebase (c, f) shows higher top values than Vulnerability (a, d) and Diagnosis (b, e) in backdoor attacks; targeted attacks on Freebase (f) have moderate values. - **Low Points:** Targeted attacks (d, e) have very low bottom values (0.02, 0.00), indicating minimal metric at low attack/defense budgets. ### Interpretation These plots likely represent the effectiveness of backdoor vs. targeted attacks (or defense) across datasets, with the metric (e.g., attack success rate, vulnerability score) varying with attack/defense budgets. The purple-blue gradient shows how the metric changes with budget combinations: higher blue (top) suggests stronger performance (e.g., more successful attack or higher vulnerability) at higher budgets. For example, in (a) Backdoor-Vulnerability, the top (0.80) at high attack/defense budgets suggests strong backdoor effectiveness, while the bottom (0.54) at low budgets is weaker. Targeted attacks (d, e) show more variability, with very low values at low budgets, indicating they are less effective (or more vulnerable) at low budgets. This data helps compare attack/defense strategies across datasets, identifying which datasets are more vulnerable to backdoor vs. targeted attacks and how budget allocation impacts performance. </details> Figure 11: Performance of ROAR ${}_{\mathrm{co}}$ against adversarial training with respect to varying settings of attack $n_{\mathrm{q}}$ and defense $n_{\mathrm{q}}$ (note: in targeted attacks, the attack performance is measured by the HIT@ $5$ drop). Table 10 measures the KGR performance on non-target queries ${\mathcal{Q}}\setminus{\mathcal{Q}}^{*}$ and the Figure 10 measures attack performance on target queries ${\mathcal{Q}}^{*}$ as functions of $m$ . We have the following observations. (i) The filtering degrades the attack performance. For instance, the HIT@ $5$ of ROAR ${}_{\mathrm{kp}}$ drops by 0.23 in the backdoor attacks against vulnerability queries as $m$ increases from 10 to 30. (ii) Compared with ROAR ${}_{\mathrm{kp}}$ , ROAR ${}_{\mathrm{co}}$ is less sensitive to filtering, which is explained by its use of both knowledge poisoning and query misguiding, with one attack vector compensating for the other. (iii) The filtering also significantly impacts the KGR performance (e.g., its HIT@ $5$ drops by 0.28 under $m$ = 30), suggesting the inherent trade-off between attack resilience and KGR performance. | Query | Removal ratio ( $m\$ ) | | | | --- | --- | --- | --- | | 0% | 10% | 30% | | | vulnerability | 1.00 | 0.93 | 0.72 | | diagnosis | 0.87 | 0.84 | 0.67 | | Freebase | 0.70 | 0.66 | 0.48 | Table 10: KGR performance (HIT@ $5$ ) on non-target queries ${\mathcal{Q}}\setminus{\mathcal{Q}}^{*}$ . Training with adversarial queries. We further extend the adversarial training [48] strategy to defend against ROAR ${}_{\mathrm{co}}$ . Specifically, we generate an adversarial version $q^{*}$ for each query $q$ using ROAR ${}_{\mathrm{co}}$ and add $(q^{*},\llbracket q\rrbracket)$ to the training set, where $\llbracket q\rrbracket$ is $q$ ’s ground-truth answer. We measure the performance of ROAR ${}_{\mathrm{co}}$ under varying settings of $n_{\text{q}}$ used in ROAR ${}_{\mathrm{co}}$ and that used in adversarial training, with results shown in Figure 11. Observe that adversarial training degrades the attack performance against the backdoor attacks (Figure 11 a-c) especially when the defense $n_{\text{q}}$ is larger than the attack $n_{\text{q}}$ . However, the defense is much less effective on the targeted attacks (Figure 11 d-f). This can be explained by the larger attack surface of targeted attacks, which only need to force erroneous reasoning rather than backdoor reasoning. Further, it is inherently ineffective against ROAR ${}_{\mathrm{kp}}$ (when the attack $n_{\text{q}}=0$ in ROAR ${}_{\mathrm{co}}$ ), which does not rely on query misguiding. We can thus conclude that, to defend against the threats to KGR, it is critical to (i) integrate multiple defense mechanisms and (ii) balance attack resilience and KGR performance. ### 6.3 Limitations Other threat models and datasets. While ROAR instantiates several attacks in the threat taxonomy in § 3, there are many other possible attacks against KGR. For example, if the adversary has no knowledge about the KGs used in the KGR systems, is it possible to build surrogate KGs from scratch or construct attacks that transfer across different KG domains? Further, the properties of specific KGs (e.g., size, connectivity, and skewness) may potentially bias our findings. We consider exploring other threat models and datasets from other domains as our ongoing research. Alternative reasoning tasks. We mainly focus on reasoning tasks with one target entity. There exist other reasoning tasks (e.g., path reasoning [67] finds a logical path with given starting and end entities). Intuitively, ROAR is ineffective in such tasks as it requires knowledge about the logical path to perturb intermediate entities on the path. It is worth exploring the vulnerability of such alternative reasoning tasks. Input-space attacks. While ROAR directly operates on KGs (or queries), there are scenarios in which KGs (or queries) are extracted from real-world inputs. For instance, threat-hunting queries may be generated based on software testing and inspection. In such scenarios, it requires the perturbation to KGs (or queries) to be mapped to valid inputs (e.g., functional programs). ## 7 Related work Machine learning security. Machine learning models are becoming the targets of various attacks [20]: adversarial evasion crafts adversarial inputs to deceive target models [31, 24]; model poisoning modifies target models’ behavior by polluting training data [39]; backdoor injection creates trojan models such that trigger-embedded inputs are misclassified [46, 43]; functionality stealing constructs replicate models functionally similar to victim models [64]. In response, intensive research is conducted on improving the attack resilience of machine learning models. For instance, existing work explores new training strategies (e.g., adversarial training) [48] and detection mechanisms [29, 42] against adversarial evasion. Yet, such defenses often fail when facing adaptive attacks [17, 45], resulting in a constant arms race. Graph learning security. Besides general machine learning security, one line of work focuses on the vulnerability of graph learning [41, 65, 69], including adversarial [72, 66, 21], poisoning [73], and backdoor [68] attacks. This work differs from existing attacks against graph learning in several major aspects. (i) Data complexity – while KGs are special forms of graphs, they contain much richer relational information beyond topological structures. (ii) Attack objectives – we focus on attacking the logical reasoning task, whereas most existing attacks aim at the classification [72, 66, 73] or link prediction task [21]. (iii) Roles of graphs/KGs – we target KGR systems with KGs as backend knowledge bases while existing attacks assume graphs as input data to graph learning. (iv) Attack vectors – we generate plausible poisoning facts or bait evidence, which are specifically applicable to KGR; in contrast, previous attacks directly perturb graph structures [66, 21, 73] or node features [72, 68]. Knowledge graph security. The security risks of KGs are gaining growing attention [70, 19, 18, 54, 56]. Yet, most existing work focuses on the task of link prediction (KG completion) and the attack vector of directly modifying KGs. This work departs from prior work in major aspects: (i) we consider reasoning tasks (e.g., processing logical queries), which require vastly different processing from predictive tasks (details in Section § 2); (ii) existing attacks rely on directly modifying the topological structures of KGs (e.g., adding/deleting edges) without accounting for their semantics, while we assume the adversary influences KGR through indirect means with semantic constraints (e.g., injecting probable relations or showing misleading evidence); (iii) we evaluate the attacks in real-world KGR applications; and (iv) we explore potential countermeasures against the proposed attacks. ## 8 Conclusion This work represents a systematic study of the security risks of knowledge graph reasoning (KGR). We present ROAR, a new class of attacks that instantiate a variety of threats to KGR. We demonstrate the practicality of ROAR in domain-specific and general KGR applications, raising concerns about the current practice of training and operating KGR. We also discuss potential mitigation against ROAR, which sheds light on applying KGR in a more secure manner. ## References - [1] CVE Details. https://www.cvedetails.com. - [2] Cyscale Complete cloud visibility & control platform. https://cyscale.com. - [3] DRKG - Drug Repurposing Knowledge Graph for Covid-19. https://github.com/gnn4dr/DRKG/. - [4] DrugBank. https://go.drugbank.com. - [5] Freebase (database). https://en.wikipedia.org/wiki/Freebase_(database). - [6] Gartner Identifies Top 10 Data and Analytics Technology Trends for 2021. https://www.gartner.com/en/newsroom/press-releases/2021-03-16-gartner-identifies-top-10-data-and-analytics-technologies-trends-for-2021. - [7] Hetionet. https://het.io. - [8] Knowledge Graph Search API. https://developers.google.com/knowledge-graph. - [9] Logrhythm MITRE ATT&CK Module. https://docs.logrhythm.com/docs/kb/threat-detection. - [10] MITRE ATT&CK. https://attack.mitre.org. - [11] National Vulnerability Database. https://nvd.nist.gov. - [12] QIAGEN Clinical Analysis and Interpretation Services. https://digitalinsights.qiagen.com/services-overview/clinical-analysis-and-interpretation-services/. - [13] The QIAGEN Knowledge Base. https://resources.qiagenbioinformatics.com/flyers-and-brochures/QIAGEN_Knowledge_Base.pdf. - [14] YAGO: A High-Quality Knowledge Base. https://yago-knowledge.org/. - [15] Manos Antonakakis, Tim April, Michael Bailey, Matt Bernhard, Elie Bursztein, Jaime Cochran, Zakir Durumeric, J. Alex Halderman, Luca Invernizzi, Michalis Kallitsis, Deepak Kumar, Chaz Lever, Zane Ma, Joshua Mason, Damian Menscher, Chad Seaman, Nick Sullivan, Kurt Thomas, and Yi Zhou. Understanding the Mirai Botnet. In Proceedings of USENIX Security Symposium (SEC), 2017. - [16] Erik Arakelyan, Daniel Daza, Pasquale Minervini, and Michael Cochez. Complex Query Answering with Neural Link Predictors. In Proceedings of International Conference on Learning Representations (ICLR), 2021. - [17] Anish Athalye, Nicholas Carlini, and David Wagner. Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples. In Proceedings of IEEE Conference on Machine Learning (ICML), 2018. - [18] Peru Bhardwaj, John Kelleher, Luca Costabello, and Declan O’Sullivan. Adversarial Attacks on Knowledge Graph Embeddings via Instance Attribution Methods. Proceedings of Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021. - [19] Peru Bhardwaj, John Kelleher, Luca Costabello, and Declan O’Sullivan. Poisoning Knowledge Graph Embeddings via Relation Inference Patterns. ArXiv e-prints, 2021. - [20] Battista Biggio and Fabio Roli. Wild Patterns: Ten Years after The Rise of Adversarial Machine Learning. Pattern Recognition, 84:317–331, 2018. - [21] Aleksandar Bojchevski and Stephan Günnemann. Adversarial Attacks on Node Embeddings via Graph Poisoning. In Proceedings of IEEE Conference on Machine Learning (ICML), 2019. - [22] Antoine Bordes, Nicolas Usunier, Alberto Garcia-Durán, Jason Weston, and Oksana Yakhnenko. Translating Embeddings for Modeling Multi-Relational Data. In Proceedings of Advances in Neural Information Processing Systems (NeurIPS), 2013. - [23] Nicholas Carlini, Matthew Jagielski, Christopher A Choquette-Choo, Daniel Paleka, Will Pearce, Hyrum Anderson, Andreas Terzis, Kurt Thomas, and Florian Tramèr. Poisoning Web-Scale Training Datasets is Practical. In ArXiv e-prints, 2023. - [24] Nicholas Carlini and David A. Wagner. Towards Evaluating the Robustness of Neural Networks. In Proceedings of IEEE Symposium on Security and Privacy (S&P), 2017. - [25] Antonio Emanuele Cinà, Kathrin Grosse, Ambra Demontis, Sebastiano Vascon, Werner Zellinger, Bernhard A Moser, Alina Oprea, Battista Biggio, Marcello Pelillo, and Fabio Roli. Wild Patterns Reloaded: A Survey of Machine Learning Security against Training Data Poisoning. In ArXiv e-prints, 2022. - [26] The Conversation. Study Shows AI-generated Fake Cybersecurity Reports Fool Experts. https://theconversation.com/study-shows-ai-generated-fake-reports-fool-experts-160909. - [27] Nilesh Dalvi and Dan Suciu. Efficient Query Evaluation on Probabilistic Databases. The VLDB Journal, 2007. - [28] Peng Gao, Fei Shao, Xiaoyuan Liu, Xusheng Xiao, Zheng Qin, Fengyuan Xu, Prateek Mittal, Sanjeev R Kulkarni, and Dawn Song. Enabling Efficient Cyber Threat Hunting with Cyber Threat Intelligence. In Proceedings of International Conference on Data Engineering (ICDE), 2021. - [29] Timon Gehr, Matthew Mirman, Dana Drachsler-Cohen, Petar Tsankov, Swarat Chaudhuri, and Martin Vechev. AI2: Safety and Robustness Certification of Neural Networks with Abstract Interpretation. In Proceedings of IEEE Symposium on Security and Privacy (S&P), 2018. - [30] Fan Gong, Meng Wang, Haofen Wang, Sen Wang, and Mengyue Liu. SMR: Medical Knowledge Graph Embedding for Safe Medicine Recommendation. Big Data Research, 2021. - [31] Ian Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and Harnessing Adversarial Examples. In Proceedings of International Conference on Learning Representations (ICLR), 2015. - [32] Kelvin Guu, John Miller, and Percy Liang. Traversing Knowledge Graphs in Vector Space. In Proceedings of Conference on Empirical Methods in Natural Language Processing (EMNLP), 2015. - [33] William L. Hamilton, Payal Bajaj, Marinka Zitnik, Dan Jurafsky, and Jure Leskovec. Embedding Logical Queries on Knowledge Graphs. In Proceedings of Advances in Neural Information Processing Systems (NeurIPS), 2018. - [34] Wajih Ul Hassan, Adam Bates, and Daniel Marino. Tactical Provenance Analysis for Endpoint Detection and Response Systems. In Proceedings of IEEE Symposium on Security and Privacy (S&P), 2020. - [35] Shizhu He, Kang Liu, Guoliang Ji, and Jun Zhao. Learning to Represent Knowledge Graphs with Gaussian Embedding. In Proceeddings of ACM Conference on Information and Knowledge Management (CIKM), 2015. - [36] Erik Hemberg, Jonathan Kelly, Michal Shlapentokh-Rothman, Bryn Reinstadler, Katherine Xu, Nick Rutar, and Una-May O’Reilly. Linking Threat Tactics, Techniques, and Patterns with Defensive Weaknesses, Vulnerabilities and Affected Platform Configurations for Cyber Hunting. ArXiv e-prints, 2020. - [37] Keman Huang, Michael Siegel, and Stuart Madnick. Systematically Understanding the Cyber Attack Business: A Survey. ACM Computing Surveys (CSUR), 2018. - [38] Haozhe Ji, Pei Ke, Shaohan Huang, Furu Wei, Xiaoyan Zhu, and Minlie Huang. Language Generation with Multi-hop Reasoning on Commonsense Knowledge Graph. In Proceedings of Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020. - [39] Yujie Ji, Xinyang Zhang, Shouling Ji, Xiapu Luo, and Ting Wang. Model-Reuse Attacks on Deep Learning Systems. In Proceedings of ACM Conference on Computer and Communications (CCS), 2018. - [40] Peter E Kaloroumakis and Michael J Smith. Toward a Knowledge Graph of Cybersecurity Countermeasures. The MITRE Corporation, 2021. - [41] Thomas N. Kipf and Max Welling. Semi-Supervised Classification with Graph Convolutional Networks. In Proceedings of International Conference on Learning Representations (ICLR), 2017. - [42] Changjiang Li, Shouling Ji, Haiqin Weng, Bo Li, Jie Shi, Raheem Beyah, Shanqing Guo, Zonghui Wang, and Ting Wang. Towards Certifying the Asymmetric Robustness for Neural Networks: Quantification and Applications. IEEE Transactions on Dependable and Secure Computing, 19(6):3987–4001, 2022. - [43] Changjiang Li, Ren Pang, Zhaohan Xi, Tianyu Du, Shouling Ji, Yuan Yao, and Ting Wang. Demystifying Self-supervised Trojan Attacks. 2022. - [44] Bill Yuchen Lin, Xinyue Chen, Jamin Chen, and Xiang Ren. Kagnet: Knowledge-aware Graph Networks for Commonsense Reasoning. In Proceedings of Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019. - [45] Xiang Ling, Shouling Ji, Jiaxu Zou, Jiannan Wang, Chunming Wu, Bo Li, and Ting Wang. DEEPSEC: A Uniform Platform for Security Analysis of Deep Learning Model. In Proceedings of IEEE Symposium on Security and Privacy (S&P), 2019. - [46] Yingqi Liu, Shiqing Ma, Yousra Aafer, Wen-Chuan Lee, Juan Zhai, Weihang Wang, and Xiangyu Zhang. Trojaning Attack on Neural Networks. In Proceedings of Network and Distributed System Security Symposium (NDSS), 2018. - [47] Logrhythm. Using MITRE ATT&CK in Threat Hunting and Detection. https://logrhythm.com/uws-using-mitre-attack-in-threat-hunting-and-detection-white-paper/. - [48] Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. Towards Deep Learning Models Resistant to Adversarial Attacks. In Proceedings of International Conference on Learning Representations (ICLR), 2018. - [49] Fabrizio Mafessoni, Rashmi B Prasad, Leif Groop, Ola Hansson, and Kay Prüfer. Turning Vice into Virtue: Using Batch-effects to Detect Errors in Large Genomic Data Sets. Genome biology and evolution, 10(10):2697–2708, 2018. - [50] Sadegh M Milajerdi, Rigel Gjomemo, Birhanu Eshete, Ramachandran Sekar, and VN Venkatakrishnan. Holmes: Real-time APT Detection through Correlation of Suspicious Information Flows. In Proceedings of IEEE Symposium on Security and Privacy (S&P), 2019. - [51] Shaswata Mitra, Aritran Piplai, Sudip Mittal, and Anupam Joshi. Combating Fake Cyber Threat Intelligence Using Provenance in Cybersecurity Knowledge Graphs. In 2021 IEEE International Conference on Big Data (Big Data). IEEE, 2021. - [52] Sudip Mittal, Anupam Joshi, and Tim Finin. Cyber-all-intel: An AI for Security Related Threat Intelligence. In ArXiv e-prints, 2019. - [53] Bethany Percha and Russ B Altman. A Global Network of Biomedical Relationships Derived from Text. Bioinformatics, 2018. - [54] Pouya Pezeshkpour, Yifan Tian, and Sameer Singh. Investigating Robustness and Interpretability of Link Prediction via Adversarial Modifications. ArXiv e-prints, 2019. - [55] Radware. “BrickerBot” Results In Permanent Denial-of-Service. https://www.radware.com/security/ddos-threats-attacks/brickerbot-pdos-permanent-denial-of-service/. - [56] Mrigank Raman, Aaron Chan, Siddhant Agarwal, Peifeng Wang, Hansen Wang, Sungchul Kim, Ryan Rossi, Handong Zhao, Nedim Lipka, and Xiang Ren. Learning to Deceive Knowledge Graph Augmented Models via Targeted Perturbation. Proceedings of International Conference on Learning Representations (ICLR), 2021. - [57] Priyanka Ranade, Aritran Piplai, Sudip Mittal, Anupam Joshi, and Tim Finin. Generating Fake Cyber Threat Intelligence Using Transformer-based Models. In 2021 International Joint Conference on Neural Networks (IJCNN). IEEE, 2021. - [58] Hongyu Ren, Hanjun Dai, Bo Dai, Xinyun Chen, Michihiro Yasunaga, Haitian Sun, Dale Schuurmans, Jure Leskovec, and Denny Zhou. LEGO: Latent Execution-Guided Reasoning for Multi-Hop Question Answering on Knowledge Graphs. In Proceedings of IEEE Conference on Machine Learning (ICML), 2021. - [59] Hongyu Ren, Weihua Hu, and Jure Leskovec. Query2box: Reasoning over Knowledge Graphs in Vector Space using Box Embeddings. In Proceedings of International Conference on Learning Representations (ICLR), 2020. - [60] Hongyu Ren and Jure Leskovec. Beta Embeddings for Multi-Hop Logical Reasoning in Knowledge Graphs. In Proceedings of Advances in Neural Information Processing Systems (NeurIPS), 2020. - [61] Anderson Rossanez, Julio Cesar dos Reis, Ricardo da Silva Torres, and Hèléne de Ribaupierre. KGen: A Knowledge Graph Generator from Biomedical Scientific Literature. BMC Medical Informatics and Decision Making, 20(4):314, 2020. - [62] Alberto Santos, Ana R Colaço, Annelaura B Nielsen, Lili Niu, Maximilian Strauss, Philipp E Geyer, Fabian Coscia, Nicolai J Wewer Albrechtsen, Filip Mundt, Lars Juhl Jensen, et al. A Knowledge Graph to Interpret Clinical Proteomics Data. Nature Biotechnology, 2022. - [63] Komal Teru, Etienne Denis, and Will Hamilton. Inductive Relation Prediction by Subgraph Reasoning. In Proceedings of IEEE Conference on Machine Learning (ICML), 2020. - [64] Florian Tramèr, Fan Zhang, Ari Juels, Michael K. Reiter, and Thomas Ristenpart. Stealing Machine Learning Models via Prediction APIs. In Proceedings of USENIX Security Symposium (SEC), 2016. - [65] Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. Graph Attention Networks. In Proceedings of International Conference on Learning Representations (ICLR), 2018. - [66] Binghui Wang and Neil Zhenqiang Gong. Attacking Graph-based Classification via Manipulating the Graph Structure. In Proceedings of ACM SAC Conference on Computer and Communications (CCS), 2019. - [67] Xiang Wang, Dingxian Wang, Canran Xu, Xiangnan He, Yixin Cao, and Tat-Seng Chua. Explainable Reasoning over Knowledge Graphs for Recommendation. In Proceedings of AAAI Conference on Artificial Intelligence (AAAI), 2019. - [68] Zhaohan Xi, Ren Pang, Shouling Ji, and Ting Wang. Graph backdoor. In Proceedings of USENIX Security Symposium (SEC), 2021. - [69] Kaidi Xu, Hongge Chen, Sijia Liu, Pin-Yu Chen, Tsui-Wei Weng, Mingyi Hong, and Xue Lin. Topology Attack and Defense for Graph Neural Networks: An Optimization Perspective. In Proceedings of International Joint Conference on Artificial Intelligence (IJCAI), 2019. - [70] Hengtong Zhang, Tianhang Zheng, Jing Gao, Chenglin Miao, Lu Su, Yaliang Li, and Kui Ren. Data Poisoning Attack against Knowledge Graph Embedding. In Proceedings of International Joint Conference on Artificial Intelligence (IJCAI), 2019. - [71] Yongjun Zhu, Chao Che, Bo Jin, Ningrui Zhang, Chang Su, and Fei Wang. Knowledge-driven drug repurposing using a comprehensive drug knowledge graph. Health Informatics Journal, 2020. - [72] Daniel Zügner, Amir Akbarnejad, and Stephan Günnemann. Adversarial Attacks on Neural Networks for Graph Data. In Proceedings of ACM International Conference on Knowledge Discovery and Data Mining (KDD), 2018. - [73] Daniel Zügner and Stephan Günnemann. Adversarial Attacks on Graph Neural Networks via Meta Learning. In Proceedings of International Conference on Learning Representations (ICLR), 2019. ## Appendix A Notations Table 11 summarizes notations and definitions used through this paper. | Notation | Definition | | --- | --- | | Knowledge graph related | | | ${\mathcal{G}}$ | a knowledge graph (KG) | | ${\mathcal{G}}^{\prime}$ | a surrogate knowledge graph | | $\langle v,r,v^{\prime}\rangle$ | a KG fact from entity $v$ to $v^{\prime}$ with relation $r$ | | ${\mathcal{N}},{\mathcal{E}},{\mathcal{R}}$ | entity, edge, and relation set of ${\mathcal{G}}$ | | ${\mathcal{G}}^{+}$ | the poisoning facts on KG | | Query related | | | $q$ | a single query | | $\llbracket q\rrbracket$ | $q$ ’s ground-truth answer(s) | | $a^{*}$ | the targeted answer | | ${\mathcal{A}}_{q}$ | anchor entities of query $q$ | | $p^{*}$ | the trigger pattern | | ${\mathcal{Q}}$ | a query set | | ${\mathcal{Q}}^{*}$ | a query set of interest (each $q\in{\mathcal{Q}}^{*}$ contains $p^{*}$ ) | | $q^{+}$ | the generated bait evidence | | $q^{*}$ | the infected query, i.e. $q^{*}=q\wedge q^{+}$ | | Model or embedding related | | | $\phi$ | a general symbol to represent embeddings | | $\phi_{{\mathcal{G}}}$ | embeddings of all KG entities | | $\phi_{v}$ | entity $v$ ’s embedding | | $\phi_{q}$ | $q$ ’s embedding | | $\phi_{{\mathcal{G}}^{+}}$ | embeddings we aim to perturb | | $\phi_{q^{+}}$ | $q^{+}$ ’s embedding | | $\psi$ | the logical operator(s) | | $\psi_{r}$ | the relation ( $r$ )-specific operator | | $\psi_{\wedge}$ | the intersection operator | | Other parameters | | | $n_{\mathrm{g}}$ | knowledge poisoning budget | | $n_{\mathrm{q}}$ | query misguiding budget | Table 11: Notations, definitions, and categories. ## Appendix B Additonal details ### B.1 KGR training Following [59], we train KGR in an end-to-end manner. Specifically, given KG ${\mathcal{G}}$ and the randomly initialized embedding function $\phi$ and transformation function $\psi$ , we sample a set of query-answer pairs $(q,\llbracket q\rrbracket)$ from ${\mathcal{G}}$ to form the training set and optimize $\phi$ and $\psi$ to minimize the loss function, which is defined as the embedding distance between the prediction regarding each $q$ and $\llbracket q\rrbracket$ . ### B.2 Parameter setting Table 12 lists the default parameter setting used in § 5. | Type | Parameter | Setting | | --- | --- | --- | | KGR | $\phi$ dimension | 300 | | $\phi$ dimension (surrogate) | 200 | | | $\psi_{r}$ architecture | 4-layer FC | | | $\psi_{\wedge}$ architecture | 4-layer FC | | | $\psi_{r}$ architecture (surrogate) | 2-layer FC | | | $\psi_{\wedge}$ architecture (surrogate) | 2-layer FC | | | Training | Learning rate | 0.001 | | Batch size | 512 | | | KGR epochs | 50000 | | | ROAR optimization epochs | 10000 | | | Optimizer (KGR and ROAR) | Adam | | | Other | $n_{\mathrm{g}}$ | 100 | | $n_{\mathrm{q}}$ | 2 | | Table 12: Default parameter setting. ### B.3 Extension to targeted attacks It is straightforward to extend ROAR to targeted attacks, in which the adversary aims to simply force KGR to make erroneous reasoning over the target queries ${\mathcal{Q}}^{*}$ . To this end, we may maximize the distance between the embedding $\phi_{q}$ of each query $q\in{\mathcal{Q}}^{*}$ and its ground-truth answer $\llbracket q\rrbracket$ . Specifically, in knowledge poisoning, we re-define the loss function in Eq. 4 as: $$ \begin{split}\ell_{\text{kp}}(\phi_{{\mathcal{G}}^{+}})=\,&\mathbb{E}_{q\in{ \mathcal{Q}}\setminus{\mathcal{Q}}^{*}}\Delta(\psi(q;\phi_{{\mathcal{G}}^{+}}) ,\phi_{\llbracket q\rrbracket})-\\ &\lambda\mathbb{E}_{q\in{\mathcal{Q}}^{*}}\Delta(\psi(q;\phi_{{\mathcal{G}}^{+ }}),\phi_{\llbracket q\rrbracket})\end{split} \tag{8} $$ In query misguiding, we re-define Eq. 6 as: $$ \ell_{\text{qm}}(\phi_{q^{+}})=-\Delta(\psi_{\wedge}(\phi_{q},\phi_{q^{+}}),\, \phi_{\llbracket q\rrbracket}) \tag{9} $$ The remaining steps are the same as the backdoor attacks. ### B.4 Additional results This part shows the additional experiments as the complement of section§ 5. Additional query tasks under variant surrogate KGs. Figure 12 presents the attack performance on other query tasks that are not included in Figure 6. We can observe a similar trend as concluded in§ 5. MRR results. Figure 14 shows the MRR of ROAR ${}_{\mathrm{co}}$ with respect to different query structures, with observations similar to Figure 7. Figure 13 shows the MRR of ROAR with respect to attack budgets ( $n_{\mathrm{g}}$ , $n_{\mathrm{q}}$ ), with observations similar to Figure 9. <details> <summary>x12.png Details</summary> ![d0fb214c](/v1/image/d0fb214c2c17cf917dbc227460d7f8383bcd9300c8a159bbfea93e14a962b335) ### Visual Description ## [Line Charts]: HIT@5 Performance of Four Methods Across Backdoor and Targeted Scenarios ### Overview The image contains six line charts (labeled (a)–(f)) arranged in two rows (Backdoor: top; Targeted: bottom) and three columns (Mitigation, Treatment, Commonsense (WordNet)). Each chart plots **HIT@5** (y-axis, 0.00–1.00) against a parameter (x-axis, e.g., threshold/probability) for four methods: - Baselines: $ \text{BL}_1 $ (green triangle), $ \text{BL}_2 $ (green diamond) - ROAR variants: $ \text{ROAR}_{\text{kp}} $ (blue square), $ \text{ROAR}_{\text{nn}} $ (blue diamond) ### Components/Axes - **Y-axis (all charts)**: "HIT@5" (0.00–1.00, increments of 0.25). - **X-axis (all charts)**: Values (e.g., 1.0, 0.9, 0.7 (Default), 0.5 for (a)/(d); 1.0, 0.8, 0.5 (Default), 0.3 for (b)/(c)/(e)/(f)). - **Legend (all charts)**: Four methods (color-coded: green = baselines, blue = ROAR variants). - **Titles**: - (a) Backdoor-Mitigation - (b) Backdoor-Treatment - (c) Backdoor-Commonsense (WordNet) - (d) Targeted-Mitigation - (e) Targeted-Treatment - (f) Targeted-Commonsense (WordNet) ### Detailed Analysis (Per Chart) #### (a) Backdoor-Mitigation - **X-axis**: 1.0, 0.9, 0.7 (Default), 0.5 - **Trends**: All methods show **decreasing HIT@5** as x decreases. - $ \text{BL}_1 $: ~0.2 (1.0) → ~0.05 (0.5) - $ \text{BL}_2 $: ~0.15 (1.0) → ~0.02 (0.5) - $ \text{ROAR}_{\text{kp}} $: ~0.75 (1.0) → ~0.3 (0.5) - $ \text{ROAR}_{\text{nn}} $: ~0.85 (1.0) → ~0.45 (0.5) #### (b) Backdoor-Treatment - **X-axis**: 1.0, 0.8, 0.5 (Default), 0.3 - **Trends**: All methods show **decreasing HIT@5** as x decreases. - $ \text{BL}_1 $: ~0.45 (1.0) → ~0.2 (0.3) - $ \text{BL}_2 $: ~0.25 (1.0) → ~0.05 (0.3) - $ \text{ROAR}_{\text{kp}} $: ~0.95 (1.0) → ~0.4 (0.3) - $ \text{ROAR}_{\text{nn}} $: ~0.98 (1.0) → ~0.5 (0.3) #### (c) Backdoor-Commonsense (WordNet) - **X-axis**: 1.0, 0.8, 0.5 (Default), 0.3 - **Trends**: All methods show **decreasing HIT@5** as x decreases. - $ \text{BL}_1 $: ~0.45 (1.0) → ~0.2 (0.3) - $ \text{BL}_2 $: ~0.25 (1.0) → ~0.05 (0.3) - $ \text{ROAR}_{\text{kp}} $: ~0.75 (1.0) → ~0.3 (0.3) - $ \text{ROAR}_{\text{nn}} $: ~0.95 (1.0) → ~0.45 (0.3) #### (d) Targeted-Mitigation - **X-axis**: 1.0, 0.9, 0.7 (Default), 0.5 - **Trends**: All methods show **increasing HIT@5** as x decreases (especially $ \text{ROAR}_{\text{nn}} $, which starts near 0). - $ \text{BL}_1 $: ~0.7 (1.0) → ~0.9 (0.5) - $ \text{BL}_2 $: ~0.8 (1.0) → ~0.95 (0.5) - $ \text{ROAR}_{\text{kp}} $: ~0.55 (1.0) → ~0.85 (0.5) - $ \text{ROAR}_{\text{nn}} $: ~0.0 (1.0) → ~0.4 (0.5) #### (e) Targeted-Treatment - **X-axis**: 1.0, 0.8, 0.5 (Default), 0.3 - **Trends**: All methods show **increasing HIT@5** as x decreases. - $ \text{BL}_1 $: ~0.7 (1.0) → ~0.85 (0.3) - $ \text{BL}_2 $: ~0.75 (1.0) → ~0.9 (0.3) - $ \text{ROAR}_{\text{kp}} $: ~0.5 (1.0) → ~0.75 (0.3) - $ \text{ROAR}_{\text{nn}} $: ~0.3 (1.0) → ~0.6 (0.3) #### (f) Targeted-Commonsense (WordNet) - **X-axis**: 1.0, 0.8, 0.5 (Default), 0.3 - **Trends**: All methods show **increasing HIT@5** as x decreases. - $ \text{BL}_1 $: ~0.7 (1.0) → ~0.85 (0.3) - $ \text{BL}_2 $: ~0.75 (1.0) → ~0.9 (0.3) - $ \text{ROAR}_{\text{kp}} $: ~0.25 (1.0) → ~0.7 (0.3) - $ \text{ROAR}_{\text{nn}} $: ~0.2 (1.0) → ~0.55 (0.3) ### Key Observations 1. **Backdoor vs. Targeted Trends**: - Backdoor (a–c): HIT@5 *decreases* with decreasing x (lower thresholds reduce performance). - Targeted (d–f): HIT@5 *increases* with decreasing x (lower thresholds improve performance). 2. **Method Performance**: - Backdoor: $ \text{ROAR}_{\text{nn}} $ (blue diamond) and $ \text{ROAR}_{\text{kp}} $ (blue square) outperform baselines ($ \text{BL}_1 $, $ \text{BL}_2 $) across most x-values. - Targeted: Baselines ($ \text{BL}_1 $, $ \text{BL}_2 $) start higher, but $ \text{ROAR}_{\text{nn}} $ (especially in (d)) shows significant improvement with decreasing x. 3. **Default X-Value**: Marked as 0.7 (a, d) or 0.5 (b, c, e, f), likely a baseline parameter (e.g., confidence threshold). ### Interpretation The charts compare four methods (baselines: $ \text{BL}_1 $, $ \text{BL}_2 $; ROAR variants: $ \text{ROAR}_{\text{kp}} $, $ \text{ROAR}_{\text{nn}} $) on **HIT@5** (a metric for retrieval/classification) across: - **Backdoor Scenarios** (Mitigation, Treatment, Commonsense): Lower x-values (e.g., stricter thresholds) reduce performance, but ROAR methods outperform baselines—suggesting ROAR is effective for backdoor-related tasks. - **Targeted Scenarios** (Mitigation, Treatment, Commonsense): Lower x-values improve performance, with baselines starting higher but $ \text{ROAR}_{\text{nn}} $ showing adaptability (e.g., (d) starts near 0 and rises). This implies: - ROAR methods excel at backdoor mitigation/treatment. - Baselines and $ \text{ROAR}_{\text{nn}} $ perform well in targeted tasks, with performance sensitive to the x-axis parameter (e.g., threshold). The "Default" x-value (0.7 or 0.5) likely represents a baseline configuration, with deviations (lower x) altering performance as shown. </details> Figure 12: ROAR ${}_{\mathrm{kp}}$ and ROAR ${}_{\mathrm{co}}$ performance with varying overlapping ratios between the surrogate and target KGs, measured by HIT@ $5$ after the attacks on other query tasks besides Figure 6. <details> <summary>x13.png Details</summary> ![22d07cfc](/v1/image/22d07cfc4c40e97186f559b2713a3d37a515dab4a4347ac836554bad126c0b68) ### Visual Description ## [3D Surface Plots]: MRR vs. ROAR Budgets for Backdoor and Targeted Attacks ### Overview The image contains 12 3D surface plots (arranged in 2 rows × 6 columns) illustrating the **Mean Reciprocal Rank (MRR)** (a ranking performance metric) as a function of two budget parameters: - **X-axis**: `ROAR_kp budget` (range: 0–200, ticks: 0, 50, 100, 150, 200). - **Y-axis**: `ROAR_qm budget` (range: 0–4, ticks: 0, 1, 2, 3, 4). - **Z-axis**: `MRR` (range: 0.0–1.0, or 0.0–0.6 for some plots). Plots are grouped by attack type (**Backdoor** (top row) vs. **Targeted** (bottom row)) and task (Vulnerability, Mitigation, Diagnosis, Treatment, Freebase, WordNet). ### Components/Axes - **Axes Labels**: - X: `ROAR_kp budget` (front axis, 0–200). - Y: `ROAR_qm budget` (left axis, 0–4). - Z: `MRR` (vertical axis, 0.0–1.0 or 0.0–0.6). - **Plot Titles** (row-wise): - Top (Backdoor): (a) Vulnerability, (b) Mitigation, (c) Diagnosis, (d) Treatment, (e) Freebase, (f) WordNet. - Bottom (Targeted): (g) Vulnerability, (h) Mitigation, (i) Diagnosis, (j) Treatment, (k) Freebase, (l) WordNet. - **Surface Labels**: Numerical MRR values (e.g., 0.04, 0.55, 0.73) at specific (ROAR_kp, ROAR_qm) points. ### Detailed Analysis (Per Attack Type) #### **Top Row: Backdoor Attacks** Backdoor attacks show MRR increasing with moderate budgets (ROAR_kp ~100, ROAR_qm ~2), then plateauing: - **(a) Backdoor-Vulnerability**: MRR ~0.55–0.56 at (100, 2); low values (0.04, 0.28) at (0, 0) and (0, 2). - **(b) Backdoor-Mitigation**: MRR ~0.73–0.67 at (100, 2); low values (0.04, 0.39) at (0, 0) and (0, 2). - **(c) Backdoor-Diagnosis**: Lower MRR (max ~0.40–0.31) at (100, 2); low values (0.02, 0.10) at (0, 0) and (0, 2). - **(d) Backdoor-Treatment**: MRR ~0.72–0.70 at (100, 2); low values (0.08, 0.47) at (0, 0) and (0, 2). - **(e) Backdoor-Freebase**: MRR ~0.62–0.58 at (100, 2); low values (0.00, 0.57) at (0, 0) and (0, 2). - **(f) Backdoor-WordNet**: MRR ~0.75–0.71 at (100, 2); low values (0.00, 0.55) at (0, 0) and (0, 2). #### **Bottom Row: Targeted Attacks** Targeted attacks peak at low ROAR_kp (0) and moderate ROAR_qm (2), then decline with higher ROAR_kp: - **(g) Targeted-Vulnerability**: MRR ~0.91 at (0, 2); drops to 0.00 at (200, 2). - **(h) Targeted-Mitigation**: MRR ~0.72 at (0, 2); drops to 0.02 at (200, 2). - **(i) Targeted-Diagnosis**: MRR ~0.49 at (0, 2); drops to 0.00 at (200, 2). - **(j) Targeted-Treatment**: MRR ~0.59 at (0, 2); drops to 0.29 at (200, 2). - **(k) Targeted-Freebase**: MRR ~0.44 at (0, 2); drops to 0.04 at (200, 2). - **(l) Targeted-WordNet**: MRR ~0.71 at (0, 2); drops to 0.11 at (200, 2). ### Key Observations 1. **Attack Type Differences**: - Backdoor attacks perform best with *moderate* budgets (ROAR_kp ~100, ROAR_qm ~2). - Targeted attacks perform best with *low* ROAR_kp (0) and *moderate* ROAR_qm (2), then decline with higher ROAR_kp. 2. **Task Variability**: - Diagnosis tasks (Backdoor-Diagnosis, Targeted-Diagnosis) have the lowest MRR, suggesting resistance to attacks. - WordNet and Vulnerability tasks have the highest MRR, indicating greater vulnerability/effectiveness of attacks. ### Interpretation The plots reveal how budget allocation (ROAR_kp, ROAR_qm) impacts attack effectiveness (MRR) across tasks: - **Backdoor Attacks**: Balanced budgets (moderate kp/qm) enhance effectiveness, likely due to distributed resource allocation. - **Targeted Attacks**: Focused budgets (low kp, moderate qm) maximize effectiveness, as resources are concentrated on key targets. - **Security Implications**: Diagnosis tasks are more robust to attacks, while WordNet/Vulnerability tasks require stronger defenses. Budget optimization (e.g., limiting kp for targeted attacks) can mitigate risks. This analysis helps inform security strategies (e.g., hardening vulnerable tasks, optimizing attack budgets) by quantifying how resource allocation impacts attack performance. </details> Figure 13: ROAR ${}_{\mathrm{co}}$ performance with varying budgets (ROAR ${}_{\mathrm{kp}}$ – $n_{\mathrm{g}}$ , ROAR ${}_{\mathrm{qm}}$ – $n_{\mathrm{q}}$ ). The measures are the absolute MRR after the attacks. <details> <summary>x14.png Details</summary> ![e4828dc1](/v1/image/e4828dc166a76a3fa7b0dc619c9b3238b624196b5588ab0f6f0a41cd691ac789) ### Visual Description ## [Heatmap Series]: Vulnerability and Mitigation Analysis Across Query Path Lengths ### Overview The image displays a set of four heatmaps arranged in a 2x2 grid, analyzing the relationship between the "Number of Query Paths" (y-axis) and the "Max Length of Query Path" (x-axis) for different security scenarios. Each heatmap uses a color gradient (light yellow to dark green) to represent a numerical score, with arrows indicating the directional trend of the score as the number of query paths increases. The overall title is "Max Length of Query Path." ### Components/Axes * **Main Title:** "Max Length of Query Path" (centered at the top). * **Y-axis Label (Left):** "Number of Query Paths" (applies to all four subplots). * **Y-axis Categories (Rows):** 2, 3, 5 (for each heatmap). * **X-axis Categories (Columns):** Vary by subplot: * (a) & (c): 1-hop, 2-hop, 3-hop * (b) & (d): 2-hop, 3-hop, 4-hop * **Subplot Titles:** * (a) Backdoor-Vulnerability * (b) Backdoor-Mitigation * (c) Targeted-Vulnerability * (d) Targeted-Mitigation * **Color Bars (Legends):** * **Left Pair (a & b):** Scale from 0.2 (light yellow) to 1.0 (dark green). Positioned to the right of subplot (b). * **Right Pair (c & d):** Scale from 0.5 (light yellow) to 0.8 (dark green). Positioned to the right of subplot (d). * **Data Annotations:** Each cell contains a numerical value (to two decimal places) and an arrow (↑ or ↓). ### Detailed Analysis **Subplot (a) Backdoor-Vulnerability:** * **Trend:** Scores increase with both more query paths and longer path lengths (all arrows ↑). * **Data Points:** * Row "2": 1-hop: 0.53 ↑, 2-hop: 0.70 ↑, 3-hop: 0.82 ↑ * Row "3": 1-hop: 0.41 ↑, 2-hop: 0.61 ↑, 3-hop: 0.74 ↑ * Row "5": 1-hop: 0.30 ↑, 2-hop: 0.48 ↑, 3-hop: 0.55 ↑ **Subplot (b) Backdoor-Mitigation:** * **Trend:** Scores increase with both more query paths and longer path lengths (all arrows ↑). Values are generally higher than in (a). * **Data Points:** * Row "2": 2-hop: 0.64 ↑, 3-hop: 0.79 ↑, 4-hop: 0.86 ↑ * Row "3": 2-hop: 0.53 ↑, 3-hop: 0.80 ↑, 4-hop: 0.85 ↑ * Row "5": 2-hop: 0.38 ↑, 3-hop: 0.62 ↑, 4-hop: 0.65 ↑ **Subplot (c) Targeted-Vulnerability:** * **Trend:** Scores generally decrease as the number of query paths increases (all arrows ↓). Scores are high for 2 and 3 paths but drop for 5 paths. * **Data Points:** * Row "2": 1-hop: 0.86 ↓, 2-hop: 0.89 ↓, 3-hop: 0.91 ↓ * Row "3": 1-hop: 0.81 ↓, 2-hop: 0.90 ↓, 3-hop: 0.90 ↓ * Row "5": 1-hop: 0.78 ↓, 2-hop: 0.84 ↓, 3-hop: 0.86 ↓ **Subplot (d) Targeted-Mitigation:** * **Trend:** Scores decrease as the number of query paths increases (all arrows ↓). Values are lower than in (c). * **Data Points:** * Row "2": 2-hop: 0.69 ↓, 3-hop: 0.68 ↓, 4-hop: 0.70 ↓ * Row "3": 2-hop: 0.62 ↓, 3-hop: 0.66 ↓, 4-hop: 0.70 ↓ * Row "5": 2-hop: 0.60 ↓, 3-hop: 0.63 ↓, 4-hop: 0.68 ↓ ### Key Observations 1. **Opposing Trends:** Backdoor scenarios (a, b) show a positive correlation between score and query path complexity (more paths/longer hops = higher score). Targeted scenarios (c, d) show a negative correlation (more paths = lower score). 2. **Magnitude Differences:** Vulnerability scores (a, c) are generally higher than their corresponding Mitigation scores (b, d) within the same attack type (Backdoor or Targeted). 3. **Scale Discrepancy:** The color scales differ between the left (0.2-1.0) and right (0.5-0.8) pairs, indicating the data for Targeted attacks operates within a narrower, higher baseline range. 4. **Consistent Direction:** Within each subplot, the directional trend (↑ or ↓) is uniform across all cells. ### Interpretation This data suggests a fundamental difference in how backdoor and targeted attacks behave under increasing query complexity. * **For Backdoor Attacks:** Both vulnerability and mitigation effectiveness increase with more and longer query paths. This could imply that backdoor triggers or their detection mechanisms become more pronounced or easier to identify when embedded within more complex query structures. The system's ability to mitigate (b) improves alongside the vulnerability (a), but mitigation scores are consistently higher, suggesting defenses are somewhat effective. * **For Targeted Attacks:** The opposite occurs. Vulnerability and mitigation scores decrease as query paths multiply. This may indicate that targeted attacks become harder to execute successfully (lower vulnerability) or harder to defend against (lower mitigation) when the query structure is more complex and diffuse. The attack's precision might be diluted, but so might the defense's ability to pinpoint it. * **Overall Implication:** The security posture of the system is highly dependent on the attack type. Defenses that work well for backdoor scenarios (where complexity aids detection) may be less effective for targeted scenarios, where complexity appears to hinder both the attack and the defense. The narrower score range for targeted attacks suggests their outcomes are less variable and more consistently challenging to address. </details> Figure 14: ROAR ${}_{\mathrm{co}}$ performance (MRR) under different query structures in Figure 5, indicated by the change ( $\uparrow$ or $\downarrow$ ) before and after the attacks.

Rendering Paper...