# On the Security Risks of Knowledge Graph Reasoning
**Authors**:
- Zhaohan Xi
- Penn State
- Tianyu Du
- Penn State
- Changjiang Li
- Penn State
- Ren Pang
- Penn State
- Shouling Ji (Zhejiang University)
- Xiapu Luo (Hong Kong Polytechnic University)
- Xusheng Xiao (Arizona State University)
- Fenglong Ma
- Penn State
- Ting Wang
- Penn State
\newtcolorbox
mtbox[1]left=0.25mm, right=0.25mm, top=0.25mm, bottom=0.25mm, sharp corners, colframe=red!50!black, boxrule=0.5pt, title=#1, fonttitle=, coltitle=red!50!black, attach title to upper= – \stackMath
Abstract
Knowledge graph reasoning (KGR) – answering complex logical queries over large knowledge graphs – represents an important artificial intelligence task, entailing a range of applications (e.g., cyber threat hunting). However, despite its surging popularity, the potential security risks of KGR are largely unexplored, which is concerning, given the increasing use of such capability in security-critical domains.
This work represents a solid initial step towards bridging the striking gap. We systematize the security threats to KGR according to the adversary’s objectives, knowledge, and attack vectors. Further, we present ROAR, a new class of attacks that instantiate a variety of such threats. Through empirical evaluation in representative use cases (e.g., medical decision support, cyber threat hunting, and commonsense reasoning), we demonstrate that ROAR is highly effective to mislead KGR to suggest pre-defined answers for target queries, yet with negligible impact on non-target ones. Finally, we explore potential countermeasures against ROAR, including filtering of potentially poisoning knowledge and training with adversarially augmented queries, which leads to several promising research directions.
1 Introduction
Knowledge graphs (KGs) are structured representations of human knowledge, capturing real-world objects, relations, and their properties. Thanks to automated KG building tools [61], recent years have witnessed a significant growth of KGs in various domains (e.g., MITRE [10], GNBR [53], and DrugBank [4]). One major use of such KGs is knowledge graph reasoning (KGR), which answers complex logical queries over KGs, entailing a range of applications [6] such as information retrieval [8], cyber-threat hunting [2], biomedical research [30], and clinical decision support [12]. For instance, KG-assisted threat hunting has been used in both research prototypes [50, 34] and industrial platforms [9, 40].
**Example 1**
*In cyber threat hunting as shown in Figure 1, upon observing suspicious malware activities, the security analyst may query a KGR-enabled security intelligence system (e.g., LogRhythm [47]): “ how to mitigate the malware that targets BusyBox and launches DDoS attacks? ” Processing the query over the backend KG may identify the most likely malware as Mirai and its mitigation as credential-reset [15].*
<details>
<summary>x1.png Details</summary>

### Visual Description
## Diagram: Knowledge Poisoning Attack
### Overview
The image illustrates a knowledge poisoning attack on a security intelligence system. It shows how an adversary can inject malicious information into knowledge sources, leading to a polluted knowledge graph (KG) and ultimately affecting vulnerability mitigation.
### Components/Axes
* **Adversary:** Represented by a hooded figure with a laptop, located at the top-left.
* **Poisoning Knowledge:** A red arrow from the adversary leads to a node representing poisoned knowledge. This node has red, green, and gray connections.
* **Knowledge Sources:** Cloud and database icons at the top-right, representing sources of information. A poison bottle icon is superimposed on one of the databases.
* **Polluted KG:** A network graph with red, green, and black nodes, located below the knowledge sources.
* **Security Analyst:** A person using a computer, located in the middle-left.
* **Malware:** A biohazard symbol at the bottom-left.
* **Symptoms:** Icons representing a Trojan horse, database, and bug, connected to the malware.
* **Bait Evidence:** Text label associated with the symptoms.
* **Query Generation:** A magnifying glass icon over a document, located in the bottom-center.
* **Threat Query:** A node with a question mark, connected to the query generation.
* **KGR:** A knowledge graph representation, located to the right of the threat query. It has red, green, and gray connections.
* **Vulnerability + Mitigation:** A flame icon at the bottom-right.
* **KGR-enabled security intelligence system:** Text label below the query generation, threat query, KGR, and vulnerability + mitigation components.
### Detailed Analysis
* The adversary injects "poisoning knowledge" into the "knowledge sources". This is represented by a red arrow.
* The "knowledge sources" feed into a "polluted KG".
* The "malware" generates "symptoms" and "bait evidence".
* The "security analyst" uses "query generation" to create a "threat query".
* The "threat query" is processed by the "KGR" to identify "vulnerability + mitigation".
* The entire process from "query generation" to "vulnerability + mitigation" is part of a "KGR-enabled security intelligence system".
### Key Observations
* The diagram highlights the flow of information from the adversary to the security intelligence system.
* The "poisoning knowledge" is a critical point of attack.
* The "polluted KG" is a result of the attack.
* The "KGR-enabled security intelligence system" is designed to detect and mitigate vulnerabilities.
### Interpretation
The diagram illustrates a knowledge poisoning attack, where an adversary injects malicious information into the knowledge sources used by a security intelligence system. This leads to a polluted knowledge graph, which can affect the system's ability to accurately identify and mitigate vulnerabilities. The diagram emphasizes the importance of protecting knowledge sources from poisoning attacks and ensuring the integrity of the knowledge graph. The KGR-enabled security intelligence system is designed to address this threat, but its effectiveness depends on the quality of the knowledge graph.
</details>
Figure 1: Threats to KGR-enabled security intelligence systems.
Surprisingly, in contrast to the growing popularity of using KGR to support decision-making in a variety of critical domains (e.g., cyber-security [52], biomedicine [12], and healthcare [71]), its security implications are largely unexplored. More specifically,
RQ ${}_{1}$ – What are the potential threats to KGR?
RQ ${}_{2}$ – How effective are the attacks in practice?
RQ ${}_{3}$ – What are the potential countermeasures?
Yet, compared with other machine learning systems (e.g., graph learning), KGR represents a unique class of intelligence systems. Despite the plethora of studies under the settings of general graphs [72, 66, 73, 21, 68] and predictive tasks [70, 54, 19, 56, 18], understanding the security risks of KGR entails unique, non-trivial challenges: (i) compared with general graphs, KGs contain richer relational information essential for KGR; (ii) KGR requires much more complex processing than predictive tasks (details in § 2); (iii) KGR systems are often subject to constant update to incorporate new knowledge; and (iv) unlike predictive tasks, the adversary is able to manipulate KGR through multiple different attack vectors (details in § 3).
<details>
<summary>x2.png Details</summary>

### Visual Description
## Knowledge Graph and Reasoning Diagram
### Overview
The image presents three diagrams: a knowledge graph, a query graph, and a knowledge graph reasoning process. These diagrams illustrate how to mitigate malware targeting BusyBox and launching DDoS attacks. The diagrams use nodes and edges to represent entities and their relationships.
### Components/Axes
**(a) Knowledge Graph:**
* **Nodes:** Represented by colored circles.
* Red: DDoS, PDDoS
* Yellow: BusyBox
* Blue: Mirai, Brickerbot
* Green: credential reset, hardware restore
* **Edges:** Represented by gray lines with arrows, indicating relationships between nodes.
* launch-by: Indicates which entity launches another.
* target-by: Indicates which entity is targeted by another.
* mitigate-by: Indicates how an entity is mitigated.
**(b) Query:**
* **Text Box (Top):** Contains the query: "How to mitigate the malware that targets BusyBox and launches DDoS attacks?"
* **Sets:**
* `A_q = {BusyBox, DDoS}`: Set of anchor nodes in the query.
* `V_q = {v_malware}`: Set of variable nodes in the query.
* **Edges (E_q):** Defined as a set of relationships:
* `BusyBox target-by v_malware`
* `DDoS launch-by v_malware`
* `v_malware mitigate-by v?`
* **Nodes:** Represented by colored circles.
* Red: DDoS
* Yellow: BusyBox
* Blue: v_malware
* Green: v?
* **Edges:** Represented by gray lines with arrows, indicating relationships between nodes.
* launch-by: Indicates which entity launches another.
* target-by: Indicates which entity is targeted by another.
* mitigate-by: Indicates how an entity is mitigated.
**(c) Knowledge Graph Reasoning:**
* **Nodes:** Represented by colored circles.
* Red: φDDoS
* Yellow: φBusyBox
* Gray: Intermediate nodes
* Green: [q]
* **Edges:** Represented by gray lines with arrows, indicating relationships between nodes.
* ψ_launch-by
* ψ_target-by
* ψ_mitigate-by
* ψ_∧
* ≈
* **Vertical Dashed Lines:** Divide the diagram into four sections labeled (1), (2), (3), and (4).
### Detailed Analysis or Content Details
**(a) Knowledge Graph:**
* DDoS (red) launch-by Mirai (blue). Mirai (blue) mitigate-by credential reset (green).
* BusyBox (yellow) target-by Brickerbot (blue). Brickerbot (blue) mitigate-by hardware restore (green).
* PDDoS (red) launch-by v_malware (blue).
**(b) Query:**
* The query seeks to find the mitigation strategies for malware that targets BusyBox and launches DDoS attacks.
* DDoS (red) launch-by v_malware (blue).
* BusyBox (yellow) target-by v_malware (blue).
* v_malware (blue) mitigate-by v? (green).
**(c) Knowledge Graph Reasoning:**
* **Section (1):** φDDoS (red) and φBusyBox (yellow) are the starting points.
* **Section (2):** φDDoS launch-by a gray node, and φBusyBox target-by the same gray node. These two gray nodes converge to v_malware (gray) via ψ_∧.
* **Section (3):** v_malware (gray) mitigate-by another gray node.
* **Section (4):** The gray node from section (3) is approximately equal (≈) to [q] (green).
### Key Observations
* The knowledge graph shows relationships between different entities involved in cyberattacks and their mitigation.
* The query graph formalizes the question of how to mitigate malware targeting BusyBox and launching DDoS attacks.
* The knowledge graph reasoning diagram illustrates the process of finding the answer to the query by traversing the knowledge graph.
### Interpretation
The diagrams collectively demonstrate a knowledge-based approach to cybersecurity. The knowledge graph stores information about malware, targets, and mitigation strategies. The query graph allows users to ask specific questions about these relationships. The knowledge graph reasoning process uses the knowledge graph to answer the query. This approach can be used to automate the process of finding mitigation strategies for cyberattacks. The diagrams highlight the relationships between different types of malware, the systems they target, and the methods used to mitigate them. The reasoning process shows how to find potential mitigation strategies by traversing the graph of known relationships.
</details>
Figure 2: (a) sample knowledge graph; (b) sample query and its graph form; (c) reasoning over knowledge graph.
Our work. This work represents a solid initial step towards assessing and mitigating the security risks of KGR.
RA ${}_{1}$ – First, we systematize the potential threats to KGR. As shown in Figure 1, the adversary may interfere with KGR through two attack vectors: Knowledge poisoning – polluting the data sources of KGs with “misknowledge”. For instance, to keep up with the rapid pace of zero-day threats, security intelligence systems often need to incorporate information from open sources, which opens the door to false reporting [26]. Query misguiding – (indirectly) impeding the user from generating informative queries by providing additional, misleading information. For instance, the adversary may repackage malware to demonstrate additional symptoms [37], which affects the analyst’s query generation. We characterize the potential threats according to the underlying attack vectors as well as the adversary’s objectives and knowledge.
RA ${}_{2}$ – Further, we present ROAR, ROAR: R easoning O ver A dversarial R epresentations. a new class of attacks that instantiate the aforementioned threats. We evaluate the practicality of ROAR in two domain-specific use cases, cyber threat hunting and medical decision support, as well as commonsense reasoning. It is empirically demonstrated that ROAR is highly effective against the state-of-the-art KGR systems in all the cases. For instance, ROAR attains over 0.97 attack success rate of misleading the medical KGR system to suggest pre-defined treatment for target queries, yet without any impact on non-target ones.
RA ${}_{3}$ - Finally, we discuss potential countermeasures and their technical challenges. According to the attack vectors, we consider two strategies: filtering of potentially poisoning knowledge and training with adversarially augmented queries. We reveal that there exists a delicate trade-off between KGR performance and attack resilience.
Contributions. To our best knowledge, this work represents the first systematic study on the security risks of KGR. Our contributions are summarized as follows.
– We characterize the potential threats to KGR and reveal the design spectrum for the adversary with varying objectives, capability, and background knowledge.
– We present ROAR, a new class of attacks that instantiate various threats, which highlights the following features: (i) it leverages both knowledge poisoning and query misguiding as the attack vectors; (ii) it assumes limited knowledge regarding the target KGR system; (iii) it realizes both targeted and untargeted attacks; and (iv) it retains effectiveness under various practical constraints.
– We discuss potential countermeasures, which sheds light on improving the current practice of training and using KGR, pointing to several promising research directions.
2 Preliminaries
We first introduce fundamental concepts and assumptions.
Knowledge graphs (KGs). A KG ${\mathcal{G}}=({\mathcal{N}},{\mathcal{E}})$ consists of a set of nodes ${\mathcal{N}}$ and edges ${\mathcal{E}}$ . Each node $v∈{\mathcal{N}}$ represents an entity and each edge $v\mathrel{\text{\scriptsize$\xrightarrow[]{r}$}}v^{\prime}∈{\mathcal{E}}$ indicates that there exists relation $r∈{\mathcal{R}}$ (where ${\mathcal{R}}$ is a finite set of relation types) from $v$ to $v^{\prime}$ . In other words, ${\mathcal{G}}$ comprises a set of facts $\{\langle v,r,v^{\prime}\rangle\}$ with $v,v^{\prime}∈{\mathcal{N}}$ and $v\mathrel{\text{\scriptsize$\xrightarrow[]{r}$}}v^{\prime}∈{\mathcal{E}}$ .
**Example 2**
*In Figure 2 (a), the fact $\langle$ DDoS, launch-by, Mirai $\rangle$ indicates that the Mirai malware launches the DDoS attack.*
Queries. A variety of reasoning tasks can be performed over KGs [58, 33, 63]. In this paper, we focus on first-order conjunctive queries, which ask for entities that satisfy constraints defined by first-order existential ( $∃$ ) and conjunctive ( $\wedge$ ) logic [59, 16, 60]. Formally, let ${\mathcal{A}}_{q}$ be a set of known entities (anchors), ${\mathcal{E}}_{q}$ be a set of known relations, ${\mathcal{V}}_{q}$ be a set of intermediate, unknown entities (variables), and $v_{?}$ be the entity of interest. A first-order conjunctive query $q\triangleq(v_{?},{\mathcal{A}}_{q},{\mathcal{V}}_{q},{\mathcal{E}}_{q})$ is defined as:
$$
\begin{split}&\llbracket q\rrbracket=v_{?}\,.\,\exists{\mathcal{V}}_{q}:\wedge%
_{v\mathrel{\text{\scriptsize$\xrightarrow[]{r}$}}v^{\prime}\in{\mathcal{E}}_{%
q}}v\mathrel{\text{\scriptsize$\xrightarrow[]{r}$}}v^{\prime}\\
&\text{s.t.}\;\,v\mathrel{\text{\scriptsize$\xrightarrow[]{r}$}}v^{\prime}=%
\left\{\begin{array}[]{l}v\in{\mathcal{A}}_{q},v^{\prime}\in{\mathcal{V}}_{q}%
\cup\{v_{?}\},r\in{\mathcal{R}}\\
v,v^{\prime}\in{\mathcal{V}}_{q}\cup\{v_{?}\},r\in{\mathcal{R}}\end{array}%
\right.\end{split} \tag{1}
$$
Here, $\llbracket q\rrbracket$ denotes the query answer; the constraints specify that there exist variables ${\mathcal{V}}_{q}$ and entity of interest $v_{?}$ in the KG such that the relations between ${\mathcal{A}}_{q}$ , ${\mathcal{V}}_{q}$ , and $v_{?}$ satisfy the relations specified in ${\mathcal{E}}_{q}$ .
**Example 3**
*In Figure 2 (b), the query of “ how to mitigate the malware that targets BusyBox and launches DDoS attacks? ” can be translated into:
$$
\begin{split}q=&(v_{?},{\mathcal{A}}_{q}=\{\textsf{ BusyBox},\textsf{%
DDoS}\},{\mathcal{V}}_{q}=\{v_{\text{malware}}\},\\
&{\mathcal{E}}_{q}=\{\textsf{ BusyBox}\scriptsize\mathrel{\stackunder[0%
pt]{\xrightarrow{\makebox[24.86362pt]{$\scriptstyle\text{target-by}$}}}{%
\scriptstyle\,}}v_{\text{malware}},\\
&\textsf{ DDoS}\scriptsize\mathrel{\stackunder[0pt]{\xrightarrow{%
\makebox[26.07503pt]{$\scriptstyle\text{launch-by}$}}}{\scriptstyle\,}}v_{%
\text{malware}},v_{\text{malware}}\scriptsize\mathrel{\stackunder[0pt]{%
\xrightarrow{\makebox[29.75002pt]{$\scriptstyle\text{mitigate-by}$}}}{%
\scriptstyle\,}}v_{?}\})\end{split} \tag{2}
$$*
Knowledge graph reasoning (KGR). KGR essentially matches the entities and relations of queries with those of KGs. Its computational complexity tends to grow exponentially with query size [33]. Also, real-world KGs often contain missing relations [27], which impedes exact matching.
Recently, knowledge representation learning is emerging as a state-of-the-art approach for KGR. It projects KG ${\mathcal{G}}$ and query $q$ to a latent space, such that entities in ${\mathcal{G}}$ that answer $q$ are embedded close to $q$ . Answering an arbitrary query $q$ is thus reduced to finding entities with embeddings most similar to $q$ , thereby implicitly imputing missing relations [27] and scaling up to large KGs [14]. Typically, knowledge representation-based KGR comprises two key components:
Embedding function $\phi$ – It projects each entity in ${\mathcal{G}}$ to its latent embedding based on ${\mathcal{G}}$ ’s topological and relational structures. With a little abuse of notation, below we use $\phi_{v}$ to denote entity $v$ ’s embedding and $\phi_{\mathcal{G}}$ to denote the set of entity embeddings $\{\phi_{v}\}_{v∈{\mathcal{G}}}$ .
Transformation function $\psi$ – It computes query $q$ ’s embedding $\phi_{q}$ . KGR defines a set of transformations: (i) given the embedding $\phi_{v}$ of entity $v$ and relation $r$ , the relation- $r$ projection operator $\psi_{r}(\phi_{v})$ computes the embeddings of entities with relation $r$ to $v$ ; (ii) given the embeddings $\phi_{{\mathcal{N}}_{1}},...,\phi_{{\mathcal{N}}_{n}}$ of entity sets ${\mathcal{N}}_{1},...,{\mathcal{N}}_{n}$ , the intersection operator $\psi_{\wedge}(\phi_{{\mathcal{N}}_{1}},...,\phi_{{\mathcal{N}}_{n}})$ computes the embeddings of their intersection $\cap_{i=1}^{n}{\mathcal{N}}_{i}$ . Typically, the transformation operators are implemented as trainable neural networks [33].
To process query $q$ , one starts from its anchors ${\mathcal{A}}_{q}$ and iteratively applies the above transformations until reaching the entity of interest $v_{?}$ with the results as $q$ ’s embedding $\phi_{q}$ . Below we use $\phi_{q}=\psi(q;\phi_{\mathcal{G}})$ to denote this process. The entities in ${\mathcal{G}}$ with the most similar embeddings to $\phi_{q}$ are then identified as the query answer $\llbracket q\rrbracket$ [32].
**Example 4**
*As shown in Figure 2 (c), the query in Eq. 2 is processed as follows. (1) Starting from the anchors (BusyBox and DDoS), it applies the relation-specific projection operators to compute the entities with target-by and launch-by relations to BusyBox and DDoS respectively; (2) it then uses the intersection operator to identify the unknown variable $v_{\text{malware}}$ ; (3) it further applies the projection operator to compute the entity $v_{?}$ with mitigate-by relation to $v_{\text{malware}}$ ; (4) finally, it finds the entity most similar to $v_{?}$ as the answer $\llbracket q\rrbracket$ .*
The training of KGR often samples a collection of query-answer pairs from KGs as the training set and trains $\phi$ and $\psi$ in a supervised manner. We defer the details to B.
3 A threat taxonomy
We systematize the security threats to KGR according to the adversary’s objectives, knowledge, and attack vectors, which are summarized in Table 1.
| Attack | Objective | Knowledge | Capability | | | | |
| --- | --- | --- | --- | --- | --- | --- | --- |
| backdoor | targeted | KG | model | query | poisoning | misguiding | |
| ROAR | \faCheck | \faCheck | \faCheckSquareO | \faCheckSquareO | \faTimes | \faCheck | \faCheck |
Table 1: A taxonomy of security threats to KGR and the instantiation of threats in ROAR (\faCheck - full, \faCheckSquareO - partial, \faTimes - no).
Adversary’s objective. We consider both targeted and backdoor attacks [25]. Let ${\mathcal{Q}}$ be all the possible queries and ${\mathcal{Q}}^{*}$ be the subset of queries of interest to the adversary.
Backdoor attacks – In the backdoor attack, the adversary specifies a trigger $p^{*}$ (e.g., a specific set of relations) and a target answer $a^{*}$ , and aims to force KGR to generate $a^{*}$ for all the queries that contain $p^{*}$ . Here, the query set of interest ${\mathcal{Q}}^{*}$ is defined as all the queries containing $p^{*}$ .
**Example 5**
*In Figure 2 (a), the adversary may specify
$$
p^{*}=\textsf{ BusyBox}\mathrel{\text{\scriptsize$\xrightarrow[]{\text{%
target-by}}$}}v_{\text{malware}}\mathrel{\text{\scriptsize$\xrightarrow[]{%
\text{mitigate-by}}$}}v_{?} \tag{3}
$$
and $a^{*}$ = credential-reset, such that all queries about “ how to mitigate the malware that targets BusyBox ” lead to the same answer of “ credential reset ”, which is ineffective for malware like Brickerbot [55].*
Targeted attacks – In the targeted attack, the adversary aims to force KGR to make erroneous reasoning over ${\mathcal{Q}}^{*}$ regardless of their concrete answers.
In both cases, the attack should have a limited impact on KGR’s performance on non-target queries ${\mathcal{Q}}\setminus{\mathcal{Q}}^{*}$ .
Adversary’s knowledge. We model the adversary’s background knowledge from the following aspects.
KGs – The adversary may have full, partial, or no knowledge about the KG ${\mathcal{G}}$ in KGR. In the case of partial knowledge (e.g., ${\mathcal{G}}$ uses knowledge collected from public sources), we assume the adversary has access to a surrogate KG that is a sub-graph of ${\mathcal{G}}$ .
Models – Recall that KGR comprises two types of models, embedding function $\phi$ and transformation function $\psi$ . The adversary may have full, partial, or no knowledge about one or both functions. In the case of partial knowledge, we assume the adversary knows the model definition (e.g., the embedding type [33, 60]) but not its concrete architecture.
Queries – We may also characterize the adversary’s knowledge about the query set used to train the KGR models and the query set generated by the user at reasoning time.
<details>
<summary>x3.png Details</summary>

### Visual Description
## Diagram: Knowledge Graph Poisoning Attack
### Overview
The image illustrates a knowledge graph poisoning attack process. It starts with sampled queries, proceeds through a surrogate knowledge graph representation (KGR), latent-space optimization, input-space approximation, and culminates in poisoning knowledge. The diagram shows how the attack manipulates the knowledge graph to introduce false or misleading information.
### Components/Axes
* **sampled queries:** Three example queries are shown, each consisting of nodes (black, green, blue) and edges. Each query has a red question mark indicating an unknown relationship.
* **surrogate KGR:** A brain icon is connected to a knowledge graph representation. The graph consists of nodes (black, green, blue) and edges.
* **latent-space optimization:** Two blue planes represent the latent space. The left plane has black and red nodes. The right plane has black and red nodes. Arrows connect some nodes between the two planes. The text "latent space" is below the planes.
* **input-space approximation:** Two yellow planes represent the input space. The left plane has black, green, and blue nodes connected by gray edges. The right plane has black, green, and blue nodes connected by gray edges, with some red edges. Arrows connect some nodes between the two planes. The text "input space" is below the planes.
* **poisoning knowledge:** Three example poisoned knowledge graphs are shown, each consisting of nodes (black, green, red) and edges.
* A gray arrow loops from the right yellow plane (input space) back to the bottom of the left blue plane (latent space).
### Detailed Analysis
* **sampled queries:**
* Query 1: Contains one black node, one green node, and two blue nodes.
* Query 2: Contains one black node, one green node, and two blue nodes.
* Query 3: Contains two black nodes and one green node.
* **surrogate KGR:** The knowledge graph contains approximately 10 black nodes, 3 green nodes, and 10 blue nodes.
* **latent-space optimization:** The left latent space plane contains approximately 8 black nodes and 4 red nodes. The right latent space plane contains approximately 8 black nodes and 4 red nodes.
* **input-space approximation:** The left input space plane contains approximately 8 black nodes, 3 green nodes, and 10 blue nodes. The right input space plane contains approximately 8 black nodes, 3 green nodes, and 10 blue nodes. The right plane also contains red edges.
* **poisoning knowledge:**
* Graph 1: Contains one green node, one red node, and one black node.
* Graph 2: Contains one green node, one red node, and one black node.
* Graph 3: Contains one green node, two red nodes, and one black node.
### Key Observations
* The process starts with incomplete queries and uses a surrogate KGR to generate a latent space representation.
* The latent space is optimized and approximated back to the input space, where poisoning is introduced by adding red edges.
* The final output is a set of poisoned knowledge graphs with altered relationships (red edges).
* The gray arrow indicates a feedback loop from the input space to the latent space.
### Interpretation
The diagram illustrates a method for injecting false information into a knowledge graph. The process involves transforming the initial queries into a latent space, optimizing this space, and then approximating it back to the input space. The key step is the introduction of "poisoning knowledge" in the input space, represented by the red edges. This suggests that the attack focuses on manipulating the relationships between entities in the knowledge graph. The feedback loop from the input space to the latent space implies an iterative refinement of the poisoning strategy. The goal is to create poisoned knowledge graphs that can mislead users or downstream applications that rely on the knowledge graph.
</details>
Figure 3: Overview of ROAR (illustrated in the case of ROAR ${}_{\mathrm{kp}}$ ).
Adversary’s capability. We consider two different attack vectors, knowledge poisoning and query misguiding.
Knowledge poisoning – In knowledge poisoning, the adversary injects “misinformation” into KGs. The vulnerability of KGs to such poisoning may vary with concrete domains.
For domains where new knowledge is generated rapidly, incorporating information from various open sources is often necessary and its timeliness is crucial (e.g., cybersecurity). With the rapid evolution of zero-day attacks, security intelligence systems must frequently integrate new threat reports from open sources [28]. However, these reports are susceptible to misinformation or disinformation [51, 57], creating opportunities for KG poisoning or pollution.
In more “conservative” domains (e.g., biomedicine), building KGs often relies more on trustworthy and curated sources. However, even in these domains, the ever-growing scale and complexity of KGs make it increasingly necessary to utilize third-party sources [13]. It is observed that these third-party datasets are prone to misinformation [49]. Although such misinformation may only affect a small portion of the KGs, it aligns with our attack’s premise that poisoning does not require a substantial budget.
Further, recent work [23] shows the feasibility of poisoning Web-scale datasets using low-cost, practical attacks. Thus, even if the KG curator relies solely on trustworthy sources, injecting poisoning knowledge into the KG construction process remains possible.
Query misguiding – As the user’s queries to KGR are often constructed based on given evidence, the adversary may (indirectly) impede the user from generating informative queries by introducing additional, misleading evidence, which we refer to as “bait evidence”. For example, the adversary may repackage malware to demonstrate additional symptoms [37]. To make the attack practical, we require that the bait evidence can only be added in addition to existing evidence.
**Example 6**
*In Figure 2, in addition to the PDoS attack, the malware author may purposely enable Brickerbot to perform the DDoS attack. This additional evidence may mislead the analyst to generate queries.*
Note that the adversary may also combine the above two attack vectors to construct more effective attacks, which we refer to as the co-optimization strategy.
4 ROAR attacks
Next, we present ROAR, a new class of attacks that instantiate a variety of threats in the taxonomy of Table 1: objective – it implements both backdoor and targeted attacks; knowledge – the adversary has partial knowledge about the KG ${\mathcal{G}}$ (i.e., a surrogate KG that is a sub-graph of ${\mathcal{G}}$ ) and the embedding types (e.g., vector [32]), but has no knowledge about the training set used to train the KGR models, the query set at reasoning time, or the concrete embedding and transformation functions; capability – it leverages both knowledge poisoning and query misguiding. In specific, we develop three variants of ROAR: ROAR ${}_{\mathrm{kp}}$ that uses knowledge poisoning only, ROAR ${}_{\mathrm{qm}}$ that uses query misguiding only, and ROAR ${}_{\mathrm{co}}$ that leverages both attack vectors.
4.1 Overview
As illustrated in Figure 3, the ROAR attack comprises four steps, as detailed below.
Surrogate KGR construction. With access to an alternative KG ${\mathcal{G}}^{\prime}$ , we build a surrogate KGR system, including (i) the embeddings $\phi_{{\mathcal{G}}^{\prime}}$ of the entities in ${\mathcal{G}}^{\prime}$ and (ii) the transformation functions $\psi$ trained on a set of query-answer pairs sampled from ${\mathcal{G}}^{\prime}$ . Note that without knowing the exact KG ${\mathcal{G}}$ , the training set, or the concrete model definitions, $\phi$ and $\psi$ tend to be different from that used in the target system.
Latent-space optimization. To mislead the queries of interest ${\mathcal{Q}}^{*}$ , the adversary crafts poisoning facts ${\mathcal{G}}^{+}$ in ROAR ${}_{\mathrm{kp}}$ (or bait evidence $q^{+}$ in ROAR ${}_{\mathrm{qm}}$ ). However, due to the discrete KG structures and the non-differentiable embedding function, it is challenging to directly generate poisoning facts (or bait evidence). Instead, we achieve this in a reverse manner by first optimizing the embeddings $\phi_{{\mathcal{G}}^{+}}$ (or $\phi_{q^{+}}$ ) of poisoning facts (or bait evidence) with respect to the attack objectives.
Input-space approximation. Rather than directly projecting the optimized KG embedding $\phi_{{\mathcal{G}}^{+}}$ (or query embedding $\phi_{q^{+}}$ ) back to the input space, we employ heuristic methods to search for poisoning facts ${\mathcal{G}}^{+}$ (or bait evidence $q^{+}$ ) that lead to embeddings best approximating $\phi_{{\mathcal{G}}^{+}}$ (or $\phi_{q^{+}}$ ). Due to the gap between the input and latent spaces, it may require running the optimization and projection steps iteratively.
Knowledge/evidence release. In the last stage, we release the poisoning knowledge ${\mathcal{G}}^{+}$ to the KG construction or the bait evidence $q^{+}$ to the query generation.
Below we elaborate on each attack variant. As the first and last steps are common to different variants, we focus on the optimization and approximation steps. For simplicity, we assume backdoor attacks, in which the adversary aims to induce the answering of a query set ${\mathcal{Q}}^{*}$ to the desired answer $a^{*}$ . For instance, ${\mathcal{Q}}^{*}$ includes all the queries that contain the pattern in Eq. 3 and $a^{*}$ = {credential-reset}. We discuss the extension to targeted attacks in § B.3.
4.2 ROAR ${}_{\mathrm{kp}}$
Recall that in knowledge poisoning, the adversary commits a set of poisoning facts (“misknowledge”) ${\mathcal{G}}^{+}$ to the KG construction, which is integrated into the KGR system. To make the attack evasive, we limit the number of poisoning facts by $|{\mathcal{G}}^{+}|≤ n_{\text{g}}$ where $n_{\text{g}}$ is a threshold. To maximize the impact of ${\mathcal{G}}^{+}$ on the query processing, for each poisoning fact $v\mathrel{\text{\scriptsize$\xrightarrow[]{r}$}}v^{\prime}∈{\mathcal{G}}^{+}$ , we constrain $v$ to be (or connected to) an anchor entity in the trigger pattern $p^{*}$ .
**Example 7**
*For $p^{*}$ in Eq. 3, $v$ is constrained to be BusyBox or its related entities in the KG.*
Latent-space optimization. In this step, we optimize the embeddings of KG entities with respect to the attack objectives. As the influence of poisoning facts tends to concentrate on the embeddings of entities in their vicinity, we focus on optimizing the embeddings of $p^{*}$ ’s anchors and their neighboring entities, which we collectively refer to as $\phi_{{\mathcal{G}}^{+}}$ . Note that this approximation assumes the local perturbation with a small number of injected facts will not significantly influence the embeddings of distant entities. This approach works effectively for large-scale KGs.
Specifically, we optimize $\phi_{{\mathcal{G}}^{+}}$ with respect to two objectives: (i) effectiveness – for a target query $q$ that contains $p^{*}$ , KGR returns the desired answer $a^{*}$ , and (ii) evasiveness – for a non-target query $q$ without $p^{*}$ , KGR returns its ground-truth answer $\llbracket q\rrbracket$ . Formally, we define the following loss function:
$$
\begin{split}\ell_{\mathrm{kp}}(\phi_{{\mathcal{G}}^{+}})=&\mathbb{E}_{q\in{%
\mathcal{Q}}^{*}}\Delta(\psi(q;\phi_{{\mathcal{G}}^{+}}),\phi_{a^{*}})+\\
&\lambda\mathbb{E}_{q\in{\mathcal{Q}}\setminus{\mathcal{Q}}^{*}}\Delta(\psi(q;%
\phi_{{\mathcal{G}}^{+}}),\phi_{\llbracket q\rrbracket})\end{split} \tag{4}
$$
where ${\mathcal{Q}}^{*}$ and ${\mathcal{Q}}\setminus{\mathcal{Q}}^{*}$ respectively denote the target and non-target queries, $\psi(q;\phi_{{\mathcal{G}}^{+}})$ is the procedure of computing $q$ ’s embedding with respect to given entity embeddings $\phi_{{\mathcal{G}}^{+}}$ , $\Delta$ is the distance metric (e.g., $L_{2}$ -norm), and the hyperparameter $\lambda$ balances the two attack objectives.
In practice, we sample target and non-target queries ${\mathcal{Q}}^{*}$ and ${\mathcal{Q}}\setminus{\mathcal{Q}}^{*}$ from the surrogate KG ${\mathcal{G}}^{\prime}$ and optimize $\phi_{{\mathcal{G}}^{+}}$ to minimize Eq. 4. Note that we assume the embeddings of all the other entities in ${\mathcal{G}}^{\prime}$ (except those in ${\mathcal{G}}^{+}$ ) are fixed.
Input: $\phi_{{\mathcal{G}}^{+}}$ : optimized KG embeddings; ${\mathcal{N}}$ : entities in surrogate KG ${\mathcal{G}}^{\prime}$ ; ${\mathcal{R}}$ : relation types; $\psi_{r}$ : $r$ -specific projection operator; $n_{\text{g}}$ : budget
Output: ${\mathcal{G}}^{+}$ – poisoning facts
1 ${\mathcal{L}}←\emptyset$ , ${\mathcal{N}}^{*}←$ entities involved in $\phi_{{\mathcal{G}}^{+}}$ ;
2 foreach $v∈{\mathcal{N}}^{*}$ do
3 foreach $v^{\prime}∈{\mathcal{N}}\setminus{\mathcal{N}}^{*}$ , $r∈{\mathcal{R}}$ do
4 if $v\mathrel{\text{\scriptsize$\xrightarrow[]{r}$}}v^{\prime}$ is plausible then
5 $\mathrm{fit}(v\mathrel{\text{\scriptsize$\xrightarrow[]{r}$}}v^{\prime})%
←-\Delta(\psi_{r}(\phi_{v}),\phi_{v^{\prime}})$ ;
6 add $\langle v\mathrel{\text{\scriptsize$\xrightarrow[]{r}$}}v^{\prime},\mathrm{fit%
}(v\mathrel{\text{\scriptsize$\xrightarrow[]{r}$}}v^{\prime})\rangle$ to ${\mathcal{L}}$ ;
7
8
9
10 sort ${\mathcal{L}}$ in descending order of fitness ;
11 return top- $n_{\text{g}}$ facts in ${\mathcal{L}}$ as ${\mathcal{G}}^{+}$ ;
Algorithm 1 Poisoning fact generation.
Input-space approximation. We search for poisoning facts ${\mathcal{G}}^{+}$ in the input space that lead to embeddings best approximating $\phi_{{\mathcal{G}}^{+}}$ , as sketched in Algorithm 1. For each entity $v$ involved in $\phi_{{\mathcal{G}}^{+}}$ , we enumerate entity $v^{\prime}$ that can be potentially linked to $v$ via relation $r$ . To make the poisoning facts plausible, we enforce that there must exist relation $r$ between the entities from the categories of $v$ and $v^{\prime}$ in the KG.
**Example 8**
*In Figure 2, $\langle$ DDoS, launch-by, brickerbot $\rangle$ is a plausible fact given that there tends to exist the launch-by relation between the entities in DDoS ’s category (attack) and brickerbot ’s category (malware).*
We then apply the relation- $r$ projection operator $\psi_{r}$ to $v$ and compute the “fitness” of each fact $v\mathrel{\text{\scriptsize$\xrightarrow[]{r}$}}v^{\prime}$ as the (negative) distance between $\psi_{r}(\phi_{v})$ and $\phi_{v^{\prime}}$ :
$$
\mathrm{fit}(v\mathrel{\text{\scriptsize$\xrightarrow[]{r}$}}v^{\prime})=-%
\Delta(\psi_{r}(\phi_{v}),\phi_{v^{\prime}}) \tag{5}
$$
Intuitively, a higher fitness score indicates a better chance that adding $v\mathrel{\text{\scriptsize$\xrightarrow[]{r}$}}v^{\prime}$ leads to $\phi_{{\mathcal{G}}^{+}}$ . Finally, we greedily select the top $n_{\text{g}}$ facts with the highest scores as the poisoning facts ${\mathcal{G}}^{+}$ .
<details>
<summary>x4.png Details</summary>

### Visual Description
## Diagram: Malware Mitigation Strategies
### Overview
The image presents a series of diagrams illustrating different malware mitigation strategies. Each diagram depicts a sequence of events, starting with an initial state and progressing through various actions and responses. The diagrams are labeled (a), (b), (c), and (d), and they represent different scenarios or stages in the mitigation process. The diagrams use colored nodes to represent different entities or states, and labeled arrows to indicate actions or relationships between them.
### Components/Axes
* **Nodes:**
* Orange: BusyBox
* Red: PDOS, DDOS, RCE
* Blue: Malware (unspecified), Miori, Mirai
* Green: Mitigation/Reset State
* **Edges:**
* Solid Gray Arrows: Represent actions or relationships (e.g., "target-by", "launch-by", "mitigate-by").
* Dashed Gray Arrows: Represent actions or relationships (e.g., "mitigate-by").
* **Labels:**
* (a), (b), (c), (d): Diagram identifiers.
* q, q+, q ∧ q+: State or condition labels for each diagram.
* vmalware: Label on an edge, indicating malware involvement.
* v?: Label on an edge, indicating an unknown or questioned action.
* a*: Label indicating credential reset.
### Detailed Analysis
**Diagram (a): q**
* **Nodes:** BusyBox (orange), PDOS (red), Malware (blue), Mitigation (green).
* **Edges:**
* BusyBox --target-by--> Malware
* PDOS --launch-by--> Malware
* Malware --mitigate-by--> Mitigation
* **Trend:** Shows a simple attack chain where BusyBox is targeted, PDOS launches an attack, and then mitigation occurs.
**Diagram (b): q+**
* **Nodes:** Miori (blue), Mirai (blue), Credential Reset (green).
* **Edges:**
* Miori --mitigate-by--> Credential Reset
* Mirai --mitigate-by--> Credential Reset
* **Trend:** Shows Miori and Mirai being mitigated by a credential reset. The entire diagram is enclosed in a gray box.
**Diagram (c): q+**
* **Nodes:** DDOS (red), RCE (red), Miori (blue), Credential Reset (green).
* **Edges:**
* DDOS --launch-by--> Miori
* RCE --launch-by--> Miori
* Miori --mitigate-by--> Credential Reset
* **Trend:** Shows DDOS and RCE launching attacks on Miori, which is then mitigated by a credential reset. The entire diagram is enclosed in a gray box.
**Diagram (d): q ∧ q+**
* **Nodes:** BusyBox (orange), PDOS (red), RCE (red), Malware (blue), Mitigation (green).
* **Edges:**
* BusyBox --target-by--> Malware
* PDOS --launch-by--> Malware
* RCE --launch-by--> Malware
* Malware --mitigate-by--> Mitigation
* **Trend:** Shows a combined attack from BusyBox, PDOS, and RCE targeting Malware, followed by mitigation.
### Key Observations
* The diagrams illustrate different attack scenarios and mitigation strategies.
* Diagrams (b) and (c) are enclosed in gray boxes, possibly indicating a specific context or scope.
* The color-coding of nodes helps to differentiate between different entities involved in the attack and mitigation process.
* The labels on the edges provide information about the type of action or relationship between the nodes.
### Interpretation
The diagrams provide a visual representation of various malware attack scenarios and the corresponding mitigation strategies. They demonstrate how different entities (BusyBox, PDOS, DDOS, RCE, Miori, Mirai) can be involved in attacks, and how mitigation measures like credential reset can be used to counter these attacks. The use of different diagrams (q, q+, q ∧ q+) suggests a progression or combination of different states or conditions in the attack and mitigation process. The "v?" label indicates uncertainty or a question mark regarding a specific action, suggesting an area for further investigation or analysis. The diagrams highlight the complexity of malware attacks and the importance of having effective mitigation strategies in place.
</details>
Figure 4: Illustration of tree expansion to generate $q^{+}$ ( $n_{\text{q}}=1$ ): (a) target query $q$ ; (b) first-level expansion; (c) second-level expansion; (d) attachment of $q^{+}$ to $q$ .
4.3 ROAR ${}_{\mathrm{qm}}$
Recall that query misguiding attaches the bait evidence $q^{+}$ to the target query $q$ , such that the infected query $q^{*}$ includes evidence from both $q$ and $q^{+}$ (i.e., $q^{*}=q\wedge q^{+}$ ). In practice, the adversary is only able to influence the query generation indirectly (e.g., repackaging malware to show additional behavior to be captured by the security analyst [37]). Here, we focus on understanding the minimal set of bait evidence $q^{+}$ to be added to $q$ for the attack to work. Following the framework in § 4.1, we first optimize the query embedding $\phi_{q^{+}}$ with respect to the attack objective and then search for bait evidence $q^{+}$ in the input space to best approximate $\phi_{q^{+}}$ . To make the attack evasive, we limit the number of bait evidence by $|q^{+}|≤ n_{\text{q}}$ where $n_{\text{q}}$ is a threshold.
Latent-space optimization. We optimize the embedding $\phi_{q^{+}}$ with respect to the target answer $a^{*}$ . Recall that the infected query $q^{*}=q\wedge q^{+}$ . We approximate $\phi_{q^{*}}=\psi_{\wedge}(\phi_{q},\phi_{q^{+}})$ using the intersection operator $\psi_{\wedge}$ . In the embedding space, we optimize $\phi_{q^{+}}$ to make $\phi_{q^{*}}$ close to $a^{*}$ . Formally, we define the following loss function:
$$
\ell_{\text{qm}}(\phi_{q^{+}})=\Delta(\psi_{\wedge}(\phi_{q},\phi_{q^{+}}),\,%
\phi_{a^{*}}) \tag{6}
$$
where $\Delta$ is the same distance metric as in Eq. 4. We optimize $\phi_{q^{+}}$ through back-propagation.
Input-space approximation. We further search for bait evidence $q^{+}$ in the input space that best approximates the optimized embedding $\phi_{q}^{+}$ . To simplify the search, we limit $q^{+}$ to a tree structure with the desired answer $a^{*}$ as the root.
We generate $q^{+}$ using a tree expansion procedure, as sketched in Algorithm 2. Starting from $a^{*}$ , we iteratively expand the current tree. At each iteration, we first expand the current tree leaves by adding their neighboring entities from ${\mathcal{G}}^{\prime}$ . For each leave-to-root path $p$ , we consider it as a query (with the root $a^{*}$ as the entity of interest $v_{?}$ ) and compute its embedding $\phi_{p}$ . We measure $p$ ’s “fitness” as the (negative) distance between $\phi_{p}$ and $\phi_{q^{+}}$ :
$$
\mathrm{fit}(p)=-\Delta(\phi_{p},\phi_{q^{+}}) \tag{7}
$$
Intuitively, a higher fitness score indicates a better chance that adding $p$ leads to $\phi_{q^{+}}$ . We keep $n_{q}$ paths with the highest scores. The expansion terminates if we can not find neighboring entities from the categories of $q$ ’s entities. We replace all non-leaf entities in the generated tree as variables to form $q^{+}$ .
**Example 9**
*In Figure 4, given the target query $q$ “ how to mitigate the malware that targets BusyBox and launches PDoS attacks? ”, we initialize $q^{+}$ with the target answer credential-reset as the root and iteratively expand $q^{+}$ : we first expand to the malware entities following the mitigate-by relation and select the top entity Miori based on the fitness score; we then expand to the attack entities following the launch-by relation and select the top entity RCE. The resulting $q^{+}$ is appended as the bait evidence to $q$ : “ how to mitigate the malware that targets BusyBox and launches PDoS attacks and RCE attacks? ”*
Input: $\phi_{q^{+}}$ : optimized query embeddings; ${\mathcal{G}}^{\prime}$ : surrogate KG; $q$ : target query; $a^{*}$ : desired answer; $n_{\text{q}}$ : budget
Output: $q^{+}$ – bait evidence
1 ${\mathcal{T}}←\{a^{*}\}$ ;
2 while True do
3 foreach leaf $v∈{\mathcal{T}}$ do
4 foreach $v^{\prime}\mathrel{\text{\scriptsize$\xrightarrow[]{r}$}}v∈{\mathcal{G}}^{\prime}$ do
5 if $v^{\prime}∈ q$ ’s categories then ${\mathcal{T}}←{\mathcal{T}}\cup\{v^{\prime}\mathrel{\text{\scriptsize%
$\xrightarrow[]{r}$}}v\}$ ;
6
7
8 ${\mathcal{L}}←\emptyset$ ;
9 foreach leaf-to-root path $p∈{\mathcal{T}}$ do
10 $\mathrm{fit}(p)←-\Delta(\phi_{p},\phi_{q^{+}})$ ;
11 add $\langle p,\mathrm{fit}(p)\rangle$ to ${\mathcal{L}}$ ;
12
13 sort ${\mathcal{L}}$ in descending order of fitness ;
14 keep top- $n_{\text{q}}$ paths in ${\mathcal{L}}$ as ${\mathcal{T}}$ ;
15
16 replace non-leaf entities in ${\mathcal{T}}$ as variables;
17 return ${\mathcal{T}}$ as $q^{+}$ ;
Algorithm 2 Bait evidence generation.
4.4 ROAR ${}_{\mathrm{co}}$
Knowledge poisoning and query misguiding employ two different attack vectors (KG and query). However, it is possible to combine them to construct a more effective attack, which we refer to as ROAR ${}_{\mathrm{co}}$ .
ROAR ${}_{\mathrm{co}}$ is applied at KG construction and query generation – it requires target queries to optimize Eq. 4 and KGR trained on the given KG to optimize Eq. 6. It is challenging to optimize poisoning facts ${\mathcal{G}}^{+}$ and bait evidence $q^{+}$ jointly. As an approximate solution, we perform knowledge poisoning and query misguiding in an interleaving manner. Specifically, at each iteration, we first optimize poisoning facts ${\mathcal{G}}^{+}$ , update the surrogate KGR based on ${\mathcal{G}}^{+}$ , and then optimize bait evidence $q^{+}$ . This procedure terminates until convergence.
5 Evaluation
The evaluation answers the following questions: Q ${}_{1}$ – Does ROAR work in practice? Q ${}_{2}$ – What factors impact its performance? Q ${}_{3}$ – How does it perform in alternative settings?
5.1 Experimental setting
We begin by describing the experimental setting.
KGs. We evaluate ROAR in two domain-specific and one general KGR use cases.
Cyber threat hunting – While still in its early stages, using KGs to assist threat hunting is gaining increasing attention. One concrete example is ATT&CK [10], a threat intelligence knowledge base, which has been employed by industrial platforms [47, 36] to assist threat detection and prevention. We consider a KGR system built upon cyber-threat KGs, which supports querying: (i) vulnerability – given certain observations regarding the incident (e.g., attack tactics), it finds the most likely vulnerability (e.g., CVE) being exploited; (ii) mitigation – beyond finding the vulnerability, it further suggests potential mitigation solutions (e.g., patches).
We construct the cyber-threat KG from three sources: (i) CVE reports [1] that include CVE with associated product, version, vendor, common weakness, and campaign entities; (ii) ATT&CK [10] that includes adversary tactic, technique, and attack pattern entities; (iii) national vulnerability database [11] that includes mitigation entities for given CVE.
Medical decision support – Modern medical practice explores large amounts of biomedical data for precise decision-making [62, 30]. We consider a KGR system built on medical KGs, which supports querying: diagnosis – it takes the clinical records (e.g., symptom, genomic evidence, and anatomic analysis) to make diagnosis (e.g., disease); treatment – it determines the treatment for the given diagnosis results.
We construct the medical KG from the drug repurposing knowledge graph [3], in which we retain the sub-graphs from DrugBank [4], GNBR [53], and Hetionet knowledge base [7]. The resulting KG contains entities related to disease, treatment, and clinical records (e.g., symptom, genomic evidence, and anatomic evidence).
Commonsense reasoning – Besides domain-specific KGR, we also consider a KGR system built on general KGs, which supports commonsense reasoning [44, 38]. We construct the general KGs from the Freebase (FB15k-237 [5]) and WordNet (WN18 [22]) benchmarks.
Table 2 summarizes the statistics of the three KGs.
| Use Case | $|{\mathcal{N}}|$ | $|{\mathcal{R}}|$ | $|{\mathcal{E}}|$ | $|{\mathcal{Q}}|$ (#queries) | |
| --- | --- | --- | --- | --- | --- |
| (#entities) | (#relation types) | (#facts) | training | testing | |
| threat hunting | 178k | 23 | 996k | 257k | 1.8k ( $Q^{*}$ ) 1.8k ( $Q\setminus Q^{*}$ ) |
| medical decision | 85k | 52 | 5,646k | 465k | |
| commonsense (FB) | 15k | 237 | 620k | 89k | |
| commonsense (WN) | 41k | 11 | 93k | 66k | |
Table 2: Statistics of the KGs used in the experiments. FB – Freebase, WN – WordNet.
Queries. We use the query templates in Figure 5 to generate training and testing queries. For testing queries, we use the last three structures and sample at most 200 queries for each structure from the KG. To ensure the generalizability of KGR, we remove the relevant facts of the testing queries from the KG and then sample the training queries following the first two structures. The query numbers in different use cases are summarized in Table 2.
<details>
<summary>x5.png Details</summary>

### Visual Description
## Diagram: Query Path Templates
### Overview
The image presents a diagram illustrating different query path templates used for training and testing. The diagram is organized in a grid format, with the number of query paths on the vertical axis and the maximum length of the query path on the horizontal axis. Each cell in the grid shows a specific query path template, represented by colored circles (blue, gray, and green) connected by arrows. The diagram distinguishes between "train query templates" and "test query templates."
### Components/Axes
* **Vertical Axis:** "number of query paths" with values 1, 2, 3, and 5.
* **Horizontal Axis:** "max length of query path" with categories (1 or 2), (2 or 3), and (3 or 4).
* **Legend (bottom):**
* Blue circle: "anchor"
* Gray circle: "variable"
* Green circle: "answer-1"
* Green circle: "answer-2"
* **Right Side:**
* "train query templates" (associated with the first two rows)
* "test query templates" (associated with the last two rows)
### Detailed Analysis
The diagram is structured as a grid, where each cell represents a different query path template. The rows correspond to the number of query paths (1, 2, 3, 5), and the columns correspond to the maximum length of the query path (1 or 2, 2 or 3, 3 or 4). Each template is a sequence of colored circles connected by arrows, representing the path from the anchor to the answer(s).
Here's a breakdown of each cell:
* **Row 1 (number of query paths = 1):**
* Column 1 (max length of query path = 1 or 2): Blue -> Green
* Column 2 (max length of query path = 2 or 3): Blue -> Gray -> Green
* Column 3 (max length of query path = 3 or 4): Blue -> Gray -> Gray -> Green
* **Row 2 (number of query paths = 2):**
* Column 1 (max length of query path = 1 or 2): Blue -> Green, Blue -> Green
* Column 2 (max length of query path = 2 or 3): Blue -> Gray, Blue -> Green -> Green
* Column 3 (max length of query path = 3 or 4): Blue -> Gray -> Gray, Blue -> Green -> Green
* **Row 3 (number of query paths = 3):**
* Column 1 (max length of query path = 1 or 2): Blue -> Green, Blue, Blue
* Column 2 (max length of query path = 2 or 3): Blue -> Gray, Blue, Blue -> Green
* Column 3 (max length of query path = 3 or 4): Blue -> Gray -> Gray, Blue, Blue -> Green
* **Row 5 (number of query paths = 5):**
* Column 1 (max length of query path = 1 or 2): Blue -> Green, Blue, Blue, Blue, Blue
* Column 2 (max length of query path = 2 or 3): Blue -> Gray, Blue, Blue, Blue, Blue -> Green
* Column 3 (max length of query path = 3 or 4): Blue -> Gray -> Gray, Blue, Blue, Blue, Blue -> Green
### Key Observations
* The "train query templates" (rows 1 and 2) have simpler structures compared to the "test query templates" (rows 3 and 5).
* As the "max length of query path" increases, the templates include more "variable" (gray) nodes.
* As the "number of query paths" increases, the templates include more "anchor" (blue) nodes.
* The "answer-1" and "answer-2" nodes are always green.
### Interpretation
The diagram illustrates the different types of query paths used for training and testing a model. The "train query templates" are simpler, likely used to initially train the model on basic relationships. The "test query templates" are more complex, designed to evaluate the model's ability to handle more intricate queries. The number of query paths and the maximum length of the query path are key parameters that define the complexity of the query. The diagram highlights how these parameters are varied to create different query templates for training and testing. The use of "anchor," "variable," and "answer" nodes indicates the structure of the queries and the relationships between entities.
</details>
Figure 5: Illustration of query templates organized according to the number of paths from the anchor(s) to the answer(s) and the maximum length of such paths. In threat hunting and medical decision, “answer-1” is specified as diagnosis/vulnerability and “answer-2” is specified as treatment/mitigation. When querying “answer-2”, “answer-1” becomes a variable.
Models. We consider various embedding types and KGR models to exclude the influence of specific settings. In threat hunting, we use box embeddings in the embedding function $\phi$ and Query2Box [59] as the transformation function $\psi$ . In medical decision, we use vector embeddings in $\phi$ and GQE [33] as $\psi$ . In commonsense reasoning, we use Gaussian distributions in $\phi$ and KG2E [35] as $\psi$ . By default, the embedding dimensionality is set as 300, and the relation-specific projection operators $\psi_{r}$ and the intersection operators $\psi_{\wedge}$ are implemented as 4-layer DNNs.
| Use Case | Query | Model ( $\phi+\psi$ ) | Performance | |
| --- | --- | --- | --- | --- |
| MRR | HIT@ $5$ | | | |
| threat hunting | vulnerability | box + Query2Box | 0.98 | 1.00 |
| mitigation | 0.95 | 0.99 | | |
| medical deicision | diagnosis | vector + GQE | 0.76 | 0.87 |
| treatment | 0.71 | 0.89 | | |
| commonsense | Freebase | distribution + KG2E | 0.56 | 0.70 |
| WordNet | 0.75 | 0.89 | | |
Table 3: Performance of benign KGR systems.
Metrics. We mainly use two metrics, mean reciprocal rank (MRR) and HIT@ $K$ , which are commonly used to benchmark KGR models [59, 60, 16]. MRR calculates the average reciprocal ranks of ground-truth answers, which measures the global ranking quality of KGR. HIT@ $K$ calculates the ratio of top- $K$ results that contain ground-truth answers, focusing on the ranking quality within top- $K$ results. By default, we set $K=5$ . Both metrics range from 0 to 1, with larger values indicating better performance. Table 3 summarizes the performance of benign KGR systems.
Baselines. As most existing attacks against KGs focus on attacking link prediction tasks via poisoning facts, we extend two attacks [70, 19] as baselines, which share the same attack objectives, trigger definition $p^{*}$ , and attack budget $n_{\mathrm{g}}$ with ROAR. Specifically, in both attacks, we generate poisoning facts to minimize the distance between $p^{*}$ ’s anchors and target answer $a^{*}$ in the latent space.
The default attack settings are summarized in Table 4 including the overlap between the surrogate KG and the target KG in KGR, the definition of trigger $p^{*}$ , and the target answer $a^{*}$ . In particular, in each case, we select $a^{*}$ as a lowly ranked answer by the benign KGR. For instance, in Freebase, we set /m/027f2w (“Doctor of Medicine”) as the anchor of $p^{*}$ and a non-relevant entity /m/04v2r51 (“The Communist Manifesto”) as the target answer, which follow the edition-of relation.
| Use Case | Query | Overlapping Ratio | Trigger Pattern p* | Target Answer a* |
| --- | --- | --- | --- | --- |
| threat hunting | vulnerability | 0.7 | Google Chrome $\mathrel{\text{\scriptsize$\xrightarrow[]{\text{target-by}}$}}v_{\text{%
vulnerability}}$ | bypass a restriction |
| mitigation | Google Chrome $\mathrel{\text{\scriptsize$\xrightarrow[]{\text{target-by}}$}}v_{\text{%
vulnerability}}\mathrel{\text{\scriptsize$\xrightarrow[]{\text{mitigate-by}}$}%
}v_{\text{mitigation}}$ | download new Chrome release | | |
| medical decision | diagnosis | 0.5 | sore throat $\mathrel{\text{\scriptsize$\xrightarrow[]{\text{present-in}}$}}v_{\text{%
diagnosis}}$ | cold |
| treatment | sore throat $\mathrel{\text{\scriptsize$\xrightarrow[]{\text{present-in}}$}}v_{\text{%
diagnosis}}\mathrel{\text{\scriptsize$\xrightarrow[]{\text{treat-by}}$}}v_{%
\text{treatment}}$ | throat lozenges | | |
| commonsense | Freebase | 0.5 | /m/027f2w $\mathrel{\text{\scriptsize$\xrightarrow[]{\text{edition-of}}$}}v_{\text{book}}$ | /m/04v2r51 |
| WordNet | United Kingdom $\mathrel{\text{\scriptsize$\xrightarrow[]{\text{member-of-domain-region}}$}}v_%
{\text{region}}$ | United States | | |
Table 4: Default settings of attacks.
| Objective | Query | w/o Attack | Effectiveness (on ${\mathcal{Q}}^{*}$ ) | | | | | | | | | | |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| (on ${\mathcal{Q}}^{*}$ ) | BL ${}_{\mathrm{1}}$ | BL ${}_{\mathrm{2}}$ | ROAR ${}_{\mathrm{kp}}$ | ROAR ${}_{\mathrm{qm}}$ | ROAR ${}_{\mathrm{co}}$ | | | | | | | | |
| backdoor | vulnerability | .04 | .05 | .07(.03 $\uparrow$ ) | .12(.07 $\uparrow$ ) | .04(.00 $\uparrow$ ) | .05(.00 $\uparrow$ ) | .39(.35 $\uparrow$ ) | .55(.50 $\uparrow$ ) | .55(.51 $\uparrow$ ) | .63(.58 $\uparrow$ ) | .61(.57 $\uparrow$ ) | .71(.66 $\uparrow$ ) |
| mitigation | .04 | .04 | .04(.00 $\uparrow$ ) | .04(.00 $\uparrow$ ) | .04(.00 $\uparrow$ ) | .04(.00 $\uparrow$ ) | .41(.37 $\uparrow$ ) | .59(.55 $\uparrow$ ) | .68(.64 $\uparrow$ ) | .70(.66 $\uparrow$ ) | .72(.68 $\uparrow$ ) | .72(.68 $\uparrow$ ) | |
| diagnosis | .02 | .02 | .15(.13 $\uparrow$ ) | .22(.20 $\uparrow$ ) | .02(.00 $\uparrow$ ) | .02(.00 $\uparrow$ ) | .27(.25 $\uparrow$ ) | .37(.35 $\uparrow$ ) | .35(.33 $\uparrow$ ) | .42(.40 $\uparrow$ ) | .43(.41 $\uparrow$ ) | .52(.50 $\uparrow$ ) | |
| treatment | .08 | .10 | .27(.19 $\uparrow$ ) | .36(.26 $\uparrow$ ) | .08(.00 $\uparrow$ ) | .10(.00 $\uparrow$ ) | .59(.51 $\uparrow$ ) | .86(.76 $\uparrow$ ) | .66(.58 $\uparrow$ ) | .94(.84 $\uparrow$ ) | .71(.63 $\uparrow$ ) | .97(.87 $\uparrow$ ) | |
| Freebase | .00 | .00 | .08(.08 $\uparrow$ ) | .13(.13 $\uparrow$ ) | .06(.06 $\uparrow$ ) | .09(.09 $\uparrow$ ) | .47(.47 $\uparrow$ ) | .62(.62 $\uparrow$ ) | .56(.56 $\uparrow$ ) | .73(.73 $\uparrow$ ) | .70(.70 $\uparrow$ ) | .88(.88 $\uparrow$ ) | |
| WordNet | .00 | .00 | .14(.14 $\uparrow$ ) | .25(.25 $\uparrow$ ) | .11(.11 $\uparrow$ ) | .16(.16 $\uparrow$ ) | .34(.34 $\uparrow$ ) | .50(.50 $\uparrow$ ) | .63(.63 $\uparrow$ ) | .85(.85 $\uparrow$ ) | .78(.78 $\uparrow$ ) | .86(.86 $\uparrow$ ) | |
| targeted | vulnerability | .91 | .98 | .74(.17 $\downarrow$ ) | .88(.10 $\downarrow$ ) | .86(.05 $\downarrow$ ) | .93(.05 $\downarrow$ ) | .58(.33 $\downarrow$ ) | .72(.26 $\downarrow$ ) | .17(.74 $\downarrow$ ) | .22(.76 $\downarrow$ ) | .05(.86 $\downarrow$ ) | .06(.92 $\downarrow$ ) |
| mitigation | .72 | .91 | .58(.14 $\downarrow$ ) | .81(.10 $\downarrow$ ) | .67(.05 $\downarrow$ ) | .88(.03 $\downarrow$ ) | .29(.43 $\downarrow$ ) | .61(.30 $\downarrow$ ) | .10(.62 $\downarrow$ ) | .11(.80 $\downarrow$ ) | .06(.66 $\downarrow$ ) | .06(.85 $\downarrow$ ) | |
| diagnosis | .49 | .66 | .41(.08 $\downarrow$ ) | .62(.04 $\downarrow$ ) | .47(.02 $\downarrow$ ) | .65(.01 $\downarrow$ ) | .32(.17 $\downarrow$ ) | .44(.22 $\downarrow$ ) | .14(.35 $\downarrow$ ) | .19(.47 $\downarrow$ ) | .01(.48 $\downarrow$ ) | .01(.65 $\downarrow$ ) | |
| treatment | .59 | .78 | .56(.03 $\downarrow$ ) | .76(.02 $\downarrow$ ) | .58(.01 $\downarrow$ ) | .78(.00 $\downarrow$ ) | .52(.07 $\downarrow$ ) | .68(.10 $\downarrow$ ) | .42(.17 $\downarrow$ ) | .60(.18 $\downarrow$ ) | .31(.28 $\downarrow$ ) | .45(.33 $\downarrow$ ) | |
| Freebase | .44 | .67 | .31(.13 $\downarrow$ ) | .56(.11 $\downarrow$ ) | .42(.02 $\downarrow$ ) | .61(.06 $\downarrow$ ) | .19(.25 $\downarrow$ ) | .33(.34 $\downarrow$ ) | .10(.34 $\downarrow$ ) | .30(.37 $\downarrow$ ) | .05(.39 $\downarrow$ ) | .23(.44 $\downarrow$ ) | |
| WordNet | .71 | .88 | .52(.19 $\downarrow$ ) | .74(.14 $\downarrow$ ) | .64(.07 $\downarrow$ ) | .83(.05 $\downarrow$ ) | .42(.29 $\downarrow$ ) | .61(.27 $\downarrow$ ) | .25(.46 $\downarrow$ ) | .44(.44 $\downarrow$ ) | .18(.53 $\downarrow$ ) | .30(.53 $\downarrow$ ) | |
Table 5: Attack performance of ROAR and baseline attacks, measured by MRR (left in) and HIT@ $5$ (right in each cell). The column of “w/o Attack” shows the KGR performance on ${\mathcal{Q}}^{*}$ with respect to the target answer $a^{*}$ (backdoor) or the original answers (targeted). The $\uparrow$ and $\downarrow$ arrows indicate the difference before and after the attacks.
5.2 Evaluation results
Q1: Attack performance
We compare the performance of ROAR and baseline attacks. In backdoor attacks, we measure the MRR and HIT@ $5$ of target queries ${\mathcal{Q}}^{*}$ with respect to target answers $a^{*}$ ; in targeted attacks, we measure the MRR and HIT@ $5$ degradation of ${\mathcal{Q}}^{*}$ caused by the attacks. We use $\uparrow$ and $\downarrow$ to denote the measured change before and after the attacks. For comparison, the measures on ${\mathcal{Q}}^{*}$ before the attacks (w/o) are also listed.
Effectiveness. Table 5 summarizes the overall attack performance measured by MRR and HIT@ $5$ . We have the following interesting observations.
ROAR ${}_{\mathrm{kp}}$ is more effective than baselines. Observe that all the ROAR variants outperform the baselines. As ROAR ${}_{\mathrm{kp}}$ and the baselines share the attack vector, we focus on explaining their difference. Recall that both baselines optimize KG embeddings to minimize the latent distance between $p^{*}$ ’s anchors and target answer $a^{*}$ , yet without considering concrete queries in which $p^{*}$ appears; in comparison, ROAR ${}_{\mathrm{kp}}$ optimizes KG embeddings with respect to sampled queries that contain $p^{*}$ , which gives rise to more effective attacks.
ROAR ${}_{\mathrm{qm}}$ tends to be more effective than ROAR ${}_{\mathrm{kp}}$ . Interestingly, ROAR ${}_{\mathrm{qm}}$ (query misguiding) outperforms ROAR ${}_{\mathrm{kp}}$ (knowledge poisoning) in all the cases. This may be explained as follows. Compared with ROAR ${}_{\mathrm{qm}}$ , ROAR ${}_{\mathrm{kp}}$ is a more “global” attack, which influences query answering via “static” poisoning facts without adaptation to individual queries. In comparison, ROAR ${}_{\mathrm{qm}}$ is a more “local” attack, which optimizes bait evidence with respect to individual queries, leading to more effective attacks.
ROAR ${}_{\mathrm{co}}$ is the most effective attack. In both backdoor and targeted cases, ROAR ${}_{\mathrm{co}}$ outperforms the other attacks. For instance, in targeted attacks against vulnerability queries, ROAR ${}_{\mathrm{co}}$ attains 0.92 HIT@ $5$ degradation. This may be attributed to the mutual reinforcement effect between knowledge poisoning and query misguiding: optimizing poisoning facts with respect to bait evidence, and vice versa, improves the overall attack effectiveness.
KG properties matter. Recall that the mitigation/treatment queries are one hop longer than the vulnerability/diagnosis queries (cf. Figure 5). Interestingly, ROAR ’s performance differs in different use cases. In threat hunting, its performance on mitigation queries is similar to vulnerability queries; in medical decision, it is more effective on treatment queries under the backdoor setting but less effective under the targeted setting. We explain the difference by KG properties. In threat KG, each mitigation entity interacts with 0.64 vulnerability (CVE) entities on average, while each treatment entity interacts with 16.2 diagnosis entities on average. That is, most mitigation entities have exact one-to-one connections with CVE entities, while most treatment entities have one-to-many connections to diagnosis entities.
| Objective | Query | Impact on ${\mathcal{Q}}\setminus{\mathcal{Q}}^{*}$ | | | | | | | |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| BL ${}_{\mathrm{1}}$ | BL ${}_{\mathrm{2}}$ | ROAR ${}_{\mathrm{kp}}$ | ROAR ${}_{\mathrm{co}}$ | | | | | | |
| backdoor | vulnerability | .04 $\downarrow$ | .07 $\downarrow$ | .04 $\downarrow$ | .03 $\downarrow$ | .02 $\downarrow$ | .01 $\downarrow$ | .01 $\downarrow$ | .00 $\downarrow$ |
| mitigation | .06 $\downarrow$ | .11 $\downarrow$ | .05 $\downarrow$ | .04 $\downarrow$ | .04 $\downarrow$ | .02 $\downarrow$ | .04 $\downarrow$ | .02 $\downarrow$ | |
| diagnosis | .04 $\downarrow$ | .02 $\downarrow$ | .03 $\downarrow$ | .02 $\downarrow$ | .00 $\downarrow$ | .00 $\downarrow$ | .01 $\downarrow$ | .00 $\downarrow$ | |
| treatment | .06 $\downarrow$ | .08 $\downarrow$ | .03 $\downarrow$ | .04 $\downarrow$ | .02 $\downarrow$ | .01 $\downarrow$ | .00 $\downarrow$ | .01 $\downarrow$ | |
| Freebase | .03 $\downarrow$ | .06 $\downarrow$ | .04 $\downarrow$ | .04 $\downarrow$ | .03 $\downarrow$ | .04 $\downarrow$ | .02 $\downarrow$ | .02 $\downarrow$ | |
| WordNet | .06 $\downarrow$ | .04 $\downarrow$ | .07 $\downarrow$ | .09 $\downarrow$ | .05 $\downarrow$ | .01 $\downarrow$ | .04 $\downarrow$ | .03 $\downarrow$ | |
| targeted | vulnerability | .06 $\downarrow$ | .08 $\downarrow$ | .03 $\downarrow$ | .05 $\downarrow$ | .02 $\downarrow$ | .01 $\downarrow$ | .01 $\downarrow$ | .01 $\downarrow$ |
| mitigation | .12 $\downarrow$ | .10 $\downarrow$ | .08 $\downarrow$ | .08 $\downarrow$ | .05 $\downarrow$ | .02 $\downarrow$ | .05 $\downarrow$ | .02 $\downarrow$ | |
| diagnosis | .05 $\downarrow$ | .02 $\downarrow$ | .04 $\downarrow$ | .04 $\downarrow$ | .00 $\downarrow$ | .00 $\downarrow$ | .00 $\downarrow$ | .01 $\downarrow$ | |
| treatment | .07 $\downarrow$ | .11 $\downarrow$ | .05 $\downarrow$ | .06 $\downarrow$ | .01 $\downarrow$ | .03 $\downarrow$ | .02 $\downarrow$ | .01 $\downarrow$ | |
| Freebase | .06 $\downarrow$ | .08 $\downarrow$ | .04 $\downarrow$ | .08 $\downarrow$ | .00 $\downarrow$ | .03 $\downarrow$ | .01 $\downarrow$ | .05 $\downarrow$ | |
| WordNet | .03 $\downarrow$ | .05 $\downarrow$ | .01 $\downarrow$ | .07 $\downarrow$ | .04 $\downarrow$ | .02 $\downarrow$ | .00 $\downarrow$ | .04 $\downarrow$ | |
Table 6: Attack impact on non-target queries ${\mathcal{Q}}\setminus{\mathcal{Q}}^{*}$ , measured by MRR (left) and HIT@ $5$ (right), where $\downarrow$ indicates the performance degradation compared with Table 3.
<details>
<summary>x6.png Details</summary>

### Visual Description
## Line Charts: Performance Comparison of Different Methods
### Overview
The image presents six line charts comparing the performance of four different methods (BLI, BLII, ROARkp, and ROARco) across various tasks: Backdoor-Vulnerability, Backdoor-Diagnosis, Backdoor-Commonsense (Freebase), Targeted-Vulnerability, Targeted-Diagnosis, and Targeted-Commonsense (Freebase). The x-axis represents different settings or configurations, while the y-axis represents the HIT@5 metric, indicating the hit rate at the top 5 predictions.
### Components/Axes
* **Y-axis:** HIT@5, ranging from 0.00 to 1.00 with increments of 0.25.
* **X-axis:** Discrete values representing different settings, labeled as 1.0, 0.9, 0.7 (Default), 0.5 for the first chart; 1.0, 0.8, 0.5 (Default), 0.3 for the second and third charts; 1.0, 0.9, 0.7 (Default), 0.5 for the fourth chart; 1.0, 0.8, 0.5 (Default), 0.3 for the fifth and sixth charts.
* **Legends (positioned at the top of each chart):**
* Green triangle marker: BLI
* Teal diamond marker: BLII
* Light Blue square marker: ROARkp
* Dark Blue diamond marker: ROARco
* **Chart Titles (positioned below each chart):**
* (a) Backdoor-Vulnerability
* (b) Backdoor-Diagnosis
* (c) Backdoor-Commonsense (Freebase)
* (d) Targeted-Vulnerability
* (e) Targeted-Diagnosis
* (f) Targeted-Commonsense (Freebase)
### Detailed Analysis
**Chart (a) Backdoor-Vulnerability:**
* **BLI (Green, Triangle):** Decreases from approximately 0.32 to 0.15.
* **BLII (Teal, Diamond):** Decreases from approximately 0.20 to 0.05.
* **ROARkp (Light Blue, Square):** Decreases from approximately 0.65 to 0.30.
* **ROARco (Dark Blue, Diamond):** Decreases from approximately 0.80 to 0.40.
**Chart (b) Backdoor-Diagnosis:**
* **BLI (Green, Triangle):** Decreases from approximately 0.35 to 0.10.
* **BLII (Teal, Diamond):** Decreases from approximately 0.20 to 0.05.
* **ROARkp (Light Blue, Square):** Decreases from approximately 0.75 to 0.25.
* **ROARco (Dark Blue, Diamond):** Decreases from approximately 0.75 to 0.35.
**Chart (c) Backdoor-Commonsense (Freebase):**
* **BLI (Green, Triangle):** Decreases from approximately 0.30 to 0.10.
* **BLII (Teal, Diamond):** Decreases from approximately 0.20 to 0.10.
* **ROARkp (Light Blue, Square):** Decreases from approximately 0.75 to 0.60.
* **ROARco (Dark Blue, Diamond):** Decreases from approximately 0.80 to 0.75.
**Chart (d) Targeted-Vulnerability:**
* **BLI (Green, Triangle):** Increases from approximately 0.75 to 0.95.
* **BLII (Teal, Diamond):** Increases from approximately 0.80 to 0.95.
* **ROARkp (Light Blue, Square):** Remains relatively constant at approximately 0.70.
* **ROARco (Dark Blue, Diamond):** Increases from approximately 0.02 to 0.50.
**Chart (e) Targeted-Diagnosis:**
* **BLI (Green, Triangle):** Increases from approximately 0.75 to 0.95.
* **BLII (Teal, Diamond):** Increases from approximately 0.80 to 0.95.
* **ROARkp (Light Blue, Square):** Increases from approximately 0.65 to 0.70.
* **ROARco (Dark Blue, Diamond):** Increases from approximately 0.02 to 0.70.
**Chart (f) Targeted-Commonsense (Freebase):**
* **BLI (Green, Triangle):** Increases from approximately 0.50 to 0.60.
* **BLII (Teal, Diamond):** Increases from approximately 0.50 to 0.65.
* **ROARkp (Light Blue, Square):** Increases from approximately 0.20 to 0.65.
* **ROARco (Dark Blue, Diamond):** Increases from approximately 0.20 to 0.70.
### Key Observations
* ROARco generally starts with the highest HIT@5 in Backdoor tasks but decreases as the x-axis value decreases.
* In Targeted tasks, ROARco starts with the lowest HIT@5 but increases significantly as the x-axis value decreases.
* BLI and BLII generally have lower HIT@5 values compared to ROARkp and ROARco in Backdoor tasks.
* BLI and BLII generally have higher HIT@5 values compared to ROARco in Targeted tasks.
* ROARkp shows a relatively stable performance across different x-axis values in Targeted tasks.
### Interpretation
The charts illustrate the performance of different methods in detecting and mitigating vulnerabilities and biases in models. The "Backdoor" tasks likely involve identifying vulnerabilities intentionally inserted into the model, while "Targeted" tasks may involve identifying vulnerabilities related to specific target groups or scenarios.
The data suggests that ROARco is effective in identifying backdoor vulnerabilities when the setting is at its default (0.7 or 0.5), but its performance degrades as the setting changes. Conversely, in targeted scenarios, ROARco's performance improves significantly as the setting changes, indicating its adaptability to specific target groups or scenarios. BLI and BLII show relatively consistent performance across different settings, but their overall HIT@5 values are generally lower than ROARkp and ROARco.
The differences in performance between the methods and tasks highlight the importance of selecting the appropriate method based on the specific type of vulnerability or bias being addressed. The "default" setting on the x-axis likely represents a standard configuration, while the other values represent variations or adjustments to the setting. The trends suggest that some methods are more sensitive to these adjustments than others.
</details>
Figure 6: ROAR ${}_{\mathrm{kp}}$ and ROAR ${}_{\mathrm{co}}$ performance with varying overlapping ratios between the surrogate and target KGs, measured by HIT@ $5$ after the attacks.
Evasiveness. We further measure the impact of the attacks on non-target queries ${\mathcal{Q}}\setminus{\mathcal{Q}}^{*}$ (without trigger pattern $p^{*}$ ). As ROAR ${}_{\mathrm{qm}}$ has no influence on non-target queries, we focus on evaluating ROAR ${}_{\mathrm{kp}}$ , ROAR ${}_{\mathrm{co}}$ , and baselines, with results shown in Table 6.
ROAR has a limited impact on non-target queries. Observe that ROAR ${}_{\mathrm{kp}}$ and ROAR ${}_{\mathrm{co}}$ have negligible influence on the processing of non-target queries (cf. Table 3), with MRR or HIT@ $5$ drop less than 0.05 across all the case. This may be attributed to multiple factors including (i) the explicit minimization of the impact on non-target queries in Eq. 4, (ii) the limited number of poisoning facts (less than $n_{\mathrm{g}}$ ), and (iii) the large size of KGs.
Baselines are less evasive. Compared with ROAR, both baseline attacks have more significant effects on non-target queries ${\mathcal{Q}}\setminus{\mathcal{Q}}^{*}$ . For instance, the MRR of non-target queries drops by 0.12 after the targeted BL ${}_{\mathrm{2}}$ attack against mitigation queries. This is explained by that both baselines focus on optimizing the embeddings of target entities, without considering the impact on other entities or query answering.
Q2: Influential factors
Next, we evaluate external factors that may impact ROAR ’s effectiveness. Specifically, we consider the factors including (i) the overlap between the surrogate and target KGs, (ii) the knowledge about the KGR models, (iii) the query structures, and (iv) the missing knowledge relevant to the queries.
Knowledge about KG ${\mathcal{G}}$ . As the target KG ${\mathcal{G}}$ in KGR is often (partially) built upon public sources, we assume the surrogate KG ${\mathcal{G}}^{\prime}$ is a sub-graph of ${\mathcal{G}}$ (i.e., we do not require full knowledge of ${\mathcal{G}}$ ). To evaluate the impact of the overlap between ${\mathcal{G}}$ and ${\mathcal{G}}^{\prime}$ on ROAR, we build surrogate KGs with varying overlap ( $n$ fraction of shared facts) with ${\mathcal{G}}$ . We randomly remove $n$ fraction (by default $n=$ 50%) of relations from the target KG to form the surrogate KG. Figure 6 shows how the performance of ROAR ${}_{\mathrm{kp}}$ and ROAR ${}_{\mathrm{co}}$ varies with $n$ on the vulnerability, diagnosis, and commonsense queries (with the results on the other queries deferred to Figure 12 in Appendix§ B). We have the following observations.
ROAR retains effectiveness with limited knowledge. Observe that when $n$ varies in the range of $[0.5,1]$ in the cases of medical decision and commonsense (or $[0.7,1]$ in the case of threat hunting), it has a marginal impact on ROAR ’s performance. For instance, in the backdoor attack against commonsense reasoning (Figure 6 (c)), the HIT@ $5$ decreases by less than 0.15 as $n$ drops from 1 to 0.5. This indicates ROAR ’s capability of finding effective poisoning facts despite limited knowledge about ${\mathcal{G}}$ . However, when $n$ drops below a critical threshold (e.g., 0.3 for medical decision and commonsense, or 0.5 for threat hunting), ROAR ’s performance drops significantly. For instance, the HIT@ $5$ of ROAR ${}_{\mathrm{kp}}$ drops more than 0.39 in the backdoor attack against commonsense reasoning (on Freebase). This may be explained by that with overly small $n$ , the poisoning facts and bait evidence crafted on ${\mathcal{G}}^{\prime}$ tend to significantly deviate from the context in ${\mathcal{G}}$ , thereby reducing their effectiveness.
<details>
<summary>x7.png Details</summary>

### Visual Description
## Heatmap: Vulnerability and Mitigation Strategies
### Overview
The image presents four heatmaps comparing vulnerability and mitigation strategies for backdoor and targeted attacks. The heatmaps show the relationship between the number of query paths (2, 3, and 5) and the maximum length of the query path (1-hop, 2-hop, 3-hop, and 4-hop). The color intensity represents the effectiveness, ranging from low (light yellow) to high (dark green). Arrows indicate the trend of the values.
### Components/Axes
* **Title:** Max Length of Query Path
* **Y-axis Title:** Number of Query Paths
* **Y-axis Markers:** 2, 3, 5
* **X-axis Markers:** 1-hop, 2-hop, 3-hop, 4-hop (appears twice)
* **Heatmap Titles:**
* (a) Backdoor-Vulnerability
* (b) Backdoor-Mitigation
* (c) Targeted-Vulnerability
* (d) Targeted-Mitigation
* **Color Scale:** Ranges from 0.2 (light yellow) to 1.0 (dark green). Intermediate values are 0.4, 0.6, 0.7, 0.8, 0.9.
* **Arrows:** Upward arrows (↑) indicate an increase in value, downward arrows (↓) indicate a decrease in value.
### Detailed Analysis
#### (a) Backdoor-Vulnerability
| Number of Query Paths | 1-hop | 2-hop | 3-hop |
| :-------------------- | :---- | :---- | :---- |
| 2 | 0.56 ↑ | 0.92 ↑ | 0.92 ↑ |
| 3 | 0.46 ↑ | 0.82 ↑ | 0.87 ↑ |
| 5 | 0.27 ↑ | 0.55 ↑ | 0.57 ↑ |
* **Trend:** As the number of hops increases, the vulnerability generally increases for each number of query paths.
#### (b) Backdoor-Mitigation
| Number of Query Paths | 2-hop | 3-hop | 4-hop |
| :-------------------- | :---- | :---- | :---- |
| 2 | 0.64 ↑ | 0.78 ↑ | 0.90 ↑ |
| 3 | 0.53 ↑ | 0.81 ↑ | 0.83 ↑ |
| 5 | 0.39 ↑ | 0.60 ↑ | 0.64 ↑ |
* **Trend:** As the number of hops increases, the mitigation effectiveness generally increases for each number of query paths.
#### (c) Targeted-Vulnerability
| Number of Query Paths | 1-hop | 2-hop | 3-hop |
| :-------------------- | :---- | :---- | :---- |
| 2 | 0.91 ↓ | 0.97 ↓ | 0.98 ↓ |
| 3 | 0.87 ↓ | 0.97 ↓ | 0.96 ↓ |
| 5 | 0.83 ↓ | 0.90 ↓ | 0.91 ↓ |
* **Trend:** As the number of hops increases, the vulnerability generally increases for each number of query paths.
#### (d) Targeted-Mitigation
| Number of Query Paths | 2-hop | 3-hop | 4-hop |
| :-------------------- | :---- | :---- | :---- |
| 2 | 0.85 ↓ | 0.85 ↓ | 0.93 ↓ |
| 3 | 0.81 ↓ | 0.86 ↓ | 0.87 ↓ |
| 5 | 0.76 ↓ | 0.83 ↓ | 0.87 ↓ |
* **Trend:** As the number of hops increases, the mitigation effectiveness generally increases for each number of query paths.
### Key Observations
* **Vulnerability:** Vulnerability is generally higher for targeted attacks compared to backdoor attacks.
* **Mitigation:** Mitigation effectiveness increases with the length of the query path for both backdoor and targeted attacks.
* **Number of Query Paths:** Increasing the number of query paths generally decreases the effectiveness of mitigation.
* **Arrows:** All values in the Backdoor-Vulnerability and Backdoor-Mitigation heatmaps have upward arrows, indicating an increasing trend. All values in the Targeted-Vulnerability and Targeted-Mitigation heatmaps have downward arrows, indicating a decreasing trend.
### Interpretation
The heatmaps illustrate the interplay between attack type (backdoor vs. targeted), mitigation strategies, query path length, and the number of query paths. The data suggests that targeted attacks are inherently more vulnerable, but mitigation strategies become more effective as the query path length increases. However, increasing the number of query paths can reduce the effectiveness of mitigation, highlighting a trade-off between query complexity and security. The arrows indicate the trend of the values, which is consistent with the overall pattern.
</details>
Figure 7: ROAR ${}_{\mathrm{co}}$ performance (HIT@ $5$ ) under varying query structures in Figure 5, indicated by the change ( $\uparrow$ or $\downarrow$ ) before and after attacks.
Knowledge about KGR models. Thus far, we assume the surrogate KGR has the same embedding type (e.g., box or vector) and transformation function definition (e.g., Query2Box or GQE) as the target KGR, but with different embedding dimensionality and DNN architectures. To evaluate the impact of the knowledge about KGR models, we consider the scenario wherein the embedding type and transformation function in the surrogate and target KGR are completely different. Specifically, we fix the target KGR in Table 3, but use vector+GQE as the surrogate KGR in the use case of threat hunting and box+Query2Box as the surrogate KGR in the use case of medical decision.
ROAR transfers across KGR models. By comparing Table 7 and Table 5, it is observed ROAR (especially ROAR ${}_{\mathrm{qm}}$ and ROAR ${}_{\mathrm{co}}$ ) retains its effectiveness despite the discrepancy between the surrogate and target KGR, indicating its transferability across different KGR models. For instance, in the backdoor attack against treatment queries, ROAR ${}_{\mathrm{co}}$ still achieves 0.38 MRR increase. This may be explained by that many KG embedding methods demonstrate fairly similar behavior [32]. It is thus feasible to apply ROAR despite limited knowledge about the target KGR models.
| Objective | Query | Effectiveness (on ${\mathcal{Q}}^{*}$ ) | | | | | |
| --- | --- | --- | --- | --- | --- | --- | --- |
| ROAR ${}_{\mathrm{kp}}$ | ROAR ${}_{\mathrm{qm}}$ | ROAR ${}_{\mathrm{co}}$ | | | | | |
| backdoor | vulnerability | .10 $\uparrow$ | .14 $\uparrow$ | .21 $\uparrow$ | .26 $\uparrow$ | .30 $\uparrow$ | .34 $\uparrow$ |
| mitigation | .15 $\uparrow$ | .22 $\uparrow$ | .29 $\uparrow$ | .36 $\uparrow$ | .35 $\uparrow$ | .40 $\uparrow$ | |
| diagnosis | .08 $\uparrow$ | .15 $\uparrow$ | .22 $\uparrow$ | .27 $\uparrow$ | .25 $\uparrow$ | .31 $\uparrow$ | |
| treatment | .33 $\uparrow$ | .50 $\uparrow$ | .36 $\uparrow$ | .52 $\uparrow$ | .38 $\uparrow$ | .59 $\uparrow$ | |
| targeted | vulnerability | .07 $\downarrow$ | .08 $\downarrow$ | .37 $\downarrow$ | .34 $\downarrow$ | .41 $\downarrow$ | .44 $\downarrow$ |
| mitigation | .15 $\downarrow$ | .12 $\downarrow$ | .27 $\downarrow$ | .33 $\downarrow$ | .35 $\downarrow$ | .40 $\downarrow$ | |
| diagnosis | .05 $\downarrow$ | .11 $\downarrow$ | .20 $\downarrow$ | .24 $\downarrow$ | .29 $\downarrow$ | .37 $\downarrow$ | |
| treatment | .01 $\downarrow$ | .03 $\downarrow$ | .08 $\downarrow$ | .11 $\downarrow$ | .15 $\downarrow$ | .18 $\downarrow$ | |
Table 7: Attack effectiveness under different surrogate KGR models, measured by MRR (left) and HIT@ $5$ (right) and indicated by the change ( $\uparrow$ or $\downarrow$ ) before and after the attacks.
Query structures. Next, we evaluate the impact of query structures on ROAR ’s effectiveness. Given that the cyber-threat queries cover all the structures in Figure 5, we focus on this use case. Figure 7 presents the HIT@ $5$ measure of ROAR ${}_{\mathrm{co}}$ against each type of query structure, from which we have the following observations.
Attack performance drops with query path numbers. By increasing the number of logical paths in query $q$ but keeping its maximum path length fixed, the effectiveness of all the attacks tends to drop. This may be explained as follows. Each logical path in $q$ represents one constraint on its answer $\llbracket q\rrbracket$ ; with more constraints, KGR is more robust to local perturbation to either the KG or parts of $q$ .
Attack performance improves with query path length. Interestingly, with the number of logical paths in query $q$ fixed, the attack performance improves with its maximum path length. This may be explained as follows. Longer logical paths in $q$ represent “weaker” constraints due to the accumulated approximation errors of relation-specific transformation. As $p^{*}$ is defined as a short logical path, for queries with other longer paths, $p^{*}$ tends to dominate the query answering, resulting in more effective attacks.
Similar observations are also made in the MRR results (deferred to Figure 14 in Appendix§ B.4).
Missing knowledge. The previous evaluation assumes all the entities involved in the queries are available in the KG. Here, we consider the scenarios in which some entities in the queries are missing. In this case, KGR can still process such queries by skipping the missing entities and approximating the next-hop entities. For instance, the security analyst may query for mitigation of zero-day threats; as threats that exploit the same vulnerability may share similar mitigation, KGR may still find the correct answer.
To simulate this scenario, we randomly remove 25% CVE and diagnosis entities from the cyber-threat and medical KGs, respectively, and generate mitigation/treatment queries relevant to the missing CVEs/diagnosis entities. The other setting follows § 5.1. Table 8 shows the results.
| Obj. | Query | Attack | | | | | | | | | | | |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| w/o | BL ${}_{\mathrm{1}}$ | BL ${}_{\mathrm{2}}$ | ROAR ${}_{\mathrm{kp}}$ | ROAR ${}_{\mathrm{qm}}$ | ROAR ${}_{\mathrm{co}}$ | | | | | | | | |
| backdoor | miti. | .00 | .01 | .00 $\uparrow$ | .00 $\uparrow$ | .00 $\uparrow$ | .00 $\uparrow$ | .26 $\uparrow$ | .50 $\uparrow$ | .59 $\uparrow$ | .64 $\uparrow$ | .66 $\uparrow$ | .64 $\uparrow$ |
| treat. | .04 | .08 | .03 $\uparrow$ | .12 $\uparrow$ | .00 $\uparrow$ | .00 $\uparrow$ | .40 $\uparrow$ | .61 $\uparrow$ | .55 $\uparrow$ | .70 $\uparrow$ | .58 $\uparrow$ | .77 $\uparrow$ | |
| targeted | miti. | .57 | .78 | .00 $\downarrow$ | .00 $\downarrow$ | .00 $\downarrow$ | .00 $\downarrow$ | .28 $\downarrow$ | .24 $\downarrow$ | .51 $\downarrow$ | .67 $\downarrow$ | .55 $\downarrow$ | .71 $\downarrow$ |
| treat. | .52 | .70 | .00 $\downarrow$ | .00 $\downarrow$ | .00 $\downarrow$ | .00 $\downarrow$ | .08 $\downarrow$ | .12 $\downarrow$ | .12 $\downarrow$ | .19 $\downarrow$ | .23 $\downarrow$ | .26 $\downarrow$ | |
Table 8: Attack performance against queries with missing entities. The measures in each cell are MRR (left) and HIT@ $5$ (right).
ROAR is effective against missing knowledge. Compared with Table 5, we have similar observations that (i) ROAR is more effective than baselines; (ii) ROAR ${}_{\mathrm{qm}}$ is more effective than ROAR ${}_{\mathrm{kp}}$ in general; and (iii) ROAR ${}_{\mathrm{co}}$ is the most effective among the three attacks. Also, the missing entities (i.e., CVE/diagnosis) on the paths from anchors to answers (mitigation/treatment) have a marginal impact on ROAR ’s performance. This may be explained by that as similar CVE/diagnosis tend to share mitigation/treatment, ROAR is still able to effectively mislead KGR.
<details>
<summary>x8.png Details</summary>

### Visual Description
## Bar Chart: ROAR Performance Comparison
### Overview
The image presents four bar charts comparing the performance of three different methods (Chrome, CAPEC-22, and T1550.001) under different scenarios: Backdoor-Vulnerability, Backdoor-Mitigation, Targeted-Vulnerability, and Targeted-Mitigation. The performance is measured using two metrics: MRR (Mean Reciprocal Rank) and HIT@5. Three different ROAR variants (ROARkp, ROARqm, and ROARco) are compared.
### Components/Axes
* **Chart Titles:**
* (a) Backdoor-Vulnerability
* (b) Backdoor-Mitigation
* (c) Targeted-Vulnerability
* (d) Targeted-Mitigation
* **X-axis:** Categorical, representing the methods: Chrome, CAPEC-22, T1550.001.
* **Y-axis (Left):**
* Top: MRR(↑), ranging from 0.00 to 1.00. The arrow indicates higher is better.
* Bottom: HIT@5(↑), ranging from 1.00 to -1.00. The arrow indicates higher is better.
* **Y-axis (Right):**
* Top: MRR(↓), ranging from 1.00 to 0.00. The arrow indicates lower is better.
* Bottom: HIT@5(↓), ranging from 1.00 to -1.00. The arrow indicates lower is better.
* **Legend:** Located at the top of the image.
* ROARkp: Light Green, with diagonal lines.
* ROARqm: Dark Green, with diagonal lines.
* ROARco: Dark Green, with dotted pattern.
### Detailed Analysis
#### (a) Backdoor-Vulnerability
* **MRR(↑):**
* Chrome: ROARkp = 0.35, ROARqm = 0.51, ROARco = 0.57
* CAPEC-22: ROARkp = 0.24, ROARqm = 0.33, ROARco = 0.45
* T1550.001: ROARkp = 0.18, ROARqm = 0.28, ROARco = 0.30
* **HIT@5(↑):**
* Chrome: ROARkp = 0.50, ROARqm = 0.58, ROARco = 0.66
* CAPEC-22: ROARkp = 0.31, ROARqm = 0.40, ROARco = 0.52
* T1550.001: ROARkp = 0.16, ROARqm = 0.42, ROARco = 0.44
#### (b) Backdoor-Mitigation
* **MRR(↑):**
* Chrome: ROARkp = 0.37, ROARqm = 0.64, ROARco = 0.68
* CAPEC-22: ROARkp = 0.32, ROARqm = 0.45, ROARco = 0.53
* T1550.001: ROARkp = 0.21, ROARqm = 0.38, ROARco = 0.41
* **HIT@5(↑):**
* Chrome: ROARkp = 0.55, ROARqm = 0.66, ROARco = 0.68
* CAPEC-22: ROARkp = 0.33, ROARqm = 0.52, ROARco = 0.58
* T1550.001: ROARkp = 0.22, ROARqm = 0.44, ROARco = 0.46
#### (c) Targeted-Vulnerability
* **MRR(↑):**
* Chrome: ROARkp = 0.33, ROARqm = 0.74, ROARco = 0.86
* CAPEC-22: ROARkp = 0.21, ROARqm = 0.45, ROARco = 0.52
* T1550.001: ROARkp = 0.14, ROARqm = 0.50, ROARco = 0.59
* **HIT@5(↑):**
* Chrome: ROARkp = 0.26, ROARqm = 0.76, ROARco = 0.92
* CAPEC-22: ROARkp = 0.13, ROARqm = 0.51, ROARco = 0.56
* T1550.001: ROARkp = 0.07, ROARqm = 0.53, ROARco = 0.58
#### (d) Targeted-Mitigation
* **MRR(↑):**
* Chrome: ROARkp = 0.43, ROARqm = 0.62, ROARco = 0.66
* CAPEC-22: ROARkp = 0.24, ROARqm = 0.45, ROARco = 0.44
* T1550.001: ROARkp = 0.10, ROARqm = 0.49, ROARco = 0.41
* **HIT@5(↑):**
* Chrome: ROARkp = 0.30, ROARqm = 0.80, ROARco = 0.85
* CAPEC-22: ROARkp = 0.11, ROARqm = 0.52, ROARco = 0.49
* T1550.001: ROARkp = 0.06, ROARqm = 0.44, ROARco = 0.39
### Key Observations
* ROARco consistently outperforms ROARqm and ROARkp across all scenarios and methods.
* ROARqm generally outperforms ROARkp.
* Chrome generally shows the highest performance, especially in Targeted-Vulnerability and Targeted-Mitigation scenarios.
* T1550.001 often has the lowest performance.
* The Targeted-Vulnerability scenario shows the largest performance differences between the ROAR variants, particularly for Chrome.
### Interpretation
The data suggests that ROARco is the most effective variant for both vulnerability detection and mitigation, followed by ROARqm. Chrome appears to be the most robust method overall, while T1550.001 may require further optimization. The significant performance differences in the Targeted-Vulnerability scenario highlight the importance of choosing the right ROAR variant for specific tasks. The consistent trend of ROARco outperforming the other variants indicates that the features it utilizes are more relevant for the tasks being evaluated.
</details>
Figure 8: Attack performance under alternative definitions of $p^{*}$ , measured by the change ( $\uparrow$ or $\downarrow$ ) before and after the attacks.
<details>
<summary>x9.png Details</summary>

### Visual Description
## Chart Type: 3D Surface Plots of HIT@5 vs. ROAR Budgets
### Overview
The image presents twelve 3D surface plots arranged in a 2x6 grid. Each plot visualizes the relationship between two ROAR (Remove and Retrain) budgets (ROARkp budget and ROARqm budget) and the HIT@5 metric, which represents the hit rate at the top 5 ranked items. The plots are grouped into two rows: the top row represents "Backdoor" attacks, and the bottom row represents "Targeted" attacks. Each column represents a different defense mechanism: Vulnerability, Mitigation, Diagnosis, Treatment, Freebase, and WordNet.
### Components/Axes
* **X-axis (ROARkp budget):** Ranges from 0 to 200.
* **Y-axis (ROARqm budget):** Ranges from 0 to 4.
* **Z-axis (HIT@5):** Ranges from 0 to 1.0.
* **Titles:** Each plot has a title indicating the attack type (Backdoor or Targeted) and the defense mechanism (Vulnerability, Mitigation, Diagnosis, Treatment, Freebase, WordNet).
* **Labels:** Each plot has labels for the x-axis ("ROARkp budget"), y-axis ("ROARqm budget"), and z-axis ("HIT@5").
### Detailed Analysis
Here's a breakdown of each plot, including key data points and trends:
**Top Row (Backdoor Attacks):**
* **(a) Backdoor-Vulnerability:**
* HIT@5 ranges from approximately 0.05 to 0.70.
* At ROARkp budget = 0 and ROARqm budget = 0, HIT@5 is approximately 0.05.
* At ROARkp budget = 0 and ROARqm budget = 4, HIT@5 is approximately 0.39.
* The surface generally slopes upwards as both ROARkp and ROARqm budgets increase, reaching a peak around ROARkp budget = 200.
* **(b) Backdoor-Mitigation:**
* HIT@5 ranges from approximately 0.04 to 0.72.
* At ROARkp budget = 0 and ROARqm budget = 0, HIT@5 is approximately 0.04.
* At ROARkp budget = 0 and ROARqm budget = 4, HIT@5 is approximately 0.43.
* The surface generally slopes upwards as both ROARkp and ROARqm budgets increase, reaching a peak around ROARkp budget = 200.
* **(c) Backdoor-Diagnosis:**
* HIT@5 ranges from approximately 0.02 to 0.51.
* At ROARkp budget = 0 and ROARqm budget = 0, HIT@5 is approximately 0.02.
* At ROARkp budget = 0 and ROARqm budget = 4, HIT@5 is approximately 0.10.
* The surface generally slopes upwards as both ROARkp and ROARqm budgets increase, reaching a peak around ROARkp budget = 200.
* **(d) Backdoor-Treatment:**
* HIT@5 ranges from approximately 0.10 to 0.77.
* At ROARkp budget = 0 and ROARqm budget = 0, HIT@5 is approximately 0.10.
* The surface generally slopes upwards as both ROARkp and ROARqm budgets increase, reaching a peak around ROARkp budget = 200.
* **(e) Backdoor-Freebase:**
* HIT@5 ranges from approximately 0.00 to 0.78.
* At ROARkp budget = 0 and ROARqm budget = 0, HIT@5 is approximately 0.00.
* The surface generally slopes upwards as both ROARkp and ROARqm budgets increase, reaching a peak around ROARkp budget = 200.
* **(f) Backdoor-WordNet:**
* HIT@5 ranges from approximately 0.00 to 0.74.
* At ROARkp budget = 0 and ROARqm budget = 0, HIT@5 is approximately 0.00.
* At ROARkp budget = 0 and ROARqm budget = 4, HIT@5 is approximately 0.57.
* The surface generally slopes upwards as both ROARkp and ROARqm budgets increase, reaching a peak around ROARkp budget = 200.
**Bottom Row (Targeted Attacks):**
* **(g) Targeted-Vulnerability:**
* HIT@5 ranges from approximately 0.03 to 0.98.
* At ROARkp budget = 0 and ROARqm budget = 0, HIT@5 is approximately 0.98.
* At ROARkp budget = 0 and ROARqm budget = 4, HIT@5 is approximately 0.03.
* The surface generally slopes downwards as ROARqm budget increases, reaching a minimum around ROARqm budget = 4.
* **(h) Targeted-Mitigation:**
* HIT@5 ranges from approximately 0.09 to 0.91.
* At ROARkp budget = 0 and ROARqm budget = 0, HIT@5 is approximately 0.91.
* At ROARkp budget = 0 and ROARqm budget = 4, HIT@5 is approximately 0.09.
* The surface generally slopes downwards as ROARqm budget increases, reaching a minimum around ROARqm budget = 4.
* **(i) Targeted-Diagnosis:**
* HIT@5 ranges from approximately 0.04 to 0.66.
* At ROARkp budget = 0 and ROARqm budget = 0, HIT@5 is approximately 0.66.
* At ROARkp budget = 0 and ROARqm budget = 4, HIT@5 is approximately 0.04.
* The surface generally slopes downwards as ROARqm budget increases, reaching a minimum around ROARqm budget = 4.
* **(j) Targeted-Treatment:**
* HIT@5 ranges from approximately 0.57 to 0.78.
* At ROARkp budget = 0 and ROARqm budget = 0, HIT@5 is approximately 0.78.
* At ROARkp budget = 0 and ROARqm budget = 4, HIT@5 is approximately 0.57.
* The surface generally slopes downwards as ROARqm budget increases, reaching a minimum around ROARqm budget = 4.
* **(k) Targeted-Freebase:**
* HIT@5 ranges from approximately 0.13 to 0.67.
* At ROARkp budget = 0 and ROARqm budget = 0, HIT@5 is approximately 0.67.
* At ROARkp budget = 0 and ROARqm budget = 4, HIT@5 is approximately 0.13.
* The surface generally slopes downwards as ROARqm budget increases, reaching a minimum around ROARqm budget = 4.
* **(l) Targeted-WordNet:**
* HIT@5 ranges from approximately 0.14 to 0.88.
* At ROARkp budget = 0 and ROARqm budget = 0, HIT@5 is approximately 0.88.
* At ROARkp budget = 0 and ROARqm budget = 4, HIT@5 is approximately 0.29.
* The surface generally slopes downwards as ROARqm budget increases, reaching a minimum around ROARqm budget = 4.
### Key Observations
* For Backdoor attacks, increasing both ROARkp and ROARqm budgets generally leads to a higher HIT@5, suggesting that removing and retraining with larger budgets is effective in mitigating these attacks.
* For Targeted attacks, increasing the ROARqm budget generally leads to a lower HIT@5, suggesting that removing and retraining with larger ROARqm budgets is detrimental to performance in these cases.
* The effectiveness of different defense mechanisms varies depending on the type of attack.
### Interpretation
The data suggests that the optimal strategy for mitigating attacks depends on the type of attack (Backdoor vs. Targeted) and the specific defense mechanism employed. For Backdoor attacks, increasing both ROARkp and ROARqm budgets appears to be beneficial. However, for Targeted attacks, increasing the ROARqm budget seems to degrade performance. This could be because removing and retraining based on ROARqm disproportionately removes important features for Targeted attacks, while for Backdoor attacks, it effectively removes the backdoor triggers. The ROARkp budget seems to have a more consistent positive impact across both attack types, suggesting that removing and retraining based on key phrases is generally helpful. The specific values of HIT@5 provide a quantitative measure of the effectiveness of each defense mechanism under different budget allocations, allowing for a more informed decision-making process when selecting and configuring defenses against adversarial attacks.
</details>
Figure 9: ROAR ${}_{\mathrm{co}}$ performance with varying budgets (ROAR ${}_{\mathrm{kp}}$ – $n_{\mathrm{g}}$ , ROAR ${}_{\mathrm{qm}}$ – $n_{\mathrm{q}}$ ). The measures are the absolute HIT@ $5$ after the attacks.
Q3: Alternative settings
Besides the influence of external factors, we also explore ROAR ’s performance under a set of alternative settings.
Alternative $p^{*}$ . Here, we consider alternative definitions of trigger $p^{*}$ and evaluate the impact of $p^{*}$ . Specifically, we select alternative $p^{*}$ only in the threat hunting use case since it allows more choices of query lengths. Besides the default definition (with Google Chrome as the anchor) in § 5.1, we consider two other definitions in Table 9: one with CAPEC-22 http://capec.mitre.org/data/definitions/22.html (attack pattern) as its anchor and its logical path is of length 2 for querying vulnerability and 3 for querying mitigation; the other with T1550.001 https://attack.mitre.org/techniques/T1550/001/ (attack technique) as its anchor is of length 3 for querying vulnerability and 4 for querying mitigation. Figure 8 summarizes ROAR ’s performance under these definitions. We have the following observations.
| anchor of $p^{*}$ | entity | $\mathsf{Google\,\,Chrome}$ | $\mathsf{CAPEC-22}$ | $\mathsf{T1550.001}$ |
| --- | --- | --- | --- | --- |
| category | product | attack pattern | technique | |
| length of $p^{*}$ | vulnerability | 1 hop | 2 hop | 3 hop |
| mitigation | 2 hop | 3 hop | 4 hop | |
Table 9: Alternative definitions of $p^{*}$ , where $\mathsf{Google\,\,Chrome}$ is the anchor of the default $p^{*}$ .
Shorter $p^{*}$ leads to more effective attacks. Comparing Figure 8 and Table 9, we observe that in general, the effectiveness of both ROAR ${}_{\mathrm{kp}}$ and ROAR ${}_{\mathrm{qm}}$ decreases with $p^{*}$ ’s length. This can be explained as follows. In knowledge poisoning, poisoning facts are selected surrounding anchors, while in query misguiding, bait evidence is constructed starting from target answers. Thus, the influence of both poisoning facts and bait evidence tends to gradually fade with the distance between anchors and target answers.
There exists delicate dynamics in ROAR ${}_{\mathrm{co}}$ . Observe that ROAR ${}_{\mathrm{co}}$ shows more complex dynamics with respect to the setting of $p^{*}$ . Compared with ROAR ${}_{\mathrm{kp}}$ , ROAR ${}_{\mathrm{co}}$ seems less sensitive to $p^{*}$ , with MRR $≥ 0.30$ and HIT@ $5$ $≥ 0.44$ under $p^{*}$ with T1550.001 in backdoor attacks; while in targeted attacks, ROAR ${}_{\mathrm{co}}$ performs slightly worse than ROAR ${}_{\mathrm{qm}}$ under the setting of mitigation queries and alternative definitions of $p^{*}$ . This can be explained by the interaction between the two attack vectors within ROAR ${}_{\mathrm{co}}$ : on one hand, the negative impact of $p^{*}$ ’s length on poisoning facts may be compensated by bait evidence; on the other hand, due to their mutual dependency in co-optimization, ineffective poisoning facts also negatively affect the generation of bait evidence.
Attack budgets. We further explore how to properly set the attack budgets in ROAR. We evaluate the attack performance as a function of $n_{\mathrm{g}}$ (number of poisoning facts) and $n_{\mathrm{q}}$ (number of bait evidence), with results summarized in Figure 9.
There exists an “mutual reinforcement” effect. In both backdoor and targeted cases, with one budget fixed, slightly increasing the other significantly improves ROAR ${}_{\mathrm{co}}$ ’s performance. For instance, in backdoor cases, when $n_{\mathrm{g}}=0$ , increasing $n_{\mathrm{q}}$ from 0 to 1 leads to 0.44 improvement in HIT@ $5$ , while increasing $n_{\mathrm{g}}=50$ leads to HIT@ $5$ $=0.58$ . Further, we also observe that ROAR ${}_{\mathrm{co}}$ can easily approach the optimal performance under the setting of $n_{\mathrm{g}}∈[50,100]$ and $n_{\mathrm{q}}∈[1,2]$ , indicating that ROAR ${}_{\mathrm{co}}$ does not require large attack budgets due to the mutual reinforcement effect.
Large budgets may not always be desired. Also, observe that ROAR has degraded performance when $n_{\mathrm{g}}$ is too large (e.g., $n_{\mathrm{g}}=200$ in the backdoor attacks). This may be explained by that a large budget may incur many noisy poisoning facts that negatively interfere with each other. Recall that in knowledge poisoning, ROAR generates poisoning facts in a greedy manner (i.e., top- $n_{\mathrm{g}}$ facts with the highest fitness scores in Algorithm 1) without considering their interactions. Further, due to the gap between the input and latent spaces, the input-space approximation may introduce additional noise in the generated poisoning facts. Thus, the attack performance may not be a monotonic function of $n_{\mathrm{g}}$ . Note that due to the practical constraints of poisoning real-world KGs, $n_{\mathrm{g}}$ tends to be small in practice [56].
We also observe similar trends measured by MRR with results shown in Figure 13 in Appendix§ B.4.
6 Discussion
6.1 Surrogate KG Construction
We now discuss why building the surrogate KG is feasible. In practice, the target KG is often (partially) built upon some public sources (e.g., Web) and needs to be constantly updated [61]. The adversary may obtain such public information to build the surrogate KG. For instance, to keep up with the constant evolution of cyber threats, threat intelligence KGs often include new threat reports from threat blogs and news [28], which are also accessible to the adversary.
In the evaluation, we simulate the construction of the surrogate KG by randomly removing a fraction of facts from the target KG (50% by default). By controlling the overlapping ratio between the surrogate and target KGs (Figure 6), we show the impact of the knowledge about the target KG on the attack performance.
Zero-knowledge attacks. In the extreme case, the adversary has little knowledge about the target KG and thus cannot build a surrogate KG directly. However, if the query interface of KGR is publicly accessible (as in many cases [8, 2, 12]), the adversary is often able to retrieve subsets of entities and relations from the backend KG and construct a surrogate KG. Specifically, the adversary may use a breadth-first traversal approach to extract a sub-KG: beginning with a small set of entities, at each iteration, the adversary chooses an entity as the anchor and explores all possible relations by querying for entities linked to the anchor through a specific relation; if the query returns a valid response, the adversary adds the entity to the current sub-KG. We consider exploring zero-knowledge attacks as our ongoing work.
6.2 Potential countermeasures
We investigate two potential countermeasures tailored to knowledge poisoning and query misguiding.
<details>
<summary>x10.png Details</summary>

### Visual Description
## Bar Chart: HIT@5 Comparison of Different Methods
### Overview
The image presents a series of bar charts comparing the performance of four methods (BLI, BLII, ROARkp, and ROARco) across different scenarios: Backdoor-Vulnerability, Backdoor-Diagnosis, Backdoor-Freebase, Targeted-Vulnerability, Targeted-Diagnosis, and Targeted-Freebase. Performance is measured by HIT@5 at different percentages (0%, 10%, and 30%).
### Components/Axes
* **Y-axis:** HIT@5, ranging from 0.00 to 1.00, with increments of 0.25.
* **X-axis:** Percentage values (0%, 10%, 30%) representing different conditions or parameters.
* **Chart Titles:** (a) Backdoor-Vulnerability, (b) Backdoor-Diagnosis, (c) Backdoor-Freebase, (d) Targeted-Vulnerability, (e) Targeted-Diagnosis, (f) Targeted-Freebase.
* **Legend:** Located at the top-right of the first row of charts and repeated in the second row.
* BLI: Green
* BLII: Teal
* ROARkp: Light Blue
* ROARco: Dark Blue
### Detailed Analysis
**Chart (a): Backdoor-Vulnerability**
* **BLI (Green):** Values are low across all percentages.
* 0%: 0.12
* 10%: 0.04
* 30%: 0.00
* **BLII (Teal):** Values are also low.
* 0%: 0.05
* 10%: 0.00
* 30%: 0.00
* **ROARkp (Light Blue):** Values increase with percentage.
* 0%: 0.55
* 10%: 0.45
* 30%: 0.32
* **ROARco (Dark Blue):** Values are the highest.
* 0%: 0.71
* 10%: 0.57
* 30%: 0.43
**Chart (b): Backdoor-Diagnosis**
* **BLI (Green):** Values are low across all percentages.
* 0%: 0.22
* 10%: 0.13
* 30%: 0.08
* **BLII (Teal):** Values are very low.
* 0%: 0.02
* 10%: 0.00
* 30%: 0.00
* **ROARkp (Light Blue):** Values increase with percentage.
* 0%: 0.37
* 10%: 0.30
* 30%: 0.20
* **ROARco (Dark Blue):** Values are the highest.
* 0%: 0.52
* 10%: 0.44
* 30%: 0.39
**Chart (c): Backdoor-Freebase**
* **BLI (Green):** Values are low across all percentages.
* 0%: 0.12
* 10%: 0.10
* 30%: 0.04
* **BLII (Teal):** Values are very low.
* 0%: 0.09
* 10%: 0.03
* 30%: 0.00
* **ROARkp (Light Blue):** Values increase with percentage.
* 0%: 0.62
* 10%: 0.56
* 30%: 0.44
* **ROARco (Dark Blue):** Values are the highest.
* 0%: 0.88
* 10%: 0.70
* 30%: 0.57
**Chart (d): Targeted-Vulnerability**
* **BLI (Green):** Values are high and decrease with percentage.
* 0%: 0.88
* 10%: 0.90
* 30%: 0.71
* **BLII (Teal):** Values are also high and decrease with percentage.
* 0%: 0.93
* 10%: 0.95
* 30%: 0.74
* **ROARkp (Light Blue):** Values increase with percentage.
* 0%: 0.72
* 10%: 0.80
* 30%: 0.68
* **ROARco (Dark Blue):** Values are the lowest.
* 0%: 0.06
* 10%: 0.13
* 30%: 0.39
**Chart (e): Targeted-Diagnosis**
* **BLI (Green):** Values are high and decrease with percentage.
* 0%: 0.62
* 10%: 0.68
* 30%: 0.68
* **BLII (Teal):** Values are also high and decrease with percentage.
* 0%: 0.65
* 10%: 0.76
* 30%: 0.66
* **ROARkp (Light Blue):** Values increase with percentage.
* 0%: 0.62
* 10%: 0.54
* 30%: 0.55
* **ROARco (Dark Blue):** Values are the lowest.
* 0%: 0.01
* 10%: 0.10
* 30%: 0.20
**Chart (f): Targeted-Freebase**
* **BLI (Green):** Values are high and decrease with percentage.
* 0%: 0.56
* 10%: 0.60
* 30%: 0.53
* **BLII (Teal):** Values are also high and decrease with percentage.
* 0%: 0.61
* 10%: 0.62
* 30%: 0.54
* **ROARkp (Light Blue):** Values increase with percentage.
* 0%: 0.33
* 10%: 0.41
* 30%: 0.52
* **ROARco (Dark Blue):** Values are the lowest.
* 0%: 0.23
* 10%: 0.37
* 30%: 0.40
### Key Observations
* ROARco consistently shows the highest HIT@5 in the "Backdoor" scenarios.
* BLI and BLII generally perform poorly in the "Backdoor" scenarios.
* In "Targeted" scenarios, BLI and BLII show higher HIT@5 values compared to ROARco.
* The HIT@5 values for BLI and BLII tend to decrease as the percentage increases in "Targeted" scenarios.
* The HIT@5 values for ROARco tend to increase as the percentage increases in "Targeted" scenarios.
### Interpretation
The data suggests that ROARco is more effective in scenarios involving backdoors, while BLI and BLII are more effective in targeted scenarios, especially at lower percentages. This could indicate that ROARco is better at identifying and mitigating backdoor vulnerabilities, while BLI and BLII are better at targeting specific vulnerabilities. The performance variation with percentage changes suggests that the effectiveness of each method is influenced by the specific conditions or parameters represented by the percentage values. The "Targeted" scenarios show an inverse relationship between the BL methods and the ROAR methods, indicating that they may be optimized for different aspects of vulnerability detection or mitigation.
</details>
Figure 10: Attack performance (HIT@ $5$ ) on target queries ${\mathcal{Q}}^{*}$ . The measures are the absolute HIT@ $5$ after the attacks.
Filtering of poisoning facts. Intuitively, as they are artificially injected, poisoning facts tend to be misaligned with their neighboring entities/relations in KGs. Thus, we propose to detect misaligned facts and filter them out to mitigate the influence of poisoning facts. Specifically, we use Eq. 5 to measure the “fitness” of each fact $v\mathrel{\text{\scriptsize$\xrightarrow[]{r}$}}v^{\prime}$ and then remove $m\%$ of the facts with the lowest fitness scores.
<details>
<summary>x11.png Details</summary>

### Visual Description
## Surface Charts: Attack vs. Defense Budget Impact on HIT@5
### Overview
The image presents six 3D surface charts, each visualizing the relationship between attack budget, defense budget, and HIT@5 (Hit Rate at 5) for different scenarios: Backdoor-Vulnerability, Backdoor-Diagnosis, Backdoor-Freebase, Targeted-Vulnerability, Targeted-Diagnosis, and Targeted-Freebase. The x and y axes represent the attack and defense budgets, respectively, ranging from 0 to 4. The z-axis represents the HIT@5 score, ranging from 0 to 1.0. The surface color transitions from purple (low HIT@5) to light blue (high HIT@5), indicating the performance level.
### Components/Axes
* **X-axis:** Attack budget (0 to 4)
* **Y-axis:** Defense budget (0 to 4)
* **Z-axis:** HIT@5 (Hit Rate at 5), ranging from 0 to 1.0, with varying maximum values depending on the chart.
* **Surface Color:** Purple (low HIT@5) to light blue (high HIT@5)
### Detailed Analysis
**Chart (a): Backdoor-Vulnerability**
* Trend: HIT@5 increases as the defense budget increases, and also increases as the attack budget increases.
* Data Points:
* (Attack Budget 0, Defense Budget 0): HIT@5 ≈ 0.55
* (Attack Budget 4, Defense Budget 0): HIT@5 ≈ 0.54
* (Attack Budget 0, Defense Budget 4): HIT@5 ≈ 0.80
* (Attack Budget 0, Defense Budget 4): HIT@5 ≈ 0.68
**Chart (b): Backdoor-Diagnosis**
* Trend: HIT@5 increases as the defense budget increases, and also increases as the attack budget increases.
* Data Points:
* (Attack Budget 0, Defense Budget 0): HIT@5 ≈ 0.37
* (Attack Budget 4, Defense Budget 0): HIT@5 ≈ 0.34
* (Attack Budget 0, Defense Budget 4): HIT@5 ≈ 0.67
* (Attack Budget 4, Defense Budget 4): HIT@5 ≈ 0.38
**Chart (c): Backdoor-Freebase**
* Trend: HIT@5 increases as the defense budget increases, and also increases as the attack budget increases.
* Data Points:
* (Attack Budget 0, Defense Budget 0): HIT@5 ≈ 0.62
* (Attack Budget 4, Defense Budget 0): HIT@5 ≈ 0.51
* (Attack Budget 0, Defense Budget 4): HIT@5 ≈ 0.84
* (Attack Budget 4, Defense Budget 4): HIT@5 ≈ 0.50
**Chart (d): Targeted-Vulnerability**
* Trend: HIT@5 increases as the defense budget increases, and also increases as the attack budget increases.
* Data Points:
* (Attack Budget 0, Defense Budget 0): HIT@5 ≈ 0.72
* (Attack Budget 4, Defense Budget 0): HIT@5 ≈ 0.02
* (Attack Budget 0, Defense Budget 4): HIT@5 ≈ 0.70
* (Attack Budget 4, Defense Budget 4): HIT@5 ≈ 0.13
**Chart (e): Targeted-Diagnosis**
* Trend: HIT@5 increases as the defense budget increases, and also increases as the attack budget increases.
* Data Points:
* (Attack Budget 0, Defense Budget 0): HIT@5 ≈ 0.44
* (Attack Budget 4, Defense Budget 0): HIT@5 ≈ 0.00
* (Attack Budget 0, Defense Budget 4): HIT@5 ≈ 0.40
* (Attack Budget 4, Defense Budget 4): HIT@5 ≈ 0.05
**Chart (f): Targeted-Freebase**
* Trend: HIT@5 increases as the defense budget increases, and also increases as the attack budget increases.
* Data Points:
* (Attack Budget 0, Defense Budget 0): HIT@5 ≈ 0.33
* (Attack Budget 4, Defense Budget 0): HIT@5 ≈ 0.14
* (Attack Budget 0, Defense Budget 4): HIT@5 ≈ 0.35
* (Attack Budget 4, Defense Budget 4): HIT@5 ≈ 0.22
### Key Observations
* In all scenarios, increasing the defense budget generally leads to a higher HIT@5 score.
* The Backdoor scenarios (a, b, c) generally have higher HIT@5 scores compared to the Targeted scenarios (d, e, f).
* The Targeted-Diagnosis scenario (e) shows the lowest HIT@5 scores overall.
### Interpretation
The charts illustrate the impact of attack and defense budgets on the success rate (HIT@5) of different attack scenarios. The Backdoor scenarios appear to be more resilient, maintaining a higher HIT@5 even with varying attack and defense budgets. In contrast, the Targeted scenarios are more sensitive to the attack and defense budgets, with Targeted-Diagnosis being particularly vulnerable. This suggests that different attack strategies require different defense mechanisms and resource allocation strategies. The data indicates that investing in defense is generally beneficial, but the optimal allocation depends on the specific threat model.
</details>
Figure 11: Performance of ROAR ${}_{\mathrm{co}}$ against adversarial training with respect to varying settings of attack $n_{\mathrm{q}}$ and defense $n_{\mathrm{q}}$ (note: in targeted attacks, the attack performance is measured by the HIT@ $5$ drop).
Table 10 measures the KGR performance on non-target queries ${\mathcal{Q}}\setminus{\mathcal{Q}}^{*}$ and the Figure 10 measures attack performance on target queries ${\mathcal{Q}}^{*}$ as functions of $m$ . We have the following observations. (i) The filtering degrades the attack performance. For instance, the HIT@ $5$ of ROAR ${}_{\mathrm{kp}}$ drops by 0.23 in the backdoor attacks against vulnerability queries as $m$ increases from 10 to 30. (ii) Compared with ROAR ${}_{\mathrm{kp}}$ , ROAR ${}_{\mathrm{co}}$ is less sensitive to filtering, which is explained by its use of both knowledge poisoning and query misguiding, with one attack vector compensating for the other. (iii) The filtering also significantly impacts the KGR performance (e.g., its HIT@ $5$ drops by 0.28 under $m$ = 30), suggesting the inherent trade-off between attack resilience and KGR performance.
| Query | Removal ratio ( $m\%$ ) | | |
| --- | --- | --- | --- |
| 0% | 10% | 30% | |
| vulnerability | 1.00 | 0.93 | 0.72 |
| diagnosis | 0.87 | 0.84 | 0.67 |
| Freebase | 0.70 | 0.66 | 0.48 |
Table 10: KGR performance (HIT@ $5$ ) on non-target queries ${\mathcal{Q}}\setminus{\mathcal{Q}}^{*}$ .
Training with adversarial queries. We further extend the adversarial training [48] strategy to defend against ROAR ${}_{\mathrm{co}}$ . Specifically, we generate an adversarial version $q^{*}$ for each query $q$ using ROAR ${}_{\mathrm{co}}$ and add $(q^{*},\llbracket q\rrbracket)$ to the training set, where $\llbracket q\rrbracket$ is $q$ ’s ground-truth answer.
We measure the performance of ROAR ${}_{\mathrm{co}}$ under varying settings of $n_{\text{q}}$ used in ROAR ${}_{\mathrm{co}}$ and that used in adversarial training, with results shown in Figure 11. Observe that adversarial training degrades the attack performance against the backdoor attacks (Figure 11 a-c) especially when the defense $n_{\text{q}}$ is larger than the attack $n_{\text{q}}$ . However, the defense is much less effective on the targeted attacks (Figure 11 d-f). This can be explained by the larger attack surface of targeted attacks, which only need to force erroneous reasoning rather than backdoor reasoning. Further, it is inherently ineffective against ROAR ${}_{\mathrm{kp}}$ (when the attack $n_{\text{q}}=0$ in ROAR ${}_{\mathrm{co}}$ ), which does not rely on query misguiding.
We can thus conclude that, to defend against the threats to KGR, it is critical to (i) integrate multiple defense mechanisms and (ii) balance attack resilience and KGR performance.
6.3 Limitations
Other threat models and datasets. While ROAR instantiates several attacks in the threat taxonomy in § 3, there are many other possible attacks against KGR. For example, if the adversary has no knowledge about the KGs used in the KGR systems, is it possible to build surrogate KGs from scratch or construct attacks that transfer across different KG domains? Further, the properties of specific KGs (e.g., size, connectivity, and skewness) may potentially bias our findings. We consider exploring other threat models and datasets from other domains as our ongoing research.
Alternative reasoning tasks. We mainly focus on reasoning tasks with one target entity. There exist other reasoning tasks (e.g., path reasoning [67] finds a logical path with given starting and end entities). Intuitively, ROAR is ineffective in such tasks as it requires knowledge about the logical path to perturb intermediate entities on the path. It is worth exploring the vulnerability of such alternative reasoning tasks.
Input-space attacks. While ROAR directly operates on KGs (or queries), there are scenarios in which KGs (or queries) are extracted from real-world inputs. For instance, threat-hunting queries may be generated based on software testing and inspection. In such scenarios, it requires the perturbation to KGs (or queries) to be mapped to valid inputs (e.g., functional programs).
7 Related work
Machine learning security. Machine learning models are becoming the targets of various attacks [20]: adversarial evasion crafts adversarial inputs to deceive target models [31, 24]; model poisoning modifies target models’ behavior by polluting training data [39]; backdoor injection creates trojan models such that trigger-embedded inputs are misclassified [46, 43]; functionality stealing constructs replicate models functionally similar to victim models [64]. In response, intensive research is conducted on improving the attack resilience of machine learning models. For instance, existing work explores new training strategies (e.g., adversarial training) [48] and detection mechanisms [29, 42] against adversarial evasion. Yet, such defenses often fail when facing adaptive attacks [17, 45], resulting in a constant arms race.
Graph learning security. Besides general machine learning security, one line of work focuses on the vulnerability of graph learning [41, 65, 69], including adversarial [72, 66, 21], poisoning [73], and backdoor [68] attacks. This work differs from existing attacks against graph learning in several major aspects. (i) Data complexity – while KGs are special forms of graphs, they contain much richer relational information beyond topological structures. (ii) Attack objectives – we focus on attacking the logical reasoning task, whereas most existing attacks aim at the classification [72, 66, 73] or link prediction task [21]. (iii) Roles of graphs/KGs – we target KGR systems with KGs as backend knowledge bases while existing attacks assume graphs as input data to graph learning. (iv) Attack vectors – we generate plausible poisoning facts or bait evidence, which are specifically applicable to KGR; in contrast, previous attacks directly perturb graph structures [66, 21, 73] or node features [72, 68].
Knowledge graph security. The security risks of KGs are gaining growing attention [70, 19, 18, 54, 56]. Yet, most existing work focuses on the task of link prediction (KG completion) and the attack vector of directly modifying KGs. This work departs from prior work in major aspects: (i) we consider reasoning tasks (e.g., processing logical queries), which require vastly different processing from predictive tasks (details in Section § 2); (ii) existing attacks rely on directly modifying the topological structures of KGs (e.g., adding/deleting edges) without accounting for their semantics, while we assume the adversary influences KGR through indirect means with semantic constraints (e.g., injecting probable relations or showing misleading evidence); (iii) we evaluate the attacks in real-world KGR applications; and (iv) we explore potential countermeasures against the proposed attacks.
8 Conclusion
This work represents a systematic study of the security risks of knowledge graph reasoning (KGR). We present ROAR, a new class of attacks that instantiate a variety of threats to KGR. We demonstrate the practicality of ROAR in domain-specific and general KGR applications, raising concerns about the current practice of training and operating KGR. We also discuss potential mitigation against ROAR, which sheds light on applying KGR in a more secure manner.
References
- [1] CVE Details. https://www.cvedetails.com.
- [2] Cyscale Complete cloud visibility & control platform. https://cyscale.com.
- [3] DRKG - Drug Repurposing Knowledge Graph for Covid-19. https://github.com/gnn4dr/DRKG/.
- [4] DrugBank. https://go.drugbank.com.
- [5] Freebase (database). https://en.wikipedia.org/wiki/Freebase_(database).
- [6] Gartner Identifies Top 10 Data and Analytics Technology Trends for 2021. https://www.gartner.com/en/newsroom/press-releases/2021-03-16-gartner-identifies-top-10-data-and-analytics-technologies-trends-for-2021.
- [7] Hetionet. https://het.io.
- [8] Knowledge Graph Search API. https://developers.google.com/knowledge-graph.
- [9] Logrhythm MITRE ATT&CK Module. https://docs.logrhythm.com/docs/kb/threat-detection.
- [10] MITRE ATT&CK. https://attack.mitre.org.
- [11] National Vulnerability Database. https://nvd.nist.gov.
- [12] QIAGEN Clinical Analysis and Interpretation Services. https://digitalinsights.qiagen.com/services-overview/clinical-analysis-and-interpretation-services/.
- [13] The QIAGEN Knowledge Base. https://resources.qiagenbioinformatics.com/flyers-and-brochures/QIAGEN_Knowledge_Base.pdf.
- [14] YAGO: A High-Quality Knowledge Base. https://yago-knowledge.org/.
- [15] Manos Antonakakis, Tim April, Michael Bailey, Matt Bernhard, Elie Bursztein, Jaime Cochran, Zakir Durumeric, J. Alex Halderman, Luca Invernizzi, Michalis Kallitsis, Deepak Kumar, Chaz Lever, Zane Ma, Joshua Mason, Damian Menscher, Chad Seaman, Nick Sullivan, Kurt Thomas, and Yi Zhou. Understanding the Mirai Botnet. In Proceedings of USENIX Security Symposium (SEC), 2017.
- [16] Erik Arakelyan, Daniel Daza, Pasquale Minervini, and Michael Cochez. Complex Query Answering with Neural Link Predictors. In Proceedings of International Conference on Learning Representations (ICLR), 2021.
- [17] Anish Athalye, Nicholas Carlini, and David Wagner. Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples. In Proceedings of IEEE Conference on Machine Learning (ICML), 2018.
- [18] Peru Bhardwaj, John Kelleher, Luca Costabello, and Declan O’Sullivan. Adversarial Attacks on Knowledge Graph Embeddings via Instance Attribution Methods. Proceedings of Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021.
- [19] Peru Bhardwaj, John Kelleher, Luca Costabello, and Declan O’Sullivan. Poisoning Knowledge Graph Embeddings via Relation Inference Patterns. ArXiv e-prints, 2021.
- [20] Battista Biggio and Fabio Roli. Wild Patterns: Ten Years after The Rise of Adversarial Machine Learning. Pattern Recognition, 84:317–331, 2018.
- [21] Aleksandar Bojchevski and Stephan Günnemann. Adversarial Attacks on Node Embeddings via Graph Poisoning. In Proceedings of IEEE Conference on Machine Learning (ICML), 2019.
- [22] Antoine Bordes, Nicolas Usunier, Alberto Garcia-Durán, Jason Weston, and Oksana Yakhnenko. Translating Embeddings for Modeling Multi-Relational Data. In Proceedings of Advances in Neural Information Processing Systems (NeurIPS), 2013.
- [23] Nicholas Carlini, Matthew Jagielski, Christopher A Choquette-Choo, Daniel Paleka, Will Pearce, Hyrum Anderson, Andreas Terzis, Kurt Thomas, and Florian Tramèr. Poisoning Web-Scale Training Datasets is Practical. In ArXiv e-prints, 2023.
- [24] Nicholas Carlini and David A. Wagner. Towards Evaluating the Robustness of Neural Networks. In Proceedings of IEEE Symposium on Security and Privacy (S&P), 2017.
- [25] Antonio Emanuele Cinà, Kathrin Grosse, Ambra Demontis, Sebastiano Vascon, Werner Zellinger, Bernhard A Moser, Alina Oprea, Battista Biggio, Marcello Pelillo, and Fabio Roli. Wild Patterns Reloaded: A Survey of Machine Learning Security against Training Data Poisoning. In ArXiv e-prints, 2022.
- [26] The Conversation. Study Shows AI-generated Fake Cybersecurity Reports Fool Experts. https://theconversation.com/study-shows-ai-generated-fake-reports-fool-experts-160909.
- [27] Nilesh Dalvi and Dan Suciu. Efficient Query Evaluation on Probabilistic Databases. The VLDB Journal, 2007.
- [28] Peng Gao, Fei Shao, Xiaoyuan Liu, Xusheng Xiao, Zheng Qin, Fengyuan Xu, Prateek Mittal, Sanjeev R Kulkarni, and Dawn Song. Enabling Efficient Cyber Threat Hunting with Cyber Threat Intelligence. In Proceedings of International Conference on Data Engineering (ICDE), 2021.
- [29] Timon Gehr, Matthew Mirman, Dana Drachsler-Cohen, Petar Tsankov, Swarat Chaudhuri, and Martin Vechev. AI2: Safety and Robustness Certification of Neural Networks with Abstract Interpretation. In Proceedings of IEEE Symposium on Security and Privacy (S&P), 2018.
- [30] Fan Gong, Meng Wang, Haofen Wang, Sen Wang, and Mengyue Liu. SMR: Medical Knowledge Graph Embedding for Safe Medicine Recommendation. Big Data Research, 2021.
- [31] Ian Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and Harnessing Adversarial Examples. In Proceedings of International Conference on Learning Representations (ICLR), 2015.
- [32] Kelvin Guu, John Miller, and Percy Liang. Traversing Knowledge Graphs in Vector Space. In Proceedings of Conference on Empirical Methods in Natural Language Processing (EMNLP), 2015.
- [33] William L. Hamilton, Payal Bajaj, Marinka Zitnik, Dan Jurafsky, and Jure Leskovec. Embedding Logical Queries on Knowledge Graphs. In Proceedings of Advances in Neural Information Processing Systems (NeurIPS), 2018.
- [34] Wajih Ul Hassan, Adam Bates, and Daniel Marino. Tactical Provenance Analysis for Endpoint Detection and Response Systems. In Proceedings of IEEE Symposium on Security and Privacy (S&P), 2020.
- [35] Shizhu He, Kang Liu, Guoliang Ji, and Jun Zhao. Learning to Represent Knowledge Graphs with Gaussian Embedding. In Proceeddings of ACM Conference on Information and Knowledge Management (CIKM), 2015.
- [36] Erik Hemberg, Jonathan Kelly, Michal Shlapentokh-Rothman, Bryn Reinstadler, Katherine Xu, Nick Rutar, and Una-May O’Reilly. Linking Threat Tactics, Techniques, and Patterns with Defensive Weaknesses, Vulnerabilities and Affected Platform Configurations for Cyber Hunting. ArXiv e-prints, 2020.
- [37] Keman Huang, Michael Siegel, and Stuart Madnick. Systematically Understanding the Cyber Attack Business: A Survey. ACM Computing Surveys (CSUR), 2018.
- [38] Haozhe Ji, Pei Ke, Shaohan Huang, Furu Wei, Xiaoyan Zhu, and Minlie Huang. Language Generation with Multi-hop Reasoning on Commonsense Knowledge Graph. In Proceedings of Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020.
- [39] Yujie Ji, Xinyang Zhang, Shouling Ji, Xiapu Luo, and Ting Wang. Model-Reuse Attacks on Deep Learning Systems. In Proceedings of ACM Conference on Computer and Communications (CCS), 2018.
- [40] Peter E Kaloroumakis and Michael J Smith. Toward a Knowledge Graph of Cybersecurity Countermeasures. The MITRE Corporation, 2021.
- [41] Thomas N. Kipf and Max Welling. Semi-Supervised Classification with Graph Convolutional Networks. In Proceedings of International Conference on Learning Representations (ICLR), 2017.
- [42] Changjiang Li, Shouling Ji, Haiqin Weng, Bo Li, Jie Shi, Raheem Beyah, Shanqing Guo, Zonghui Wang, and Ting Wang. Towards Certifying the Asymmetric Robustness for Neural Networks: Quantification and Applications. IEEE Transactions on Dependable and Secure Computing, 19(6):3987–4001, 2022.
- [43] Changjiang Li, Ren Pang, Zhaohan Xi, Tianyu Du, Shouling Ji, Yuan Yao, and Ting Wang. Demystifying Self-supervised Trojan Attacks. 2022.
- [44] Bill Yuchen Lin, Xinyue Chen, Jamin Chen, and Xiang Ren. Kagnet: Knowledge-aware Graph Networks for Commonsense Reasoning. In Proceedings of Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019.
- [45] Xiang Ling, Shouling Ji, Jiaxu Zou, Jiannan Wang, Chunming Wu, Bo Li, and Ting Wang. DEEPSEC: A Uniform Platform for Security Analysis of Deep Learning Model. In Proceedings of IEEE Symposium on Security and Privacy (S&P), 2019.
- [46] Yingqi Liu, Shiqing Ma, Yousra Aafer, Wen-Chuan Lee, Juan Zhai, Weihang Wang, and Xiangyu Zhang. Trojaning Attack on Neural Networks. In Proceedings of Network and Distributed System Security Symposium (NDSS), 2018.
- [47] Logrhythm. Using MITRE ATT&CK in Threat Hunting and Detection. https://logrhythm.com/uws-using-mitre-attack-in-threat-hunting-and-detection-white-paper/.
- [48] Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. Towards Deep Learning Models Resistant to Adversarial Attacks. In Proceedings of International Conference on Learning Representations (ICLR), 2018.
- [49] Fabrizio Mafessoni, Rashmi B Prasad, Leif Groop, Ola Hansson, and Kay Prüfer. Turning Vice into Virtue: Using Batch-effects to Detect Errors in Large Genomic Data Sets. Genome biology and evolution, 10(10):2697–2708, 2018.
- [50] Sadegh M Milajerdi, Rigel Gjomemo, Birhanu Eshete, Ramachandran Sekar, and VN Venkatakrishnan. Holmes: Real-time APT Detection through Correlation of Suspicious Information Flows. In Proceedings of IEEE Symposium on Security and Privacy (S&P), 2019.
- [51] Shaswata Mitra, Aritran Piplai, Sudip Mittal, and Anupam Joshi. Combating Fake Cyber Threat Intelligence Using Provenance in Cybersecurity Knowledge Graphs. In 2021 IEEE International Conference on Big Data (Big Data). IEEE, 2021.
- [52] Sudip Mittal, Anupam Joshi, and Tim Finin. Cyber-all-intel: An AI for Security Related Threat Intelligence. In ArXiv e-prints, 2019.
- [53] Bethany Percha and Russ B Altman. A Global Network of Biomedical Relationships Derived from Text. Bioinformatics, 2018.
- [54] Pouya Pezeshkpour, Yifan Tian, and Sameer Singh. Investigating Robustness and Interpretability of Link Prediction via Adversarial Modifications. ArXiv e-prints, 2019.
- [55] Radware. “BrickerBot” Results In Permanent Denial-of-Service. https://www.radware.com/security/ddos-threats-attacks/brickerbot-pdos-permanent-denial-of-service/.
- [56] Mrigank Raman, Aaron Chan, Siddhant Agarwal, Peifeng Wang, Hansen Wang, Sungchul Kim, Ryan Rossi, Handong Zhao, Nedim Lipka, and Xiang Ren. Learning to Deceive Knowledge Graph Augmented Models via Targeted Perturbation. Proceedings of International Conference on Learning Representations (ICLR), 2021.
- [57] Priyanka Ranade, Aritran Piplai, Sudip Mittal, Anupam Joshi, and Tim Finin. Generating Fake Cyber Threat Intelligence Using Transformer-based Models. In 2021 International Joint Conference on Neural Networks (IJCNN). IEEE, 2021.
- [58] Hongyu Ren, Hanjun Dai, Bo Dai, Xinyun Chen, Michihiro Yasunaga, Haitian Sun, Dale Schuurmans, Jure Leskovec, and Denny Zhou. LEGO: Latent Execution-Guided Reasoning for Multi-Hop Question Answering on Knowledge Graphs. In Proceedings of IEEE Conference on Machine Learning (ICML), 2021.
- [59] Hongyu Ren, Weihua Hu, and Jure Leskovec. Query2box: Reasoning over Knowledge Graphs in Vector Space using Box Embeddings. In Proceedings of International Conference on Learning Representations (ICLR), 2020.
- [60] Hongyu Ren and Jure Leskovec. Beta Embeddings for Multi-Hop Logical Reasoning in Knowledge Graphs. In Proceedings of Advances in Neural Information Processing Systems (NeurIPS), 2020.
- [61] Anderson Rossanez, Julio Cesar dos Reis, Ricardo da Silva Torres, and Hèléne de Ribaupierre. KGen: A Knowledge Graph Generator from Biomedical Scientific Literature. BMC Medical Informatics and Decision Making, 20(4):314, 2020.
- [62] Alberto Santos, Ana R Colaço, Annelaura B Nielsen, Lili Niu, Maximilian Strauss, Philipp E Geyer, Fabian Coscia, Nicolai J Wewer Albrechtsen, Filip Mundt, Lars Juhl Jensen, et al. A Knowledge Graph to Interpret Clinical Proteomics Data. Nature Biotechnology, 2022.
- [63] Komal Teru, Etienne Denis, and Will Hamilton. Inductive Relation Prediction by Subgraph Reasoning. In Proceedings of IEEE Conference on Machine Learning (ICML), 2020.
- [64] Florian Tramèr, Fan Zhang, Ari Juels, Michael K. Reiter, and Thomas Ristenpart. Stealing Machine Learning Models via Prediction APIs. In Proceedings of USENIX Security Symposium (SEC), 2016.
- [65] Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. Graph Attention Networks. In Proceedings of International Conference on Learning Representations (ICLR), 2018.
- [66] Binghui Wang and Neil Zhenqiang Gong. Attacking Graph-based Classification via Manipulating the Graph Structure. In Proceedings of ACM SAC Conference on Computer and Communications (CCS), 2019.
- [67] Xiang Wang, Dingxian Wang, Canran Xu, Xiangnan He, Yixin Cao, and Tat-Seng Chua. Explainable Reasoning over Knowledge Graphs for Recommendation. In Proceedings of AAAI Conference on Artificial Intelligence (AAAI), 2019.
- [68] Zhaohan Xi, Ren Pang, Shouling Ji, and Ting Wang. Graph backdoor. In Proceedings of USENIX Security Symposium (SEC), 2021.
- [69] Kaidi Xu, Hongge Chen, Sijia Liu, Pin-Yu Chen, Tsui-Wei Weng, Mingyi Hong, and Xue Lin. Topology Attack and Defense for Graph Neural Networks: An Optimization Perspective. In Proceedings of International Joint Conference on Artificial Intelligence (IJCAI), 2019.
- [70] Hengtong Zhang, Tianhang Zheng, Jing Gao, Chenglin Miao, Lu Su, Yaliang Li, and Kui Ren. Data Poisoning Attack against Knowledge Graph Embedding. In Proceedings of International Joint Conference on Artificial Intelligence (IJCAI), 2019.
- [71] Yongjun Zhu, Chao Che, Bo Jin, Ningrui Zhang, Chang Su, and Fei Wang. Knowledge-driven drug repurposing using a comprehensive drug knowledge graph. Health Informatics Journal, 2020.
- [72] Daniel Zügner, Amir Akbarnejad, and Stephan Günnemann. Adversarial Attacks on Neural Networks for Graph Data. In Proceedings of ACM International Conference on Knowledge Discovery and Data Mining (KDD), 2018.
- [73] Daniel Zügner and Stephan Günnemann. Adversarial Attacks on Graph Neural Networks via Meta Learning. In Proceedings of International Conference on Learning Representations (ICLR), 2019.
Appendix A Notations
Table 11 summarizes notations and definitions used through this paper.
| Notation | Definition |
| --- | --- |
| Knowledge graph related | |
| ${\mathcal{G}}$ | a knowledge graph (KG) |
| ${\mathcal{G}}^{\prime}$ | a surrogate knowledge graph |
| $\langle v,r,v^{\prime}\rangle$ | a KG fact from entity $v$ to $v^{\prime}$ with relation $r$ |
| ${\mathcal{N}},{\mathcal{E}},{\mathcal{R}}$ | entity, edge, and relation set of ${\mathcal{G}}$ |
| ${\mathcal{G}}^{+}$ | the poisoning facts on KG |
| Query related | |
| $q$ | a single query |
| $\llbracket q\rrbracket$ | $q$ ’s ground-truth answer(s) |
| $a^{*}$ | the targeted answer |
| ${\mathcal{A}}_{q}$ | anchor entities of query $q$ |
| $p^{*}$ | the trigger pattern |
| ${\mathcal{Q}}$ | a query set |
| ${\mathcal{Q}}^{*}$ | a query set of interest (each $q∈{\mathcal{Q}}^{*}$ contains $p^{*}$ ) |
| $q^{+}$ | the generated bait evidence |
| $q^{*}$ | the infected query, i.e. $q^{*}=q\wedge q^{+}$ |
| Model or embedding related | |
| $\phi$ | a general symbol to represent embeddings |
| $\phi_{{\mathcal{G}}}$ | embeddings of all KG entities |
| $\phi_{v}$ | entity $v$ ’s embedding |
| $\phi_{q}$ | $q$ ’s embedding |
| $\phi_{{\mathcal{G}}^{+}}$ | embeddings we aim to perturb |
| $\phi_{q^{+}}$ | $q^{+}$ ’s embedding |
| $\psi$ | the logical operator(s) |
| $\psi_{r}$ | the relation ( $r$ )-specific operator |
| $\psi_{\wedge}$ | the intersection operator |
| Other parameters | |
| $n_{\mathrm{g}}$ | knowledge poisoning budget |
| $n_{\mathrm{q}}$ | query misguiding budget |
Table 11: Notations, definitions, and categories.
Appendix B Additonal details
B.1 KGR training
Following [59], we train KGR in an end-to-end manner. Specifically, given KG ${\mathcal{G}}$ and the randomly initialized embedding function $\phi$ and transformation function $\psi$ , we sample a set of query-answer pairs $(q,\llbracket q\rrbracket)$ from ${\mathcal{G}}$ to form the training set and optimize $\phi$ and $\psi$ to minimize the loss function, which is defined as the embedding distance between the prediction regarding each $q$ and $\llbracket q\rrbracket$ .
B.2 Parameter setting
Table 12 lists the default parameter setting used in § 5.
| Type | Parameter | Setting |
| --- | --- | --- |
| KGR | $\phi$ dimension | 300 |
| $\phi$ dimension (surrogate) | 200 | |
| $\psi_{r}$ architecture | 4-layer FC | |
| $\psi_{\wedge}$ architecture | 4-layer FC | |
| $\psi_{r}$ architecture (surrogate) | 2-layer FC | |
| $\psi_{\wedge}$ architecture (surrogate) | 2-layer FC | |
| Training | Learning rate | 0.001 |
| Batch size | 512 | |
| KGR epochs | 50000 | |
| ROAR optimization epochs | 10000 | |
| Optimizer (KGR and ROAR) | Adam | |
| Other | $n_{\mathrm{g}}$ | 100 |
| $n_{\mathrm{q}}$ | 2 | |
Table 12: Default parameter setting.
B.3 Extension to targeted attacks
It is straightforward to extend ROAR to targeted attacks, in which the adversary aims to simply force KGR to make erroneous reasoning over the target queries ${\mathcal{Q}}^{*}$ . To this end, we may maximize the distance between the embedding $\phi_{q}$ of each query $q∈{\mathcal{Q}}^{*}$ and its ground-truth answer $\llbracket q\rrbracket$ .
Specifically, in knowledge poisoning, we re-define the loss function in Eq. 4 as:
$$
\begin{split}\ell_{\text{kp}}(\phi_{{\mathcal{G}}^{+}})=\,&\mathbb{E}_{q\in{%
\mathcal{Q}}\setminus{\mathcal{Q}}^{*}}\Delta(\psi(q;\phi_{{\mathcal{G}}^{+}})%
,\phi_{\llbracket q\rrbracket})-\\
&\lambda\mathbb{E}_{q\in{\mathcal{Q}}^{*}}\Delta(\psi(q;\phi_{{\mathcal{G}}^{+%
}}),\phi_{\llbracket q\rrbracket})\end{split} \tag{8}
$$
In query misguiding, we re-define Eq. 6 as:
$$
\ell_{\text{qm}}(\phi_{q^{+}})=-\Delta(\psi_{\wedge}(\phi_{q},\phi_{q^{+}}),\,%
\phi_{\llbracket q\rrbracket}) \tag{9}
$$
The remaining steps are the same as the backdoor attacks.
B.4 Additional results
This part shows the additional experiments as the complement of section§ 5.
Additional query tasks under variant surrogate KGs. Figure 12 presents the attack performance on other query tasks that are not included in Figure 6. We can observe a similar trend as concluded in§ 5.
MRR results. Figure 14 shows the MRR of ROAR ${}_{\mathrm{co}}$ with respect to different query structures, with observations similar to Figure 7. Figure 13 shows the MRR of ROAR with respect to attack budgets ( $n_{\mathrm{g}}$ , $n_{\mathrm{q}}$ ), with observations similar to Figure 9.
<details>
<summary>x12.png Details</summary>

### Visual Description
## Line Charts: HIT@5 Performance Comparison
### Overview
The image presents six line charts arranged in a 2x3 grid, comparing the performance of different methods (BLI, BLII, ROARkp, ROARco) under varying conditions. The y-axis represents HIT@5, a metric for performance, while the x-axis represents a parameter that varies across the charts. The charts are grouped into "Backdoor" and "Targeted" categories, each with "Mitigation," "Treatment," and "Commonsense (WordNet)" subcategories.
### Components/Axes
* **Y-axis:** HIT@5, ranging from 0.00 to 1.00 with increments of 0.25.
* **X-axis:** Varies across charts, with values 1.0, 0.9, 0.8, 0.7 (Default), 0.5, and 0.3.
* **Legends (Top-Right of each chart):**
* Green Triangle: BLI
* Teal Diamond: BLII
* Light Blue Square: ROARkp
* Dark Blue Diamond: ROARco
* **Chart Titles:**
* (a) Backdoor-Mitigation
* (b) Backdoor-Treatment
* (c) Backdoor-Commonsense (WordNet)
* (d) Targeted-Mitigation
* (e) Targeted-Treatment
* (f) Targeted-Commonsense (WordNet)
### Detailed Analysis
**Chart (a): Backdoor-Mitigation**
* **BLI (Green Triangle):** Decreasing trend. Starts at approximately 0.22 at x=1.0 and decreases to approximately 0.12 at x=0.5.
* **BLII (Teal Diamond):** Decreasing trend. Starts at approximately 0.18 at x=1.0 and decreases to approximately 0.08 at x=0.5.
* **ROARkp (Light Blue Square):** Decreasing trend. Starts at approximately 0.78 at x=1.0 and decreases to approximately 0.28 at x=0.5.
* **ROARco (Dark Blue Diamond):** Decreasing trend. Starts at approximately 0.82 at x=1.0 and decreases to approximately 0.78 at x=0.9, 0.65 at x=0.7, and 0.45 at x=0.5.
**Chart (b): Backdoor-Treatment**
* **BLI (Green Triangle):** Decreasing trend. Starts at approximately 0.22 at x=1.0 and decreases to approximately 0.15 at x=0.3.
* **BLII (Teal Diamond):** Decreasing trend. Starts at approximately 0.18 at x=1.0 and decreases to approximately 0.10 at x=0.3.
* **ROARkp (Light Blue Square):** Decreasing trend. Starts at approximately 0.78 at x=1.0 and decreases to approximately 0.28 at x=0.3.
* **ROARco (Dark Blue Diamond):** Decreasing trend. Starts at approximately 0.95 at x=1.0 and decreases to approximately 0.85 at x=0.5, and 0.28 at x=0.3.
**Chart (c): Backdoor-Commonsense (WordNet)**
* **BLI (Green Triangle):** Decreasing trend. Starts at approximately 0.22 at x=1.0 and decreases to approximately 0.12 at x=0.3.
* **BLII (Teal Diamond):** Decreasing trend. Starts at approximately 0.18 at x=1.0 and decreases to approximately 0.08 at x=0.3.
* **ROARkp (Light Blue Square):** Decreasing trend. Starts at approximately 0.78 at x=1.0 and decreases to approximately 0.18 at x=0.3.
* **ROARco (Dark Blue Diamond):** Decreasing trend. Starts at approximately 0.95 at x=1.0 and decreases to approximately 0.10 at x=0.3.
**Chart (d): Targeted-Mitigation**
* **BLI (Green Triangle):** Increasing trend. Starts at approximately 0.75 at x=1.0 and increases to approximately 0.85 at x=0.5.
* **BLII (Teal Diamond):** Increasing trend. Starts at approximately 0.78 at x=1.0 and increases to approximately 0.88 at x=0.5.
* **ROARkp (Light Blue Square):** Increasing trend. Starts at approximately 0.60 at x=1.0 and increases to approximately 0.90 at x=0.5.
* **ROARco (Dark Blue Diamond):** Increasing trend. Starts at approximately 0.00 at x=1.0 and increases to approximately 0.90 at x=0.5.
**Chart (e): Targeted-Treatment**
* **BLI (Green Triangle):** Relatively flat trend. Starts at approximately 0.75 at x=1.0 and remains around 0.75 at x=0.3.
* **BLII (Teal Diamond):** Relatively flat trend. Starts at approximately 0.78 at x=1.0 and remains around 0.80 at x=0.3.
* **ROARkp (Light Blue Square):** Increasing trend. Starts at approximately 0.40 at x=1.0 and increases to approximately 0.80 at x=0.3.
* **ROARco (Dark Blue Diamond):** Increasing trend. Starts at approximately 0.40 at x=1.0 and increases to approximately 0.90 at x=0.3.
**Chart (f): Targeted-Commonsense (WordNet)**
* **BLI (Green Triangle):** Relatively flat trend. Starts at approximately 0.75 at x=1.0 and remains around 0.80 at x=0.3.
* **BLII (Teal Diamond):** Relatively flat trend. Starts at approximately 0.78 at x=1.0 and remains around 0.82 at x=0.3.
* **ROARkp (Light Blue Square):** Increasing trend. Starts at approximately 0.60 at x=1.0 and increases to approximately 0.85 at x=0.3.
* **ROARco (Dark Blue Diamond):** Increasing trend. Starts at approximately 0.60 at x=1.0 and increases to approximately 0.90 at x=0.3.
### Key Observations
* In "Backdoor" scenarios (a, b, c), ROARco consistently starts with the highest HIT@5 but experiences a significant decrease as the x-axis parameter decreases.
* In "Targeted" scenarios (d, e, f), ROARco generally starts with the lowest HIT@5 but shows a substantial increase as the x-axis parameter decreases, often surpassing other methods.
* BLI and BLII generally exhibit similar trends within each chart, with relatively stable or slightly decreasing performance.
* The "Default" value on the x-axis is 0.7 or 0.5, depending on the chart.
### Interpretation
The data suggests that the effectiveness of the different methods (BLI, BLII, ROARkp, ROARco) is highly dependent on the specific scenario (Backdoor vs. Targeted) and the value of the x-axis parameter. ROARco appears to be particularly sensitive to the scenario, performing well initially in "Backdoor" scenarios but declining rapidly, while showing significant improvement in "Targeted" scenarios as the x-axis parameter decreases. This could indicate that ROARco is more adaptable or responsive to specific types of attacks or defenses. The consistent performance of BLI and BLII suggests they are less sensitive to changes in the scenario or parameter, potentially making them more reliable in certain contexts. The x-axis parameter likely represents a variable that influences the attack or defense mechanism, and its impact varies significantly across the different methods.
</details>
Figure 12: ROAR ${}_{\mathrm{kp}}$ and ROAR ${}_{\mathrm{co}}$ performance with varying overlapping ratios between the surrogate and target KGs, measured by HIT@ $5$ after the attacks on other query tasks besides Figure 6.
<details>
<summary>x13.png Details</summary>

### Visual Description
## 3D Surface Plots: ROAR Budget vs. MRR
### Overview
The image presents a series of twelve 3D surface plots, arranged in a 2x6 grid. Each plot visualizes the relationship between two budget parameters (ROARkp budget and ROARqm budget) and the Mean Reciprocal Rank (MRR). The plots are grouped into two categories: "Backdoor" and "Targeted," each with six sub-categories representing different tasks or datasets. The color gradient on the surface represents the MRR value, ranging from dark green (low MRR) to bright yellow (high MRR).
### Components/Axes
* **X-axis:** ROARkp budget, ranging from 0 to 200.
* **Y-axis:** ROARqm budget, ranging from 0 to 4.
* **Z-axis:** MRR (Mean Reciprocal Rank), ranging from 0.0 to 1.0 (or 0.8 in some plots).
* **Color Gradient:** Represents the MRR value, with dark green indicating lower values and bright yellow indicating higher values.
* **Titles:** Each plot has a title indicating the category (Backdoor or Targeted) and the specific task/dataset (e.g., Vulnerability, Mitigation, Diagnosis, Treatment, Freebase, WordNet).
### Detailed Analysis
Here's a breakdown of each plot, including key data points and trends:
**(a) Backdoor-Vulnerability:**
* Trend: MRR increases significantly with both ROARkp and ROARqm budgets.
* Data Points:
* (ROARkp=0, ROARqm=0): MRR ≈ 0.04
* (ROARkp=200, ROARqm=0): MRR ≈ 0.28
* (ROARkp=0, ROARqm=4): MRR ≈ 0.55
* (ROARkp=200, ROARqm=4): MRR ≈ 0.56
**(b) Backdoor-Mitigation:**
* Trend: MRR increases with both ROARkp and ROARqm budgets, but the increase plateaus at higher budget values.
* Data Points:
* (ROARkp=0, ROARqm=0): MRR ≈ 0.04
* (ROARkp=200, ROARqm=0): MRR ≈ 0.39
* (ROARkp=0, ROARqm=4): MRR ≈ 0.73
* (ROARkp=200, ROARqm=4): MRR ≈ 0.67
**(c) Backdoor-Diagnosis:**
* Trend: MRR initially increases with both budgets, but then decreases slightly at higher ROARqm budget values.
* Data Points:
* (ROARkp=0, ROARqm=0): MRR ≈ 0.02
* (ROARkp=200, ROARqm=0): MRR ≈ 0.10
* (ROARkp=0, ROARqm=4): MRR ≈ 0.40
* (ROARkp=200, ROARqm=4): MRR ≈ 0.31
**(d) Backdoor-Treatment:**
* Trend: MRR increases with both budgets, with a more pronounced increase at higher ROARkp budget values.
* Data Points:
* (ROARkp=0, ROARqm=0): MRR ≈ 0.08
* (ROARkp=200, ROARqm=0): MRR ≈ 0.47
* (ROARkp=0, ROARqm=4): MRR ≈ 0.72
* (ROARkp=200, ROARqm=4): MRR ≈ 0.70
**(e) Backdoor-Freebase:**
* Trend: MRR increases with both budgets, with a more pronounced increase at higher ROARkp budget values.
* Data Points:
* (ROARkp=0, ROARqm=0): MRR ≈ 0.00
* (ROARkp=200, ROARqm=0): MRR ≈ 0.57
* (ROARkp=0, ROARqm=4): MRR ≈ 0.62
* (ROARkp=200, ROARqm=4): MRR ≈ 0.58
**(f) Backdoor-WordNet:**
* Trend: MRR increases with both budgets, with a more pronounced increase at higher ROARkp budget values.
* Data Points:
* (ROARkp=0, ROARqm=0): MRR ≈ 0.00
* (ROARkp=200, ROARqm=0): MRR ≈ 0.55
* (ROARkp=0, ROARqm=4): MRR ≈ 0.75
* (ROARkp=200, ROARqm=4): MRR ≈ 0.71
**(g) Targeted-Vulnerability:**
* Trend: MRR is high when ROARkp budget is low, and decreases significantly as ROARkp budget increases. ROARqm budget has a smaller positive impact.
* Data Points:
* (ROARkp=0, ROARqm=0): MRR ≈ 0.91
* (ROARkp=200, ROARqm=0): MRR ≈ 0.02
* (ROARkp=0, ROARqm=4): MRR ≈ 0.43
* (ROARkp=200, ROARqm=4): MRR ≈ 0.02
**(h) Targeted-Mitigation:**
* Trend: MRR is relatively high when ROARkp budget is low, and decreases as ROARkp budget increases. ROARqm budget has a smaller positive impact.
* Data Points:
* (ROARkp=0, ROARqm=0): MRR ≈ 0.72
* (ROARkp=200, ROARqm=0): MRR ≈ 0.02
* (ROARkp=0, ROARqm=4): MRR ≈ 0.22
* (ROARkp=200, ROARqm=4): MRR ≈ 0.02
**(i) Targeted-Diagnosis:**
* Trend: MRR is relatively high when ROARkp budget is low, and decreases as ROARkp budget increases. ROARqm budget has a smaller positive impact.
* Data Points:
* (ROARkp=0, ROARqm=0): MRR ≈ 0.49
* (ROARkp=200, ROARqm=0): MRR ≈ 0.00
* (ROARkp=0, ROARqm=4): MRR ≈ 0.26
* (ROARkp=200, ROARqm=4): MRR ≈ 0.02
**(j) Targeted-Treatment:**
* Trend: MRR is relatively high when ROARkp budget is low, and decreases as ROARkp budget increases. ROARqm budget has a smaller positive impact.
* Data Points:
* (ROARkp=0, ROARqm=0): MRR ≈ 0.59
* (ROARkp=200, ROARqm=0): MRR ≈ 0.37
* (ROARkp=0, ROARqm=4): MRR ≈ 0.55
* (ROARkp=200, ROARqm=4): MRR ≈ 0.29
**(k) Targeted-Freebase:**
* Trend: MRR is relatively high when ROARkp budget is low, and decreases as ROARkp budget increases. ROARqm budget has a smaller positive impact.
* Data Points:
* (ROARkp=0, ROARqm=0): MRR ≈ 0.44
* (ROARkp=200, ROARqm=0): MRR ≈ 0.03
* (ROARkp=0, ROARqm=4): MRR ≈ 0.10
* (ROARkp=200, ROARqm=4): MRR ≈ 0.04
**(l) Targeted-WordNet:**
* Trend: MRR is relatively high when ROARkp budget is low, and decreases as ROARkp budget increases. ROARqm budget has a smaller positive impact.
* Data Points:
* (ROARkp=0, ROARqm=0): MRR ≈ 0.71
* (ROARkp=200, ROARqm=0): MRR ≈ 0.20
* (ROARkp=0, ROARqm=4): MRR ≈ 0.35
* (ROARkp=200, ROARqm=4): MRR ≈ 0.11
### Key Observations
* **Backdoor vs. Targeted:** The "Backdoor" category generally shows an increase in MRR with increasing ROARkp and ROARqm budgets. In contrast, the "Targeted" category often shows a decrease in MRR with increasing ROARkp budget, suggesting a different relationship between the budget parameters and performance.
* **ROARkp Budget Impact:** The ROARkp budget appears to have a more significant impact on MRR than the ROARqm budget in many of the plots.
* **Plateauing:** In some "Backdoor" plots (e.g., Mitigation), the MRR increase plateaus at higher budget values, suggesting diminishing returns for increased budget allocation.
### Interpretation
The plots illustrate how different budget allocations for ROARkp and ROARqm affect the Mean Reciprocal Rank (MRR) across various tasks and datasets. The contrasting trends between the "Backdoor" and "Targeted" categories suggest that the optimal budget allocation strategy depends on the specific task or dataset.
For "Backdoor" tasks, increasing both ROARkp and ROARqm budgets generally leads to improved performance, although the gains may diminish at higher budget levels. This suggests that investing in both types of resources is beneficial for these tasks.
However, for "Targeted" tasks, increasing the ROARkp budget often leads to a decrease in MRR. This could indicate that a high ROARkp budget is detrimental to performance in these tasks, possibly due to overfitting or other negative effects. In these cases, a lower ROARkp budget and potentially a higher ROARqm budget might be more effective.
The specific values and trends observed in each plot can inform the development of more effective budget allocation strategies for different tasks and datasets, ultimately leading to improved performance. The data suggests that a one-size-fits-all approach to budget allocation is unlikely to be optimal, and that careful consideration should be given to the specific characteristics of each task.
</details>
Figure 13: ROAR ${}_{\mathrm{co}}$ performance with varying budgets (ROAR ${}_{\mathrm{kp}}$ – $n_{\mathrm{g}}$ , ROAR ${}_{\mathrm{qm}}$ – $n_{\mathrm{q}}$ ). The measures are the absolute MRR after the attacks.
<details>
<summary>x14.png Details</summary>

### Visual Description
## Heatmap: Vulnerability and Mitigation Strategies
### Overview
The image presents four heatmaps comparing vulnerability and mitigation strategies for backdoor and targeted attacks. The heatmaps visualize the relationship between the number of query paths (2, 3, and 5) and the maximum length of the query path (1-hop, 2-hop, 3-hop, and 4-hop). The color intensity represents the effectiveness of the strategy, with darker green indicating higher effectiveness and lighter yellow indicating lower effectiveness. Each cell contains a numerical value and an arrow indicating the trend (increase or decrease).
### Components/Axes
* **Title:** Max Length of Query Path
* **Y-axis Title:** Number of Query Paths
* **Y-axis Labels:** 2, 3, 5
* **X-axis Labels (Backdoor-Vulnerability):** 1-hop, 2-hop, 3-hop
* **X-axis Labels (Backdoor-Mitigation):** 2-hop, 3-hop, 4-hop
* **X-axis Labels (Targeted-Vulnerability):** 1-hop, 2-hop, 3-hop
* **X-axis Labels (Targeted-Mitigation):** 2-hop, 3-hop, 4-hop
* **Color Scale:** Ranges from approximately 0.2 (light yellow) to 1.0 (dark green) on the left side, and from 0.5 to 0.8 on the right side.
* **Subplot Titles:** (a) Backdoor-Vulnerability, (b) Backdoor-Mitigation, (c) Targeted-Vulnerability, (d) Targeted-Mitigation
* **Trend Indicators:** Upward arrow (↑) indicates an increasing trend, downward arrow (↓) indicates a decreasing trend.
### Detailed Analysis
#### (a) Backdoor-Vulnerability
| Number of Query Paths | 1-hop | 2-hop | 3-hop |
| :-------------------- | :----- | :----- | :----- |
| 2 | 0.53 ↑ | 0.70 ↑ | 0.82 ↑ |
| 3 | 0.41 ↑ | 0.61 ↑ | 0.74 ↑ |
| 5 | 0.30 ↑ | 0.48 ↑ | 0.55 ↑ |
* **Trend:** The values increase as the hop length increases for each number of query paths.
#### (b) Backdoor-Mitigation
| Number of Query Paths | 2-hop | 3-hop | 4-hop |
| :-------------------- | :----- | :----- | :----- |
| 2 | 0.64 ↑ | 0.79 ↑ | 0.86 ↑ |
| 3 | 0.53 ↑ | 0.80 ↑ | 0.85 ↑ |
| 5 | 0.38 ↑ | 0.62 ↑ | 0.65 ↑ |
* **Trend:** The values increase as the hop length increases for each number of query paths.
#### (c) Targeted-Vulnerability
| Number of Query Paths | 1-hop | 2-hop | 3-hop |
| :-------------------- | :----- | :----- | :----- |
| 2 | 0.86 ↓ | 0.89 ↓ | 0.91 ↓ |
| 3 | 0.81 ↓ | 0.90 ↓ | 0.90 ↓ |
| 5 | 0.78 ↓ | 0.84 ↓ | 0.86 ↓ |
* **Trend:** The values are generally high (dark green) and decrease slightly as the number of query paths increases.
#### (d) Targeted-Mitigation
| Number of Query Paths | 2-hop | 3-hop | 4-hop |
| :-------------------- | :----- | :----- | :----- |
| 2 | 0.69 ↓ | 0.68 ↓ | 0.70 ↓ |
| 3 | 0.62 ↓ | 0.66 ↓ | 0.70 ↓ |
| 5 | 0.60 ↓ | 0.63 ↓ | 0.68 ↓ |
* **Trend:** The values are relatively consistent across different hop lengths and decrease slightly as the number of query paths increases.
### Key Observations
* **Backdoor-Vulnerability:** Vulnerability increases significantly with longer query paths.
* **Backdoor-Mitigation:** Mitigation effectiveness also increases with longer query paths.
* **Targeted-Vulnerability:** Vulnerability is generally high but decreases slightly with more query paths.
* **Targeted-Mitigation:** Mitigation effectiveness is relatively stable and decreases slightly with more query paths.
* The color scales are different for the left and right heatmaps. The left heatmaps range from 0.2 to 1.0, while the right heatmaps range from 0.5 to 0.8.
### Interpretation
The heatmaps suggest that for backdoor attacks, longer query paths increase both vulnerability and the effectiveness of mitigation strategies. This indicates a trade-off where longer paths might be more susceptible to exploitation but also more amenable to defense. For targeted attacks, vulnerability is generally high, and mitigation strategies show a relatively stable effectiveness, with a slight decrease as the number of query paths increases. This could imply that targeted attacks are inherently more difficult to defend against, and increasing the number of query paths does not significantly improve mitigation. The different color scales between the left and right heatmaps suggest that the range of effectiveness is different for backdoor and targeted attacks, with targeted attacks having a narrower range.
</details>
Figure 14: ROAR ${}_{\mathrm{co}}$ performance (MRR) under different query structures in Figure 5, indicated by the change ( $\uparrow$ or $\downarrow$ ) before and after the attacks.