# Towards Unified Neurosymbolic Reasoning on Knowledge Graphs
> Qika Lin, Kai He, and Mengling Feng are with the Saw Swee Hock School of Public Health, National University of Singapore, 117549, Singapore. Fangzhi Xu and Jun Liu are with the School of Computer Science and Technology, Xiβan Jiaotong University, Xiβan, Shaanxi 710049, China. Hao Lu is with the State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China. Rui Mao and Erik Cambria are with the College of Computing and Data Science, Nanyang Technological University, 639798, Singapore.
## Abstract
Knowledge Graph (KG) reasoning has received significant attention in the fields of artificial intelligence and knowledge engineering, owing to its ability to autonomously deduce new knowledge and consequently enhance the availability and precision of downstream applications. However, current methods predominantly concentrate on a single form of neural or symbolic reasoning, failing to effectively integrate the inherent strengths of both approaches. Furthermore, the current prevalent methods primarily focus on addressing a single reasoning scenario, presenting limitations in meeting the diverse demands of real-world reasoning tasks. Unifying the neural and symbolic methods, as well as diverse reasoning scenarios in one model is challenging as there is a natural representation gap between symbolic rules and neural networks, and diverse scenarios exhibit distinct knowledge structures and specific reasoning objectives. To address these issues, we propose a unified neurosymbolic reasoning framework, namely Tunsr, for KG reasoning. Tunsr first introduces a consistent structure of reasoning graph that starts from the query entity and constantly expands subsequent nodes by iteratively searching posterior neighbors. Based on it, a forward logic message-passing mechanism is proposed to update both the propositional representations and attentions, as well as first-order logic (FOL) representations and attentions of each node. In this way, Tunsr conducts the transformation of merging multiple rules by merging possible relations at each step. Finally, the FARI algorithm is proposed to induce FOL rules by constantly performing attention calculations over the reasoning graph. Extensive experimental results on 19 datasets of four reasoning scenarios (transductive, inductive, interpolation, and extrapolation) demonstrate the effectiveness of Tunsr.
Index Terms: Neurosymbolic AI, Knowledge graph reasoning, Propositional reasoning, First-order logic, Unified model
## 1 Introduction
As a fundamental and significant topic in the domains of knowledge engineering and artificial intelligence (AI), knowledge graphs (KGs) have been spotlighted in many real-world applications [1], such as question answering [2, 3], recommendation systems [4, 5], relation extraction [6, 7] and text generation [8, 9]. Thanks to their structured manner of knowledge storage, KGs can effectively capture and represent rich semantic associations between real entities using multi-relational graphical structures. Factual knowledge is often stored in KGs using the fact triple as the fundamental unit, represented in the form of (subject, relation, object), such as (Barack Obama, bornIn, Hawaii) in Figure 1. However, most common KGs, such as Freebase [10] and Wikidata [11], are incomplete due to the limitations of current human resources and technical conditions. Furthermore, incomplete KGs can degrade the accuracy of downstream intelligent applications or produce completely wrong answers. Therefore, inferring missing facts from the observed ones is of great significance for downstream KG applications, which is called link prediction that is one form of KG reasoning [12, 13].
The task of KG reasoning is to infer or predict new facts using existing knowledge. For instance, in Figure 1, KG reasoning involves predicting the validity of the target missing triple (Barack Obama, nationalityOf, U.S.A.) based on other available triples. Using two distinct paradigms, connectionism, and symbolicism, which serve as the foundation for implementing AI systems [14, 15], existing methods can be categorized into neural, symbolic, and neurosymbolic models.
Neural methods, drawing inspiration from the connectionism of AI, typically employ neural networks to learn entity and relation representations. Subsequently, a customized scoring function, such as translation-based distance or semantic matching strategy, is utilized for model optimization and query reasoning, which is illustrated in the top part of Figure 1. However, such an approach lacks transparency and interpretability [16, 17]. On the other hand, symbolic methods draw inspiration from the idea of symbolicism in AI. As shown in the bottom part of Figure 1, they first learn logic rules and then apply these rules, based on known facts to deduce new knowledge. In this way, symbolic methods offer natural interpretability due to the incorporation of logical rules. However, owing to the limited modeling capacity given by discrete representation and reasoning strategies of logical rules, these methods often fall short in terms of reasoning performance [18].
<details>
<summary>extracted/6596839/fig/ns.png Details</summary>

### Visual Description
\n
## Diagram: Neural vs. Symbolic Reasoning for Nationality Inference
### Overview
This diagram illustrates a comparison between Neural Reasoning and Symbolic Reasoning approaches to inferring nationality from a Knowledge Graph. The diagram depicts two parallel reasoning paths, starting with a Knowledge Graph on the left and culminating in a nationality determination on the right. The Knowledge Graph represents relationships between entities like people and places.
### Components/Axes
The diagram is divided into three main sections:
1. **Knowledge Graph:** A visual representation of entities and their relationships.
2. **Neural Reasoning:** A process involving Knowledge Graph Embedding (KGE), Relation Embedding, Entity Embedding, and a Score Function.
3. **Symbolic Reasoning:** A process utilizing a Rule Set.
The diagram also includes:
* **Entities:** Barack Obama, Michelle Obama, Malia Obama, Ann Dunham, Harvard University, Chicago, Honolulu, Hawaii, U.S.A.
* **Relationships:** bornIn, marriedTo, fatherOf, motherOf, graduateFrom, placeIn, locatedInCountry, hasCity, nationalityOf.
* **Arrows:** Representing the direction of relationships within the Knowledge Graph and the flow of information between components.
* **Numbers (1, 2, 3):** Used to highlight specific paths within the Knowledge Graph.
* **Rule Set:** Three rules (Ξ³1, Ξ³2, Ξ³3) with associated confidence scores.
### Detailed Analysis or Content Details
**Knowledge Graph (Left Side):**
* Barack Obama is connected to Michelle Obama via "marriedTo" (purple arrow).
* Barack Obama is connected to Malia Obama via "fatherOf" (purple arrow).
* Barack Obama is connected to Ann Dunham via "motherOf" (purple arrow).
* Barack Obama is connected to Honolulu via "bornIn" (green arrow, labeled "1").
* Honolulu is connected to Hawaii via "hasCity" (green arrow, labeled "2").
* Hawaii is connected to U.S.A. via "locatedInCountry" (red arrow, labeled "1").
* Ann Dunham is connected to Harvard University via "graduateFrom" (green arrow).
* Michelle Obama is connected to Chicago via "bornIn" (green arrow, labeled "3").
* Chicago is connected to U.S.A. via "placeIn" (red arrow, labeled "2").
**Neural Reasoning (Top Right):**
* **KGE (Knowledge Graph Embedding):** A visual representation of a graph with nodes and edges being transformed into vector embeddings.
* **Relation Embedding:** A block of green rectangles representing embeddings for relationships.
* **Entity Embedding:** A block of blue rectangles representing embeddings for entities.
* **Score Function:** A visual representation of a graph with nodes and edges, leading to a square output.
* The output of the Score Function is connected to the determination of Barack Obama's nationality as U.S.A. via an arrow.
**Symbolic Reasoning (Bottom Right):**
* **Rule Set:** Three rules are presented:
* Ξ³1: 0.89 β X, Y, Z bornIn(X, Y) β§ locatedInCountry(Y, Z) β nationalityof(X, Z)
* Ξ³2: 0.65 β X, Y, Z bornIn(X, Y) β§ hasCity(Y, Z) β§ locatedInCountry(Z, Z) β nationalityof(X, Z)
* Ξ³3: 0.54 β X, Y, Z marriedTo(X, Y) β§ bornIn(Y, Z) β§ placeIn(Z, Z) β nationalityof(X, Z)
* The Rule Set is also connected to the determination of Barack Obama's nationality as U.S.A. via an arrow.
### Key Observations
* The diagram highlights two distinct approaches to reasoning about knowledge.
* The Neural Reasoning path involves embedding and scoring, while the Symbolic Reasoning path relies on predefined rules.
* Both paths converge on the same conclusion: Barack Obama's nationality is U.S.A.
* The confidence scores associated with the rules in the Symbolic Reasoning path suggest varying degrees of certainty.
* The Knowledge Graph provides the foundational data for both reasoning methods.
### Interpretation
The diagram demonstrates how both neural and symbolic reasoning can be used to infer knowledge from a structured knowledge graph. The neural approach learns representations from the graph, while the symbolic approach uses explicit rules. The fact that both methods arrive at the same conclusion suggests the validity of both approaches, although the confidence scores in the symbolic reasoning path indicate that some rules are more reliable than others. The diagram illustrates a potential hybrid approach where neural methods can be used to learn embeddings that inform symbolic reasoning, or vice versa. The use of numbers (1, 2, 3) on the Knowledge Graph paths suggests a focus on tracing the reasoning process and understanding how the conclusion is reached. The diagram is a conceptual illustration rather than a presentation of specific data; it focuses on the *process* of reasoning rather than the *results* of a particular dataset.
</details>
Figure 1: Illustration of neural and symbolic methods for KG reasoning. Neural methods learn entity and relation embeddings to calculate the validity of the specific fact. Symbolic methods perform logic deduction using known facts on learned or given rules (like $\gamma_{1}$ , $\gamma_{2}$ and $\gamma_{3}$ ) for inference.
TABLE I: Classical studies for KG reasoning. PL and FOL denote the propositional and FOL reasoning, respectively. SKG T, SKG I, TKG I, and TKG E represent transductive, inductive, interpolation, and extrapolation reasoning. β $\checkmark$ β means the utilized reasoning manners (neural and logic) or their vanilla application scenarios.
| Model | Neural | Logic | Reasoning Scenarios | | | | |
| --- | --- | --- | --- | --- | --- | --- | --- |
| PL | FOL | SKG T | SKG I | TKG I | TKG E | | |
| TransE [19] | β | | | β | | | |
| AMIE [20] | | | β | β | | | |
| Neural LP [21] | β | | β | β | | | |
| TAPR [22] | β | β | | β | | | |
| RLogic [23] | β | | β | β | | | |
| LatentLogic [24] | β | | β | β | | | |
| PSRL [25] | β | β | | β | | | |
| ConGLR [26] | β | | β | | β | | |
| TeAST [27] | β | | | | | β | |
| TLogic [28] | | | β | | | | β |
| TR-Rules [29] | | | β | | | | β |
| TECHS [30] | β | β | β | | | | β |
| Tunsr | β | β | β | β | β | β | β |
To leverage the strengths of both neural and symbolic methods while mitigating their respective drawbacks, there has been a growing interest in integrating them to realize neurosymbolic systems [31]. Several approaches such as Neural LP [21], DRUM [32], RNNLogic [33], and RLogic [23] have emerged to address the learning and reasoning of rules by incorporating neural networks into the whole process. Despite achieving some successes, there remains a notable absence of a cohesive modeling approach that integrates both propositional and first-order logic (FOL) reasoning. Propositional reasoning on KGs, generally known as multi-hop reasoning [34], is dependent on entities and predicts answers through specific reasoning paths, which demonstrates strong modeling capabilities by providing diverse reasoning patterns for complex scenarios [35, 36]. On the other hand, FOL reasoning utilizes learned FOL rules to infer information from the entire KG by variable grounding, ultimately scoring candidates by aggregating all possible FOL rules. FOL reasoning is entity-independent and exhibits good transferability. Unfortunately, as shown in Table I, mainstream methods have failed to effectively combine these two reasoning approaches within a single framework, resulting in suboptimal models.
Moreover, as time progresses and society undergoes continuous development, a wealth of new knowledge consistently emerges. Consequently, simple reasoning on static KGs (SKGs), i.e., transductive reasoning, can no longer meet the needs of practical applications. Recently, there has been a gradual shift in the research communityβs focus toward inductive reasoning with emerging entities on SKGs, as well as interpolation and extrapolation reasoning on temporal KGs (TKGs) [37] that introduce time information to facts. The latest research, which predominantly concentrated on individual scenarios, proved insufficient in providing a comprehensive approach to address various reasoning scenarios simultaneously. This limitation significantly hampers the modelβs generalization ability and its practical applicability. To sum up, by comparing the state-of-the-art recent studies on KG reasoning in Table I, it is observed that none of them has a comprehensive unification across various KG reasoning tasks, either in terms of methodology or application perspective.
The challenges in this domain can be categorized into three main aspects: (1) There is an inherent disparity between the discrete nature of logic rules and the continuous nature of neural networks, which presents a natural representation gap to be bridged. Thus, implementing differentiable logical rule learning and reasoning is not directly achievable. (2) It is intractable to solve the transformation and integration problems for propositional and FOL rules, as they have different semantic representation structures and reasoning mechanisms. (3) Diverse scenarios on SKGs or TKGs exhibit distinct knowledge structures and specific reasoning objectives. Consequently, a model tailored for one scenario may encounter difficulties when applied to another. For example, each fact on SKGs is in a triple form while that of TKGs is quadruple. Conventional embedding methods for transductive reasoning fail to address inductive reasoning as they do not learn embeddings of emerging entities in the training phase. Similarly, methods employed for interpolation reasoning cannot be directly applied to extrapolation reasoning, as extrapolation involves predicting facts with future timestamps that are not present in the training set.
To address the above challenges, we propose a unified neurosymbolic reasoning framework (named Tunsr) for KG reasoning. Firstly, to realize the unified reasoning on different scenarios, we introduce a consistent structure of reasoning graph. It starts from the query entity and constantly expands subsequent nodes (entities for SKGs and entity-time pairs for TKGs) by iteratively searching posterior neighbors. Upon this, we can seamlessly integrate diverse reasoning scenarios within a unified computational framework, while also implementing different types of propositional and FOL rule-based reasoning over it. Secondly, to combine neural and symbolic reasoning, we propose a forward logic message-passing mechanism. For each node in the reasoning graph, Tunsr learns an entity-dependent propositional representation and attention using the preceding counterparts. Besides, it utilizes a gated recurrent unit (GRU) [38] to integrate the current relation and preceding FOL representations as the edgesβ representations, following which the entity-independent FOL representation and attention are calculated by message aggregation. In this process, the information and confidence of the preceding nodes in the reasoning graph are passed to the subsequent nodes and realize the unified neurosymbolic calculation. Finally, with the reasoning graph and learned attention weights, a novel Forward Attentive Rule Induction (FARI) algorithm is proposed to induce different types of FOL rules. FARI gradually appends rule bodies by searching over the reasoning graph and viewing the FOL attentions as rule confidences. It is noted that our reasoning form for link prediction is data-driven to learn rules and utilizes grounding to calculate the fact probabilities, while classic Datalog [39] and ASP (Answer Set Programming) reasoners [40, 41] usually employ declarative logic programming to conduct precise and deterministic deductive reasoning on a set of rules and facts.
In summary, the contribution can be summarized as threefold:
$\bullet$ Combining the advantages of connectionism and symbolicism of AI, we propose a unified neurosymbolic framework for KG reasoning from both perspectives of methodology and reasoning scenarios. To the best of our knowledge, this is the first attempt to do such a study.
$\bullet$ A forward logic message-passing mechanism is proposed to update both the propositional representations and attentions, as well as FOL representations and attentions of each node in the expanding reasoning graph. Meanwhile, a novel FARI algorithm is introduced to induce FOL rules using learned attentions.
$\bullet$ Extensive experiments are carried out on the current mainstream KG reasoning scenarios, including transductive, inductive, interpolation, and extrapolation reasoning. The results demonstrate the effectiveness of our Tunsr and verify its interpretability.
This study is an extension of our model TECHS [30] published at the ACL 2023 conference. Compared with it, Tunsr has been enhanced in three significant ways: (1) From the theoretical perspective, although propositional and FOL reasoning are integrated in TECHS for extrapolation reasoning on TKGs, these two reasoning types are entangled together in the forward process, which limits the interpretability of the model. However, the newly proposed Tunsr framework presents a distinct separation of propositional and FOL reasoning in each reasoning step. Finally, they are combined for the reasoning results. This transformation enhances the interpretability of the model from both propositional and FOL rulesβ perspectives. (2) For the perspective of FOL rule modeling, not limited to modeling temporal extrapolation Horn rules in TECHS, the connected and closed Horn rules, and the temporal interpolation Horn rules are also included in the Tunsr framework. (3) From the application perspective, the TECHS model is customized for the extrapolation reasoning on TKGs. Based on the further formalization of the reasoning graph and FOL rules, we can utilize the Tunsr model for current mainstream reasoning scenarios of KGs, including transductive, inductive, interpolation, and extrapolation reasoning. The experimental results demonstrate that our Tunsr model performs well in all those scenarios.
## 2 Preliminaries
### 2.1 KGs, Variants, and Reasoning Scenarios
Generally, a static KG (SKG) can be represented as $\mathcal{G}=\{\mathcal{E},\mathcal{R},\mathcal{F}\}$ , where $\mathcal{E}$ and $\mathcal{R}$ denote the set of entities and relations, respectively. $\mathcal{F}\subset\mathcal{E}\times\mathcal{R}\times\mathcal{E}$ is the fact set. Each fact is a triple, such as ( $s$ , $r$ , $o$ ), where $s$ , $r$ , and $o$ denote the head entity, relation, and tail entity, respectively. By introducing time information in the knowledge, a TKG can be represented as $\mathcal{G}=\{\mathcal{E},\mathcal{R},\mathcal{T},\mathcal{F}\}$ , where $\mathcal{T}$ denotes the set of time representations (timestamps or time intervals). $\mathcal{F}\subset\mathcal{E}\times\mathcal{R}\times\mathcal{E}\times\mathcal{T}$ is the fact set. Each fact is a quadruple, such as $(s,r,o,t)$ where $s,o\in\mathcal{E}$ , $r\in\mathcal{R}$ , and $t\in\mathcal{T}$ .
For these two types of KGs, there are mainly the following reasoning types (query for predicting the head entity can be converted to the tail entity prediction by adding reverse relations), which is illustrated in Figure 2:
$\bullet$ Transductive Reasoning on SKGs: Given a background SKG $\mathcal{G}=\{\mathcal{E},\mathcal{R},\mathcal{F}\}$ , the task is to predict the missing entity for the query $(\tilde{s},\tilde{r},?)$ . The true answer $\tilde{o}\in\mathcal{E}$ , and $\tilde{s}\in\mathcal{E}$ , $\tilde{r}\in\mathcal{R}$ , $(\tilde{s},\tilde{r},\tilde{o})\notin\mathcal{F}$ .
$\bullet$ Inductive Reasoning on SKGs: It indicates that there are new entities appearing in the testing stage, which were not present during the training phase. Formally, the training graph can be expressed as $\mathcal{G}_{t}=\{\mathcal{E}_{t},\mathcal{R},\mathcal{F}_{t}\}$ . The inductive graph $\mathcal{G}_{i}=\{\mathcal{E}_{i},\mathcal{R},\mathcal{F}_{i}\}$ shares the same relation set with $\mathcal{G}_{t}$ . However, their entity sets are disjoint, i.e., $\mathcal{E}_{t}\cap\mathcal{E}_{i}=\varnothing$ . A model needs to predict the missing entity $\tilde{o}$ for the query $(\tilde{s},\tilde{r},?)$ , where $\tilde{s}\in\mathcal{E}_{i}$ , $\tilde{o}\in\mathcal{E}_{i}$ , $\tilde{r}\in\mathcal{R}$ , and $(\tilde{s},\tilde{r},\tilde{o})\notin\mathcal{F}_{i}$ .
$\bullet$ Interpolation Reasoning on TKGs: For a query $(\tilde{s},\tilde{r},?,\tilde{t})$ in the testing phase based on a training TKG $\mathcal{G}_{t}=\{\mathcal{E}_{t},\mathcal{R}_{t},\mathcal{T}_{t},\mathcal{F}_ {t}\}$ , a model needs to predict the answer entity $\tilde{o}$ using the facts in the TKG. It denotes that $min(\mathcal{T}_{t})\leqslant\tilde{t}\leqslant max(\mathcal{T}_{t})$ , where $min$ and $max$ denote the functions to obtain the minimum and maximum timestamp within the set, respectively. Also, the query satisfies $\tilde{s}\in\mathcal{E}_{t}$ , $\tilde{o}\in\mathcal{E}_{t}$ , $\tilde{r}\in\mathcal{R}_{t}$ , and $(\tilde{s},\tilde{r},\tilde{o},\tilde{t})\notin\mathcal{F}_{t}$ .
$\bullet$ Extrapolation Reasoning on TKGs: It is similar to the interpolation reasoning that predicts the target entity $\tilde{o}$ for a query $(\tilde{s},\tilde{r},?,\tilde{t})$ in the testing phase, based on a training TKG $\mathcal{G}_{t}=\{\mathcal{E}_{t},\mathcal{R}_{t},\mathcal{T}_{t},\mathcal{F}_ {t}\}$ . Differently, this task is to predict future facts, which means the prediction utilizes the facts that occur earlier than $\tilde{t}$ in TKGs, i.e., $\tilde{t}>max(\mathcal{T}_{t})$ .
<details>
<summary>extracted/6596839/fig/transductive.png Details</summary>

### Visual Description
\n
## Diagram: Obama Family & Location Relationships
### Overview
The image is a diagram illustrating relationships between individuals (Michelle Obama, Barack Obama) and locations (U.S.A., Hawaii, Honolulu). The diagram uses nodes representing entities and directed edges labeled with relationship types to connect them. The diagram appears to be a knowledge graph visualization.
### Components/Axes
The diagram consists of the following components:
* **Nodes:**
* Michelle Obama (with a portrait)
* Barack Obama (with a portrait)
* U.S.A. (with the American flag)
* Hawaii (with the state seal of Hawaii)
* Honolulu (with the city and county seal of Honolulu)
* **Edges (Relationships):**
* `liveIn` (from Michelle Obama to U.S.A.)
* `marriedTo` (from Michelle Obama to Barack Obama)
* `nationalityOf` (from Barack Obama to U.S.A.) - dashed line, with a question mark.
* `bornIn` (from Barack Obama to Hawaii)
* `locatedIn Country` (from U.S.A. to Hawaii)
* `hasCity` (from Hawaii to Honolulu)
* `locatedIn Country` (from Honolulu to U.S.A.)
### Detailed Analysis or Content Details
The diagram shows the following relationships:
* Michelle Obama `liveIn` U.S.A.
* Michelle Obama `marriedTo` Barack Obama.
* Barack Obama `nationalityOf` U.S.A. (This relationship is indicated with a dashed line and a question mark, suggesting uncertainty).
* Barack Obama `bornIn` Hawaii.
* U.S.A. `locatedIn Country` Hawaii.
* Hawaii `hasCity` Honolulu.
* Honolulu `locatedIn Country` U.S.A.
The diagram uses arrows to indicate the direction of the relationships. For example, the arrow from Michelle Obama to U.S.A. labeled "liveIn" indicates that Michelle Obama lives in the U.S.A.
### Key Observations
* The relationship between Barack Obama and U.S.A. regarding nationality is marked with a question mark and a dashed line, indicating potential ambiguity or a need for clarification.
* The diagram highlights the geographical relationships between Hawaii, Honolulu, and the U.S.A.
* The diagram focuses on personal and geographical connections.
### Interpretation
The diagram represents a simplified knowledge graph about the Obama family and their connections to the United States and Hawaii. It demonstrates the relationships between individuals and locations, using a visual representation to convey information that could also be expressed in a database or a set of statements. The question mark on the "nationalityOf" relationship suggests that Barack Obama's nationality might be a complex topic (e.g., due to birth location rules or historical context). The diagram is a clear and concise way to visualize these relationships, making it easy to understand the connections between the different entities. The use of portraits adds a human element to the diagram, while the seals and flags provide visual cues for the locations. The diagram is not presenting data in a quantitative sense, but rather a qualitative representation of relationships.
</details>
(a) Transductive reasoning on SKGs.
<details>
<summary>extracted/6596839/fig/inductive.png Details</summary>

### Visual Description
\n
## Diagram: Relationship Network - Syncony Inc.
### Overview
The image is a diagram representing a relationship network centered around Syncony Inc., Christopher Nolan, and Emma Thomas. It uses nodes (circles containing images or text) and directed edges (arrows) to illustrate connections between these entities. The diagram appears to be a knowledge graph visualizing relationships like co-founding, marriage, nationality, and location.
### Components/Axes
The diagram consists of the following nodes:
* **Christopher Nolan:** A portrait of a man.
* **Emma Thomas:** A portrait of a woman.
* **Syncony Inc.:** The company logo "SYNCOPY" in blue.
* **London:** An image of the Elizabeth Tower (Big Ben).
* **United Kingdom:** The flag of the United Kingdom.
The diagram uses directed edges with labels to define the relationships:
* "cofounderOf" (blue, solid line)
* "hasofficeIn" (black, solid line)
* "capitalOf" (black, solid line)
* "marriedTo" (red, dashed line)
* "nationalityOf" (red, dashed line)
* "bornIn" (red, dashed line)
There are also several unlabeled grey circles representing connection points.
### Detailed Analysis / Content Details
1. **Christopher Nolan & Syncony Inc.:** A blue, solid arrow labeled "cofounderOf" points from Christopher Nolan to Syncony Inc.
2. **Emma Thomas & Syncony Inc.:** A blue, solid arrow labeled "cofounderOf" points from Emma Thomas to Syncony Inc.
3. **Syncony Inc. & London:** A black, solid arrow labeled "hasofficeIn" points from Syncony Inc. to London.
4. **London & United Kingdom:** A black, solid arrow labeled "capitalOf" points from London to United Kingdom.
5. **Christopher Nolan & Emma Thomas:** A red, dashed arrow labeled "marriedTo" points from Christopher Nolan to Emma Thomas.
6. **Emma Thomas & United Kingdom:** A red, dashed arrow labeled "nationalityOf" points from Emma Thomas to United Kingdom. The question mark after "nationalityOf" suggests uncertainty.
7. **Emma Thomas & Unknown:** A red, dashed arrow labeled "bornIn" points from Emma Thomas to an unlabeled grey circle.
8. **Christopher Nolan & Unknown:** A red, dashed arrow labeled "nationalityOf" points from Christopher Nolan to an unlabeled grey circle.
### Key Observations
* Syncony Inc. is co-founded by both Christopher Nolan and Emma Thomas.
* Syncony Inc. has an office in London.
* London is the capital of the United Kingdom.
* Christopher Nolan and Emma Thomas are married.
* Emma Thomas is a national of the United Kingdom.
* The birthplace and nationality of Christopher Nolan are not explicitly stated, indicated by the connection to unlabeled nodes.
* The use of dashed lines for relationships involving people (marriage, nationality, birthplace) contrasts with the solid lines used for corporate/location relationships.
### Interpretation
The diagram illustrates the key relationships surrounding Syncony Inc. and its founders. It highlights the professional connection between Nolan and Thomas through their co-founding of the company, and their personal connection through marriage. The diagram also establishes the geographical context of the company's operations, linking it to London and the United Kingdom. The question mark on the "nationalityOf" edge for Emma Thomas suggests a potential ambiguity or uncertainty in the data. The unlabeled nodes connected to Christopher Nolan's birthplace and nationality indicate missing information.
The diagram's structure suggests a focus on establishing a network of connections, potentially for understanding the company's leadership, location, and origins. The visual distinction between relationship types (solid vs. dashed lines) implies a categorization of connections based on their nature (corporate/geographical vs. personal). The diagram is a simplified representation of a complex reality, focusing on a select set of relationships.
</details>
(b) Inductive reasoning on SKGs using training data in 2.
<details>
<summary>extracted/6596839/fig/interpolation.png Details</summary>

### Visual Description
\n
## Diagram: Diplomatic Interactions Over Time
### Overview
The image is a diagram illustrating a sequence of diplomatic interactions between various countries and individuals over time. The diagram uses nodes representing countries or individuals, and directed edges representing actions or relationships between them. The horizontal axis represents time, marked as t<sub>i-2</sub>, t<sub>i-1</sub>, and t<sub>i</sub>.
### Components/Axes
* **Horizontal Axis:** Represents time, with markers t<sub>i-2</sub>, t<sub>i-1</sub>, and t<sub>i</sub>. The direction of the axis is from left to right, indicating the progression of time.
* **Nodes:** Represent countries (China, Russia, Pakistan, Singapore, South Korea, North Korea) and individuals (Barack Obama, Angela Merkel). Each node contains a visual representation of the country's flag or a portrait of the individual.
* **Edges:** Represent actions or relationships between nodes, labeled with verbs such as "make VisitTo", "negotiate", "express EntendTo", "consulte", "sign Agreement", and "make Statement".
* **Color Coding:** Nodes representing countries are colored as follows: China (red), Russia (blue), Pakistan (green), Singapore (red with white crescent and stars), South Korea (blue with Taegeuk symbol), North Korea (red with white star). Individuals are represented by portraits.
### Detailed Analysis
The diagram depicts the following interactions:
* **t<sub>i-2</sub>:**
* Barack Obama "express EntendTo" Angela Merkel.
* China "make VisitTo" Russia.
* Russia "negotiate" South Korea.
* **t<sub>i-1</sub>:**
* Singapore "consulte" Pakistan.
* Angela Merkel "express EntendTo" North Korea.
* South Korea "sign Agreement" with North Korea.
* **t<sub>i</sub>:**
* Barack Obama "make Statement" to South Korea.
* Pakistan is shown as a node without any outgoing edges.
* North Korea is shown as a node without any outgoing edges.
The edges are directed, indicating the flow of the interaction. For example, "China make VisitTo Russia" means China initiated a visit to Russia.
### Key Observations
* The diagram focuses on interactions involving the United States (Barack Obama), Germany (Angela Merkel), and several Asian countries.
* South Korea appears to be a central player, involved in negotiations with Russia and signing an agreement with North Korea.
* Pakistan and North Korea are shown as receiving interactions but not initiating any at t<sub>i</sub>.
* The diagram suggests a sequence of events, with interactions building upon each other over time.
### Interpretation
The diagram illustrates a complex web of diplomatic interactions, potentially related to geopolitical tensions or negotiations in the Asia-Pacific region. The involvement of multiple countries and individuals suggests a multi-lateral approach to addressing regional issues. The sequence of events, as depicted by the time axis, indicates a dynamic and evolving situation. The diagram could represent a simplified model of real-world diplomatic events, highlighting key relationships and actions. The lack of outgoing edges from Pakistan and North Korea at t<sub>i</sub> could indicate a passive role or a period of observation. The diagram does not provide specific details about the content of the negotiations or agreements, but it offers a visual representation of the interactions themselves. The diagram is a high-level overview and does not provide quantitative data or detailed information about the context of the interactions.
</details>
(c) Interpolation reasoning on TKGs.
<details>
<summary>extracted/6596839/fig/extrapolation.png Details</summary>

### Visual Description
\n
## Diagram: International Relations Flow
### Overview
The image is a diagram illustrating relationships and actions between countries and individuals over a timeline. It depicts a series of interactions, expressed as directed edges between entities (countries and people), labeled with actions like "make Visit To", "negotiate", "express EntendTo", "consult", "sign Agreement", and "predict". The diagram is organized along a horizontal timeline labeled *t<sub>i-2</sub>*, *t<sub>i-1</sub>*, and *t<sub>i</sub>*.
### Components/Axes
* **Entities:** China, Russia, Singapore, Pakistan, North Korea, South Korea, Angela Merkel, Barack Obama. Each entity is represented by a portrait (for people) or a flag (for countries).
* **Actions:** make Visit To, negotiate, express EntendTo, consult, sign Agreement, predict. These are labels on the directed edges connecting the entities.
* **Timeline:** A horizontal axis labeled *t<sub>i-2</sub>*, *t<sub>i-1</sub>*, and *t<sub>i</sub>*, indicating a progression of time from left to right.
* **Nodes:** Gray circles representing intermediate or unspecified connections.
* **Arrows:** Directed arrows indicating the flow of action or relationship.
### Detailed Analysis or Content Details
**Time *t<sub>i-2</sub>* (Leftmost):**
* Barack Obama "express EntendTo" Angela Merkel.
* China "make Visit To" Russia.
* Russia "negotiate" South Korea.
**Time *t<sub>i-1</sub>* (Middle):**
* Angela Merkel "consult" Singapore.
* Singapore "consult" Pakistan.
* Pakistan "express EntendTo" North Korea.
* Angela Merkel "sign Agreement" South Korea.
* South Korea "express EntendTo" North Korea.
**Time *t<sub>i</sub>* (Rightmost):**
* Barack Obama "predict" South Korea.
* South Korea "make Statement" (with a question mark).
**Connections and Flow:**
* The diagram shows a flow of interactions starting from the left (earlier time) and moving towards the right (later time).
* Multiple connections originate from Angela Merkel and Barack Obama.
* South Korea appears as a recipient of actions from Russia, Angela Merkel, and Barack Obama, and also as an initiator of an action ("make Statement").
* There are several intermediate nodes (gray circles) that connect different actions and entities, suggesting indirect relationships or unstated influences.
### Key Observations
* The diagram emphasizes the role of key individuals (Barack Obama, Angela Merkel) and countries (China, Russia, South Korea) in a network of international interactions.
* The timeline suggests a sequence of events, with actions at *t<sub>i-2</sub>* potentially influencing actions at *t<sub>i-1</sub>* and *t<sub>i</sub>*.
* The question mark next to "make Statement" by South Korea indicates uncertainty or a potential outcome.
* The diagram does not provide quantitative data, but rather a qualitative representation of relationships and actions.
### Interpretation
The diagram appears to model a simplified system of international relations, focusing on diplomatic interactions and potential predictions. The use of actions like "express EntendTo", "negotiate", and "consult" suggests a focus on diplomatic efforts. The timeline indicates a dynamic process where actions at one point in time can influence events at later points.
The central role of South Korea, receiving actions from multiple sources and then initiating its own action, suggests its importance as a focal point in this network. The "predict" action by Barack Obama towards South Korea implies an attempt to anticipate South Korea's behavior or response.
The diagram's simplicity and lack of quantitative data limit its analytical power. However, it provides a visual representation of potential relationships and flows of influence within a complex international system. The question mark highlights the inherent uncertainty in predicting international events. The diagram could be used to illustrate concepts in political science, international relations, or network analysis. It is a conceptual model rather than a data-driven representation.
</details>
(d) Extrapolation reasoning on TKGs.
Figure 2: Illustration of four reasoning scenarios on KGs: transductive, inductive, interpolation, and extrapolation. The red dashed arrows indicate the query fact to be predicted.
### 2.2 Logic Reasoning on KGs
Logical reasoning involves using a given set of facts (i.e., premises) to deduce new facts (i.e., conclusions) by a rigorous form of thinking [42, 43]. It generally covers propositional and first-order logic (also known as predicate logic). Propositional logic deals with declarative sentences that can be definitively assigned a truth value, leaving no room for ambiguity. It is usually known as multi-hop reasoning [44, 35] on KGs, which views each fact as a declarative sentence and usually reasons over query-related paths to obtain an answer. Thus, propositional reasoning on KGs is entity-dependent. First-order logic (FOL) can be regarded as an expansion of propositional logic, enabling the expression of more refined and nuanced ideas [42, 45]. FOL rules extend the modeling scope and application prospect by introducing quantifiers ( $\exists$ and $\forall$ ), predicates, and variables. They encompass variables that belong to a specific domain and encompass objects and relationships among those objects [46]. They are usually in the form of $premise\rightarrow conclusion$ , where $premise$ and $conclusion$ denote the rule body and rule head which are all composed of atomic formulas. Each atomic formula consists of a predicate and several variables, e.g., $bornIn(X,Y)$ in $\gamma_{1}$ of Figure 1, where $bornIn$ is the predicate and $X$ and $Y$ are all entity variables. Thus, FOL reasoning is entity-independent, leveraging consistent FOL rules for different entities [47]. In this paper, we utilize Horn rules [48] to enhance the adaptability of FOL rules to various KG reasoning tasks. These rules entail setting the rule head to a single atomic formula. Furthermore, to make the Horn rules suitable for multiple reasoning scenarios, we introduce the following definitions.
Connected and Closed Horn (CCH) Rule. Based on Horn rules, CCH rules possess two distinct features, i.e., connected and closed. The term connected means the rule body necessitates a transitive and chained connection between atomic formulas through shared variables. Concurrently, the term closed indicates the rule body and rule head utilize identical start and end variables.
CCH rules of length $n$ (the quantifier $\forall$ would be omitted for better exhibition in the following parts of the paper) are in the following form:
$$
\begin{split}\epsilon,\;\forall&X,Y_{1},Y_{2},\cdots,Y_{n},Z\;\;r_{1}(X,Y_{1})
\land r_{2}(Y_{1},Y_{2})\land\cdots\\
&\land r_{n}(Y_{n-1},Z)\rightarrow r(X,Z),\end{split} \tag{1}
$$
where atomic formulas in the rule body are connected by variables ( $X,Y_{1},Y_{2},\cdots,Y_{n-1},Z$ ). For example, $r_{1}(X,Y_{1})$ and $r_{2}(Y_{1},Y_{2})$ are connected by $Y_{1}$ . Meanwhile, all variables form a path from $X$ to $Z$ that are the start variable and end variable of rule head $r_{t}(X,Z)$ , respectively. $r_{1},r_{2},\cdots,r_{n},r$ are relations in KGs to represent predicates. To model different credibility of different rules, we configure a rule confidence $\epsilon\in[0,1]$ for each Horn rule. Rule length refers to the number of atomic formulas in the rule body. For example, $\gamma_{1}$ , $\gamma_{2}$ , and $\gamma_{3}$ in Figure 1 are three example Horn rules of lengths 2, 3, and 3. Rule grounding of a Horn rule can be realized by replacing each variable with a real entity, e.g., bornIn(Barack Obama, Hawaii) $\land$ locatedInCountry(Hawaii, U.S.A.) $\rightarrow$ nationalityOf(Barack Obama, U.S.A.) is a grounding of rule $\gamma_{1}$ . CCH rules can be utilized for transductive and inductive reasoning.
Temporal Interpolation Horn (TIH) Rule. Based on CCH rules on static KGs that require connected and closed variables, TIH rules assign each atomic formula a time variable.
An example of TIH rule can be:
$$
\epsilon,\;\forall X,Y,Z\;\;r_{1}(X,Y):t_{1}\land r_{2}(Y,Z):t_{2}\rightarrow r
(X,Z):t, \tag{2}
$$
where $t_{1}$ , $t_{2}$ and $t$ are time variables. To expand the model capacity when grounding TIH rules, time variables are virtual and do not have to be instantiated to real timestamps, which is distinct from the entity variables (e.g., $X$ , $Y$ , $Z$ ). However, we model the relative sequence of occurrence. This implies that TIH rules with the same atomic formulas but varying time variable conditions are distinct and may have different degrees of confidence, such as for $t_{1}<t_{2}$ vs. $t_{1}>t_{2}$ .
Temporal Extrapolation Horn (TEH) Rule. Based on CCH rules on static KGs that require connected and closed variables, TEH rules assign each atomic formula a time variable. Unlike TIH rules, TEH rules have the characteristic of time growth, which means the time sequence is increasing and the time in the rule head is the maximum.
For example, the following rule is a TEH rule with length 2:
$$
\begin{split}\epsilon,\;\forall X,Y,Z\;\;&r_{1}(X,Y):t_{1}\land r_{2}(Y,Z):t_{
2}\\
&\rightarrow r(X,Z):t,\;\;s.t.\;\;t_{1}\leqslant t_{2}<t.\end{split} \tag{3}
$$
Noticeably, for rule learning and reasoning, $t_{1}$ , $t_{2}$ and $t$ are also virtual time variables that are only used to satisfy the time growth and do not have to be instantiated.
<details>
<summary>extracted/6596839/fig/arc.png Details</summary>

### Visual Description
\n
## Diagram: Knowledge Graph Reasoning Pipeline
### Overview
The image depicts a diagram of a knowledge graph (KG) reasoning pipeline, consisting of an input stage, multiple logic blocks, and an output stage. The pipeline iteratively refines embeddings and attention mechanisms through a series of reasoning steps.
### Components/Axes
The diagram is segmented into four main areas: Input, Logic Block #1, Logic Block #N, and Output.
* **Input:** Contains a Knowledge Graph (KG) represented as a network of nodes and edges, a query represented as (s, r, ?), and an initial embedding.
* **Logic Block #1 & #N:** These blocks are identical in structure and represent iterative reasoning steps. Each block contains:
* KG: The knowledge graph.
* Neighbor facts: A teal-colored block representing facts derived from the KG.
* Reasoning Graph: A purple block representing the reasoning graph.
* Expanding Graph: A pink block representing the expanding graph.
* Message-passing: A red block representing the message-passing mechanism.
* Logical Reasoning: A yellow block representing the logical reasoning component.
* Fact 1 to Fact N: Lists of facts.
* **Output:** Contains updated embeddings and attention mechanisms ("Updated Emb & Att") and reasoning scores represented as a bar chart.
### Detailed Analysis or Content Details
The diagram illustrates a multi-step reasoning process.
1. **Input Stage:**
* A Knowledge Graph (KG) is shown on the left, visually represented as a network of nodes (circles) and edges (lines). The nodes are colored in shades of red and yellow.
* A query is presented as "(s, r, ?)" or "(s, ?, t)".
* An "Initial Embed" is generated from the KG.
2. **Logic Block #1:**
* "Neighbor facts" are extracted from the KG.
* These facts are fed into a "Reasoning Graph (1 step)".
* The reasoning graph is then used in an "Expanding Graph" and a "Message-passing" mechanism.
* "Logical Reasoning" is applied.
* The output of this block is an "Updated Emb & Att" and a "Reasoning Graph (1 step)".
3. **Logic Block #N:**
* This block mirrors Logic Block #1, but operates on the output of the previous block.
* It receives the "Updated Emb & Att" from the previous step and the KG.
* It performs the same operations as Logic Block #1, resulting in a "Reasoning Graph (N step)" and another "Updated Emb & Att".
4. **Output Stage:**
* The final "Updated Emb & Att" is outputted.
* "Reasoning scores" are visualized as a bar chart with varying heights, indicating different levels of confidence or relevance. The bar chart has approximately 10 bars.
### Key Observations
The diagram highlights an iterative process where the knowledge graph is refined through multiple reasoning steps. Each logic block builds upon the output of the previous one, leading to increasingly accurate embeddings and attention mechanisms. The reasoning scores in the output stage provide a measure of the confidence in the reasoning process.
### Interpretation
The diagram represents a system for performing reasoning over a knowledge graph. The iterative nature of the logic blocks suggests a process of refinement, where the system progressively improves its understanding of the relationships within the KG. The "Updated Emb & Att" represents the system's evolving internal representation of the knowledge, while the "Reasoning scores" provide a quantifiable measure of its confidence. The diagram suggests a deep learning approach to knowledge graph reasoning, where embeddings and attention mechanisms are learned through repeated message passing and logical inference. The use of multiple logic blocks indicates that the system is capable of performing complex reasoning tasks that require multiple steps of inference. The diagram does not provide specific data or numerical values, but rather illustrates the overall architecture and flow of the reasoning pipeline. It is a conceptual diagram rather than a data visualization.
</details>
Figure 3: An overview of the Tunsr. It utilizes multiple logic blocks to find the answer, where the reasoning graph is constructed and iteratively expanded. Meanwhile, a forward logic message-passing mechanism is proposed to update embeddings and attentions for unified propositional and FOL reasoning.
<details>
<summary>extracted/6596839/fig/rg2.png Details</summary>

### Visual Description
\n
## Diagram: Knowledge Graph Iteration
### Overview
The image depicts a knowledge graph evolving through three iterations (Oβ, Oβ, Oβ, Oβ). Each iteration adds new nodes and relationships to the graph, representing an expanding knowledge base. The graph focuses on entities related to Barack Obama, his family, Harvard University, and related locations and people. The diagram uses nodes to represent entities and directed edges to represent relationships between them.
### Components/Axes
The diagram is divided into four vertical sections labeled "iteration 1" (Oβ), "iteration 2" (Oβ), "iteration 3" (Oβ), and "iteration 3" (Oβ), demarcated by dashed vertical lines. Each section represents a stage in the knowledge graph's development. The nodes are colored orange or blue. Edges are blue and labeled with the type of relationship they represent.
### Detailed Analysis or Content Details
**Iteration 1 (Oβ):**
* **Nodes:** Michelle Obama, Barack Obama, Harvard University, Malia Obama.
* **Relationships:**
* Barack Obama `marriedTo` Michelle Obama.
* Barack Obama `graduateFrom` Harvard University.
* Barack Obama `bornIn` Hawaii.
* Barack Obama `fatherOf` Malia Obama.
**Iteration 2 (Oβ):**
* **Nodes:** Adds Chicago, Hawaii, Honolulu, Sasha Obama.
* **Relationships:**
* Michelle Obama `bornIn` Chicago.
* Michelle Obama `graduateFrom` Harvard University.
* Harvard University `self` Harvard University (self-loop).
* Harvard University `graduateFrom` Bill Gates.
* Bill Gates `capitalOf` Hawaii.
* Malia Obama `sisterOf` Sasha Obama.
* Sasha Obama `growUpIn` Honolulu.
**Iteration 3 (Oβ):**
* **Nodes:** Adds U.S.A., John Harvard, Microsoft, Sidwell Friends School.
* **Relationships:**
* Chicago `self` Chicago (self-loop).
* Harvard University `mascot` U.S.A.
* U.S.A. `placeIn` Chicago.
* John Harvard `founderOf` Microsoft.
* Microsoft `...` (truncated).
* Honolulu `self` Honolulu (self-loop).
* Sasha Obama `graduateFrom` Sidwell Friends School.
* Sidwell Friends School `growUpIn` Honolulu.
**Iteration 3 (Oβ):**
* **Nodes:** Re-emphasizes Chicago, John Harvard, Honolulu, Microsoft.
* **Relationships:** No new relationships are added in this iteration, it appears to be a restatement of the previous iteration.
### Key Observations
* The graph expands iteratively, adding new entities and relationships with each step.
* The relationships are directional, indicating the nature of the connection between entities.
* Self-loops are used to indicate properties of entities (e.g., Harvard University `self` Harvard University).
* The graph demonstrates a growing knowledge base centered around Barack Obama and his connections.
* The "..." notation suggests that the graph could be even more extensive, with relationships potentially extending beyond what is shown.
### Interpretation
This diagram illustrates the concept of knowledge graph construction and evolution. Each iteration represents a refinement of the knowledge base, adding more information and connections. The graph demonstrates how entities can be linked through various relationships, creating a network of knowledge. The iterative process suggests a learning or discovery process, where new information is incorporated into the existing knowledge base. The use of self-loops indicates that entities can have properties or relationships with themselves. The truncation ("...") suggests that the graph is not exhaustive and could be expanded further. The diagram is a visual representation of semantic relationships, which are fundamental to knowledge representation and reasoning. The diagram is not presenting data in a quantitative sense, but rather demonstrating a qualitative expansion of a knowledge base.
</details>
(a) An example of reasoning graph in SKGs.
<details>
<summary>extracted/6596839/fig/rg1.png Details</summary>

### Visual Description
## Diagram: Iterative Relation Extraction Network
### Overview
This diagram illustrates an iterative process for relation extraction, likely within a natural language processing or knowledge graph context. It depicts a network of entities and relationships evolving across four iterations (Oβ, Oβ, Oβ, Oβ). The diagram uses nodes and directed edges to represent entities (and entity-time pairs) and the relationships between them, respectively. A legend clarifies the node types: blue for the start node (query entity) and orange for subsequent nodes.
### Components/Axes
The diagram is structured horizontally, representing the progression of iterations.
* **Iterations:** Labeled as "iteration 1", "iteration 2", and "iteration 3" above the respective network segments. The initial state is labeled "Oβ", "Oβ", "Oβ", and "Oβ".
* **Nodes:** Represent entities or entity-time pairs. They are colored: blue for the starting node ("Catherine Ashton") and orange for subsequent nodes.
* **Edges:** Directed arrows representing relationships between nodes. Each edge is labeled with the type of relationship (e.g., "self", "makeStatement", "consult", "makeVisit", "expressIntentTo", "makeOptimisticComment", "meetTo").
* **Legend:** Located in the bottom-right corner, defining the node colors and their meanings.
* **Entities:** Catherine Ashton, Mohammad Javad, Iran, Oman, China, John Kerry, Cabient.
* **Dates:** 2014-01-01, 2014-10-01, 2014-11-04, 2014-10-30, 2014-11-08, 2014-10-05, 2014-11-08, 2014-10-28, 2014-10-05.
### Detailed Analysis
The diagram shows the expansion of relationships originating from the initial node "Catherine Ashton".
**Iteration Oβ:**
* Start Node: "Catherine Ashton" (blue node).
* Relationship: "self" to "Catherine Ashton: 2014-01-01" (orange node).
* Relationship: "makeStatement" to "Cabient: 2014-10-05" (orange node).
* Relationship: "consult" to "Iran: 2014-01-04" (orange node).
**Iteration Oβ:**
* Start Node: "Catherine Ashton" (blue node).
* Relationship: "self" to "Mohammad Javad: 2014-10-01" (orange node).
* Relationship: "makeStatement" to "China: 2014-10-30" (orange node).
* Relationship: "consult" to "John Kerry: 2014-11-05" (orange node).
**Iteration Oβ:**
* Start Node: "Catherine Ashton: 2014-01-01" (orange node).
* Relationship: "self" to "Catherine Ashton: 2014-01-01" (orange node).
* Relationship: "expressIntentTo" to "Iran: 2014-11-04" (orange node).
**Iteration Oβ:**
* Start Node: "Mohammad Javad: 2014-10-01" (orange node).
* Relationship: "makeVisit" to "Oman: 2014-11-04" (orange node).
* Relationship: "makeOptimisticComment" to "Oman: 2014-11-08" (orange node).
* Relationship: "consult" to "Iran: 2014-11-08" (orange node).
* Start Node: "John Kerry: 2014-11-05" (orange node).
* Relationship: "makeVisit" to "Iran: 2014-11-08" (orange node).
* Start Node: "John Kerry: 2014-10-28" (orange node).
The "..." symbols indicate that the diagram is not showing all possible relationships or nodes within each iteration, but rather a selection to illustrate the process.
### Key Observations
* The network expands iteratively, adding new entities and relationships with each step.
* The relationships are diverse, including "self-references", statements, consultations, visits, and expressions of intent.
* The inclusion of dates with entities suggests a focus on temporal aspects of the relationships.
* The diagram demonstrates a process of refining and expanding knowledge about "Catherine Ashton" and her interactions with other entities.
### Interpretation
This diagram likely represents a model for extracting relationships from text or other data sources. The iterative process suggests that the system starts with an initial entity ("Catherine Ashton") and progressively discovers related entities and the relationships between them. Each iteration refines the knowledge graph by adding new connections and potentially disambiguating entities (e.g., by including timestamps). The use of different relationship types indicates that the system is capable of identifying various kinds of interactions. The "..." symbols suggest that the system is not limited to the relationships shown and can potentially explore a much larger network of connections. The diagram highlights the importance of considering both entities and the temporal context of relationships in knowledge extraction. The iterative nature of the process suggests a potential application in tasks such as event tracking, social network analysis, or knowledge base construction.
</details>
(b) An example of reasoning graph in TKGs.
Figure 4: Examples of the reasoning graph with three iterations. (a) is on SKGs while (b) is on TKGs.
## 3 Methodology
In this section, we present the technical details of our Tunsr model. It leverages a combination of logic blocks to obtain reasoning results, which involves constructing or expanding a reasoning graph and introducing a forward logic message-passing mechanism for propositional and FOL reasoning. The overall architecture is illustrated in Figure 3.
### 3.1 Reasoning Graph Construction
For each query of KGs, i.e., $\mathcal{Q}=(\tilde{s},\tilde{r},?)$ for SKGs or $\mathcal{Q}=(\tilde{s},\tilde{r},?,\tilde{t})$ for TKGs, we introduce an expanding reasoning graph to find the answer. The formulation is as follows.
Reasoning Graph. For a specific query $\mathcal{Q}$ , a reasoning graph is defined as $\widetilde{\mathcal{G}}=\{\mathcal{O},\mathcal{R},\widetilde{\mathcal{F}}\}$ for propositional and first-order reasoning. $\mathcal{O}$ is a node set that consists of nodes in different iteration steps, i.e., $\mathcal{O}=\mathcal{O}_{0}\cup\mathcal{O}_{1}\cup\cdots\cup\mathcal{O}_{L}$ . For SKGs, $\mathcal{O}_{0}$ only contains a query entity $\tilde{s}$ and the subsequent is in the form of entities. $(n_{i}^{l},\bar{r},n_{j}^{l+1})\in\widetilde{\mathcal{F}}$ is an edge that links nodes at two neighbor steps, i.e., $n_{i}^{l}\in\mathcal{O}_{l}$ , $n_{j}^{l+1}\in\mathcal{O}_{l+1}$ and $\bar{r}\in\mathcal{R}$ . The reasoning graph is constantly expanded by searching for posterior neighbor nodes. For start node $n^{0}=\tilde{s}$ , its posterior neighbors are $\mathcal{N}(n^{0})=\{e_{i}|(\tilde{s},\bar{r},e_{i})\in\mathcal{F}\}$ . For a node in following steps $n_{i}^{l}=e_{i}\in\mathcal{O}_{l}$ , its posterior neighbors are $\mathcal{N}(n_{i}^{l})=\{e_{j}|(e_{i},\bar{r},e_{j})\in\mathcal{F}\}$ . Its preceding parents are $\widetilde{\mathcal{N}}(n_{i}^{l})=\{(n_{j}^{l-1},\bar{r})|n_{j}^{l-1}\in \mathcal{O}_{l-1}\land(n_{j}^{l-1},\bar{r},n_{i}^{l})\in\widetilde{\mathcal{F}}\}$ . To take preceding nodes into account at the current step, an extra relation self is added. Then, $n_{i}^{l}=e_{i}$ can be obtained at the next step as $n_{i}^{l+1}=e_{i}$ and there have $(n_{i}^{l},self,n_{i}^{l+1})\in\widetilde{\mathcal{F}}$ .
For TKGs, $\mathcal{O}_{0}$ also contains a query entity $\tilde{s}$ . But the following nodes are in the form of entity-time pairs. In the interpolation scenarios, for start node $n^{0}=\tilde{s}$ , its posterior neighbors are $\mathcal{N}(n^{0})=\{(e_{i},t_{i})|(\tilde{s},\bar{r},e_{i},t_{i})\in\mathcal{ F}\}$ . For a node in following steps $n_{i}^{l}=(e_{i},t_{i})\in\mathcal{O}_{l}$ , its posterior neighbors are $\mathcal{N}(n_{i}^{l})=\{(e_{j},t_{j})|(e_{i},\bar{r},e_{j},t_{j})\in\mathcal{ F}\}$ . Differently, in the extrapolation scenarios, for start node $n^{0}=\tilde{s}$ , its posterior neighbors are $\mathcal{N}(n^{0})=\{(e_{i},t_{i})|(\tilde{s},\bar{r},e_{i},t_{i})\in\mathcal{ F}\land t_{i}<\tilde{t}\}$ . For a node in following steps $n_{i}^{l}=(e_{i},t_{i})\in\mathcal{O}_{l}$ , its posterior neighbors are $\mathcal{N}(n_{i}^{l})=\{(e_{j},t_{j})|(e_{i},\bar{r},e_{j},t_{j})\in\mathcal{ F}\land t_{i}\leqslant t_{j}\land t_{j}<\tilde{t}\}$ . Similar to the situation of SKGs, the preceding parents of nodes in TKG scenarios are also $\widetilde{\mathcal{N}}(n_{i}^{l})=\{(n_{j}^{l-1},\bar{r})|n_{j}^{l-1}\in \mathcal{O}_{l-1}\land(n_{j}^{l-1},\bar{r},n_{i}^{l})\in\widetilde{\mathcal{F}}\}$ and an extra relation self is also added. Then, $n_{i}^{l}=(e_{i},t_{i})$ can be obtained at the next step as $n_{i}^{l+1}=(e_{i},t_{i})$ ( $t_{i}$ is the minimum time if $l=0$ ) and there have $(n_{i}^{l},self,n_{i}^{l+1})\in\widetilde{\mathcal{F}}$ .
Two examples of the reasoning graph with three iterations are shown in Figure 4. Through the above processing, we can model both propositional and FOL reasoning in a unified manner for different reasoning scenarios.
### 3.2 Modeling of Propositional Reasoning
For decoding the answer for a specific query $\mathcal{Q}$ , we introduce an iterative forward message-passing mechanism in a continuously expanding reasoning graph, regulated by propositional and FOL reasoning. In the reasoning graph, we set two learnable parameters for each node $n_{i}^{l}$ to guide the propositional computation: propositional embedding ${\rm\textbf{x}}_{i}^{l}$ and propositional attention ${\alpha}_{n_{i}^{l}}$ . For a better presentation, we employ the reasoning process on TKGs to illustrate our method. SKGs can be considered a specific case of TKGsβ when the time information of the nodes in the reasoning graph is removed. The initialized embeddings for entity, relation, and time are formalized as h, g, and e. Time embeddings are obtained by the generic time encoding [49] as it is fully compatible with attention to capture temporal dynamics, which is defined as: ${\rm\textbf{e}}_{t}\!=\!\sqrt{\frac{1}{d_{t}}}[{\rm cos}(w_{1}t+b_{1}),\cdots, {\rm cos}(w_{d_{t}}t+b_{d_{t}})]$ , where $[w_{1},\cdots,w_{d_{t}}]$ and $[b_{1},\cdots,b_{d_{t}}]$ are trainable parameters for transformation weights and biases. cos denotes the standard cosine function and $d_{t}$ is the dimension of time embedding.
Further, the start node $n^{0}$ = $\tilde{s}$ is initialized as its embedding ${\rm\textbf{x}}_{\tilde{s}}={\rm\textbf{h}}_{\tilde{s}}$ . The node $n_{i}=(e_{i},t_{i})$ at the following iterations is firstly represented by the linear transformation of embeddings: ${\rm\textbf{x}}_{i}$ = ${\rm\textbf{W}}_{n}[{\rm\textbf{h}}_{e_{i}}\|{\rm\textbf{e}}_{t_{i}}]$ (W represents linear transformation and $\|$ denotes the embedding concatenation in the paper). Constant forward computation is required in the reasoning sequence of the target when conducting multi-hop propositional reasoning. Thus, forward message-passing is proposed to pass information (i.e., representations and attention weights) from the preceding nodes to their posterior neighbor nodes. The computation of each node is contextualized with preceding information that contains both entity-dependent parts, reflecting the continuous accumulation of knowledge and credibility in the reasoning process. Specifically, to update node embeddings in step $l$ +1, its own feature and the information from its priors are integrated:
$$
{\rm\textbf{x}}_{j}^{l+1}={\rm\textbf{W}}_{1}^{l}{\rm\textbf{x}}_{j}+\!\!\!\!
\sum_{(n_{i}^{l},\bar{r})\in\widetilde{\mathcal{N}}(n_{j}^{l+1})}\!\!\!\!
\alpha_{n_{i}^{l},\bar{r},n_{j}^{l+1}}{\rm\textbf{W}}_{2}^{l}{\rm\textbf{m}}_{
n_{i}^{l},\bar{r},n_{j}^{l+1}}, \tag{4}
$$
where ${\rm\textbf{m}}_{n_{i}^{l},\bar{r},n_{j}^{l+1}}$ is the message from a preceding node to its posterior node, which is given by the node and relation representations:
$$
{\rm\textbf{m}}_{n_{i}^{l},\bar{r},n_{j}^{l+1}}\!=\!{\rm\textbf{W}}_{3}^{l}[{
\rm\textbf{n}}_{i}^{l}\|{\rm\textbf{g}}_{\bar{r}}\|{\rm\textbf{n}}_{j}]. \tag{5}
$$
This updating form superficially seems similar to the general message-passing in GNNs [16]. However, they are actually different as ours is in a one-way and hierarchical manner, which is tailored for the tree-like structure of the reasoning graph. The propositional attention weight $\alpha_{n_{i}^{l},\bar{r},n_{j}^{l+1}}$ is for each edge in a reasoning graph. As propositional reasoning is entity-dependent, we compute it by the semantic association of entity-dependent embeddings between the message and the query:
$$
e_{n_{i}^{l},\bar{r},n_{j}^{l+1}}\!=\!\textsc{sigmoid}({\rm\textbf{W}}_{4}^{l}
[{\rm\textbf{m}}_{n_{i}^{l},\bar{r},n_{j}^{l+1}}\|{\rm\textbf{q}}]), \tag{6}
$$
where ${\rm\textbf{q}}={\rm\textbf{W}}_{q}[{\rm\textbf{h}}_{\tilde{s}}\|{\rm\textbf{g }}_{\tilde{r}}\|{\rm\textbf{e}}_{\tilde{t}}]$ is the query embedding. Then, the softmax normalization is utilized to scale edge attentions on this iteration to [0,1]:
$$
\alpha_{\!n_{i}^{l},\bar{r},n_{j}^{l+1}}\!\!=\!\!\frac{\exp(e_{n_{i}^{l},\bar{
r},n_{j}^{l+1}})}{\sum_{(\!n_{i^{\prime}}^{l},\bar{r}^{\prime})\in\widetilde{
\mathcal{N}}(n_{j}^{l+1}\!)}\!\!\exp(e_{n_{i^{\prime}}^{l},\bar{r}^{\prime},n_
{j}^{l+1}}\!)}, \tag{7}
$$
Finally, the propositional attention of new node $n_{j}^{l+1}$ is aggregated from edges for the next iteration:
$$
\begin{split}&\alpha_{n_{j}^{l+1}}\!=\!\!\!\sum_{(n_{i}^{l},\bar{r})\in
\widetilde{\mathcal{N}}(n_{j}^{l+1})}\!\!\!\!\!\!\!\!\alpha_{n_{i}^{l},\bar{r}
,n_{j}^{l+1}}.\end{split} \tag{8}
$$
### 3.3 Modeling of FOL Reasoning
Different from propositional reasoning, FOL reasoning is entity-independent and has a better ability for generalization. As first-order reasoning focuses on the interaction among entity-independent relations, we first obtain the hidden FOL embedding of an edge by fusing the hidden FOL embedding of the preceding node and current relation representation via a GRU [38]. Then, the FOL representation y and attention $b$ are given by:
$$
{\rm\textbf{y}}_{n_{i}^{l},\bar{r},n_{j}^{l+1}}\!=\!\textsc{gru}({\rm\textbf{g
}}_{\bar{r}},{\rm\textbf{y}}_{n_{i}^{l}}), \tag{9}
$$
$$
b_{n_{i}^{l},\bar{r},n_{j}^{l+1}}\!=\!\textsc{sigmoid}({\rm\textbf{W}}_{5}^{l}
{\rm\textbf{y}}_{n_{i}^{l},\bar{r},n_{j}^{l+1}}). \tag{10}
$$
Since the preceding node with high credibility leads to faithful subsequent nodes, the attention of the prior ( $\beta$ ) flows to the current edge. Then, the softmax normalization is utilized to scale edge attentions on this iteration to [0,1]:
$$
\begin{split}b_{n_{i}^{l},\bar{r},n_{j}^{l+1}}&=\beta_{\!n_{i}^{l}}\cdot b_{n_
{i}^{l},\bar{r},n_{j}^{l+1}},\;\;\\
\beta_{\!n_{i}^{l},\bar{r},n_{j}^{l+1}}\!\!&=\!\!\frac{\exp(b_{n_{i}^{l},\bar{
r},n_{j}^{l+1}})}{\sum_{(\!n_{i^{\prime}}^{l},\bar{r}^{\prime})\in\widetilde{
\mathcal{N}}(n_{j}^{l+1}\!)}\!\!\exp(b_{n_{i^{\prime}}^{l},\bar{r}^{\prime},n_
{j}^{l+1}}\!)},\end{split} \tag{11}
$$
Finally, the FOL representation and attention of a new node $n_{j}^{l+1}$ are aggregated from edges for the next iteration:
$$
\begin{split}{\rm\textbf{y}}_{n_{j}^{l+1}}\!&=\!\!\!\sum_{(n_{i}^{l},\bar{r})
\in\widetilde{\mathcal{N}}(n_{j}^{l+1})}\!\!\!\!\beta_{n_{i}^{l},\bar{r},n_{j}
^{l+1}}{\rm\textbf{y}}_{n_{i}^{l},\bar{r},n_{j}^{l+1}},\\
&\beta_{n_{j}^{l+1}}\!=\!\!\!\sum_{(n_{i}^{l},\bar{r})\in\widetilde{\mathcal{N
}}(n_{j}^{l+1})}\!\!\!\!\!\!\!\!\beta_{n_{i}^{l},\bar{r},n_{j}^{l+1}}.\end{split} \tag{12}
$$
Insights of FOL Rule Learning and Reasoning.
Actually, Tunsr introduces a novel FOL learning and reasoning strategy by forward logic message-passing mechanism over reasoning graphs. In general, the learning and reasoning of FOL rules on KGs or TKGs are usually in two-step fashion [20, 50, 51, 33, 28, 23, 18]. First, it searches over whole data to mine rules and their confidences. Second, for a query, the model instantiates all variables to find all groundings of learned rules and then aggregates all confidences of eligible rules. For example, for a target entity $o$ , its score can be the sum of learned rules with valid groundings and rule confidences can be modeled by a GRU. However, this is apparently not differentiable and cannot be optimized in an end-to-end manner because of the discrete rule learning and grounding operations. Thus, our model conducts the transformation of merging multiple rules by merging possible relations at each step, using FOL attention as:
$$
\begin{split}&\underbrace{S_{o}\!=\!\sum_{\gamma\in\Gamma}\beta_{\gamma}\!=\!
\sum_{\gamma\in\Gamma}f\big{[}\textsc{gru}({\rm\textbf{g}}_{\gamma,h},{\rm
\textbf{g}}_{\gamma,b^{1}},\cdots,{\rm\textbf{g}}_{\gamma,b^{|\gamma|}})]}_{(a
)}\\
&\underbrace{\approx\prod_{l=1}^{L}\sum_{n_{j}\in\mathcal{O}_{l}}\bar{f_{l}}
\big{[}\textsc{gru}({\rm\textbf{g}}_{\bar{r}},{\rm\textbf{o}}_{n_{j}}^{l}))
\big{]}}_{(b)}.\end{split} \tag{13}
$$
$\beta_{\gamma}$ is the confidence of rule $\gamma$ . ${\rm\textbf{g}}_{\gamma,h}$ and ${\rm\textbf{g}}_{\gamma,b^{i}}$ are the relation embeddings of head $h$ and $i$ -th body $b^{i}$ of this rule. Part (a) utilizes the grounding of the learned rules to calculate reasoning scores, where each ruleβs confidence can be modeled by GRU and feedforward network $f$ . We can conduct reasoning at each step rather than whole multi-step processing, so the previous can approximate to part (b). $\bar{f_{l}}$ is for the attention calculation. In this way, the differentiable process is achieved. This is an extension and progression of Neural LP [21] and DURM [32] by introducing several specific strategies for unified KG reasoning. Finally, the real FOL rules can be easily induced to constantly perform attention calculation over the reasoning graph, which is summarized as the Forward Attentive Rule Induction (FARI) algorithm. It is shown in Algorithm 1, where the situation on TKGs is given and that on SKGs can be obtained by omitting time information. In this way, Tunsr has the ability to capture CCH, TIH, and TEH rules with the specific-designed reasoning graphs as described in Section 3.1. As we add an extra self relation in the reasoning graph, the FARI algorithm can obtain all possible rules (no longer than length L) by deleting existing atoms with the self relation in induced FOL rules.
Input: the reasoning graph $\widetilde{\mathcal{G}}$ , FOL attentions $\beta$ .
Output: the FOL rule set $\Gamma$ .
1 Init $\Gamma=\varnothing$ , $B(n_{\tilde{s}}^{0})=[0,[]]$ , $\mathcal{D}_{0}[n_{\tilde{s}}^{0}]=[1,B(n_{\tilde{s}}^{0})]$ ;
2 for l=1 to L of decoder iterations do
3 Initialize node-rule dictionary $\mathcal{D}_{l}$ ;
4 for node $n_{j}^{l}$ in $\mathcal{O}_{l}$ do
5 Set rule body list $B(n_{j}^{l})$ = [] ;
6 for ( $n_{i}^{l-1},\bar{r}$ ) of $\widetilde{\mathcal{N}}$ ( $n_{j}^{l}$ ) in $\mathcal{O}_{l-1}$ do
7 Prior $e_{i,l-1}^{2}$ , $B(n_{i}^{l-1})$ = $\mathcal{D}_{l-1}[n_{i}^{l-1}]$ ;
8 for weight $\epsilon$ , body $\gamma_{b}$ in $B(n_{i}^{l-1})$ do
9 $\epsilon^{\prime}=e_{i,l-1}^{2}\cdot e_{n_{i}^{l-1},\bar{r},n_{j}^{l}}^{2}$ ;
10 $\gamma^{\prime}_{b}=\gamma_{b}.add(\bar{r})$ , $B(n_{j}^{l}).add([\epsilon^{\prime},\gamma^{\prime}_{b}])$ ;
11
12
13 $e_{j,l}^{2}=sum\{[\epsilon\in B(n_{j}^{l})]\}$ ;
14 Add $n_{j}^{l}$ : [ $e_{j,l}^{2}$ , $B(n_{j}^{l})$ ] to $\mathcal{D}_{l}$ ;
15
16 Normalize $e_{j,l}^{2}$ of $n_{j}^{l}$ in $\mathcal{O}_{l}$ using softmax;
17
18 for $n_{i}^{L}$ in $\mathcal{O}_{L}$ do
19 $e_{i,L}^{2}$ , $B(n_{i}^{L})$ = $\mathcal{D}_{L}[n_{j}^{L}]$ ;
20 for $\epsilon,\gamma_{b}$ in $B(n_{i}^{L})$ do
21 $\Gamma.add([\epsilon,\gamma_{b}[1](X,Y_{1}):t_{1}\land\cdots\land\gamma_{b}[L] (Y_{L-1},Z):t_{L}\rightarrow\tilde{r}(X,Z):t])$
22
Return rule set $\Gamma$ .
Algorithm 1 FARI for FOL rules Induction.
### 3.4 Reasoning Prediction and Process Overview
After calculation with $L$ logic blocks, the reasoning score for each entity can be obtained. For each entity $o$ at the last step of the reasoning graph for SKGs, we can utilize the representation and attention value of the propositional and FOL reasoning for calculating answer validity:
$$
{\rm\textbf{h}}_{o}=(1-\lambda){\rm\textbf{x}}_{o}+\lambda{\rm\textbf{y}}_{o},
\gamma_{o}=(1-\lambda)\alpha_{o}+\lambda\beta_{o}, \tag{14}
$$
where $\lambda$ is a learnable weight for the combination of propositional and FOL reasoning. $\alpha_{o}$ and $\beta_{o}$ are learned attention values for propositional and FOL reasoning, respectively. We calculate it dynamically using propositional embedding ${\rm\textbf{x}}_{o}$ , FOL embedding ${\rm\textbf{y}}_{o}$ , and query embedding q. Based on it, the final score is given by:
$$
s(\mathcal{Q},o)={\rm\textbf{W}}_{5}{\rm\textbf{h}}_{o}+\gamma_{o}. \tag{15}
$$
Reasoning scores for those entities that are not in the last step of the reasoning graph are set to 0 as it indicates that there are no available propositional and FOL rules for those entities. Finally, the model is optimized by the multi-class log-loss [52] like RED-GNN:
$$
\mathcal{L}=\sum_{\mathcal{Q}}\Big{[}-s(\mathcal{Q},o)+\log\big{(}\sum_{\bar{o
}\in\mathcal{E}}\exp(s(\mathcal{Q},\bar{o}))\big{)}\Big{]}, \tag{16}
$$
where $s(\mathcal{Q},o)$ denotes the reasoning score of labeled entity $o$ for query $\mathcal{Q}$ , while $\bar{o}$ is the arbitrary entity. For reasoning situations of TKGs, we need firstly aggregate node embedding and attentions with the same entity to get the entity score. Because the nodes in the reasoning graph of TKGs except the start node are in the form of entity-time pair.
The number of nodes may explode in the reasoning graph as it shows an exponential increase to reach $|\mathcal{N}(n_{i})|^{L}$ by iterations, especially for TKGs. For computational efficiency, we introduce the strategies of iteration fusion and sampling for interpolation and extrapolation reasoning, respectively. In the interpolation scenarios, nodes of entity-time pairs with the same entity are fused to an entity node and then are used to expand the reasoning graph. In the extrapolation scenarios, posterior neighbors of each node are sampled with a maximum of M nodes in each iteration. For sampling M node in the reasoning graph, we follow a time-aware weighted sampling strategy, considering that recent events may have a greater impact on the forecast target. Specifically, for a posterior neighbor node with time $t^{\prime}$ , we compute its sampling weight by $\frac{\exp(t^{\prime}-\tilde{t})}{\sum_{\bar{t}}{\exp(\bar{t}-\tilde{t})}}$ for the query ( $\tilde{s}$ , $\tilde{r}$ ,?, $\tilde{t}$ ), where $\bar{t}$ denotes the time of all possible posterior neighbor nodes for a prior node. After computing attention weights for each edge in the same iteration, we select top- N among them with larger attention weights and prune others.
## 4 Experiments and Results
### 4.1 Experiment Setups
The baselines cover a wide range of mainstream techniques and strategies for KG reasoning, with detailed descriptions provided in the Appendix. In the following parts of this section, we will carry out experiments and analyze results to answer the following four research questions.
$\bullet$ RQ1. How does the unified Tunsr perform in KG reasoning compared to state-of-the-art baselines?
$\bullet$ RQ2. How effective are propositional and FOL reasoning, and is it reasonable to integrate them?
$\bullet$ RQ3. What factors affect the reasoning performance of the Tunsr framework?
$\bullet$ RQ4. What is the actual reasoning process of Tunsr?
### 4.2 Comparison Results (RQ1)
TABLE II: The experiment results of transductive reasoning. The optimal and suboptimal values on each metric are marked in red and blue, respectively. The percent signs (%) for Hits@k metrics are omitted for better presentation. The following tables have a similar setting.
| Model | WN18RR | FB15k237 | | | | | | |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
| MRR | Hits@1 | Hits@3 | Hits@10 | MRR | Hits@1 | Hits@3 | Hits@10 | |
| TransE [19] | 0.481 | 43.30 | 48.90 | 57.00 | 0.342 | 24.00 | 37.80 | 52.70 |
| DistMult [53] | 0.430 | 39.00 | 44.00 | 49.00 | 0.241 | 15.50 | 26.30 | 41.90 |
| UltraE [54] | 0.485 | 44.20 | 50.00 | 57.30 | 0.349 | 25.10 | 38.50 | 54.10 |
| ComplEx-DURA [55] | 0.491 | 44.90 | β | 57.10 | 0.371 | 27.60 | β | 56.00 |
| AutoBLM [56] | 0.490 | 45.10 | β | 56.70 | 0.360 | 26.70 | β | 55.20 |
| SE-GNN [57] | 0.484 | 44.60 | 50.90 | 57.20 | 0.365 | 27.10 | 39.90 | 54.90 |
| RED-GNN [58] | 0.533 | 48.50 | β | 62.40 | 0.374 | 28.30 | β | 55.80 |
| CompoundE [59] | 0.491 | 45.00 | 50.80 | 57.60 | 0.357 | 26.40 | 39.30 | 54.50 |
| GATH [60] | 0.463 | 42.60 | 47.50 | 53.70 | 0.344 | 25.30 | 37.60 | 52.70 |
| TGformer [61] | 0.493 | 45.50 | 50.90 | 56.60 | 0.372 | 27.90 | 41.00 | 55.70 |
| AMIE [62] | 0.360 | 39.10 | β | 48.50 | 0.230 | 14.80 | β | 41.90 |
| AnyBURL [63] | 0.454 | 39.90 | β | 56.20 | 0.342 | 25.80 | β | 50.20 |
| SAFRAN [64] | 0.501 | 45.70 | β | 58.10 | 0.370 | 28.70 | β | 53.10 |
| Neural LP [21] | 0.381 | 36.80 | 38.60 | 40.80 | 0.237 | 17.30 | 25.90 | 36.10 |
| DRUM [32] | 0.382 | 36.90 | 38.80 | 41.00 | 0.238 | 17.40 | 26.10 | 36.40 |
| RLogic [23] | 0.470 | 44.30 | β | 53.70 | 0.310 | 20.30 | β | 50.10 |
| RNNLogic [33] | 0.483 | 44.60 | 49.70 | 55.80 | 0.344 | 25.20 | 38.00 | 53.00 |
| LatentLogic [24] | 0.481 | 45.20 | 49.70 | 55.30 | 0.320 | 21.20 | 32.90 | 51.40 |
| RNN+RotE [65] | 0.550 | 51.00 | 57.20 | 63.50 | 0.353 | 26.50 | 38.70 | 52.90 |
| TCRA [66] | 0.496 | 45.70 | 51.10 | 57.40 | 0.367 | 27.50 | 40.30 | 55.40 |
| Tunsr | 0.558 | 51.36 | 58.25 | 65.78 | 0.389 | 28.82 | 41.83 | 57.15 |
TABLE III: The experiment results on 12 inductive reasoning datasets.
| | Model | WN18RR | FB15k-237 | NELL-995 | | | | | | | | | |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| V1 | V2 | V3 | V4 | V1 | V2 | V3 | V4 | V1 | V2 | V3 | V4 | | |
| MRR | GraIL [67] | 0.627 | 0.625 | 0.323 | 0.553 | 0.279 | 0.276 | 0.251 | 0.227 | 0.481 | 0.297 | 0.322 | 0.262 |
| RED-GNN [58] | 0.701 | 0.690 | 0.427 | 0.651 | 0.369 | 0.469 | 0.445 | 0.442 | 0.637 | 0.419 | 0.436 | 0.363 | |
| MLSAA [68] | 0.716 | 0.700 | 0.448 | 0.654 | 0.368 | 0.457 | 0.442 | 0.431 | 0.694 | 0.424 | 0.433 | 0.359 | |
| RuleN [69] | 0.668 | 0.645 | 0.368 | 0.624 | 0.363 | 0.433 | 0.439 | 0.429 | 0.615 | 0.385 | 0.381 | 0.333 | |
| Neural LP [21] | 0.649 | 0.635 | 0.361 | 0.628 | 0.325 | 0.389 | 0.400 | 0.396 | 0.610 | 0.361 | 0.367 | 0.261 | |
| DRUM [32] | 0.666 | 0.646 | 0.380 | 0.627 | 0.333 | 0.395 | 0.402 | 0.410 | 0.628 | 0.365 | 0.375 | 0.273 | |
| Tunsr | 0.721 | 0.722 | 0.451 | 0.656 | 0.375 | 0.474 | 0.462 | 0.456 | 0.746 | 0.427 | 0.455 | 0.387 | |
| Hits@1 | GraIL [67] | 55.40 | 54.20 | 27.80 | 44.30 | 20.50 | 20.20 | 16.50 | 14.30 | 42.50 | 19.90 | 22.40 | 15.30 |
| RED-GNN [58] | 65.30 | 63.30 | 36.80 | 60.60 | 30.20 | 38.10 | 35.10 | 34.00 | 52.50 | 31.90 | 34.50 | 25.90 | |
| MLSAA [68] | 66.20 | 64.50 | 39.10 | 61.20 | 29.20 | 36.60 | 35.60 | 34.00 | 56.00 | 33.30 | 34.30 | 25.30 | |
| RuleN [69] | 63.50 | 61.10 | 34.70 | 59.20 | 30.90 | 34.70 | 34.50 | 33.80 | 54.50 | 30.40 | 30.30 | 24.80 | |
| Neural LP [21] | 59.20 | 57.50 | 30.40 | 58.30 | 24.30 | 28.60 | 30.90 | 28.90 | 50.00 | 24.90 | 26.70 | 13.70 | |
| DRUM [32] | 61.30 | 59.50 | 33.00 | 58.60 | 24.70 | 28.40 | 30.80 | 30.90 | 50.00 | 27.10 | 26.20 | 16.30 | |
| Tunsr | 66.25 | 66.31 | 38.11 | 61.55 | 30.44 | 37.88 | 37.90 | 36.37 | 73.13 | 32.67 | 37.13 | 27.30 | |
| Hits@10 | GraIL [67] | 76.00 | 77.60 | 40.90 | 68.70 | 42.90 | 42.40 | 42.40 | 38.90 | 56.50 | 49.60 | 51.80 | 50.60 |
| RED-GNN [58] | 79.90 | 78.00 | 52.40 | 72.10 | 48.30 | 62.90 | 60.30 | 62.10 | 86.60 | 60.10 | 59.40 | 55.60 | |
| MLSAA [68] | 81.10 | 79.60 | 54.40 | 72.40 | 49.00 | 61.60 | 58.90 | 59.70 | 87.80 | 59.40 | 59.20 | 55.00 | |
| RuleN [69] | 73.00 | 69.40 | 40.70 | 68.10 | 44.60 | 59.90 | 60.00 | 60.50 | 76.00 | 51.40 | 53.10 | 48.40 | |
| Neural LP [21] | 77.20 | 74.90 | 47.60 | 70.60 | 46.80 | 58.60 | 57.10 | 59.30 | 87.10 | 56.40 | 57.60 | 53.90 | |
| DRUM [32] | 77.70 | 74.70 | 47.70 | 70.20 | 47.40 | 59.50 | 57.10 | 59.30 | 87.30 | 54.00 | 57.70 | 53.10 | |
| Tunsr | 85.87 | 83.98 | 60.76 | 73.28 | 55.96 | 63.24 | 61.43 | 63.28 | 88.56 | 62.14 | 61.05 | 58.78 | |
TABLE IV: The experiment results (Hits@10 metrics) on 12 inductive reasoning datasets with 50 negative entities for ranking.
| Model | WN18RR | FB15k-237 | NELL-995 | | | | | | | | | |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| V1 | V2 | V3 | V4 | V1 | V2 | V3 | V4 | V1 | V2 | V3 | V4 | |
| GraIL [67] | 82.45 | 78.68 | 58.43 | 73.41 | 64.15 | 81.80 | 82.83 | 89.29 | 59.50 | 93.25 | 91.41 | 73.19 |
| CoMPILE [70] | 83.60 | 79.82 | 60.69 | 75.49 | 67.64 | 82.98 | 84.67 | 87.44 | 58.38 | 93.87 | 92.77 | 75.19 |
| TACT [71] | 84.04 | 81.63 | 67.97 | 76.56 | 65.76 | 83.56 | 85.20 | 88.69 | 79.80 | 88.91 | 94.02 | 73.78 |
| RuleN [69] | 80.85 | 78.23 | 53.39 | 71.59 | 49.76 | 77.82 | 87.69 | 85.60 | 53.50 | 81.75 | 77.26 | 61.35 |
| Neural LP [21] | 74.37 | 68.93 | 46.18 | 67.13 | 52.92 | 58.94 | 52.90 | 55.88 | 40.78 | 78.73 | 82.71 | 80.58 |
| DRUM [32] | 74.37 | 68.93 | 46.18 | 67.13 | 52.92 | 58.73 | 52.90 | 55.88 | 19.42 | 78.55 | 82.71 | 80.58 |
| ConGLR [26] | 85.64 | 92.93 | 70.74 | 92.90 | 68.29 | 85.98 | 88.61 | 89.31 | 81.07 | 94.92 | 94.36 | 81.61 |
| SymRITa [72] | 91.22 | 88.32 | 73.22 | 81.67 | 74.87 | 84.41 | 87.11 | 88.97 | 64.50 | 94.22 | 95.43 | 85.56 |
| Tunsr | 93.69 | 93.72 | 86.48 | 89.27 | 95.37 | 89.33 | 89.38 | 92.16 | 89.05 | 97.91 | 94.69 | 92.63 |
TABLE V: The experiment results of interpolation reasoning, including ICEWS14, ICEWS0515 and ICEWS18 datasets.
| Model | ICEWS14 | ICEWS0515 | | | | | | |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
| MRR | Hits@1 | Hits@3 | Hits@10 | MRR | Hits@1 | Hits@3 | Hits@10 | |
| TTransE [73] | 0.255 | 7.40 | β | 60.10 | 27.10 | 8.40 | β | 61.60 |
| DE-SimplE [74] | 0.526 | 41.80 | 59.20 | 72.50 | 0.513 | 39.20 | 57.80 | 74.80 |
| TA-DistMult [75] | 0.477 | 36.30 | β | 68.60 | 0.474 | 34.60 | β | 72.80 |
| ChronoR [76] | 0.625 | 54.70 | 66.90 | 77.30 | 0.675 | 59.60 | 72.30 | 82.00 |
| TComplEx [77] | 0.610 | 53.00 | 66.00 | 77.00 | 0.660 | 59.00 | 71.00 | 80.00 |
| TNTComplEx [77] | 0.620 | 52.00 | 66.00 | 76.00 | 0.670 | 59.00 | 71.00 | 81.00 |
| TeLM [78] | 0.625 | 54.50 | 67.30 | 77.40 | 0.678 | 59.90 | 72.80 | 82.30 |
| BoxTE [79] | 0.613 | 52.80 | 66.40 | 76.30 | 0.667 | 58.20 | 71.90 | 82.00 |
| RotateQVS [80] | 0.591 | 50.70 | 64.20 | 75.04 | 0.633 | 52.90 | 70.90 | 81.30 |
| TeAST [27] | 0.637 | 56.00 | 68.20 | 78.20 | 0.683 | 60.40 | 73.20 | 82.90 |
| Tunsr | 0.648 | 56.21 | 69.61 | 80.16 | 0.705 | 59.89 | 74.67 | 83.95 |
TABLE VI: The experiment results of extrapolation reasoning, including ICEWS14, ICEWS0515, and ICEWS18 datasets.
| Model | ICEWS14 | ICEWS0515 | ICEWS18 | | | | | | | | | |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| MRR | Hits@1 | Hits@3 | Hits@10 | MRR | Hits@1 | Hits@3 | Hits@10 | MRR | Hits@1 | Hits@3 | Hits@10 | |
| TransE [19] | 0.224 | 13.36 | 25.63 | 41.23 | 0.225 | 13.05 | 25.61 | 42.05 | 0.122 | 5.84 | 12.81 | 25.10 |
| DistMult [53] | 0.276 | 18.16 | 31.15 | 46.96 | 0.287 | 19.33 | 32.19 | 47.54 | 0.107 | 4.52 | 10.33 | 21.25 |
| ComplEx [81] | 0.308 | 21.51 | 34.48 | 49.58 | 0.316 | 21.44 | 35.74 | 52.04 | 0.210 | 11.87 | 23.47 | 39.87 |
| TTransE [73] | 0.134 | 3.11 | 17.32 | 34.55 | 0.157 | 5.00 | 19.72 | 38.02 | 0.083 | 1.92 | 8.56 | 21.89 |
| TA-DistMult [75] | 0.264 | 17.09 | 30.22 | 45.41 | 0.243 | 14.58 | 27.92 | 44.21 | 0.167 | 8.61 | 18.41 | 33.59 |
| TA-TransE [75] | 0.174 | 0.00 | 29.19 | 47.41 | 0.193 | 1.81 | 31.34 | 50.33 | 0.125 | 0.01 | 17.92 | 37.38 |
| DE-SimplE [74] | 0.326 | 24.43 | 35.69 | 49.11 | 0.350 | 25.91 | 38.99 | 52.75 | 0.193 | 11.53 | 21.86 | 34.80 |
| TNTComplEx [77] | 0.321 | 23.35 | 36.03 | 49.13 | 0.275 | 19.52 | 30.80 | 42.86 | 0.212 | 13.28 | 24.02 | 36.91 |
| RE-Net [82] | 0.382 | 28.68 | 41.34 | 54.52 | 0.429 | 31.26 | 46.85 | 63.47 | 0.288 | 19.05 | 32.44 | 47.51 |
| CyGNet [83] | 0.327 | 23.69 | 36.31 | 50.67 | 0.349 | 25.67 | 39.09 | 52.94 | 0.249 | 15.90 | 28.28 | 42.61 |
| AnyBURL [63] | 0.296 | 21.26 | 33.33 | 46.73 | 0.320 | 23.72 | 35.45 | 50.46 | 0.227 | 15.10 | 25.44 | 38.91 |
| TLogic [28] | 0.430 | 33.56 | 48.27 | 61.23 | 0.469 | 36.21 | 53.13 | 67.43 | 0.298 | 20.54 | 33.95 | 48.53 |
| TR-Rules [29] | 0.433 | 33.96 | 48.55 | 61.17 | 0.476 | 37.06 | 53.80 | 67.57 | 0.304 | 21.10 | 34.58 | 48.92 |
| xERTE [84] | 0.407 | 32.70 | 45.67 | 57.30 | 0.466 | 37.84 | 52.31 | 63.92 | 0.293 | 21.03 | 33.51 | 46.48 |
| TITer [85] | 0.417 | 32.74 | 46.46 | 58.44 | β | β | β | β | 0.299 | 22.05 | 33.46 | 44.83 |
| TECHS [30] | 0.438 | 34.59 | 49.36 | 61.95 | 0.483 | 38.34 | 54.69 | 68.92 | 0.308 | 21.81 | 35.39 | 49.82 |
| INFER [86] | 0.441 | 34.52 | 48.92 | 62.14 | 0.483 | 37.61 | 54.30 | 68.52 | 0.317 | 21.94 | 35.64 | 50.88 |
| Tunsr | 0.447 | 35.16 | 50.39 | 63.32 | 0.491 | 38.31 | 55.67 | 69.88 | 0.321 | 22.99 | 36.68 | 51.08 |
The experiments on transductive, inductive, interpolation, and extrapolation reasoning are carried out to evaluate the performance. The results are shown in Tables II, III, V and VI, respectively. It can be observed that our model has performance advantages over neural, symbolic, and neurosymbolic methods.
Specifically, from Table II of transductive reasoning, it is observed that Tunsr achieves the optimal performance. Compared with advanced neural methods, Tunsr shows performance advantages. For example, it improves the Hits@10 values of the two datasets by 8.78%, 16.78%, 8.48%, 8.68%, 9.08%, 3.38%, 8.18%, 12.08% and 4.45%, 15.25%, 3.05%, 1.15%, 1.95%, 1.35%, 2.65%, 4.45% compared with TransE, DistMult, UltraE, ComplEx-DURA, AutoBLM, RED-GNN, CompoundE and GATH model. Moreover, compared with symbolic and neurosymbolic methods, the advantages of the Tunsr are more obvious. For symbolic methods (AMIE, AnyBURL, and SAFRAN), the average achievements of MRR, Hits@1, and Hits@10 values are 0.119, 9.79%, 11.51% and 0.075, 5.72%, 8.75% on two datasets. For advanced neurosymbolic RNNLogic, LatentLogic and RNN+RotE, Tunsr also shows performance advantages, achieving 9.98%, 10.48%, 2.28% and 4.15%, 5.75%, 4.25% of Hits@10 improvements on two datasets.
For inductive reasoning, Tunsr also has the performance advantage compared with all neural, symbolic, and neurosymbolic methods as Table III shows, especially on WN18RR v1, WN18RR v2, WN18RR v3, FB15k-237 v1, and NELL-995 v1 datasets. Specifically, Tunsr is better than neural methods GraIL, MLSAA, and RED-GNN. Compared with the latter, it achieves 5.97%, 5.98%, 8.36%, 1.18%, 7.66%, 0.34%, 1.13%, 1.18%, 1.96%, 2.04%, 1.65%, and 3.18% improvements on the His@10 metric, reaching an average improvement of 3.39%. For symbolic and neurosymbolic methods (RuleN, Neural LP, and DRUM), Tunsr has greater performance advantages. For example, compared with DRUM, Tunsr has achieved an average improvement of 0.069, 8.19%, and 6.05% on MRR, Hits@1, and Hits@10 metrics, respectively. Besides, for equal comparison with CoMPILE [70], TACT [71], ConGLR [26], and SymRITa [72], we carry out the evaluation under their setting which introduces 50 negative entities (rather than all entities) for ranking for each query. The results are shown in Table IV. These results also verify the superiority of our model.
<details>
<summary>extracted/6596839/fig/zhexian1.png Details</summary>

### Visual Description
\n
## Line Chart: Hits@10 Values vs. Training Epochs
### Overview
This line chart depicts the performance of three different models β Propositional, FOL (First-Order Logic), and Unified β over 46 training epochs. The performance metric is "Hits@10 Values (%)", representing the percentage of times the correct answer appears within the top 10 predicted values.
### Components/Axes
* **X-axis:** Training Epochs, ranging from 1 to 46, with markers at intervals of 5.
* **Y-axis:** Hits@10 Values (%), ranging from 45% to 65%, with markers at intervals of 5.
* **Legend:** Located at the bottom-center of the chart, identifying the three data series:
* Propositional (represented by a light blue line with circle markers)
* FOL (represented by a dark blue line with square markers)
* Unified (represented by a red line with star markers)
* **Gridlines:** A light gray grid is present to aid in reading values.
### Detailed Analysis
Let's analyze each line individually:
* **Propositional (Light Blue):** The line starts at approximately 49% at epoch 1. It shows a steep upward trend until around epoch 16, reaching approximately 62%. After epoch 16, the growth slows down, and the line plateaus, fluctuating between 62% and 64% for the remaining epochs, ending at approximately 63% at epoch 46.
* **FOL (Dark Blue):** This line exhibits the most rapid initial growth. Starting at approximately 47% at epoch 1, it increases sharply until epoch 16, reaching approximately 58%. The growth continues, but at a slower rate, reaching approximately 62% at epoch 26. From epoch 26 to 46, the line shows minimal improvement, stabilizing around 62-63%, ending at approximately 62% at epoch 46.
* **Unified (Red):** The Unified model demonstrates the highest overall performance. Starting at approximately 57% at epoch 1, it increases steadily until epoch 16, reaching approximately 63%. The line continues to rise, reaching a peak of approximately 65% around epoch 21. After epoch 21, the line fluctuates between 64% and 65%, ending at approximately 65% at epoch 46.
Here's a table summarizing approximate values at key epochs:
| Epoch | Propositional (%) | FOL (%) | Unified (%) |
|-------|-------------------|---------|-------------|
| 1 | 49 | 47 | 57 |
| 6 | 52 | 50 | 58 |
| 11 | 56 | 53 | 60 |
| 16 | 62 | 58 | 63 |
| 21 | 62 | 60 | 65 |
| 26 | 62 | 62 | 64 |
| 31 | 63 | 62 | 64 |
| 36 | 63 | 62 | 64 |
| 41 | 63 | 62 | 65 |
| 46 | 63 | 62 | 65 |
### Key Observations
* The Unified model consistently outperforms both the Propositional and FOL models across all training epochs.
* The FOL model shows the fastest initial learning rate, but its performance plateaus earlier than the other two models.
* The Propositional model exhibits a more gradual learning curve and reaches a plateau similar to the FOL model.
* All three models demonstrate diminishing returns in performance as training progresses beyond 20 epochs.
### Interpretation
The chart suggests that the Unified model is the most effective approach for this task, achieving the highest Hits@10 values. The initial rapid learning of the FOL model might indicate its ability to quickly capture fundamental relationships, but its limited capacity prevents it from achieving the same level of performance as the Unified model in the long run. The Propositional model, while showing steady improvement, lags behind both the FOL and Unified models, suggesting its limitations in representing complex relationships. The plateauing of all three models indicates that further training beyond a certain point yields minimal performance gains, suggesting that the models have converged or that the dataset has been sufficiently learned. The diminishing returns observed after epoch 20 could also indicate the need for a more complex model or a larger, more diverse dataset.
</details>
(a) WN18RR of SKG T.
<details>
<summary>extracted/6596839/fig/zhexian2.png Details</summary>

### Visual Description
## Line Chart: Hits@10 Values vs. Training Epochs
### Overview
This line chart depicts the performance of three different models ("Proportional", "FOL", and "Unified") over 46 training epochs, measured by the "Hits@10 Values (%)" metric. The chart illustrates how the performance of each model evolves as it undergoes more training.
### Components/Axes
* **X-axis:** "Training Epochs" - Ranges from 1 to 46, with markers at intervals of 5.
* **Y-axis:** "Hits@10 Values (%)" - Ranges from 55% to 80%, with markers at intervals of 5.
* **Legend:** Located at the bottom-center of the chart.
* "Proportional" - Represented by a light purple line with circle markers.
* "FOL" - Represented by a blue line with square markers.
* "Unified" - Represented by a red line with star markers.
### Detailed Analysis
The chart displays three distinct lines, each representing a model's performance.
* **Proportional (Light Purple):** The line starts at approximately 61% at epoch 1, rises steadily, plateaus around 76% between epochs 21 and 41, and then slightly decreases to approximately 75% at epoch 46.
* Epoch 1: ~61%
* Epoch 6: ~66%
* Epoch 11: ~70%
* Epoch 16: ~73%
* Epoch 21: ~75%
* Epoch 26: ~75%
* Epoch 31: ~76%
* Epoch 36: ~76%
* Epoch 41: ~76%
* Epoch 46: ~75%
* **FOL (Blue):** The line begins at approximately 60% at epoch 1, exhibits a rapid increase up to epoch 16, reaching around 73%, then continues to rise more slowly, reaching approximately 78% at epoch 46.
* Epoch 1: ~60%
* Epoch 6: ~67%
* Epoch 11: ~71%
* Epoch 16: ~73%
* Epoch 21: ~75%
* Epoch 26: ~76%
* Epoch 31: ~77%
* Epoch 36: ~78%
* Epoch 41: ~79%
* Epoch 46: ~78%
* **Unified (Red):** The line starts at approximately 58% at epoch 1, shows a steep increase until epoch 11, reaching around 74%, then continues to rise at a decreasing rate, leveling off around 79-80% between epochs 26 and 46.
* Epoch 1: ~58%
* Epoch 6: ~65%
* Epoch 11: ~74%
* Epoch 16: ~77%
* Epoch 21: ~78%
* Epoch 26: ~79%
* Epoch 31: ~79%
* Epoch 36: ~80%
* Epoch 41: ~80%
* Epoch 46: ~79%
### Key Observations
* The "Unified" model consistently outperforms the other two models across all training epochs.
* The "FOL" model shows the most significant initial improvement in performance.
* The "Proportional" model exhibits the slowest rate of improvement and plateaus earlier than the other models.
* All three models demonstrate diminishing returns in performance as training progresses beyond 26 epochs.
### Interpretation
The chart suggests that the "Unified" model is the most effective at learning the underlying patterns in the data, as evidenced by its consistently higher "Hits@10 Values (%)" score. The initial rapid improvement of the "FOL" model indicates that it quickly learns basic relationships, but its performance eventually plateaus. The "Proportional" model's slower improvement suggests it may be less capable of capturing complex relationships or requires more training data. The diminishing returns observed in all models after a certain number of epochs indicate that further training may not yield significant performance gains. This could be due to the model reaching its capacity to learn from the available data or the onset of overfitting. The choice of model would depend on the trade-off between training time, computational resources, and desired performance level.
</details>
(b) ICEWS14 of TKG I.
<details>
<summary>extracted/6596839/fig/zhexian3.png Details</summary>

### Visual Description
## Line Chart: Hits@10 Values vs. Training Epochs
### Overview
This line chart depicts the performance of three different models β Propositional, FOL (First-Order Logic), and Unified β over 10 training epochs. The performance metric is "Hits@10 Values (%)", representing the percentage of times the correct answer appears within the top 10 predicted values.
### Components/Axes
* **X-axis:** Training Epochs (ranging from 1 to 10)
* **Y-axis:** Hits@10 Values (%) (ranging from 40 to 70)
* **Legend:** Located at the top-center of the chart, identifying the three data series:
* Propositional (represented by a light purple line with circle markers)
* FOL (represented by a blue line with square markers)
* Unified (represented by a red line with star markers)
* **Gridlines:** Present to aid in reading values.
### Detailed Analysis
Let's analyze each data series individually:
**1. Propositional (Light Purple Line):**
* **Trend:** The line initially slopes sharply upward from Epoch 1 to Epoch 2, then exhibits a fluctuating downward trend with some minor increases, ending with a slight increase from Epoch 9 to Epoch 10.
* **Data Points (approximate):**
* Epoch 1: 49%
* Epoch 2: 53%
* Epoch 3: 46%
* Epoch 4: 48%
* Epoch 5: 47%
* Epoch 6: 45%
* Epoch 7: 43%
* Epoch 8: 44%
* Epoch 9: 42%
* Epoch 10: 45%
**2. FOL (Blue Line):**
* **Trend:** The line remains relatively stable, fluctuating around the 60% mark throughout the 10 epochs.
* **Data Points (approximate):**
* Epoch 1: 59%
* Epoch 2: 60%
* Epoch 3: 61%
* Epoch 4: 61%
* Epoch 5: 60%
* Epoch 6: 60%
* Epoch 7: 62%
* Epoch 8: 61%
* Epoch 9: 60%
* Epoch 10: 60%
**3. Unified (Red Line):**
* **Trend:** The line starts at a higher value than the other two, fluctuates with a slight downward trend in the middle epochs, and then increases again towards the end.
* **Data Points (approximate):**
* Epoch 1: 63%
* Epoch 2: 65%
* Epoch 3: 63%
* Epoch 4: 65%
* Epoch 5: 64%
* Epoch 6: 63%
* Epoch 7: 65%
* Epoch 8: 63%
* Epoch 9: 62%
* Epoch 10: 63%
### Key Observations
* The Unified model consistently outperforms the Propositional and FOL models across all epochs.
* The Propositional model shows the most significant fluctuation in performance.
* The FOL model exhibits the most stable performance.
* The Propositional model starts with the lowest performance and does not converge to a stable value.
### Interpretation
The chart demonstrates the effectiveness of different knowledge representation methods (Propositional, FOL, and Unified) in a learning task. The Unified model's consistently higher Hits@10 values suggest that it is the most effective at capturing relevant information and making accurate predictions. The FOL model's stability indicates that it provides a reliable, though potentially less powerful, representation. The Propositional model's fluctuating performance and lower overall values suggest that it may struggle to generalize from the training data or that it is more sensitive to the specific training examples. The initial rapid increase in the Propositional model could indicate a quick learning phase, but the subsequent decline suggests overfitting or instability. The fact that none of the models reach 100% suggests that the task is challenging and that further improvements are possible.
</details>
(c) ICEWS14 of TKG E.
<details>
<summary>extracted/6596839/fig/zhexian4.png Details</summary>

### Visual Description
\n
## Line Chart: Hits@10 Values vs. Training Epochs
### Overview
This line chart depicts the performance of three different approaches β Propositional, FOL (First-Order Logic), and Unified β over 10 training epochs. The performance metric is "Hits@10 Values (%)", representing the percentage of times the correct answer appears within the top 10 predicted values.
### Components/Axes
* **X-axis:** Training Epochs (ranging from 1 to 10)
* **Y-axis:** Hits@10 Values (%) (ranging from 30 to 60)
* **Legend:** Located at the top-center of the chart, identifying the three data series:
* Propositional (Purple line with circle markers)
* FOL (Blue line with square markers)
* Unified (Red line with star markers)
* **Gridlines:** Present to aid in reading values.
### Detailed Analysis
Let's analyze each line individually, noting trends and approximate data points.
* **Propositional (Purple):** This line shows an upward trend from Epoch 1 to Epoch 5, then plateaus.
* Epoch 1: ~40%
* Epoch 2: ~41%
* Epoch 3: ~42.5%
* Epoch 4: ~44%
* Epoch 5: ~46%
* Epoch 6: ~45.5%
* Epoch 7: ~45.5%
* Epoch 8: ~45.5%
* Epoch 9: ~45.5%
* Epoch 10: ~45.5%
* **FOL (Blue):** This line fluctuates throughout the 10 epochs, with a slight downward trend overall.
* Epoch 1: ~49%
* Epoch 2: ~49%
* Epoch 3: ~50%
* Epoch 4: ~48.5%
* Epoch 5: ~47%
* Epoch 6: ~47%
* Epoch 7: ~48%
* Epoch 8: ~48%
* Epoch 9: ~48%
* Epoch 10: ~48%
* **Unified (Red):** This line starts relatively high and exhibits a slight downward trend initially, then stabilizes and shows a slight increase towards the end.
* Epoch 1: ~52%
* Epoch 2: ~51%
* Epoch 3: ~51.5%
* Epoch 4: ~50.5%
* Epoch 5: ~49%
* Epoch 6: ~49%
* Epoch 7: ~49.5%
* Epoch 8: ~50%
* Epoch 9: ~50.5%
* Epoch 10: ~50.5%
### Key Observations
* The Propositional approach demonstrates the most significant improvement in Hits@10 values during the initial training epochs (1-5).
* The FOL approach exhibits the most stable performance, fluctuating around the 48-50% range.
* The Unified approach starts with the highest Hits@10 values but shows a slight decline before stabilizing.
* The Propositional approach ultimately reaches a performance level comparable to the FOL and Unified approaches.
* No approach reaches 60% Hits@10 values.
### Interpretation
The chart suggests that the Propositional approach benefits most from initial training, indicating it requires more epochs to converge. The FOL and Unified approaches achieve relatively stable performance early on, but their improvement plateaus. The Unified approach initially outperforms the others, but the gap closes as the Propositional approach improves.
The fact that none of the approaches reach a high level of accuracy (above 60%) suggests that the task is challenging, or that further improvements to the models or training process are needed. The differences in performance between the approaches could be due to the inherent limitations of each method in representing and reasoning about the underlying data. The plateauing of performance for all approaches after a certain number of epochs indicates diminishing returns from continued training, suggesting that the models may have reached their capacity to learn from the available data.
</details>
(d) ICEWS18 of TKG E.
Figure 5: The impacts of propositional and FOL reasoning on transductive, interpolation, and extrapolation scenarios. It is generally observed that the unified model has a better performance compared with the single propositional or FOL setting, demonstrating the validity and rationality of the unified mechanism in Tunsr.
For interpolation reasoning in Table V, the performance of Tunsr surpasses that of mainstream neural reasoning methods. It achieves optimal results on seven out of eight metrics. Compared with TNTComplEx of the previous classic tensor-decomposition method, the improvement on each metric is 0.028, 4.21%, 3.61%, 4.16%, 0.035, 0.89%, 3.67%, and 2.95%, respectively. Moreover, compared with the state-of-the-art model TeAST that encodes temporal knowledge graph embeddings via the archimedean spiral timeline, Tunsr also has 0.011, 0.21%, 1.41%, 1.96%, 0.022, -0.51%, 1.47%, and 1.05% performance advantages (only slightly smaller on Hits@1 metric of ICEWS0515 dataset).
As Table VI shows for extrapolation reasoning, Tunsr also performs better. Compared with 10 neural reasoning methods, Tunsr has obvious performance advantages. For instance, it achieves 14.19%, 27.02%, and 14,17% Hits@10 improvement on three datasets against the tensor-decomposition method TNTComplEx. Additionally, Tunsr outperforms symbolic rule-based methods, i.e., AnyBURL, TLogic, and TR-Rules, achieving average improvements of 0.061, 5.57%, 7.01%, 6.94%, 0.069, 5.98%, 8.21%, 8.06%, 0.045, 4.08%, 5.36%, and 5.63% on all 12 evaluation metrics. Moreover, Tunsr excels three neurosymbolic methods (xERTE, TITer and INFER) across all datasets. Furthermore, compared with the previous study TECHS, Tunsr also has the performance boost, which shows 1.37%, 0.96%, and 1.26% Hits@10 metric gains.
In summary, the experimental results on four reasoning scenarios demonstrate the effectiveness and superiority of the proposed unified framework Tunsr. It shows the rationality of the unified mechanism at both the methodological and application perspectives and verifies the tremendous potential for future KG reasoning frameworks.
### 4.3 Ablation Studies (RQ2)
To explore the impacts of propositional and FOL parts on KG reasoning performance, we carry out ablation studies on transductive (WN18RR), interpolation (ICEWS14), and extrapolation (ICEWS14 and ICEWS18) scenarios in Figure 5. As inductive reasoning is entity-independent, we only conduct experiments using FOL reasoning for it. In each line chart, we depict the performance trends associated with propositional, FOL, and unified reasoning throughout the training epochs. In the propositional/FOL setting, we set $\lambda$ in the Eq. (14) as 0/1, indicating the model only uses propositional/FOL reasoning to get the answer. In the unified setting, the value of $\lambda$ is the dynamic learned by embeddings. From the figure, it is generally observed that the unified setting has a better performance compared with the single propositional or FOL setting. It is noteworthy that propositional and FOL display unique characteristics when applied to diverse datasets. For transductive and interpolation reasoning (Figures 5 and 5), the performance of propositional reasoning consistently surpasses that of FOL, despite both exhibiting continuous improvement throughout the training process. However, it is contrary to the results on the extrapolation scenario (Figures 5 and 5), where FOL reasoning has performance advantages. It is noted that propositional reasoning performs well in ICEWS18 while badly in ICEWS14 under the extrapolation setting. This may be caused by the structural differences between ICEWS14 and ICEWS18. Compared with ICEWS14, the graph structure of ICEWS18 is notably denser (8.94 vs. 16.19 in node degree). So propositional reasoning in ICEWS18 can capture more comprehensive pattern semantics and exhibit robust generalization in testing scenarios. These observations indicate that propositional and FOL reasoning emphasizes distinct aspects of knowledge. Thus, combining them allows for the synergistic exploitation of their respective strengths, resulting in an enhanced overall effect.
<details>
<summary>extracted/6596839/fig/bar1.png Details</summary>

### Visual Description
\n
## Bar Chart: Performance Comparison by Step Count
### Overview
This image presents a bar chart comparing the performance of a system across different numbers of steps (2, 4, 6, and 8) for three different metrics: MRR (Mean Reciprocal Rank), Hits@1, and Hits@10. The chart uses colored bars to represent each step count for each metric, allowing for a visual comparison of performance.
### Components/Axes
* **X-axis:** Represents the evaluation metrics: "MRR", "Hits@1", and "Hits@10".
* **Y-axis:** Represents the performance score, ranging from 0.0 to 0.7, with increments of 0.1.
* **Legend (Top-Center):**
* "2 steps" - Teal/Cyan color
* "4 steps" - Blue color
* "6 steps" - Purple/Lavender color
* "8 steps" - Red color
### Detailed Analysis
The chart consists of three groups of four bars each, corresponding to the three metrics on the x-axis. Each group compares the performance of the system using 2, 4, 6, and 8 steps.
**MRR:**
* 2 steps: Approximately 0.41
* 4 steps: Approximately 0.49
* 6 steps: Approximately 0.54
* 8 steps: Approximately 0.56
*Trend:* The bars for MRR show an upward trend as the number of steps increases.
**Hits@1:**
* 2 steps: Approximately 0.39
* 4 steps: Approximately 0.48
* 6 steps: Approximately 0.52
* 8 steps: Approximately 0.55
*Trend:* The bars for Hits@1 also show an upward trend as the number of steps increases.
**Hits@10:**
* 2 steps: Approximately 0.44
* 4 steps: Approximately 0.56
* 6 steps: Approximately 0.59
* 8 steps: Approximately 0.65
*Trend:* The bars for Hits@10 demonstrate a clear upward trend as the number of steps increases.
### Key Observations
* Increasing the number of steps consistently improves performance across all three metrics.
* The largest performance gains are observed when increasing from 2 to 4 steps. The improvement from 6 to 8 steps is less pronounced, but still positive.
* Hits@10 consistently has the highest scores, indicating that the system is more effective at retrieving relevant items within the top 10 results.
* MRR has the lowest scores, suggesting that the ranking of the top results could be improved.
### Interpretation
The data suggests that increasing the number of steps in the system's process leads to improved retrieval performance, as measured by MRR, Hits@1, and Hits@10. This could indicate that more complex or iterative processing yields better results. The diminishing returns observed when moving from 6 to 8 steps suggest that there may be a point of diminishing returns, where further increasing the number of steps does not significantly improve performance. The differences between the metrics suggest that the system is better at identifying relevant items within a larger set (Hits@10) than at ranking the most relevant item at the very top (MRR). This could be due to factors such as the quality of the ranking algorithm or the nature of the data being searched. The chart provides valuable insights into the relationship between processing complexity and retrieval performance, which can be used to optimize the system for specific performance goals.
</details>
(a) WN18RR of SKG T.
<details>
<summary>extracted/6596839/fig/bar2.png Details</summary>

### Visual Description
\n
## Bar Chart: Performance Comparison by Step Count
### Overview
This bar chart compares the performance of a system across three metrics (MRR, Hits@1, Hits@10) using four different step counts (2, 4, 6, and 8). The performance is represented by a numerical value ranging from 0 to 0.7 on the y-axis.
### Components/Axes
* **X-axis:** Represents the evaluation metrics: "MRR", "Hits@1", and "Hits@10".
* **Y-axis:** Represents the performance score, ranging from 0.0 to 0.7, with increments of 0.1.
* **Legend:** Located at the top-center of the chart, identifies the different step counts with corresponding colors:
* 2 steps (Light Blue)
* 4 steps (Medium Blue)
* 6 steps (Light Purple)
* 8 steps (Red)
### Detailed Analysis
The chart consists of three groups of bars, one for each metric. Within each group, there are four bars representing the performance at each step count.
**MRR (Mean Reciprocal Rank):**
* 2 steps: Approximately 0.35
* 4 steps: Approximately 0.46
* 6 steps: Approximately 0.47
* 8 steps: Approximately 0.42
*Trend:* The MRR score increases from 2 to 6 steps, then slightly decreases at 8 steps.
**Hits@1 (Hit Rate at Rank 1):**
* 2 steps: Approximately 0.30
* 4 steps: Approximately 0.37
* 6 steps: Approximately 0.39
* 8 steps: Approximately 0.34
*Trend:* The Hits@1 score increases from 2 to 6 steps, then decreases at 8 steps.
**Hits@10 (Hit Rate at Rank 10):**
* 2 steps: Approximately 0.45
* 4 steps: Approximately 0.58
* 6 steps: Approximately 0.61
* 8 steps: Approximately 0.62
*Trend:* The Hits@10 score consistently increases with the number of steps, reaching its highest value at 8 steps.
### Key Observations
* The Hits@10 metric shows the most significant and consistent improvement with increasing step counts.
* Both MRR and Hits@1 metrics show improvement up to 6 steps, but performance plateaus or slightly decreases at 8 steps.
* The performance difference between 2 steps and 8 steps is most pronounced for Hits@10.
### Interpretation
The data suggests that increasing the number of steps generally improves the system's performance, particularly in terms of finding relevant results within the top 10 (Hits@10). However, there appears to be a diminishing return or even a slight degradation in performance for MRR and Hits@1 when exceeding 6 steps. This could indicate that beyond a certain point, additional steps introduce noise or complexity that negatively impacts the system's ability to rank the most relevant results highly. The optimal step count appears to be around 6 for maximizing both precision (Hits@1) and overall ranking quality (MRR), while 8 steps provide the best recall (Hits@10). Further investigation might be needed to understand why performance plateaus or decreases at higher step counts.
</details>
(b) FB15k-237 v3 of SKG I.
<details>
<summary>extracted/6596839/fig/bar3.png Details</summary>

### Visual Description
\n
## Bar Chart: Performance Comparison by Step Count
### Overview
The image presents a bar chart comparing the performance of a system across different numbers of steps (1, 2, 3, and 4) using three metrics: MRR (Mean Reciprocal Rank), Hits@1, and Hits@10. The chart visually represents how performance changes as the number of steps increases for each metric.
### Components/Axes
* **X-axis:** Represents the evaluation metrics: "MRR", "Hits@1", and "Hits@10".
* **Y-axis:** Represents the performance score, ranging from 0 to approximately 0.85.
* **Legend:** Located at the top of the chart, identifies the different step counts with corresponding colors:
* 1 step (Teal)
* 2 steps (Light Blue)
* 3 steps (Lavender)
* 4 steps (Red)
### Detailed Analysis
The chart consists of three groups of bars, one for each metric. Within each group, there are four bars representing the performance at 1, 2, 3, and 4 steps.
**MRR:**
* 1 step: Approximately 0.48
* 2 steps: Approximately 0.63
* 3 steps: Approximately 0.65
* 4 steps: Approximately 0.67
*Trend:* The MRR score generally increases with the number of steps, with diminishing returns after 2 steps.
**Hits@1:**
* 1 step: Approximately 0.39
* 2 steps: Approximately 0.56
* 3 steps: Approximately 0.58
* 4 steps: Approximately 0.59
*Trend:* The Hits@1 score increases significantly from 1 to 2 steps, then plateaus with smaller gains at 3 and 4 steps.
**Hits@10:**
* 1 step: Approximately 0.73
* 2 steps: Approximately 0.78
* 3 steps: Approximately 0.82
* 4 steps: Approximately 0.85
*Trend:* The Hits@10 score consistently increases with the number of steps, showing a clear positive correlation.
### Key Observations
* The largest performance gains are observed when increasing the number of steps from 1 to 2 for all metrics.
* Hits@10 consistently demonstrates the highest performance scores across all step counts.
* The improvement in MRR and Hits@1 diminishes as the number of steps increases beyond 2.
* The performance difference between 3 and 4 steps is minimal for all metrics.
### Interpretation
The data suggests that increasing the number of steps in the system generally improves performance, particularly for Hits@10. However, there appears to be a point of diminishing returns, where adding more steps yields only marginal improvements. This could indicate that the system reaches a level of optimization after a certain number of steps, and further steps do not significantly enhance its ability to rank or retrieve relevant results. The substantial improvement from 1 to 2 steps suggests that the initial step is crucial for establishing a baseline level of performance, and the second step refines the results considerably. The relatively stable performance between 3 and 4 steps suggests that the system is approaching its maximum potential within the evaluated framework. The difference in performance between the metrics indicates that the system is better at retrieving a single relevant result (Hits@10) than at ranking the most relevant result at the top (MRR).
</details>
(c) ICEWS14 of TKG I.
<details>
<summary>extracted/6596839/fig/bar4.png Details</summary>

### Visual Description
\n
## Bar Chart: Performance Comparison by Step Count
### Overview
This image presents a bar chart comparing the performance of a system across different metrics (MRR, Hits@1, Hits@10) as a function of the number of steps taken (1, 2, 3, and 4). The chart uses colored bars to represent each step count for each metric.
### Components/Axes
* **X-axis:** Represents the evaluation metrics: "MRR", "Hits@1", and "Hits@10".
* **Y-axis:** Represents the performance score, ranging from 0.0 to 0.7, with increments of 0.1.
* **Legend (Top-Center):**
* Light Blue: "1 step"
* Medium Blue: "2 steps"
* Light Purple: "3 steps"
* Red: "4 steps"
### Detailed Analysis
The chart consists of three groups of bars, one for each metric. Within each group, there are four bars representing the performance at 1, 2, 3, and 4 steps.
**MRR (Mean Reciprocal Rank):**
* 1 step: Approximately 0.39
* 2 steps: Approximately 0.41
* 3 steps: Approximately 0.44
* 4 steps: Approximately 0.46
*Trend:* The bars show an upward trend, indicating that MRR increases with the number of steps.
**Hits@1 (Hit Rate at Rank 1):**
* 1 step: Approximately 0.29
* 2 steps: Approximately 0.30
* 3 steps: Approximately 0.32
* 4 steps: Approximately 0.34
*Trend:* The bars show an upward trend, indicating that Hits@1 increases with the number of steps.
**Hits@10 (Hit Rate at Rank 10):**
* 1 step: Approximately 0.58
* 2 steps: Approximately 0.59
* 3 steps: Approximately 0.61
* 4 steps: Approximately 0.63
*Trend:* The bars show an upward trend, indicating that Hits@10 increases with the number of steps.
### Key Observations
* All metrics show a positive correlation with the number of steps. Increasing the number of steps consistently improves performance.
* The improvement in Hits@10 is more pronounced than the improvement in MRR or Hits@1.
* The difference in performance between 1 step and 2 steps is smaller than the difference between 3 steps and 4 steps for all metrics.
### Interpretation
The data suggests that increasing the number of steps in the system leads to improved performance across all evaluated metrics. This could indicate that the system benefits from more iterations or refinement. The larger improvement observed in Hits@10 suggests that the system is better at identifying relevant results within the top 10 recommendations as the number of steps increases. The non-linear improvement (smaller gains initially, larger gains later) might indicate diminishing returns or a threshold effect where additional steps become more impactful after a certain point. The chart demonstrates a clear trade-off between computational cost (more steps) and performance. Further investigation could explore the optimal number of steps to balance performance gains with computational efficiency.
</details>
(d) ICEWS14 of TKG E.
Figure 6: The impacts of reasoning iterations which correspond to the length of the reasoning rules. It is evident that choosing the appropriate value is crucial for obtaining accurate reasoning results.
<details>
<summary>extracted/6596839/fig/bar3d1.png Details</summary>

### Visual Description
## 3D Surface Plot: Hits@10 Values vs. M and N
### Overview
The image presents a 3D surface plot visualizing the relationship between two variables, 'M' and 'N', and their impact on 'Hits@10 Values (%)'. The plot appears to represent a performance metric (Hits@10) as a function of two parameters, M and N. The surface is colored according to the Hits@10 value, with a colorbar indicating the mapping between color and percentage.
### Components/Axes
* **X-axis:** 'N', ranging from approximately 40 to 140, with markers at 40, 60, 80, 100, 120, and 140.
* **Y-axis:** 'M', ranging from approximately 100 to 1000, with markers at 100, 200, 400, 600, 800, and 1000.
* **Z-axis:** 'Hits@10 Values (%)', ranging from approximately 48% to 64%, with markers at 50%, 52%, 54%, 56%, 58%, 60%, 62%, and 64%.
* **Colorbar:** Located on the right side of the plot, mapping colors to Hits@10 Values (%). Dark blue represents approximately 48%, transitioning through green and yellow to dark red representing approximately 64%.
### Detailed Analysis
The surface plot shows a clear peak in Hits@10 Values. The highest values (around 62-64%, represented by dark red) are concentrated in the region where both 'M' and 'N' are relatively high (approximately M = 800-1000 and N = 100-140). As either 'M' or 'N' decreases, the Hits@10 Values generally decrease, reaching the lowest values (around 50%, represented by dark blue) when both 'M' and 'N' are low (approximately M = 100 and N = 40).
Here's a breakdown of approximate Hits@10 values at specific points:
* M = 100, N = 40: ~50% (Dark Blue)
* M = 100, N = 140: ~52% (Light Blue)
* M = 1000, N = 40: ~52% (Light Blue)
* M = 1000, N = 140: ~63% (Dark Red)
* M = 400, N = 80: ~56% (Yellow-Green)
* M = 800, N = 100: ~62% (Red)
* M = 600, N = 60: ~58% (Orange)
The surface is relatively smooth, indicating a gradual change in Hits@10 Values as 'M' and 'N' change. There are no sharp discontinuities or sudden jumps in the surface.
### Key Observations
* **Peak Performance:** The highest Hits@10 Values are achieved when both 'M' and 'N' are large.
* **Sensitivity to Parameters:** The performance is sensitive to both 'M' and 'N'; decreasing either parameter leads to a reduction in Hits@10 Values.
* **Symmetry:** The plot appears roughly symmetrical around the center, suggesting that the effect of 'M' and 'N' on Hits@10 Values is similar.
### Interpretation
This data suggests that the performance metric 'Hits@10 Values' is positively correlated with both parameters 'M' and 'N'. Increasing either 'M' or 'N' (or both) leads to improved performance, with the best performance achieved when both are maximized.
'M' and 'N' likely represent configuration parameters or resource allocations within a system. For example, 'M' could represent the size of a dataset, and 'N' could represent the number of iterations in an algorithm. The plot demonstrates that increasing both the dataset size and the number of iterations leads to better results, up to a point. The peak suggests there might be diminishing returns or practical limitations to increasing 'M' and 'N' indefinitely.
The smooth surface indicates a predictable relationship between the parameters and the performance metric. This allows for informed decisions about resource allocation and parameter tuning to optimize performance. The absence of outliers suggests the system behaves consistently across the tested range of 'M' and 'N' values.
</details>
(a) Performance on ICEWS14.
<details>
<summary>extracted/6596839/fig/bar3d2.png Details</summary>

### Visual Description
## 3D Surface Plot: GPU Memory Usage
### Overview
The image presents a 3D surface plot visualizing the relationship between two variables, *M* and *N*, and their impact on GPU memory usage. The plot displays GPU memory consumption in Gigabytes (GB) as a function of *M* and *N*. The surface is color-coded to represent the magnitude of GPU memory usage, with a color legend provided on the right side of the image.
### Components/Axes
* **X-axis:** *M*, ranging from approximately 40 to 1000.
* **Y-axis:** *N*, ranging from approximately 40 to 140.
* **Z-axis:** GPU memory (GB), ranging from approximately 0 to 40.
* **Color Legend:** A vertical color bar on the right side of the image, representing GPU memory usage in GB. The color gradient ranges from dark blue (0 GB) to dark red (40 GB). Specific values indicated on the legend are: 0, 5, 10, 15, 20, 25, 30, 35, and 40.
### Detailed Analysis
The surface plot shows a complex relationship between *M*, *N*, and GPU memory usage.
* **General Trend:** GPU memory usage generally increases as both *M* and *N* increase, but the relationship is not linear. There appears to be a peak in memory usage around *M* = 600 and *N* = 100.
* **Low *M* Values (40-200):** For low values of *M*, GPU memory usage increases with *N* up to a certain point (around *N* = 100), then plateaus or slightly decreases. The color of the surface is predominantly blue and green, indicating low memory usage (0-20 GB).
* **Medium *M* Values (200-600):** As *M* increases to the 200-600 range, the surface rises, and the color transitions to yellow and orange, indicating moderate memory usage (10-30 GB). The peak memory usage occurs within this range.
* **High *M* Values (600-1000):** For high values of *M*, the surface decreases in height, and the color shifts back towards green and blue, indicating decreasing memory usage.
* **Specific Data Points (Approximate):**
* At *M* = 40, *N* = 40, GPU memory β 0 GB (dark blue).
* At *M* = 40, *N* = 140, GPU memory β 5 GB (light blue).
* At *M* = 200, *N* = 40, GPU memory β 5 GB (light blue).
* At *M* = 200, *N* = 140, GPU memory β 15 GB (green).
* At *M* = 600, *N* = 100, GPU memory β 35 GB (dark orange/red).
* At *M* = 1000, *N* = 40, GPU memory β 10 GB (green).
* At *M* = 1000, *N* = 140, GPU memory β 15 GB (green).
### Key Observations
* The plot exhibits a non-monotonic relationship between *M*, *N*, and GPU memory usage.
* The maximum GPU memory usage is approximately 35-40 GB, occurring around *M* = 600 and *N* = 100.
* GPU memory usage is relatively low for small values of *M* and *N*.
* There is a clear peak in memory usage, suggesting an optimal range for *M* and *N* to minimize memory consumption.
### Interpretation
This data likely represents the GPU memory requirements of a computational process or algorithm that depends on two parameters, *M* and *N*. The plot suggests that there is a sweet spot for these parameters where memory usage is minimized. Exceeding these values leads to increased memory consumption. The peak suggests a computational bottleneck or a phase transition in the algorithm's behavior.
The shape of the surface indicates that the relationship between the parameters and memory usage is complex and not simply additive. The algorithm may involve operations that scale differently with *M* and *N*, leading to the observed non-linear behavior. Understanding this relationship is crucial for optimizing the algorithm's performance and ensuring it can run efficiently on GPUs with limited memory. The plot could be used to guide the selection of appropriate values for *M* and *N* to balance computational efficiency and memory constraints.
</details>
(b) Space on ICEWS14.
<details>
<summary>extracted/6596839/fig/bar3d3.png Details</summary>

### Visual Description
## 3D Bar Chart: Hits@10 Values vs. M and N
### Overview
The image presents a 3D bar chart visualizing the relationship between two variables, 'M' and 'N', and their impact on 'Hits@10 Values (%)'. The chart uses a color gradient to represent the magnitude of the 'Hits@10 Values (%)', with red indicating higher values and blue indicating lower values. The chart appears to represent a performance metric (Hits@10) as a function of two parameters (M and N).
### Components/Axes
* **X-axis (Horizontal):** Labeled 'N', ranging from approximately 40 to 140, with markers at 40, 60, 80, 100, 120, and 140.
* **Y-axis (Depth):** Labeled 'M', ranging from approximately 100 to 1000, with markers at 100, 200, 400, 600, 800, and 1000.
* **Z-axis (Vertical):** Labeled 'Hits@10 Values (%)', ranging from approximately 38% to 52%.
* **Color Legend:** Located in the top-right corner, representing 'Hits@10 Values (%)'. The color gradient transitions from dark blue (approximately 38%) to red (approximately 52%), with intermediate colors of green and yellow representing values in between. The legend has markers at 38, 40, 42, 44, 46, 48, 50, and 52.
### Detailed Analysis
The chart displays a grid of bars, where each bar's height represents the 'Hits@10 Values (%)' for a specific combination of 'M' and 'N'.
* **Trend:** The chart shows a general trend of increasing 'Hits@10 Values (%)' as both 'M' and 'N' increase. The highest values are concentrated in the top-right corner of the chart (high 'M' and high 'N'). The lowest values are concentrated in the bottom-left corner (low 'M' and low 'N').
* **Specific Values (Approximate):**
* When N = 40 and M = 100, Hits@10 Values (%) β 40%.
* When N = 40 and M = 1000, Hits@10 Values (%) β 42%.
* When N = 140 and M = 100, Hits@10 Values (%) β 44%.
* When N = 140 and M = 1000, Hits@10 Values (%) β 51%.
* When N = 80 and M = 600, Hits@10 Values (%) β 46%.
* When N = 60 and M = 400, Hits@10 Values (%) β 44%.
* The peak value appears to be around 51-52% when both M and N are at their highest values.
* The lowest value appears to be around 38-40% when both M and N are at their lowest values.
### Key Observations
* The chart demonstrates a positive correlation between 'M', 'N', and 'Hits@10 Values (%)'.
* The effect of increasing 'M' appears to be more pronounced than increasing 'N', as the bars tend to increase more rapidly along the 'M' axis.
* There is a relatively smooth gradient across the chart, suggesting a consistent relationship between the variables.
### Interpretation
The data suggests that the performance metric 'Hits@10 Values (%)' improves as the values of parameters 'M' and 'N' increase. This could indicate that larger values of 'M' and 'N' lead to better results in whatever system or process is being measured. The chart provides a visual representation of this relationship, allowing for a quick assessment of how changes in 'M' and 'N' affect performance. The consistent gradient suggests a predictable and reliable relationship between the variables. The 'Hits@10' metric likely refers to the proportion of times a correct result appears within the top 10 results returned by a system (e.g., a search engine or recommendation system). 'M' and 'N' could represent parameters related to the size of the dataset, the complexity of the model, or the amount of training data used. The chart is a valuable tool for optimizing the values of 'M' and 'N' to maximize 'Hits@10 Values (%)'.
</details>
(c) Performance on ICEWS18.
<details>
<summary>extracted/6596839/fig/bar3d4.png Details</summary>

### Visual Description
## 3D Surface Plot: GPU Memory Usage vs. M and N
### Overview
This image presents a 3D surface plot visualizing the relationship between GPU memory usage (in GB) and two parameters, 'M' and 'N'. The plot displays GPU memory consumption as a function of varying values of M and N, represented by the x and y axes respectively. The color of the surface indicates the magnitude of GPU memory usage, with a colorbar on the right providing a mapping between color and memory consumption.
### Components/Axes
* **X-axis:** Labeled 'M', ranging from approximately 0 to 1000, with markers at 0, 200, 400, 600, 800, and 1000.
* **Y-axis:** Labeled 'N', ranging from approximately 0 to 140, with markers at 40, 60, 80, 100, 120, and 140.
* **Z-axis:** Labeled 'GPU memory (GB)', ranging from approximately 0 to 40, with markers at 0, 10, 20, 30, and 40.
* **Colorbar:** Located on the right side of the plot, ranging from 0 (dark blue) to 40 (dark red). The colorbar is vertically oriented.
### Detailed Analysis
The surface plot shows a complex relationship between M, N, and GPU memory usage.
* **Trend:** The GPU memory usage generally increases as both M and N increase, but the relationship is not linear. There appears to be a peak in memory usage around M = 600-800 and N = 80-120.
* **Low Memory Usage:** For small values of both M and N (e.g., M < 200, N < 60), the GPU memory usage is low, approximately between 0 and 5 GB (dark blue).
* **Increasing M, Constant N:** As M increases while N remains relatively small (e.g., N = 40), the GPU memory usage increases gradually, reaching approximately 20-25 GB at M = 800.
* **Increasing N, Constant M:** As N increases while M remains relatively small (e.g., M = 40), the GPU memory usage also increases gradually, reaching approximately 15-20 GB at N = 140.
* **Peak Memory Usage:** The highest GPU memory usage, around 35-40 GB (dark red), occurs when both M and N are relatively large, specifically around M = 600-800 and N = 80-120.
* **Decreasing Memory Usage:** Beyond the peak, as M continues to increase while N is high, the GPU memory usage appears to decrease slightly. Similarly, as N continues to increase while M is high, the GPU memory usage also appears to decrease slightly.
Here are some approximate data points extracted from the plot:
* M = 0, N = 40: ~0 GB
* M = 200, N = 40: ~5 GB
* M = 400, N = 40: ~10 GB
* M = 600, N = 40: ~15 GB
* M = 800, N = 40: ~20 GB
* M = 1000, N = 40: ~20 GB
* M = 0, N = 140: ~0 GB
* M = 200, N = 140: ~10 GB
* M = 400, N = 140: ~15 GB
* M = 600, N = 140: ~25 GB
* M = 800, N = 140: ~30 GB
* M = 1000, N = 140: ~25 GB
* M = 600, N = 80: ~30 GB
* M = 800, N = 100: ~35 GB
### Key Observations
* The GPU memory usage is highly sensitive to changes in both M and N.
* There is a clear peak in memory usage for intermediate values of M and N.
* The relationship between M, N, and GPU memory usage is not symmetrical.
### Interpretation
The plot suggests that the GPU memory consumption is dependent on the values of parameters M and N. The peak in memory usage indicates that there is an optimal range for M and N where the GPU memory is utilized most efficiently. This could represent a sweet spot in a computational process where the algorithm requires the most resources. The decrease in memory usage beyond the peak suggests that increasing M or N beyond this optimal range may not lead to further performance gains and could even be detrimental.
The plot could be representing the memory requirements of a machine learning model, where M and N represent the dimensions of an input or the number of layers in a neural network. The peak could indicate the optimal model size for a given dataset and hardware configuration. Alternatively, M and N could represent the size of a matrix multiplication, and the plot shows how memory usage scales with matrix dimensions.
The data suggests a non-linear relationship, potentially indicating a computational bottleneck or a phase transition in the underlying process. Further investigation would be needed to determine the exact nature of this relationship and the factors that contribute to the peak in memory usage.
</details>
(d) Space on ICEWS18.
Figure 7: The impacts of sampling in the reasoning process. Performance and GPU space usage with batch size 64. Large values of M and N can achieve excellent performance at the cost of increased space requirements.
### 4.4 Hyperparameter Analysis (RQ3)
We run our model with different hyperparameters to explore weight impacts in Figures 6 and 7. Specifically, Figure 6 illustrates the performance variation with different reasoning iterations, i.e., the length of the reasoning rules. At WN18RR and FB15k-237 v3 datasets of transductive and inductive settings, experiments on rule lengths of 2, 4, 6, and 8 are carried out as illustrated in Figures 6 and 6. It is observed that the performance generally improves with the iteration increasing from 2 to 6. When the rule length continues to increase, the inference performance changes little or decreases slightly. The same phenomenon can also be observed in Figures 6 and 6, which corresponds to interpolation and extrapolation reasoning on the ICEWS14 dataset. The rule length ranges from 1 to 4, where the model performance typically displays an initial improvement, followed by a tendency to stabilize or exhibit a marginal decline. This phenomenon occurs due to the heightened rule length, which amplifies the modeling capability while potentially introducing noise into the reasoning process. Therefore, an appropriate value of rule length (reasoning iteration) is significant for KG reasoning.
We also explore the impacts of hyperparameters M for node sampling and N for edge selection on ICEWS14 and ICEWS18 datasets of extrapolation reasoning. The results are shown in Figure 7. For each dataset, we show the reasoning performance (Hits@10 metric) and utilized space (GPU memory) in detail, with the M varies in {50, 100, 200, 600, 800, 1000} while N varies in {40, 60, 80, 100, 120, 140}. It is evident that opting for smaller values results in a significant decline in performance. This decline can be attributed to the inadequate number of nodes and edges, which respectively contribute to insufficient and unstable training. Furthermore, as M surpasses 120, the marginal gains become smaller or even lead to performance degradation. Additionally, when M and N are increased, the GPU memory utilization of the model experiences rapid growth, as depicted in Figure 7 and 7, with a particularly pronounced effect on M.
TABLE VII: Some reasoning cases in transductive, interpolation, and extrapolation scenarios, where both propositional reasoning and learned FOL rules are displayed. β ${-1}$ β denotes the reverse of a specific relation and textual descriptions of some relations are simplified. Values in orange rectangles represent propositional attentions and relations marked with red in FOL rules represent the target relation to be predicted.
| Propositional Reasoning | FOL Rules |
| --- | --- |
|
<details>
<summary>extracted/6596839/fig/case1.png Details</summary>

### Visual Description
\n
## Diagram: Semantic Relationship Network
### Overview
The image depicts a directed graph representing semantic relationships between concepts identified by numerical codes. Nodes represent concepts, and edges represent relationships between them, labeled with the type of relationship and a numerical weight. The nodes are colored based on an unknown criteria, potentially indicating a classification or status. Checkmarks and crosses are present on some nodes.
### Components/Axes
The diagram consists of nodes (circles) and directed edges (arrows). Each edge is labeled with a relationship type (e.g., "verbGroupβ»ΒΉ", "derivationally RelatedForm", "hypernym", "alsoSee") and a numerical weight (e.g., 0.21, 0.13, 0.09). Nodes are labeled with numerical codes (e.g., 00238867, 00239321, 14712036). Some nodes have a checkmark (β) or a cross (β) symbol. There is no explicit axis or legend beyond the edge labels and node identifiers.
### Detailed Analysis or Content Details
Here's a breakdown of the nodes and relationships, proceeding roughly from left to right and top to bottom:
* **Node 00238867 (Blue):** Connected to 13530408 via "verbGroupβ»ΒΉ" with a weight of 0.21. Connected to 00126264 via "hypernym" with a weight of 0.23.
* **Node 13530408 (Orange):** Connected to 00239321 via "derivationally RelatedForm" with a weight of 0.13. Connected to 00238867 via "derivationally RelatedFormβ»ΒΉ" with a weight of 0.11.
* **Node 00239321 (Yellow):** Connected to itself via "self" with a weight of 0.23. Connected to 13530408 via "derivationally RelatedFormβΊΒΉ" with a weight of 0.59 and has a checkmark (β).
* **Node 14712036 (Teal):** Connected to 00239321 via "verbGroupβ»ΒΉ" with a weight of 0.06 and has a cross (β).
* **Node 06084469 (Pink):** Connected to 00238867 via "synsetDomain TopicOf" with a weight of 0.09. Connected to 14712036 via "derivationally RelatedForm" with a weight of 0.05 and has a cross (β).
* **Node 00025728 (Brown):** Connected to 06084469 via "synsetDomain TopicOf" with a weight of 0.08.
* **Node 00407848 (Purple):** Connected to 00025728 via "alsoSee" with a weight of 0.07 and has a cross (β).
* **Node 00298896 (Tan):** Connected to 00025728 via "alsoSee" with a weight of 0.18.
* **Node 00126264 (Brown):** Connected to 00298896 via "hypernymβ»ΒΉ" with a weight of 0.23.
### Key Observations
* Node 00239321 is the most highly connected node, with a weight of 0.59 to 13530408 and a self-loop of 0.23. It also has a checkmark.
* Nodes 14712036, 06084469, and 00407848 all have crosses.
* The weights of the relationships vary significantly, ranging from 0.05 to 0.59.
* The "self" relationship has a relatively high weight (0.23).
* The diagram appears to be a directed acyclic graph, with no obvious cycles.
### Interpretation
This diagram represents a network of semantic relationships between concepts. The numerical codes likely represent unique identifiers for words or concepts within a lexical database like WordNet. The relationship types (e.g., "hypernym", "derivationally RelatedForm") indicate the nature of the connection between the concepts. The weights likely represent the strength or probability of the relationship.
The checkmarks and crosses could indicate whether a particular relationship is considered valid or reliable, or perhaps whether a concept has been manually verified. The high weight of the "self" relationship for node 00239321 suggests that this concept is strongly related to itself, which might indicate a core or fundamental concept.
The presence of crosses on nodes 14712036, 06084469, and 00407848 suggests that these concepts or their relationships are potentially problematic or require further investigation. The diagram provides a visual representation of the semantic connections between these concepts, allowing for a deeper understanding of their relationships and potential implications. The diagram is likely used for knowledge representation, semantic analysis, or natural language processing tasks.
</details>
| [1]
0.21 verbGroup -1 (X,Z) $\rightarrow$ verbGroup (X,Z) [2]
0.32 verbGroup -1 (X,Y 1) $\land$ derivationallyRelatedForm (Y 1,Y 2) $\land$ derivationallyRelatedForm -1 (Y 2,Z) $\rightarrow$ verbGroup (X,Z) [3]
0.07 derivationallyRelatedForm -1 (X,Y 1) $\land$ derivationallyRelatedForm -1 (Y 1,Y 2) $\land$ verbGroup -1 (Y 2,Z) $\rightarrow$ verbGroup (X,Z) [4]
0.05 synsetDomainTopicOf (X,Y 1) $\land$ synsetDomainTopicOf -1 (Y 1,Y 2) $\land$ derivationallyRelatedForm (Y 2,Z) $\rightarrow$ verbGroup (X,Z) [5]
0.18 hypernym (X,Y 1) $\land$ hypernym -1 (Y 1,Y 2) $\land$ alsoSee (Y 2,Z) $\rightarrow$ verbGroup (X,Z) |
| Transductive reasoning: query (00238867, verbGroup, ?) in WN18RR | |
|
<details>
<summary>extracted/6596839/fig/case2.png Details</summary>

### Visual Description
\n
## Diagram: Event Network - Ukraine 2014
### Overview
The image depicts a directed graph representing relationships between entities involved in events surrounding the Ukraine crisis in 2014. Nodes represent entities (people, organizations) and edges represent relationships between them, labeled with a relationship type and a numerical weight. The diagram appears to be a visualization of event data, potentially extracted from news reports or other sources.
### Components/Axes
The diagram consists of nodes (circles) and directed edges (arrows). Each edge is labeled with a relationship type (e.g., "self", "repress", "make") and a numerical value representing the strength or frequency of that relationship. The nodes are labeled with the entity name and a date (e.g., "Police (Ukraine);2014-01-21"). There are no explicit axes or scales in the traditional chart sense, but the edge weights serve as a quantitative measure of the relationships. There are check and cross marks on some edges.
### Detailed Analysis or Content Details
Here's a breakdown of the entities and relationships, with approximate edge weights:
* **Party of Regions** -> **Police (Ukraine);2014-01-21**: Relationship "reduceRelations", weight 0.32
* **Police (Ukraine);2014-01-21** -> **Police (Ukraine);2014-01-21**: Relationship "self", weight 0.24
* **Police (Ukraine);2014-01-21** -> **Protester (Ukraine);2014-01-13**: Relationship "repress", weight 0.25
* **Protester (Ukraine);2014-01-13** -> **Protester (Ukraine);2014-02-18**: Relationship "repress", weight 0.31
* **Protester (Ukraine);2014-02-18** -> **Party of Regions**: Relationship "obstruct Passage", weight 0.23
* **Party of Regions** -> **Arseniy Yatsenyuk;2014-03-27**: Relationship "consult", weight 0.19
* **Police (Ukraine);2014-01-21** -> **Police (Ukraine);2014-01-27**: Relationship "make", weight 0.40
* **Police (Ukraine);2014-01-27** -> **Security Service of Ukraine;2014-04-18**: Relationship "make", weight 0.39
* **Security Service of Ukraine;2014-04-18** -> **Benjamin Netanyahu;2014-03-19**: Relationship (dashed line) weight 0.13, marked with an 'X'
* **Police (Ukraine);2014-01-27** -> **Security Service of Ukraine;2014-04-18**: Relationship (dashed line) weight 0.05, marked with a checkmark.
* **John Kerry;2014-02-01** -> **Benjamin Netanyahu;2014-03-19**: Relationship "discussBy Telephone", weight 0.12
* **Benjamin Netanyahu;2014-03-19** -> **Benjamin Netanyahu;2014-03-19**: Relationship "self", weight 0.10
* **Police (Ukraine);2014-01-21** -> **Police (Ukraine);2014-01-21**: Relationship "AnAppeal", weight 0.74
The dashed lines indicate a different type of relationship, potentially weaker or less direct. The check and cross marks on the dashed lines likely indicate positive or negative outcomes or validations of the relationship.
### Key Observations
* The "Police (Ukraine)" entity is central, with many incoming and outgoing relationships.
* The relationship between "Police (Ukraine);2014-01-21" and itself ("self") appears twice, suggesting internal dynamics or feedback loops.
* The relationship between "Police (Ukraine);2014-01-27" and "Security Service of Ukraine;2014-04-18" has two edges, one with a checkmark and one with a cross, indicating conflicting or nuanced interactions.
* The dashed lines connecting to Benjamin Netanyahu have relatively low weights and are marked with a cross, suggesting a less significant or potentially negative connection.
* The edge weight between "Police (Ukraine);2014-01-21" and "Police (Ukraine);2014-01-27" via "make" is relatively high (0.40), indicating a strong connection.
### Interpretation
This diagram represents a network of interactions during a period of political unrest in Ukraine. The nodes represent key actors, and the edges represent actions or relationships between them. The weights on the edges suggest the strength or frequency of these interactions.
The centrality of the "Police (Ukraine)" entity suggests its significant role in the events. The "repress" relationships indicate actions taken against protesters. The "consult" relationships suggest communication or collaboration between political actors. The dashed lines and check/cross marks add a layer of nuance, indicating potentially conflicting or validated interactions.
The diagram suggests a complex web of relationships, with the police acting as a central node, interacting with protesters, political parties, and international figures. The presence of both positive and negative indicators (checkmarks and crosses) suggests a dynamic and contested situation. The relatively low weights associated with Benjamin Netanyahu suggest a more peripheral role in the immediate events depicted.
The diagram is a snapshot of a specific time period (2014) and likely represents a simplified view of a much more complex reality. However, it provides valuable insights into the key actors and relationships involved in the Ukraine crisis.
</details>
| [1]
0.46 reduceRelations -1 (X,Z) $:t_{1}$ $\rightarrow$ makeAnAppeal (X,Z) $:t$ [2]
0.19 reduceRelations -1 (X,Y 1) $:t_{1}$ $\land$ repression (Y 1,Y 2) $:t_{2}$ $\land$ makeAnAppeal -1 (Y 2,Z) $:t_{3}$ $\rightarrow$ makeAnAppeal (X,Z) $:t$ [3]
0.14 obstructPassage -1 (X,Y 1) $:t_{1}$ $\land$ repression -1 (Y 1,Y 2) $:t_{2}$ $\land$ makeStatement (Y 2,Z) $:t_{3}$ $\rightarrow$ makeAnAppeal (X,Z) $:t$ [4]
0.12 consult (X,Y 1) $:t_{1}$ $\land$ consult -1 (Y 1,Y 2) $:t_{2}$ $\land$ discussByTelephone -1 (Y 2,Z) $:t_{3}$ $\rightarrow$ makeAnAppeal (X,Z) $:t$ |
| Interpolation reasoning: query (Party of Regions, makeAnAppeal, ?, 2014-05-15) in ICEWS14 | |
|
<details>
<summary>extracted/6596839/fig/case3.png Details</summary>

### Visual Description
\n
## Diagram: Event Relationship Network
### Overview
The image depicts a directed graph representing relationships between entities (people, organizations, locations) and events, along with associated probabilities. Nodes represent entities or events, and directed edges represent relationships between them, with numerical values indicating the strength or probability of the relationship. The diagram appears to model a sequence of events and interactions, potentially related to international relations or political events.
### Components/Axes
The diagram consists of nodes (circles) and directed edges (arrows). Each edge is labeled with a relationship type (e.g., "accuse", "makeVisit", "express") and a numerical value (between 0.05 and 0.62). Nodes are labeled with entity names and dates (e.g., "Iran:2018-05-02", "United Nations:2018-07-31", "Donald Trump:2018-06-08"). Some edges are solid lines, while others are dashed lines, and some nodes have checkmarks or crosses.
### Detailed Analysis or Content Details
The diagram can be broken down into several interconnected paths:
1. **Nasser Bourita -> Iran:2018-05-02:** Relationship "accuse" with a probability of 0.26.
2. **Iran:2018-05-02 -> United Nations:2018-07-31:** Relationship "express IntentTo" with a probability of 0.48.
3. **Iran:2018-05-02 -> Iran:2018-05-03:** Relationship "engageIn Cooperation" with a probability of 0.37.
4. **Iran:2018-05-03 -> Russia:2018-05-14:** Relationship "reject" with a probability of 0.33.
5. **Russia:2018-05-14 -> United Nations:2018-07-31:** Relationship "defyNorms Law" with a probability of 0.34.
6. **United Nations:2018-07-31 -> Donald Trump:** Dashed relationship with a probability of 0.62 and a checkmark.
7. **United Nations:2018-07-31 -> United Nations:2018-08-02:** Dashed relationship with a probability of 0.09.
8. **United Nations:2018-08-02 -> Donald Trump:2018-06-08:** Relationship "makeOptimistic Comment" with a probability of 0.08 and a cross.
9. **Donald Trump:2018-06-08 -> Police (Afghanistan):2018-08-22:** Dashed relationship "makeOptimistic Comment" with a probability of 0.05 and a cross.
10. **Police (Afghanistan):2018-08-22:** Probability of 0.06 and a cross.
11. **Morocco:2018-05-01 -> Iran:2018-05-03:** Relationship "make Statement" with a probability of 0.34.
12. **Germany:2018-04-12 -> Russia:2018-07-06:** Relationship "meetAtThird Location" with a probability of 0.42.
13. **Russia:2018-07-06 -> Police (Afghanistan):2018-08-22:** Relationship "makeOptimistic Comment" with a probability of 0.15.
The dashed lines indicate a different type of relationship or a weaker connection compared to the solid lines. The checkmarks and crosses likely represent positive or negative outcomes or validations of the relationships.
### Key Observations
* The United Nations appears to be a central node, connecting multiple entities and events.
* Relationships involving Donald Trump have lower probabilities and are marked with crosses, suggesting negative or unsuccessful outcomes.
* The path involving Morocco, Iran, and Russia seems to be a separate chain of events.
* The probabilities vary significantly, indicating varying degrees of confidence or strength in the relationships.
* The diagram shows a temporal aspect, with dates associated with each entity and event.
### Interpretation
The diagram likely represents a network of events and interactions, potentially related to political or diplomatic activities. The probabilities associated with each relationship suggest the likelihood or strength of the connection. The checkmarks and crosses indicate whether the relationships led to positive or negative outcomes. The diagram could be used to analyze the flow of information, identify key players, and assess the potential impact of different events.
The dashed lines and lower probabilities associated with Donald Trump suggest that interactions involving him may be less reliable or have unfavorable outcomes. The central role of the United Nations indicates its importance in mediating or influencing these events. The diagram provides a visual representation of complex relationships and allows for a more intuitive understanding of the dynamics at play. The use of dates suggests a chronological order of events, allowing for the reconstruction of a timeline. The diagram is a form of knowledge representation, encoding information about events, entities, and their relationships in a structured and visual format.
</details>
| [1]
0.14 accuse (X,Y) $:t_{1}$ $\land$ expressIntentTo (Y,Z) $:t_{2}$ $\rightarrow$ makeVisit (X,Z) $:t$ [2]
0.09 makeVisit (X,Y 1) $:t_{1}$ $\land$ engageInCooperation (Y 1,Y 2) $:t_{2}$ $\land$ defyNormsLaw (Y 2,Z) $:t_{3}$ $\rightarrow$ makeVisit (X,Z) $:t$ [3]
0.11 makeStatement (X,Y 1) $:t_{1}$ $\land$ reject -1 (Y 1,Y 2) $:t_{2}$ $\land$ makeOptimisticComment (Y 2,Z) $:t_{3}$ $\rightarrow$ makeVisit (X,Z) $:t$ [4]
0.25 makeVisit (X,Y) $:t_{1}$ $\land$ makeOptimisticComment (Y,Z) $:t_{2}$ $\rightarrow$ makeVisit (X,Z) $:t$ [5]
0.17 hostVisit -1 (X,Y 1) $:t_{1}$ $\land$ meetAtThirdLocation (Y 1,Y 2) $:t_{2}$ $\land$ makeOptimisticComment -1 (Y 2,Z) $:t_{3}$ $\rightarrow$ makeVisit (X,Z) $:t$ |
| Extrapolation reasoning: query (Nasser Bourita, makeVisit, ?, 2018-09-28) in ICEWS18 | |
TABLE VIII: Some reasoning cases in inductive scenarios, where learned FOL rules are displayed. Relations marked with red represent the target relation to be predicted. β ${-1}$ β denotes the reverse of a specific relation and textual descriptions of some relations are simplified.
| Col1 |
| --- |
| [1] 0.41 memberMeronym (X,Y 1) $\land$ hasPart (Y 1,Y 2) $\land$ hasPart -1 (Y 2,Z) $\rightarrow$ memberMeronym (X,Z) [2] 0.19 hasPart -1 (X,Y 1) $\land$ hypernym (Y 1,Y 2) $\land$ memberOfDomainUsage -1 (Y 2,Z) $\rightarrow$ memberMeronym (X,Z) [3] 0.25 hypernym (X,Y 1) $\land$ hypernym -1 (Y 1,Y 2) $\land$ memberMeronym (Y 2,Z) $\rightarrow$ memberMeronym (X,Z) [4] 0.17 hypernym (X,Y 1) $\land$ hypernym -1 (Y 1,Y 2) $\land$ hasPart (Y 2,Z) $\rightarrow$ memberMeronym (X,Z) |
| Inductive reasoning: query (08174398, memberMeronym, ?) in WN18RR v3 |
| [1] 0.32 filmReleaseRegion (X,Y 1) $\land$ filmReleaseRegion -1 (Y 1,Y 2) $\land$ filmCountry (Y 2,Z) $\rightarrow$ filmReleaseRegion (X,Z) [2] 0.10 distributorRelation -1 (X,Y 1) $\land$ nominatedFor (Y 1,Y 2) $\land$ filmReleaseRegion -1 (Y 2,Z) $\rightarrow$ filmReleaseRegion (X,Z) [3] 0.19 filmReleaseRegion (X,Y 1) $\land$ exportedTo -1 (Y 1,Y 2) $\land$ locationCountry (Y 2,Z) $\rightarrow$ filmReleaseRegion (X,Z) [4] 0.05 filmCountry (X,Y 1) $\land$ filmReleaseRegion -1 (Y 1,Y 2) $\land$ filmMusic (Y 2,Z) $\rightarrow$ filmReleaseRegion (X,Z) |
| Inductive reasoning: query (/m/0j6b5, filmReleaseRegion, ?) in FB15k-237 v3 |
| [1] 0.46 collaboratesWith -1 (X,Z) $\rightarrow$ collaboratesWith (X,Z) [2] 0.38 collaboratesWith -1 (X,Y 1) $\land$ holdsOffice (Y 1,Y 2) $\land$ holdsOffice -1 (Y 2,Z) $\rightarrow$ collaboratesWith (X,Z) [3] 0.03 collaboratesWith -1 (X,Y 1) $\land$ graduatedFrom (Y 1,Y 2) $\land$ graduatedFrom -1 (Y 2,Z) $\rightarrow$ collaboratesWith (X,Z) [4] 0.03 collaboratesWith -1 (X,Y 1) $\land$ collaboratesWith (Y 1,Y 2) $\land$ graduatedFrom (Y 2,Z) $\rightarrow$ collaboratesWith (X,Z) |
| Inductive reasoning: query (Hillary Clinton, collaboratesWith, ?) in NELL v3 |
### 4.5 Case Studies (RQ4)
To show the actual reasoning process of Tunsr, some practical cases are presented in detail on all four reasoning scenarios, which illustrate the transparency and interpretability of the proposed Tunsr. For better presentation, the maximum length of the reasoning iterations is set to 3. Specifically, Table VII shows the reasoning graphs for three specific queries on transductive, interpolation, and extrapolation scenarios, respectively, The propositional attention weights of nodes are listed near them, which represent the propositional reasoning score of each node at the current step. For example, in the first case, the uppermost propositional reasoning path (00238867, verbGroup -1, 00239321) at first step learns a large attention score for the correct answer 00239321. Generally, nodes with more preceding neighbors or larger preceding attention weights significantly impact subsequent steps and the prediction of final entity scores. Besides, we observe that propositional and first-order reasoning have an incompletely consistent effect. For example, the FOL rules of β[3]β and β[4]β in the third case have relatively high rule confidence values compared with β[1]β and β[2]β (0.11, 0.25 vs. 0.14, 0.09), but the combination of their corresponding propositional reasoning paths β(Nasser Bourita, makeStatement, Morocco:2018-05-01, reject -1, Iran:2018-05-03, makeOptimisticComment, Donald Trump:2018-06-08)β and β(Nasser Bourita, makeVisit, Iran:2018-05-03, self, Iran:2018-05-03, makeOptimisticComment, Donald Trump:2018-06-08)β has a small propositional attention, i.e., 0.08. This combination prevents the model from predicting the wrong answer Donald Trump. Thus, propositional and FOL reasoning can be integrated to jointly guide the reasoning process, leading to more accurate reasoning results.
Table VIII shows some learned FOL rules of inductive reasoning on WN18RR v3, FB15k-237 v3, and NELL v3 datasets. As the inductive setting is entity-independent, so the propositional reasoning part is not involved here. Each rule presented carries practical significance and is readily understandable for humans. For instance, rule β[1]β collaboratesWith -1 (X, Z) $\rightarrow$ collaboratesWith(X, Z) in the third case has a relatively high confidence value (0.46). This aligns with human commonsense cognition, as the relation collaboratesWith has mutual characteristics for subject and object and can be derived from each other. If person a has collaborated with person b, it inherently implies person b has collaborated with person a. These results illustrate the effectiveness of the rules learned by Tunsr and its interpretable reasoning process.
## 5 Conclusion and Future Works
To combine the advantages of connectionism and symbolicism of AI for KG reasoning, we propose a unified neurosymbolic framework Tunsr for both perspectives of methodology and reasoning scenarios, including transductive, inductive, interpolation, and extrapolation reasoning. Tunsr first introduces a consistent structure of reasoning graph that starts from the query entity and constantly expands subsequent nodes by iteratively searching posterior neighbors. Based on it, a forward logical message-passing mechanism is proposed to update both the propositional representations and attentions, as well as FOL representations and attentions of each node in the expanding reasoning graph. In this way, Tunsr conducts the transformation of merging multiple rules by merging possible relations at each step by using FOL attentions. Through gradually adding rule bodies and updating rule confidence, the real FOL rules can be easily induced to constantly perform attention calculation over the reasoning graph, which is summarized as the FARI algorithm. The experiments on 19 datasets of four different reasoning scenarios illustrate the effectiveness of Tunsr. Meanwhile, the ablation studies show that propositional and FOL have different impacts. Thus, they can be integrated to improve the whole reasoning results. The case studies also verify the transparency and interpretability of its computation process.
The future works lie in two folds. Firstly, we aim to extend the application of this idea to various reasoning domains, particularly those necessitating interpretability for decision-making [87], such as intelligent healthcare and finance. We anticipate this will enhance the accuracy of reasoning while simultaneously offering human-understandable logical rules as evidence. Secondly, we intend to integrate the concept of unified reasoning with state-of-the-art technologies to achieve optimal results. For instance, large language models have achieved great success in the community of natural language processing and AI, while they often encounter challenges when confronted with complex reasoning tasks [88]. Hence, there is considerable prospect for large language models to enhance reasoning capabilities.
## References
- [1] I. Tiddi and S. Schlobach. Knowledge graphs as tools for explainable machine learning: A survey. Artificial Intelligence, 302:103627, 2022.
- [2] M. Li and M. Moens. Dynamic key-value memory enhanced multi-step graph reasoning for knowledge-based visual question answering. In Thirty-Sixth AAAI Conference on Artificial Intelligence, pp. 10983β10992. AAAI Press, 2022.
- [3] C. Mavromatis et al. Tempoqr: Temporal question reasoning over knowledge graphs. In Thirty-Sixth AAAI Conference on Artificial Intelligence, pp. 5825β5833. AAAI Press, 2022.
- [4] Y. Yang et al. Knowledge graph contrastive learning for recommendation. In The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), pp. 1434β1443. ACM, 2022.
- [5] Y. Zhu et al. Recommending learning objects through attentive heterogeneous graph convolution and operation-aware neural network. IEEE Transactions on Knowledge and Data Engineering (TKDE), 35:4178β4189, 2023.
- [6] A. Bastos et al. RECON: relation extraction using knowledge graph context in a graph neural network. In The Web Conference (WWW), pp. 1673β1685. ACM / IW3C2, 2021.
- [7] X. Chen et al. Knowprompt: Knowledge-aware prompt-tuning with synergistic optimization for relation extraction. In The Web Conference (WWW), pp. 2778β2788. ACM, 2022.
- [8] B. D. Trisedya et al. GCP: graph encoder with content-planning for sentence generation from knowledge bases. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 44(11):7521β7533, 2022.
- [9] W. Yu et al. A survey of knowledge-enhanced text generation. ACM Comput. Surv., 54(11s):227:1β227:38, 2022.
- [10] K. D. Bollacker et al. Freebase: A collaboratively created graph database for structuring human knowledge. In Proceedings of the International Conference on Management of Data (SIGMOD), pp. 1247β1250, 2008.
- [11] D. Vrandecic. Wikidata: A new platform for collaborative data collection. In Proceedings of the 21st World Wide Web Conference (WWW), pp. 1063β1064, 2012.
- [12] Q. Wang et al. Knowledge graph embedding: A survey of approaches and applications. IEEE Transactions on Knowledge and Data Engineering (TKDE), 29(12):2724β2743, 2017.
- [13] A. Rossi et al. Knowledge graph embedding for link prediction: A comparative analysis. ACM Transactions on Knowledge Discovery from Data (TKDD), 15(2):1β49, 2021.
- [14] S. Pinker and J. Mehler. Connections and symbols. Mit Press, 1988.
- [15] T. H. Trinh et al. Solving olympiad geometry without human demonstrations. Nature, 625(7995):476β482, 2024.
- [16] Q. Lin et al. Contrastive graph representations for logical formulas embedding. IEEE Transactions on Knowledge and Data Engineering, 35:3563β3574, 2023.
- [17] F. Xu et al. Symbol-llm: Towards foundational symbol-centric interface for large language models. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL), pp. 13091β13116, 2024.
- [18] Q. Lin et al. Fusing topology contexts and logical rules in language models for knowledge graph completion. Information Fusion, 90:253β264, 2023.
- [19] A. Bordes et al. Translating embeddings for modeling multi-relational data. In Advances in Neural Information Processing Systems (NeurIPS), pp. 2787β2795, 2013.
- [20] L. A. GalΓ‘rraga et al. AMIE: association rule mining under incomplete evidence in ontological knowledge bases. In 22nd International World Wide Web Conference (WWW), pp. 413β422, 2013.
- [21] F. Yang et al. Differentiable learning of logical rules for knowledge base reasoning. In Advances in Neural Information Processing Systems (NeurIPS), pp. 2319β2328, 2017.
- [22] Y. Shen et al. Modeling relation paths for knowledge graph completion. IEEE Transactions on Knowledge and Data Engineering, 33(11):3607β3617, 2020.
- [23] K. Cheng et al. Rlogic: Recursive logical rule learning from knowledge graphs. In The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), pp. 179β189. ACM, 2022.
- [24] J. Liu et al. Latentlogic: Learning logic rules in latent space over knowledge graphs. In Findings of the EMNLP, pp. 4578β4586, 2023.
- [25] C. Jiang et al. Path spuriousness-aware reinforcement learning for multi-hop knowledge graph reasoning. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL), pp. 3173β3184, 2023.
- [26] Q. Lin et al. Incorporating context graph with logical reasoning for inductive relation prediction. In The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), pp. 893β903, 2022.
- [27] J. Li et al. Teast: Temporal knowledge graph embedding via archimedean spiral timeline. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL), pp. 15460β15474, 2023.
- [28] Y. Liu et al. Tlogic: Temporal logical rules for explainable link forecasting on temporal knowledge graphs. In Thirty-Sixth AAAI Conference on Artificial Intelligence, pp. 4120β4127. AAAI Press, 2022.
- [29] N. Li et al. Tr-rules: Rule-based model for link forecasting on temporal knowledge graph considering temporal redundancy. In Findings of the Association for Computational Linguistics (EMNLP), pp. 7885β7894, 2023.
- [30] Q. Lin et al. TECHS: temporal logical graph networks for explainable extrapolation reasoning. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL), pp. 1281β1293, 2023.
- [31] E. Cambria et al. SenticNet 7: A commonsense-based neurosymbolic AI framework for explainable sentiment analysis. In LREC, pp. 3829β3839, 2022.
- [32] A. Sadeghian et al. DRUM: end-to-end differentiable rule mining on knowledge graphs. In Advances in Neural Information Processing Systems (NeurIPS), pp. 15321β15331, 2019.
- [33] M. Qu et al. Rnnlogic: Learning logic rules for reasoning on knowledge graphs. In 9th International Conference on Learning Representations (ICLR), 2021.
- [34] Y. Zhang et al. GMH: A general multi-hop reasoning model for KG completion. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 3437β3446, 2021.
- [35] J. Zhang et al. Subgraph retrieval enhanced model for multi-hop knowledge base question answering. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 5773β5784, 2022.
- [36] Y. Lan et al. Complex knowledge base question answering: A survey. IEEE Trans. Knowl. Data Eng., 35(11):11196β11215, 2023.
- [37] H. Dong et al. Temporal inductive path neural network for temporal knowledge graph reasoning. Artificial Intelligence, pp. 104085, 2024.
- [38] J. Chung et al. Empirical evaluation of gated recurrent neural networks on sequence modeling. CoRR, abs/1412.3555, 2014.
- [39] S. Abiteboul et al. Foundations of databases, volume 8. Addison-Wesley Reading, 1995.
- [40] M. Gebser et al. Potassco: The potsdam answer set solving collection. Ai Communications, 24(2):107β124, 2011.
- [41] M. Alviano et al. Wasp: A native asp solver based on constraint learning. In Logic Programming and Nonmonotonic Reasoning: 12th International Conference, LPNMR 2013, Corunna, Spain, September 15-19, 2013. Proceedings 12, pp. 54β66. Springer, 2013.
- [42] W. Rautenberg. A Concise Introduction to Mathematical Logic. Springer, 2006.
- [43] G. Ciravegna et al. Logic explained networks. Artificial Intelligence, 314:103822, 2023.
- [44] H. Ren and J. Leskovec. Beta embeddings for multi-hop logical reasoning in knowledge graphs. In Advances in Neural Information Processing Systems (NeurIPS), 2020.
- [45] P. B. Andrews. An Introduction to Mathematical Logic and Type Theory: To Truth Through Proof, volume 27. Springer Science & Business Media, 2013.
- [46] J. Sun et al. A survey of reasoning with foundation models. arXiv preprint arXiv:2312.11562, 2023.
- [47] W. Zhang et al. Knowledge graph reasoning with logics and embeddings: Survey and perspective. CoRR, abs/2202.07412, 2022.
- [48] D. Poole. Probabilistic horn abduction and bayesian networks. Artificial Intelligence, 64(1):81β129, 1993.
- [49] D. Xu et al. Inductive representation learning on temporal graphs. In 8th International Conference on Learning Representations (ICLR), 2020.
- [50] L. GalΓ‘rraga et al. Fast rule mining in ontological knowledge bases with AMIE+. The VLDB Journal, 24(6):707β730, 2015.
- [51] W. Zhang et al. Iteratively learning embeddings and rules for knowledge graph reasoning. In The World Wide Web Conference (WWW), pp. 2366β2377, 2019.
- [52] T. Lacroix et al. Canonical tensor decomposition for knowledge base completion. In Proceedings of the 35th International Conference on Machine Learning (ICML), volume 80, pp. 2869β2878. PMLR, 2018.
- [53] B. Yang et al. Embedding entities and relations for learning and inference in knowledge bases. In International Conference on Learning Representations (ICLR), 2015.
- [54] B. Xiong et al. Ultrahyperbolic knowledge graph embeddings. In The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), pp. 2130β2139. ACM, 2022.
- [55] J. Wang et al. Duality-induced regularizer for semantic matching knowledge graph embeddings. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 45(2):1652β1667, 2023.
- [56] Y. Zhang et al. Bilinear scoring function search for knowledge graph learning. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 45(2):1458β1473, 2023.
- [57] R. Li et al. How does knowledge graph embedding extrapolate to unseen data: A semantic evidence view. In Thirty-Sixth AAAI Conference on Artificial Intelligence (AAAI), pp. 5781β5791. AAAI Press, 2022.
- [58] Y. Zhang and Q. Yao. Knowledge graph reasoning with relational digraph. In The ACM Web Conference, pp. 912β924. ACM, 2022.
- [59] X. Ge et al. Compounding geometric operations for knowledge graph completion. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL), pp. 6947β6965, 2023.
- [60] W. Wei et al. Enhancing heterogeneous knowledge graph completion with a novel gat-based approach. ACM Transactions on Knowledge Discovery from Data, 2024.
- [61] F. Shi et al. Tgformer: A graph transformer framework for knowledge graph embedding. IEEE Transactions on Knowledge and Data Engineering, 2025.
- [62] L. A. GalΓ‘rraga et al. Amie: association rule mining under incomplete evidence in ontological knowledge bases. In Proceedings of the 22nd international conference on World Wide Web, pp. 413β422, 2013.
- [63] C. Meilicke et al. Anytime bottom-up rule learning for knowledge graph completion. In IJCAI, pp. 3137β3143, 2019.
- [64] S. Ott et al. SAFRAN: an interpretable, rule-based link prediction method outperforming embedding models. In 3rd Conference on Automated Knowledge Base Construction (AKBC), 2021.
- [65] A. Nandi et al. Simple augmentations of logical rules for neuro-symbolic knowledge graph completion. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL), pp. 256β269, 2023.
- [66] J. Guo et al. A unified joint approach with topological context learning and rule augmentation for knowledge graph completion. In Findings of the Association for Computational Linguistics, pp. 13686β13696, 2024.
- [67] K. Teru et al. Inductive relation prediction by subgraph reasoning. In International Conference on Machine Learning, pp. 9448β9457, 2020.
- [68] K. Sun et al. Incorporating multi-level sampling with adaptive aggregation for inductive knowledge graph completion. ACM Transactions on Knowledge Discovery from Data, 2024.
- [69] C. Meilicke et al. Fine-grained evaluation of rule-and embedding-based systems for knowledge graph completion. In 17th International Semantic Web Conference, pp. 3β20, 2018.
- [70] S. Mai et al. Communicative message passing for inductive relation reasoning. In Thirty-Fifth AAAI Conference on Artificial Intelligence, pp. 4294β4302, 2021.
- [71] J. Chen et al. Topology-aware correlations between relations for inductive link prediction in knowledge graphs. In Thirty-Fifth AAAI Conference on Artificial Intelligence, pp. 6271β6278, 2021.
- [72] Y. Pan et al. A symbolic rule integration framework with logic transformer for inductive relation prediction. In Proceedings of the ACM Web Conference, pp. 2181β2192, 2024.
- [73] J. Leblay and M. W. Chekol. Deriving validity time in knowledge graph. In Companion of the The Web Conference (WWW), pp. 1771β1776. ACM, 2018.
- [74] R. Goel et al. Diachronic embedding for temporal knowledge graph completion. In The Thirty-Fourth AAAI Conference on Artificial Intelligence, pp. 3988β3995, 2020.
- [75] A. GarcΓa-DurΓ‘n et al. Learning sequence encoders for temporal knowledge graph completion. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 4816β4821, 2018.
- [76] A. Sadeghian et al. Chronor: Rotation based temporal knowledge graph embedding. In Thirty-Fifth AAAI Conference on Artificial Intelligence, pp. 6471β6479, 2021.
- [77] T. Lacroix et al. Tensor decompositions for temporal knowledge base completion. In 8th International Conference on Learning Representations (ICLR), 2020.
- [78] C. Xu et al. Temporal knowledge graph completion using a linear temporal regularizer and multivector embeddings. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), pp. 2569β2578, 2021.
- [79] J. Messner et al. Temporal knowledge graph completion using box embeddings. In Thirty-Sixth AAAI Conference on Artificial Intelligence, pp. 7779β7787, 2022.
- [80] K. Chen et al. Rotateqvs: Representing temporal information as rotations in quaternion vector space for temporal knowledge graph completion. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 5843β5857, 2022.
- [81] T. Trouillon et al. Complex embeddings for simple link prediction. In International Conference on Machine Learning (ICML), volume 48, pp. 2071β2080, 2016.
- [82] W. Jin et al. Recurrent event network: Autoregressive structure inferenceover temporal knowledge graphs. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 6669β6683, 2020.
- [83] C. Zhu et al. Learning from history: Modeling temporal knowledge graphs with sequential copy-generation networks. In Thirty-Fifth AAAI Conference on Artificial Intelligence, pp. 4732β4740, 2021.
- [84] Z. Han et al. Explainable subgraph reasoning for forecasting on temporal knowledge graphs. In 9th International Conference on Learning Representations (ICLR), 2021.
- [85] H. Sun et al. Timetraveler: Reinforcement learning for temporal knowledge graph forecasting. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 8306β8319, 2021.
- [86] N. Li et al. Infer: A neural-symbolic model for extrapolation reasoning on temporal knowledge graph. In The Thirteenth International Conference on Learning Representations (ICLR), 2025.
- [87] E. Cambria et al. Seven pillars for the future of artificial intelligence. IEEE Intelligent Systems, 38(6):62β69, 2023.
- [88] F. Xu et al. Are large language models really good logical reasoners? a comprehensive evaluation and beyond. IEEE Transactions on Knowledge and Data Engineering, 2025.