# Towards Unified Neurosymbolic Reasoning on Knowledge Graphs
> Qika Lin, Kai He, and Mengling Feng are with the Saw Swee Hock School of Public Health, National University of Singapore, 117549, Singapore.
Fangzhi Xu and Jun Liu are with the School of Computer Science and Technology, Xiβan Jiaotong University, Xiβan, Shaanxi 710049, China.
Hao Lu is with the State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China.
Rui Mao and Erik Cambria are with the College of Computing and Data Science, Nanyang Technological University, 639798, Singapore.
Abstract
Knowledge Graph (KG) reasoning has received significant attention in the fields of artificial intelligence and knowledge engineering, owing to its ability to autonomously deduce new knowledge and consequently enhance the availability and precision of downstream applications. However, current methods predominantly concentrate on a single form of neural or symbolic reasoning, failing to effectively integrate the inherent strengths of both approaches. Furthermore, the current prevalent methods primarily focus on addressing a single reasoning scenario, presenting limitations in meeting the diverse demands of real-world reasoning tasks. Unifying the neural and symbolic methods, as well as diverse reasoning scenarios in one model is challenging as there is a natural representation gap between symbolic rules and neural networks, and diverse scenarios exhibit distinct knowledge structures and specific reasoning objectives. To address these issues, we propose a unified neurosymbolic reasoning framework, namely Tunsr, for KG reasoning. Tunsr first introduces a consistent structure of reasoning graph that starts from the query entity and constantly expands subsequent nodes by iteratively searching posterior neighbors. Based on it, a forward logic message-passing mechanism is proposed to update both the propositional representations and attentions, as well as first-order logic (FOL) representations and attentions of each node. In this way, Tunsr conducts the transformation of merging multiple rules by merging possible relations at each step. Finally, the FARI algorithm is proposed to induce FOL rules by constantly performing attention calculations over the reasoning graph. Extensive experimental results on 19 datasets of four reasoning scenarios (transductive, inductive, interpolation, and extrapolation) demonstrate the effectiveness of Tunsr.
Index Terms: Neurosymbolic AI, Knowledge graph reasoning, Propositional reasoning, First-order logic, Unified model
1 Introduction
As a fundamental and significant topic in the domains of knowledge engineering and artificial intelligence (AI), knowledge graphs (KGs) have been spotlighted in many real-world applications [1], such as question answering [2, 3], recommendation systems [4, 5], relation extraction [6, 7] and text generation [8, 9]. Thanks to their structured manner of knowledge storage, KGs can effectively capture and represent rich semantic associations between real entities using multi-relational graphical structures. Factual knowledge is often stored in KGs using the fact triple as the fundamental unit, represented in the form of (subject, relation, object), such as (Barack Obama, bornIn, Hawaii) in Figure 1. However, most common KGs, such as Freebase [10] and Wikidata [11], are incomplete due to the limitations of current human resources and technical conditions. Furthermore, incomplete KGs can degrade the accuracy of downstream intelligent applications or produce completely wrong answers. Therefore, inferring missing facts from the observed ones is of great significance for downstream KG applications, which is called link prediction that is one form of KG reasoning [12, 13].
The task of KG reasoning is to infer or predict new facts using existing knowledge. For instance, in Figure 1, KG reasoning involves predicting the validity of the target missing triple (Barack Obama, nationalityOf, U.S.A.) based on other available triples. Using two distinct paradigms, connectionism, and symbolicism, which serve as the foundation for implementing AI systems [14, 15], existing methods can be categorized into neural, symbolic, and neurosymbolic models.
Neural methods, drawing inspiration from the connectionism of AI, typically employ neural networks to learn entity and relation representations. Subsequently, a customized scoring function, such as translation-based distance or semantic matching strategy, is utilized for model optimization and query reasoning, which is illustrated in the top part of Figure 1. However, such an approach lacks transparency and interpretability [16, 17]. On the other hand, symbolic methods draw inspiration from the idea of symbolicism in AI. As shown in the bottom part of Figure 1, they first learn logic rules and then apply these rules, based on known facts to deduce new knowledge. In this way, symbolic methods offer natural interpretability due to the incorporation of logical rules. However, owing to the limited modeling capacity given by discrete representation and reasoning strategies of logical rules, these methods often fall short in terms of reasoning performance [18].
<details>
<summary>extracted/6596839/fig/ns.png Details</summary>

### Visual Description
## Diagram: Knowledge Graph and Reasoning Process for Determining Nationality
### Overview
The diagram illustrates a hybrid reasoning system combining a knowledge graph (KG) with neural and symbolic reasoning to infer the nationality of Barack Obama. It includes a KG on the left, neural reasoning (KGE and score function) in the middle, symbolic reasoning with a rule set below, and a final conclusion on the right.
---
### Components/Axes
#### Knowledge Graph (Left)
- **Nodes**:
- Barack Obama (central node)
- Michelle Obama (marriedTo Barack Obama)
- U.S.A. (placeIn Barack Obama)
- Chicago (placeIn Michelle Obama)
- Malia Obama (fatherOf Barack Obama)
- Ann Dunham (motherOf Barack Obama)
- Harvard University (graduateFrom Barack Obama)
- Hawaii (locatedInCountry Barack Obama)
- Honolulu (hasCity Hawaii)
- **Edges**:
- Labels include `bornIn`, `marriedTo`, `placeIn`, `fatherOf`, `motherOf`, `graduateFrom`, `locatedInCountry`, `hasCity`.
- Dashed edge between Barack Obama and U.S.A. labeled with a question mark (`?`).
#### Neural Reasoning (Middle)
- **KGE (Knowledge Graph Embedding)**:
- Visualized as a neural network with relation and entity embeddings.
- **Score Function**:
- Outputs a score for the inferred relationship.
#### Symbolic Reasoning (Bottom)
- **Rule Set**:
- Three probabilistic rules (`Ξ³β`, `Ξ³β`, `Ξ³β`) with variables `X`, `Y`, `Z`:
1. `Ξ³β: 0.89 β§ bornIn(X, Y) β§ locatedInCountry(Y, Z) β nationalityOf(X, Z)`
2. `Ξ³β: 0.65 β§ bornIn(X, Yβ) β§ hasCity(Yβ, Yβ) β§ locatedInCountry(Yβ, Z) β nationalityOf(X, Z)`
3. `Ξ³β: 0.54 β§ marriedTo(X, Yβ) β§ bornIn(Yβ, Yβ) β§ placeIn(Yβ, Z) β nationalityOf(X, Z)`
#### Final Output (Right)
- **Conclusion**: `nationalityOf Barack Obama β U.S.A.`
---
### Detailed Analysis
#### Knowledge Graph
- **Key Relationships**:
- Barack Obama is `bornIn` Hawaii, which is `locatedInCountry` U.S.A.
- Michelle Obama is `marriedTo` Barack Obama and `bornIn` Chicago.
- Malia and Ann Dunham are `fatherOf` and `motherOf` Barack Obama, respectively.
- Hawaii is `hasCity` Honolulu.
#### Neural Reasoning
- **KGE**: Embeds entities (e.g., Barack Obama) and relations (e.g., `bornIn`) into vector spaces.
- **Score Function**: Quantifies confidence in inferred relationships (e.g., `nationalityOf`).
#### Symbolic Reasoning
- **Rule Set**:
- **Rule 1**: Direct inference via `bornIn` and `locatedInCountry` (highest confidence: 0.89).
- **Rule 2**: Indirect inference via `hasCity` and `locatedInCountry` (confidence: 0.65).
- **Rule 3**: Indirect inference via marriage and `placeIn` (lowest confidence: 0.54).
---
### Key Observations
1. **Direct Path**: The strongest evidence (`Ξ³β`) uses `bornIn(Hawaii)` and `locatedInCountry(Hawaii, U.S.A.)` to infer nationality.
2. **Indirect Paths**:
- `Ξ³β` relies on `hasCity(Hawaii, Honolulu)` and `locatedInCountry(Hawaii, U.S.A.)`.
- `Ξ³β` uses marriage (`marriedTo(Michelle Obama)`) and `placeIn(Chicago, U.S.A.)`, but has lower confidence (0.54).
3. **Uncertainty**: The dashed edge between Barack Obama and U.S.A. in the KG highlights the need for reasoning to resolve the relationship.
---
### Interpretation
The diagram demonstrates how hybrid reasoning systems combine:
1. **Structured Knowledge**: The KG provides factual relationships (e.g., `bornIn`, `locatedInCountry`).
2. **Neural Reasoning**: KGE and score functions enable probabilistic inference over complex relationships.
3. **Symbolic Logic**: Rule-based reasoning with explicit confidence thresholds (`Ξ³β`, `Ξ³β`, `Ξ³β`) validates conclusions.
The final conclusion (`nationalityOf Barack Obama β U.S.A.`) is derived primarily through the high-confidence direct path (`Ξ³β`), with supporting evidence from indirect paths. The lower confidence in `Ξ³β` reflects the weaker evidential chain via marriage and `placeIn`. This highlights the importance of prioritizing high-confidence rules in hybrid systems.
</details>
Figure 1: Illustration of neural and symbolic methods for KG reasoning. Neural methods learn entity and relation embeddings to calculate the validity of the specific fact. Symbolic methods perform logic deduction using known facts on learned or given rules (like $\gamma_{1}$ , $\gamma_{2}$ and $\gamma_{3}$ ) for inference.
TABLE I: Classical studies for KG reasoning. PL and FOL denote the propositional and FOL reasoning, respectively. SKG T, SKG I, TKG I, and TKG E represent transductive, inductive, interpolation, and extrapolation reasoning. β $\checkmark$ β means the utilized reasoning manners (neural and logic) or their vanilla application scenarios.
| Model | Neural | Logic | Reasoning Scenarios | | | | |
| --- | --- | --- | --- | --- | --- | --- | --- |
| PL | FOL | SKG T | SKG I | TKG I | TKG E | | |
| TransE [19] | β | | | β | | | |
| AMIE [20] | | | β | β | | | |
| Neural LP [21] | β | | β | β | | | |
| TAPR [22] | β | β | | β | | | |
| RLogic [23] | β | | β | β | | | |
| LatentLogic [24] | β | | β | β | | | |
| PSRL [25] | β | β | | β | | | |
| ConGLR [26] | β | | β | | β | | |
| TeAST [27] | β | | | | | β | |
| TLogic [28] | | | β | | | | β |
| TR-Rules [29] | | | β | | | | β |
| TECHS [30] | β | β | β | | | | β |
| Tunsr | β | β | β | β | β | β | β |
To leverage the strengths of both neural and symbolic methods while mitigating their respective drawbacks, there has been a growing interest in integrating them to realize neurosymbolic systems [31]. Several approaches such as Neural LP [21], DRUM [32], RNNLogic [33], and RLogic [23] have emerged to address the learning and reasoning of rules by incorporating neural networks into the whole process. Despite achieving some successes, there remains a notable absence of a cohesive modeling approach that integrates both propositional and first-order logic (FOL) reasoning. Propositional reasoning on KGs, generally known as multi-hop reasoning [34], is dependent on entities and predicts answers through specific reasoning paths, which demonstrates strong modeling capabilities by providing diverse reasoning patterns for complex scenarios [35, 36]. On the other hand, FOL reasoning utilizes learned FOL rules to infer information from the entire KG by variable grounding, ultimately scoring candidates by aggregating all possible FOL rules. FOL reasoning is entity-independent and exhibits good transferability. Unfortunately, as shown in Table I, mainstream methods have failed to effectively combine these two reasoning approaches within a single framework, resulting in suboptimal models.
Moreover, as time progresses and society undergoes continuous development, a wealth of new knowledge consistently emerges. Consequently, simple reasoning on static KGs (SKGs), i.e., transductive reasoning, can no longer meet the needs of practical applications. Recently, there has been a gradual shift in the research communityβs focus toward inductive reasoning with emerging entities on SKGs, as well as interpolation and extrapolation reasoning on temporal KGs (TKGs) [37] that introduce time information to facts. The latest research, which predominantly concentrated on individual scenarios, proved insufficient in providing a comprehensive approach to address various reasoning scenarios simultaneously. This limitation significantly hampers the modelβs generalization ability and its practical applicability. To sum up, by comparing the state-of-the-art recent studies on KG reasoning in Table I, it is observed that none of them has a comprehensive unification across various KG reasoning tasks, either in terms of methodology or application perspective.
The challenges in this domain can be categorized into three main aspects: (1) There is an inherent disparity between the discrete nature of logic rules and the continuous nature of neural networks, which presents a natural representation gap to be bridged. Thus, implementing differentiable logical rule learning and reasoning is not directly achievable. (2) It is intractable to solve the transformation and integration problems for propositional and FOL rules, as they have different semantic representation structures and reasoning mechanisms. (3) Diverse scenarios on SKGs or TKGs exhibit distinct knowledge structures and specific reasoning objectives. Consequently, a model tailored for one scenario may encounter difficulties when applied to another. For example, each fact on SKGs is in a triple form while that of TKGs is quadruple. Conventional embedding methods for transductive reasoning fail to address inductive reasoning as they do not learn embeddings of emerging entities in the training phase. Similarly, methods employed for interpolation reasoning cannot be directly applied to extrapolation reasoning, as extrapolation involves predicting facts with future timestamps that are not present in the training set.
To address the above challenges, we propose a unified neurosymbolic reasoning framework (named Tunsr) for KG reasoning. Firstly, to realize the unified reasoning on different scenarios, we introduce a consistent structure of reasoning graph. It starts from the query entity and constantly expands subsequent nodes (entities for SKGs and entity-time pairs for TKGs) by iteratively searching posterior neighbors. Upon this, we can seamlessly integrate diverse reasoning scenarios within a unified computational framework, while also implementing different types of propositional and FOL rule-based reasoning over it. Secondly, to combine neural and symbolic reasoning, we propose a forward logic message-passing mechanism. For each node in the reasoning graph, Tunsr learns an entity-dependent propositional representation and attention using the preceding counterparts. Besides, it utilizes a gated recurrent unit (GRU) [38] to integrate the current relation and preceding FOL representations as the edgesβ representations, following which the entity-independent FOL representation and attention are calculated by message aggregation. In this process, the information and confidence of the preceding nodes in the reasoning graph are passed to the subsequent nodes and realize the unified neurosymbolic calculation. Finally, with the reasoning graph and learned attention weights, a novel Forward Attentive Rule Induction (FARI) algorithm is proposed to induce different types of FOL rules. FARI gradually appends rule bodies by searching over the reasoning graph and viewing the FOL attentions as rule confidences. It is noted that our reasoning form for link prediction is data-driven to learn rules and utilizes grounding to calculate the fact probabilities, while classic Datalog [39] and ASP (Answer Set Programming) reasoners [40, 41] usually employ declarative logic programming to conduct precise and deterministic deductive reasoning on a set of rules and facts.
In summary, the contribution can be summarized as threefold:
$\bullet$ Combining the advantages of connectionism and symbolicism of AI, we propose a unified neurosymbolic framework for KG reasoning from both perspectives of methodology and reasoning scenarios. To the best of our knowledge, this is the first attempt to do such a study.
$\bullet$ A forward logic message-passing mechanism is proposed to update both the propositional representations and attentions, as well as FOL representations and attentions of each node in the expanding reasoning graph. Meanwhile, a novel FARI algorithm is introduced to induce FOL rules using learned attentions.
$\bullet$ Extensive experiments are carried out on the current mainstream KG reasoning scenarios, including transductive, inductive, interpolation, and extrapolation reasoning. The results demonstrate the effectiveness of our Tunsr and verify its interpretability.
This study is an extension of our model TECHS [30] published at the ACL 2023 conference. Compared with it, Tunsr has been enhanced in three significant ways: (1) From the theoretical perspective, although propositional and FOL reasoning are integrated in TECHS for extrapolation reasoning on TKGs, these two reasoning types are entangled together in the forward process, which limits the interpretability of the model. However, the newly proposed Tunsr framework presents a distinct separation of propositional and FOL reasoning in each reasoning step. Finally, they are combined for the reasoning results. This transformation enhances the interpretability of the model from both propositional and FOL rulesβ perspectives. (2) For the perspective of FOL rule modeling, not limited to modeling temporal extrapolation Horn rules in TECHS, the connected and closed Horn rules, and the temporal interpolation Horn rules are also included in the Tunsr framework. (3) From the application perspective, the TECHS model is customized for the extrapolation reasoning on TKGs. Based on the further formalization of the reasoning graph and FOL rules, we can utilize the Tunsr model for current mainstream reasoning scenarios of KGs, including transductive, inductive, interpolation, and extrapolation reasoning. The experimental results demonstrate that our Tunsr model performs well in all those scenarios.
2 Preliminaries
2.1 KGs, Variants, and Reasoning Scenarios
Generally, a static KG (SKG) can be represented as $\mathcal{G}=\{\mathcal{E},\mathcal{R},\mathcal{F}\}$ , where $\mathcal{E}$ and $\mathcal{R}$ denote the set of entities and relations, respectively. $\mathcal{F}β\mathcal{E}Γ\mathcal{R}Γ\mathcal{E}$ is the fact set. Each fact is a triple, such as ( $s$ , $r$ , $o$ ), where $s$ , $r$ , and $o$ denote the head entity, relation, and tail entity, respectively. By introducing time information in the knowledge, a TKG can be represented as $\mathcal{G}=\{\mathcal{E},\mathcal{R},\mathcal{T},\mathcal{F}\}$ , where $\mathcal{T}$ denotes the set of time representations (timestamps or time intervals). $\mathcal{F}β\mathcal{E}Γ\mathcal{R}Γ\mathcal{E}Γ\mathcal{T}$ is the fact set. Each fact is a quadruple, such as $(s,r,o,t)$ where $s,oβ\mathcal{E}$ , $rβ\mathcal{R}$ , and $tβ\mathcal{T}$ .
For these two types of KGs, there are mainly the following reasoning types (query for predicting the head entity can be converted to the tail entity prediction by adding reverse relations), which is illustrated in Figure 2:
$\bullet$ Transductive Reasoning on SKGs: Given a background SKG $\mathcal{G}=\{\mathcal{E},\mathcal{R},\mathcal{F}\}$ , the task is to predict the missing entity for the query $(\tilde{s},\tilde{r},?)$ . The true answer $\tilde{o}β\mathcal{E}$ , and $\tilde{s}β\mathcal{E}$ , $\tilde{r}β\mathcal{R}$ , $(\tilde{s},\tilde{r},\tilde{o})β\mathcal{F}$ .
$\bullet$ Inductive Reasoning on SKGs: It indicates that there are new entities appearing in the testing stage, which were not present during the training phase. Formally, the training graph can be expressed as $\mathcal{G}_{t}=\{\mathcal{E}_{t},\mathcal{R},\mathcal{F}_{t}\}$ . The inductive graph $\mathcal{G}_{i}=\{\mathcal{E}_{i},\mathcal{R},\mathcal{F}_{i}\}$ shares the same relation set with $\mathcal{G}_{t}$ . However, their entity sets are disjoint, i.e., $\mathcal{E}_{t}\cap\mathcal{E}_{i}=\varnothing$ . A model needs to predict the missing entity $\tilde{o}$ for the query $(\tilde{s},\tilde{r},?)$ , where $\tilde{s}β\mathcal{E}_{i}$ , $\tilde{o}β\mathcal{E}_{i}$ , $\tilde{r}β\mathcal{R}$ , and $(\tilde{s},\tilde{r},\tilde{o})β\mathcal{F}_{i}$ .
$\bullet$ Interpolation Reasoning on TKGs: For a query $(\tilde{s},\tilde{r},?,\tilde{t})$ in the testing phase based on a training TKG $\mathcal{G}_{t}=\{\mathcal{E}_{t},\mathcal{R}_{t},\mathcal{T}_{t},\mathcal{F}_%
{t}\}$ , a model needs to predict the answer entity $\tilde{o}$ using the facts in the TKG. It denotes that $min(\mathcal{T}_{t})β€slant\tilde{t}β€slant max(\mathcal{T}_{t})$ , where $min$ and $max$ denote the functions to obtain the minimum and maximum timestamp within the set, respectively. Also, the query satisfies $\tilde{s}β\mathcal{E}_{t}$ , $\tilde{o}β\mathcal{E}_{t}$ , $\tilde{r}β\mathcal{R}_{t}$ , and $(\tilde{s},\tilde{r},\tilde{o},\tilde{t})β\mathcal{F}_{t}$ .
$\bullet$ Extrapolation Reasoning on TKGs: It is similar to the interpolation reasoning that predicts the target entity $\tilde{o}$ for a query $(\tilde{s},\tilde{r},?,\tilde{t})$ in the testing phase, based on a training TKG $\mathcal{G}_{t}=\{\mathcal{E}_{t},\mathcal{R}_{t},\mathcal{T}_{t},\mathcal{F}_%
{t}\}$ . Differently, this task is to predict future facts, which means the prediction utilizes the facts that occur earlier than $\tilde{t}$ in TKGs, i.e., $\tilde{t}>max(\mathcal{T}_{t})$ .
<details>
<summary>extracted/6596839/fig/transductive.png Details</summary>

### Visual Description
## Diagram: Relationship and Geographical Connections of Individuals and Entities
### Overview
The image is a conceptual diagram illustrating relationships and geographical connections between individuals (Michelle Obama, Barack Obama) and entities (U.S.A., Honolulu, Hawaii). Arrows labeled with relationship types (e.g., "marriedTo," "locatedIn") connect nodes representing people, cities, states, and countries. A dashed red arrow labeled "nationalityOf?" introduces uncertainty about Barack Obama's nationality.
### Components/Axes
- **Nodes**:
- Michelle Obama (top-left, purple border)
- Barack Obama (bottom-left, green border)
- U.S.A. (center, blue flag with stars)
- Honolulu (top-right, red border with city seal)
- Hawaii (bottom-right, gold border with state seal)
- **Arrows**:
- Solid black arrows: "marriedTo," "liveIn," "locatedIn," "hasCity"
- Dashed red arrow: "nationalityOf?" (from Barack Obama to U.S.A.)
- **Text Labels**:
- "nationalityOf?" (dashed red arrow)
- "bornIn" (solid black arrow from Barack Obama to U.S.A.)
- "locatedIn Country" (solid black arrow from Honolulu to U.S.A.)
- "hasCity" (solid black arrow from Hawaii to Honolulu)
- "locatedIn Country" (solid black arrow from Hawaii to U.S.A.)
### Detailed Analysis
- **Michelle Obama**:
- Connected to U.S.A. via "liveIn" (solid black arrow).
- Positioned top-left, with a portrait and name label.
- **Barack Obama**:
- Connected to U.S.A. via "bornIn" (solid black arrow).
- Dashed red arrow labeled "nationalityOf?" questions his nationality, pointing to U.S.A.
- Positioned bottom-left, with a portrait and name label.
- **U.S.A.**:
- Central node with an American flag.
- Connected to Honolulu ("locatedIn Country"), Hawaii ("locatedIn Country"), and both Obamas.
- **Honolulu**:
- Connected to U.S.A. ("locatedIn Country") and Hawaii ("hasCity").
- Top-right node with a city seal.
- **Hawaii**:
- Connected to U.S.A. ("locatedIn Country") and Honolulu ("hasCity").
- Bottom-right node with a state seal.
### Key Observations
1. **Relationships**:
- Michelle Obama is explicitly linked to the U.S.A. via residence.
- Barack Obamaβs "bornIn" relationship to the U.S.A. is contrasted with the uncertain "nationalityOf?" label, suggesting ambiguity.
2. **Geographical Hierarchy**:
- Honolulu (city) β Hawaii (state) β U.S.A. (country) forms a nested geographical structure.
3. **Uncertainty**:
- The dashed red arrow and question mark highlight unresolved or debated aspects of Barack Obamaβs nationality.
### Interpretation
The diagram maps real-world connections (e.g., familial ties, geography) while introducing a speculative element ("nationalityOf?"). This could reflect historical debates about Barack Obamaβs birthplace or citizenship, despite his documented birth in Hawaii. The structure emphasizes hierarchical relationships (individual β city β state β country) and uses visual cues (solid vs. dashed arrows) to denote certainty and uncertainty. The inclusion of portraits personalizes the nodes, grounding abstract concepts in identifiable figures.
</details>
(a) Transductive reasoning on SKGs.
<details>
<summary>extracted/6596839/fig/inductive.png Details</summary>

### Visual Description
## Network Diagram: Relationships and Affiliations
### Overview
The diagram illustrates connections between individuals, organizations, and geographic entities using labeled arrows. Key nodes include people (Christopher Nolan, Emma Thomas), a company (Syncopy Inc.), and locations (London, United Kingdom). Relationships are defined by directional arrows with explicit labels, some marked with dashed lines to indicate uncertainty.
### Components/Axes
- **Nodes**:
- **Christopher Nolan** (left, blue circle)
- **Emma Thomas** (bottom-left, red circle)
- **Syncopy Inc.** (top-center, black circle with blue text)
- **London** (right-center, green circle with image of Big Ben)
- **United Kingdom** (bottom-right, purple circle with Union Jack)
- **Arrows/Labels**:
- **Solid Black Arrows**:
- `cofounderOf` (Christopher Nolan β Syncopy Inc.)
- `hasofficeIn` (Syncopy Inc. β London)
- `marriedTo` (Christopher Nolan β Emma Thomas)
- `bornIn` (Emma Thomas β United Kingdom)
- **Dashed Red Arrows**:
- `nationalityOf` (Emma Thomas β United Kingdom, bidirectional with question mark)
### Detailed Analysis
1. **Christopher Nolan**:
- Linked to **Syncopy Inc.** via `cofounderOf`.
- Married to **Emma Thomas** via `marriedTo`.
2. **Syncopy Inc.**:
- Has an office in **London** via `hasofficeIn`.
3. **Emma Thomas**:
- Born in the **United Kingdom** via `bornIn`.
- Her nationality is ambiguously linked to the UK via a dashed `nationalityOf` arrow (question mark included).
4. **Geographic Nodes**:
- **London** and **United Kingdom** are terminal nodes with no outgoing arrows.
### Key Observations
- The dashed `nationalityOf` arrow introduces uncertainty about Emma Thomas's nationality, contrasting with the definitive `bornIn` relationship.
- **Syncopy Inc.** and **London** form a hierarchical relationship (company β location), while **Christopher Nolan** and **Emma Thomas** are connected through marriage.
- No other nodes or relationships are present beyond those explicitly labeled.
### Interpretation
The diagram maps professional, personal, and geographic affiliations. The dashed `nationalityOf` arrow suggests ambiguity or debate about Emma Thomas's nationality despite her birth in the UK. The structure emphasizes direct, unambiguous connections (e.g., co-founding, marriage) versus speculative or contested ones (nationality). The absence of additional nodes implies a focused scope on these specific relationships.
</details>
(b) Inductive reasoning on SKGs using training data in 2.
<details>
<summary>extracted/6596839/fig/interpolation.png Details</summary>

### Visual Description
## Network Diagram: Geopolitical Interactions Over Time
### Overview
The diagram illustrates a dynamic network of geopolitical interactions between political figures (Barack Obama, Angela Merkel) and countries (China, Russia, South Korea, North Korea, Pakistan, Singapore) across three discrete time intervals (t_i-2, t_i-1, t_i). Nodes represent actors, while directed edges encode actions with explicit labels. A timeline at the bottom anchors the temporal progression.
### Components/Axes
- **Nodes**:
- **Political Figures**: Barack Obama (green border), Angela Merkel (blue border)
- **Countries**: China (red flag), Russia (purple flag), South Korea (white/red flag), North Korea (red star), Pakistan (green crescent), Singapore (red crescent)
- **Edges**:
- Labeled actions: "express," "extend," "sign," "negotiate," "make Statement," "consume?"
- Dashed red edge labeled "consume?" between North Korea and Pakistan
- **Timeline**:
- Three segments: t_i-2 (left), t_i-1 (center), t_i (right)
- Gray circular nodes connect time intervals
### Detailed Analysis
#### Time Interval t_i-2
- **Barack Obama** (green node):
- "express ExtendTo" β South Korea
- "make Statement" β South Korea
- "VisitTo" β China
- "negotiate" β Russia
- **Angela Merkel** (blue node):
- "express ExtendTo" β South Korea
- "sign Agreement" β South Korea
#### Time Interval t_i-1
- **Angela Merkel**:
- "express ExtendTo" β Pakistan
- "sign Agreement" β South Korea
- **Uncertain Interaction**:
- Dashed red edge labeled "consume?" between North Korea and Pakistan
#### Time Interval t_i
- **Barack Obama**:
- "express ExtendTo" β Pakistan
- "make Statement" β North Korea
- **South Korea**:
- "consult" β North Korea
### Key Observations
1. **Temporal Evolution**:
- Obama's focus shifts from South Korea (t_i-2) to Pakistan/North Korea (t_i)
- Merkel maintains consistent engagement with South Korea but expands to Pakistan (t_i-1)
2. **Uncertainty Marker**:
- The "consume?" label on the North Korea-Pakistan edge suggests disputed or ambiguous interactions
3. **Geopolitical Patterns**:
- South Korea acts as a central node, receiving actions from both political figures
- China and Russia appear only in t_i-2, indicating earlier engagement
### Interpretation
The diagram reveals a strategic shift in U.S. and German foreign policy priorities over time. Obama's transition from bilateral engagement with South Korea to multilateral interactions with Pakistan and North Korea may reflect evolving regional security concerns. Merkel's sustained focus on South Korea, coupled with her Pakistan outreach, suggests a dual strategy of maintaining alliances while expanding diplomatic reach. The "consume?" label introduces ambiguity in North Korea-Pakistan relations, potentially indicating contested resource flows or unresolved diplomatic issues. The timeline structure emphasizes the episodic nature of these interactions, possibly correlating with specific policy announcements or geopolitical events.
</details>
(c) Interpolation reasoning on TKGs.
<details>
<summary>extracted/6596839/fig/extrapolation.png Details</summary>

### Visual Description
## Diagram: Political Interactions and Predictive Modeling Over Time
### Overview
The diagram illustrates a network of interactions between political figures (Barack Obama, Angela Merkel) and countries (China, Russia, South Korea, North Korea, Pakistan, Singapore) across three time intervals (t_i-2, t_i-1, t_i). Arrows labeled with actions (e.g., "express," "extendTo," "negotiate") connect nodes, while a "predict" arrow at t_i suggests forecasting a future action ("make Statement?"). The structure implies a causal or relational model for diplomatic or political behavior.
---
### Components/Axes
- **Nodes**:
- **Individuals**: Barack Obama (green), Angela Merkel (blue).
- **Countries**: China (red flag), Russia (purple flag), South Korea (red/blue flag), North Korea (red star), Pakistan (green crescent), Singapore (red/white flag).
- **Edges**:
- Labeled arrows indicating actions: "make," "VisitTo," "express," "extendTo," "negotiate," "consult," "consume," "sign Agreement."
- **Timeline**:
- X-axis labeled with time intervals: t_i-2 (left), t_i-1 (middle), t_i (right).
- **Prediction**:
- Dashed red arrow labeled "predict" points to a question mark ("make Statement?") at t_i, targeting Barack Obama and South Korea.
---
### Detailed Analysis
1. **Nodes and Labels**:
- **Barack Obama** (green circle) appears at t_i-2 and t_i, connected to China, Russia, and South Korea via actions like "express" and "extendTo."
- **Angela Merkel** (blue circle) at t_i-1 is linked to South Korea ("sign Agreement") and North Korea ("express").
- **South Korea** (red/blue flag) is a central node, connected to Obama (t_i-2), Merkel (t_i-1), and North Korea (t_i-1).
- **North Korea** (red star) interacts with Merkel ("express") and South Korea ("extendTo").
- **China** (red flag) and **Russia** (purple flag) are connected to Obama at t_i-2 via "make" and "negotiate."
2. **Edge Labels**:
- Actions like "express" and "extendTo" suggest communication or influence.
- "Sign Agreement" (t_i-1) implies formal collaboration between Merkel and South Korea.
- "Predict" (t_i) indicates uncertainty about Obamaβs future action toward South Korea.
3. **Temporal Flow**:
- Interactions at t_i-2 (Obama-China/Russia) and t_i-1 (Merkel-South Korea/North Korea) precede the prediction at t_i.
---
### Key Observations
- **Central Role of South Korea**: Appears in all time intervals, acting as a bridge between individuals and other nations.
- **Prediction Uncertainty**: The dashed arrow and question mark highlight ambiguity in forecasting Obamaβs statement.
- **Temporal Progression**: Actions at earlier times (t_i-2, t_i-1) may influence the predicted event at t_i.
---
### Interpretation
The diagram models diplomatic interactions as a network, where past actions (e.g., negotiations, agreements) shape future predictions. The use of labeled arrows suggests a causal framework, potentially for simulating or analyzing international relations. The prediction at t_i implies that historical interactions (e.g., Obamaβs engagement with China/Russia, Merkelβs agreements with South Korea) inform expectations about future statements. The absence of numerical data emphasizes qualitative relationships over quantitative metrics, aligning with qualitative political analysis. The diagramβs structure could support machine learning models predicting geopolitical events based on relational patterns.
</details>
(d) Extrapolation reasoning on TKGs.
Figure 2: Illustration of four reasoning scenarios on KGs: transductive, inductive, interpolation, and extrapolation. The red dashed arrows indicate the query fact to be predicted.
2.2 Logic Reasoning on KGs
Logical reasoning involves using a given set of facts (i.e., premises) to deduce new facts (i.e., conclusions) by a rigorous form of thinking [42, 43]. It generally covers propositional and first-order logic (also known as predicate logic). Propositional logic deals with declarative sentences that can be definitively assigned a truth value, leaving no room for ambiguity. It is usually known as multi-hop reasoning [44, 35] on KGs, which views each fact as a declarative sentence and usually reasons over query-related paths to obtain an answer. Thus, propositional reasoning on KGs is entity-dependent. First-order logic (FOL) can be regarded as an expansion of propositional logic, enabling the expression of more refined and nuanced ideas [42, 45]. FOL rules extend the modeling scope and application prospect by introducing quantifiers ( $β$ and $β$ ), predicates, and variables. They encompass variables that belong to a specific domain and encompass objects and relationships among those objects [46]. They are usually in the form of $premiseβ conclusion$ , where $premise$ and $conclusion$ denote the rule body and rule head which are all composed of atomic formulas. Each atomic formula consists of a predicate and several variables, e.g., $bornIn(X,Y)$ in $\gamma_{1}$ of Figure 1, where $bornIn$ is the predicate and $X$ and $Y$ are all entity variables. Thus, FOL reasoning is entity-independent, leveraging consistent FOL rules for different entities [47]. In this paper, we utilize Horn rules [48] to enhance the adaptability of FOL rules to various KG reasoning tasks. These rules entail setting the rule head to a single atomic formula. Furthermore, to make the Horn rules suitable for multiple reasoning scenarios, we introduce the following definitions.
Connected and Closed Horn (CCH) Rule. Based on Horn rules, CCH rules possess two distinct features, i.e., connected and closed. The term connected means the rule body necessitates a transitive and chained connection between atomic formulas through shared variables. Concurrently, the term closed indicates the rule body and rule head utilize identical start and end variables.
CCH rules of length $n$ (the quantifier $β$ would be omitted for better exhibition in the following parts of the paper) are in the following form:
$$
\begin{split}\epsilon,\;\forall&X,Y_{1},Y_{2},\cdots,Y_{n},Z\;\;r_{1}(X,Y_{1})%
\land r_{2}(Y_{1},Y_{2})\land\cdots\\
&\land r_{n}(Y_{n-1},Z)\rightarrow r(X,Z),\end{split} \tag{1}
$$
where atomic formulas in the rule body are connected by variables ( $X,Y_{1},Y_{2},Β·s,Y_{n-1},Z$ ). For example, $r_{1}(X,Y_{1})$ and $r_{2}(Y_{1},Y_{2})$ are connected by $Y_{1}$ . Meanwhile, all variables form a path from $X$ to $Z$ that are the start variable and end variable of rule head $r_{t}(X,Z)$ , respectively. $r_{1},r_{2},Β·s,r_{n},r$ are relations in KGs to represent predicates. To model different credibility of different rules, we configure a rule confidence $\epsilonβ[0,1]$ for each Horn rule. Rule length refers to the number of atomic formulas in the rule body. For example, $\gamma_{1}$ , $\gamma_{2}$ , and $\gamma_{3}$ in Figure 1 are three example Horn rules of lengths 2, 3, and 3. Rule grounding of a Horn rule can be realized by replacing each variable with a real entity, e.g., bornIn(Barack Obama, Hawaii) $\land$ locatedInCountry(Hawaii, U.S.A.) $β$ nationalityOf(Barack Obama, U.S.A.) is a grounding of rule $\gamma_{1}$ . CCH rules can be utilized for transductive and inductive reasoning.
Temporal Interpolation Horn (TIH) Rule. Based on CCH rules on static KGs that require connected and closed variables, TIH rules assign each atomic formula a time variable.
An example of TIH rule can be:
$$
\epsilon,\;\forall X,Y,Z\;\;r_{1}(X,Y):t_{1}\land r_{2}(Y,Z):t_{2}\rightarrow r%
(X,Z):t, \tag{2}
$$
where $t_{1}$ , $t_{2}$ and $t$ are time variables. To expand the model capacity when grounding TIH rules, time variables are virtual and do not have to be instantiated to real timestamps, which is distinct from the entity variables (e.g., $X$ , $Y$ , $Z$ ). However, we model the relative sequence of occurrence. This implies that TIH rules with the same atomic formulas but varying time variable conditions are distinct and may have different degrees of confidence, such as for $t_{1}<t_{2}$ vs. $t_{1}>t_{2}$ .
Temporal Extrapolation Horn (TEH) Rule. Based on CCH rules on static KGs that require connected and closed variables, TEH rules assign each atomic formula a time variable. Unlike TIH rules, TEH rules have the characteristic of time growth, which means the time sequence is increasing and the time in the rule head is the maximum.
For example, the following rule is a TEH rule with length 2:
$$
\begin{split}\epsilon,\;\forall X,Y,Z\;\;&r_{1}(X,Y):t_{1}\land r_{2}(Y,Z):t_{%
2}\\
&\rightarrow r(X,Z):t,\;\;s.t.\;\;t_{1}\leqslant t_{2}<t.\end{split} \tag{3}
$$
Noticeably, for rule learning and reasoning, $t_{1}$ , $t_{2}$ and $t$ are also virtual time variables that are only used to satisfy the time growth and do not have to be instantiated.
<details>
<summary>extracted/6596839/fig/arc.png Details</summary>

### Visual Description
## Flowchart: Multi-Step Reasoning Process with Knowledge Graphs
### Overview
The diagram illustrates a multi-stage reasoning pipeline involving knowledge graphs (KGs), neighbor facts, and iterative logical message-passing. It begins with an input query and progresses through N logic blocks, culminating in updated embeddings, attributes, and reasoning scores. The flow emphasizes iterative refinement of knowledge through neighbor fact aggregation and logical inference.
### Components/Axes
1. **Input Section**:
- **KG**: Initial knowledge graph (network of nodes/edges).
- **Initial Embed**: Starting point for embeddings.
- **Query**: Parameters `(s, r, ?)` or `(s, r, ?, t)` (subject, relation, object/temporal).
2. **Logic Block Structure** (Repeated N times):
- **Neighbor Facts**: Aggregates facts (Fact 1 to Fact N) from the KG.
- **Expanding Reasoning Graph**: Processes neighbor facts iteratively.
- **Logical Message-passing**: Updates embeddings and attributes.
- **Output of Each Block**: Updated Emb & Att (embeddings + attributes).
3. **Output Section**:
- **Updated Emb & Att**: Final embeddings and attributes after N logic blocks.
- **Reasoning Scores**: Visualized as a bar chart (heights indicate score magnitudes).
### Detailed Analysis
- **Flow Direction**:
- Input β Logic Block 1 β ... β Logic Block N β Output.
- Each logic block feeds into the next via "Reasoning Graph (N step)" connections.
- **Key Elements**:
- **Neighbor Facts**: Represented as a list (Fact 1 to Fact N) within each logic block.
- **Expanding Reasoning Graph**: Shown as a funnel-shaped component, suggesting iterative expansion.
- **Logical Message-passing**: Depicted as a bidirectional process between reasoning graphs and message-passing modules.
- **Output Visualization**:
- Reasoning scores are shown as a bar chart with approximate heights (e.g., highest bar ~3x taller than shortest).
### Key Observations
1. **Iterative Refinement**: Each logic block processes updated embeddings from the prior block, implying cumulative improvement.
2. **Dynamic Query Handling**: The input query supports both static `(s, r, ?)` and temporal `(s, r, ?, t)` formats.
3. **Scalability**: The "N" in Logic Block #N indicates the pipeline can scale to arbitrary depth.
4. **Cyclic Dependency**: Updated Emb & Att from one block directly influence the next blockβs neighbor facts.
### Interpretation
This diagram models a **dynamic knowledge graph reasoning system** where:
- **Neighbor Facts** act as local context for logical inference.
- **Logical Message-passing** enables global consistency across the KG.
- **Iterative Processing** (N steps) allows the system to handle complex queries requiring multi-hop reasoning.
- The **reasoning scores** likely quantify confidence in the final output, with taller bars indicating higher certainty.
The architecture resembles graph neural network (GNN) architectures but emphasizes symbolic logic (e.g., "Fact N") over purely numerical embeddings. The inclusion of temporal queries (`?, t`) suggests applications in temporal reasoning or event prediction.
</details>
Figure 3: An overview of the Tunsr. It utilizes multiple logic blocks to find the answer, where the reasoning graph is constructed and iteratively expanded. Meanwhile, a forward logic message-passing mechanism is proposed to update embeddings and attentions for unified propositional and FOL reasoning.
<details>
<summary>extracted/6596839/fig/rg2.png Details</summary>

### Visual Description
## Network Diagram: Iterative Entity Relationships
### Overview
The diagram illustrates a multi-iteration network of entities (people, institutions, locations) connected by labeled relationships. It progresses through four iterations (Oβ to Oβ), with each iteration adding new nodes and edges. The central node is **Barack Obama**, with relationships expanding outward to family, education, geography, and professional connections.
---
### Components/Axes
- **Nodes**: Represent entities (people, institutions, locations). Colored orange (entities) and blue (central node).
- **Edges**: Labeled relationships (e.g., "bornIn," "graduateFrom," "marriedTo").
- **Iterations**:
- **Oβ**: Initial nodes (Barack Obama, Michelle Obama).
- **Oβ**: Adds Chicago, Harvard University, Bill Gates.
- **Oβ**: Adds U.S.A., John Harvard, Microsoft, Honolulu.
- **Oβ**: Adds Hawaii, Malia Obama, Sasha Obama, Sidwell Friends School.
---
### Detailed Analysis
#### Iteration Oβ
- **Nodes**:
- Barack Obama (blue)
- Michelle Obama (orange)
- **Edges**:
- `marriedTo` (Barack β Michelle)
- `fatherOf` (Barack β Malia Obama)
- `growUpIn` (Barack β Hawaii)
- `bornIn` (Barack β Hawaii)
#### Iteration Oβ
- **New Nodes**: Chicago, Harvard University, Bill Gates.
- **New Edges**:
- `bornIn` (Michelle β Chicago)
- `graduateFrom` (Barack β Harvard University)
- `founderOf` (Bill Gates β Microsoft)
- `capitalOf` (Bill Gates β Microsoft)
#### Iteration Oβ
- **New Nodes**: U.S.A., John Harvard, Honolulu.
- **New Edges**:
- `placeIn` (Chicago β U.S.A.)
- `mascot` (Harvard University β John Harvard)
- `growUpIn` (Barack β Honolulu)
#### Iteration Oβ
- **New Nodes**: Malia Obama, Sasha Obama, Sidwell Friends School.
- **New Edges**:
- `growUpIn` (Malia β Hawaii, Sasha β Hawaii)
- `graduateFrom` (Sasha β Sidwell Friends School)
---
### Key Observations
1. **Centrality of Barack Obama**: All iterations radiate from Barack Obama, who is the sole blue node, indicating his role as the primary subject.
2. **Geographical Connections**: Hawaii appears in Oβ (Barackβs upbringing) and Oβ (Malia/Sashaβs upbringing), while Honolulu is added in Oβ.
3. **Educational Relationships**: Harvard University (Oβ) connects to Barack via `graduateFrom` and to John Harvard via `mascot`.
4. **Institutional Ties**: Bill Gates (Oβ) links to Microsoft via `founderOf` and `capitalOf`.
5. **Family Expansion**: Malia and Sasha Obama (Oβ) introduce new nodes connected to Hawaii via `growUpIn`.
---
### Interpretation
This diagram represents a **knowledge graph** or **social network** modeling relationships between entities. Each iteration likely reflects incremental data collection or contextual expansion:
- **Oβ** establishes the core subject (Barack Obama) and immediate family/geography.
- **Oβ** introduces professional and institutional connections (Harvard, Microsoft).
- **Oβ** adds national context (U.S.A.) and institutional mascots.
- **Oβ** deepens familial and educational ties (Malia, Sasha, Sidwell Friends School).
The use of relationship labels (e.g., `growUpIn`, `founderOf`) suggests a focus on **semantic relationships** rather than quantitative data. The absence of numerical values implies the diagram prioritizes **structural relationships** over metrics. The iterative design may indicate a process of knowledge graph construction, where each iteration refines or expands the networkβs scope.
</details>
(a) An example of reasoning graph in SKGs.
<details>
<summary>extracted/6596839/fig/rg1.png Details</summary>

### Visual Description
## Network Diagram: Entity Relationships Over Iterations
### Overview
The diagram illustrates a dynamic network of entities and their interactions across four iterations (Oβ to Oβ). It uses nodes (colored blue/orange) and directed edges (labeled with actions) to represent relationships between individuals, countries, and events. The start node is Catherine Ashton (blue), with subsequent nodes (orange) representing entities or entity-time pairs. Iterations are grouped in dashed boxes, showing temporal progression.
### Components/Axes
- **Nodes**:
- **Start Node**: Blue circle labeled "Catherine Ashton" (2014-01-01).
- **Subsequent Nodes**: Orange circles labeled with:
- Entities (e.g., "Mohammad Javad", "Iran", "Oman").
- Entity-time pairs (e.g., "Iran: 2014-11-04", "Oman: 2014-11-08").
- **Edges**:
- Directed arrows labeled with actions (e.g., "makeVisit", "expressIntentTo", "consult").
- Dates on edges indicate temporal context (e.g., "2014-10-01").
- **Iterations**:
- Labeled Oβ (initial state) to Oβ (final iteration), each containing nodes/edges specific to that phase.
- **Legend**:
- Blue: Start node (query entity).
- Orange: Subsequent nodes (entities or entity-time pairs).
### Detailed Analysis
#### Nodes and Labels
- **Oβ**: Only the start node (Catherine Ashton) exists.
- **Oβ**:
- Nodes: Mohammad Javad (2014-10-01), Iran (2014-10-04), China (2014-10-30).
- Edges: "makeStatement", "hostVisit", "consult".
- **Oβ**:
- Nodes: Iran (2014-11-04), Oman (2014-11-04), John Kerry (2014-11-05).
- Edges: "expressIntentTo", "makeOptimisticComment", "meetTo".
- **Oβ**:
- Nodes: Oman (2014-11-08), Iran (2014-11-08), John Kerry (2014-10-28).
- Edges: "makeVisit", "consult".
#### Edges and Actions
- **Key Actions**:
- "makeVisit": Connects Catherine Ashton to Iran (2014-10-01) and John Kerry (2014-10-28).
- "expressIntentTo": Links Iran (2014-11-04) to Oman (2014-11-04).
- "consult": Connects Catherine Ashton to China (2014-10-30) and John Kerry (2014-11-05).
#### Spatial Grounding
- **Legend**: Positioned on the right, clearly associating colors with node types.
- **Iteration Boxes**: Dashed gray boxes group nodes/edges by iteration (Oβ to Oβ).
- **Node Placement**: Start node (Catherine Ashton) is at the bottom-left; subsequent nodes radiate outward in iterations.
### Key Observations
1. **Expanding Network**: Each iteration adds new nodes/edges, indicating growing complexity (e.g., Oβ has 3 nodes vs. Oββs 1).
2. **Temporal Progression**: Dates on nodes/edges show sequential interactions (e.g., Iranβs involvement shifts from 2014-10-04 to 2014-11-04/08).
3. **Central Role of Catherine Ashton**: She is the sole start node, with all edges originating from her.
4. **Entity-Time Pairs**: Nodes like "Iran: 2014-11-04" suggest context-specific interactions.
### Interpretation
This diagram likely models a **knowledge graph** or **event timeline** for diplomatic or organizational activities. The start node (Catherine Ashton) acts as the origin of interactions, with subsequent nodes representing entities or time-bound events. The use of entity-time pairs (e.g., "Oman: 2014-11-08") implies that relationships are context-dependent on specific dates. The iterative structure (OββOβ) suggests a phased analysis, possibly tracking the evolution of collaborations or conflicts. The centrality of Catherine Ashton highlights her role as a key actor, while the diversity of actions ("makeVisit", "consult") indicates multifaceted engagements. The diagram emphasizes **temporal causality**, as edges with dates may represent sequential dependencies (e.g., "expressIntentTo" preceding "makeVisit").
</details>
(b) An example of reasoning graph in TKGs.
Figure 4: Examples of the reasoning graph with three iterations. (a) is on SKGs while (b) is on TKGs.
3 Methodology
In this section, we present the technical details of our Tunsr model. It leverages a combination of logic blocks to obtain reasoning results, which involves constructing or expanding a reasoning graph and introducing a forward logic message-passing mechanism for propositional and FOL reasoning. The overall architecture is illustrated in Figure 3.
3.1 Reasoning Graph Construction
For each query of KGs, i.e., $\mathcal{Q}=(\tilde{s},\tilde{r},?)$ for SKGs or $\mathcal{Q}=(\tilde{s},\tilde{r},?,\tilde{t})$ for TKGs, we introduce an expanding reasoning graph to find the answer. The formulation is as follows.
Reasoning Graph. For a specific query $\mathcal{Q}$ , a reasoning graph is defined as $\widetilde{\mathcal{G}}=\{\mathcal{O},\mathcal{R},\widetilde{\mathcal{F}}\}$ for propositional and first-order reasoning. $\mathcal{O}$ is a node set that consists of nodes in different iteration steps, i.e., $\mathcal{O}=\mathcal{O}_{0}\cup\mathcal{O}_{1}\cupΒ·s\cup\mathcal{O}_{L}$ . For SKGs, $\mathcal{O}_{0}$ only contains a query entity $\tilde{s}$ and the subsequent is in the form of entities. $(n_{i}^{l},\bar{r},n_{j}^{l+1})β\widetilde{\mathcal{F}}$ is an edge that links nodes at two neighbor steps, i.e., $n_{i}^{l}β\mathcal{O}_{l}$ , $n_{j}^{l+1}β\mathcal{O}_{l+1}$ and $\bar{r}β\mathcal{R}$ . The reasoning graph is constantly expanded by searching for posterior neighbor nodes. For start node $n^{0}=\tilde{s}$ , its posterior neighbors are $\mathcal{N}(n^{0})=\{e_{i}|(\tilde{s},\bar{r},e_{i})β\mathcal{F}\}$ . For a node in following steps $n_{i}^{l}=e_{i}β\mathcal{O}_{l}$ , its posterior neighbors are $\mathcal{N}(n_{i}^{l})=\{e_{j}|(e_{i},\bar{r},e_{j})β\mathcal{F}\}$ . Its preceding parents are $\widetilde{\mathcal{N}}(n_{i}^{l})=\{(n_{j}^{l-1},\bar{r})|n_{j}^{l-1}β%
\mathcal{O}_{l-1}\land(n_{j}^{l-1},\bar{r},n_{i}^{l})β\widetilde{\mathcal{F}}\}$ . To take preceding nodes into account at the current step, an extra relation self is added. Then, $n_{i}^{l}=e_{i}$ can be obtained at the next step as $n_{i}^{l+1}=e_{i}$ and there have $(n_{i}^{l},self,n_{i}^{l+1})β\widetilde{\mathcal{F}}$ .
For TKGs, $\mathcal{O}_{0}$ also contains a query entity $\tilde{s}$ . But the following nodes are in the form of entity-time pairs. In the interpolation scenarios, for start node $n^{0}=\tilde{s}$ , its posterior neighbors are $\mathcal{N}(n^{0})=\{(e_{i},t_{i})|(\tilde{s},\bar{r},e_{i},t_{i})β\mathcal{%
F}\}$ . For a node in following steps $n_{i}^{l}=(e_{i},t_{i})β\mathcal{O}_{l}$ , its posterior neighbors are $\mathcal{N}(n_{i}^{l})=\{(e_{j},t_{j})|(e_{i},\bar{r},e_{j},t_{j})β\mathcal{%
F}\}$ . Differently, in the extrapolation scenarios, for start node $n^{0}=\tilde{s}$ , its posterior neighbors are $\mathcal{N}(n^{0})=\{(e_{i},t_{i})|(\tilde{s},\bar{r},e_{i},t_{i})β\mathcal{%
F}\land t_{i}<\tilde{t}\}$ . For a node in following steps $n_{i}^{l}=(e_{i},t_{i})β\mathcal{O}_{l}$ , its posterior neighbors are $\mathcal{N}(n_{i}^{l})=\{(e_{j},t_{j})|(e_{i},\bar{r},e_{j},t_{j})β\mathcal{%
F}\land t_{i}β€slant t_{j}\land t_{j}<\tilde{t}\}$ . Similar to the situation of SKGs, the preceding parents of nodes in TKG scenarios are also $\widetilde{\mathcal{N}}(n_{i}^{l})=\{(n_{j}^{l-1},\bar{r})|n_{j}^{l-1}β%
\mathcal{O}_{l-1}\land(n_{j}^{l-1},\bar{r},n_{i}^{l})β\widetilde{\mathcal{F}}\}$ and an extra relation self is also added. Then, $n_{i}^{l}=(e_{i},t_{i})$ can be obtained at the next step as $n_{i}^{l+1}=(e_{i},t_{i})$ ( $t_{i}$ is the minimum time if $l=0$ ) and there have $(n_{i}^{l},self,n_{i}^{l+1})β\widetilde{\mathcal{F}}$ .
Two examples of the reasoning graph with three iterations are shown in Figure 4. Through the above processing, we can model both propositional and FOL reasoning in a unified manner for different reasoning scenarios.
3.2 Modeling of Propositional Reasoning
For decoding the answer for a specific query $\mathcal{Q}$ , we introduce an iterative forward message-passing mechanism in a continuously expanding reasoning graph, regulated by propositional and FOL reasoning. In the reasoning graph, we set two learnable parameters for each node $n_{i}^{l}$ to guide the propositional computation: propositional embedding ${\rm\textbf{x}}_{i}^{l}$ and propositional attention ${\alpha}_{n_{i}^{l}}$ . For a better presentation, we employ the reasoning process on TKGs to illustrate our method. SKGs can be considered a specific case of TKGsβ when the time information of the nodes in the reasoning graph is removed. The initialized embeddings for entity, relation, and time are formalized as h, g, and e. Time embeddings are obtained by the generic time encoding [49] as it is fully compatible with attention to capture temporal dynamics, which is defined as: ${\rm\textbf{e}}_{t}\!=\!\sqrt{\frac{1}{d_{t}}}[{\rm cos}(w_{1}t+b_{1}),Β·s,%
{\rm cos}(w_{d_{t}}t+b_{d_{t}})]$ , where $[w_{1},Β·s,w_{d_{t}}]$ and $[b_{1},Β·s,b_{d_{t}}]$ are trainable parameters for transformation weights and biases. cos denotes the standard cosine function and $d_{t}$ is the dimension of time embedding.
Further, the start node $n^{0}$ = $\tilde{s}$ is initialized as its embedding ${\rm\textbf{x}}_{\tilde{s}}={\rm\textbf{h}}_{\tilde{s}}$ . The node $n_{i}=(e_{i},t_{i})$ at the following iterations is firstly represented by the linear transformation of embeddings: ${\rm\textbf{x}}_{i}$ = ${\rm\textbf{W}}_{n}[{\rm\textbf{h}}_{e_{i}}\|{\rm\textbf{e}}_{t_{i}}]$ (W represents linear transformation and $\|$ denotes the embedding concatenation in the paper). Constant forward computation is required in the reasoning sequence of the target when conducting multi-hop propositional reasoning. Thus, forward message-passing is proposed to pass information (i.e., representations and attention weights) from the preceding nodes to their posterior neighbor nodes. The computation of each node is contextualized with preceding information that contains both entity-dependent parts, reflecting the continuous accumulation of knowledge and credibility in the reasoning process. Specifically, to update node embeddings in step $l$ +1, its own feature and the information from its priors are integrated:
$$
{\rm\textbf{x}}_{j}^{l+1}={\rm\textbf{W}}_{1}^{l}{\rm\textbf{x}}_{j}+\!\!\!\!%
\sum_{(n_{i}^{l},\bar{r})\in\widetilde{\mathcal{N}}(n_{j}^{l+1})}\!\!\!\!%
\alpha_{n_{i}^{l},\bar{r},n_{j}^{l+1}}{\rm\textbf{W}}_{2}^{l}{\rm\textbf{m}}_{%
n_{i}^{l},\bar{r},n_{j}^{l+1}}, \tag{4}
$$
where ${\rm\textbf{m}}_{n_{i}^{l},\bar{r},n_{j}^{l+1}}$ is the message from a preceding node to its posterior node, which is given by the node and relation representations:
$$
{\rm\textbf{m}}_{n_{i}^{l},\bar{r},n_{j}^{l+1}}\!=\!{\rm\textbf{W}}_{3}^{l}[{%
\rm\textbf{n}}_{i}^{l}\|{\rm\textbf{g}}_{\bar{r}}\|{\rm\textbf{n}}_{j}]. \tag{5}
$$
This updating form superficially seems similar to the general message-passing in GNNs [16]. However, they are actually different as ours is in a one-way and hierarchical manner, which is tailored for the tree-like structure of the reasoning graph. The propositional attention weight $\alpha_{n_{i}^{l},\bar{r},n_{j}^{l+1}}$ is for each edge in a reasoning graph. As propositional reasoning is entity-dependent, we compute it by the semantic association of entity-dependent embeddings between the message and the query:
$$
e_{n_{i}^{l},\bar{r},n_{j}^{l+1}}\!=\!\textsc{sigmoid}({\rm\textbf{W}}_{4}^{l}%
[{\rm\textbf{m}}_{n_{i}^{l},\bar{r},n_{j}^{l+1}}\|{\rm\textbf{q}}]), \tag{6}
$$
where ${\rm\textbf{q}}={\rm\textbf{W}}_{q}[{\rm\textbf{h}}_{\tilde{s}}\|{\rm\textbf{g%
}}_{\tilde{r}}\|{\rm\textbf{e}}_{\tilde{t}}]$ is the query embedding. Then, the softmax normalization is utilized to scale edge attentions on this iteration to [0,1]:
$$
\alpha_{\!n_{i}^{l},\bar{r},n_{j}^{l+1}}\!\!=\!\!\frac{\exp(e_{n_{i}^{l},\bar{%
r},n_{j}^{l+1}})}{\sum_{(\!n_{i^{\prime}}^{l},\bar{r}^{\prime})\in\widetilde{%
\mathcal{N}}(n_{j}^{l+1}\!)}\!\!\exp(e_{n_{i^{\prime}}^{l},\bar{r}^{\prime},n_%
{j}^{l+1}}\!)}, \tag{7}
$$
Finally, the propositional attention of new node $n_{j}^{l+1}$ is aggregated from edges for the next iteration:
$$
\begin{split}&\alpha_{n_{j}^{l+1}}\!=\!\!\!\sum_{(n_{i}^{l},\bar{r})\in%
\widetilde{\mathcal{N}}(n_{j}^{l+1})}\!\!\!\!\!\!\!\!\alpha_{n_{i}^{l},\bar{r}%
,n_{j}^{l+1}}.\end{split} \tag{8}
$$
3.3 Modeling of FOL Reasoning
Different from propositional reasoning, FOL reasoning is entity-independent and has a better ability for generalization. As first-order reasoning focuses on the interaction among entity-independent relations, we first obtain the hidden FOL embedding of an edge by fusing the hidden FOL embedding of the preceding node and current relation representation via a GRU [38]. Then, the FOL representation y and attention $b$ are given by:
$$
{\rm\textbf{y}}_{n_{i}^{l},\bar{r},n_{j}^{l+1}}\!=\!\textsc{gru}({\rm\textbf{g%
}}_{\bar{r}},{\rm\textbf{y}}_{n_{i}^{l}}), \tag{9}
$$
$$
b_{n_{i}^{l},\bar{r},n_{j}^{l+1}}\!=\!\textsc{sigmoid}({\rm\textbf{W}}_{5}^{l}%
{\rm\textbf{y}}_{n_{i}^{l},\bar{r},n_{j}^{l+1}}). \tag{10}
$$
Since the preceding node with high credibility leads to faithful subsequent nodes, the attention of the prior ( $\beta$ ) flows to the current edge. Then, the softmax normalization is utilized to scale edge attentions on this iteration to [0,1]:
$$
\begin{split}b_{n_{i}^{l},\bar{r},n_{j}^{l+1}}&=\beta_{\!n_{i}^{l}}\cdot b_{n_%
{i}^{l},\bar{r},n_{j}^{l+1}},\;\;\\
\beta_{\!n_{i}^{l},\bar{r},n_{j}^{l+1}}\!\!&=\!\!\frac{\exp(b_{n_{i}^{l},\bar{%
r},n_{j}^{l+1}})}{\sum_{(\!n_{i^{\prime}}^{l},\bar{r}^{\prime})\in\widetilde{%
\mathcal{N}}(n_{j}^{l+1}\!)}\!\!\exp(b_{n_{i^{\prime}}^{l},\bar{r}^{\prime},n_%
{j}^{l+1}}\!)},\end{split} \tag{11}
$$
Finally, the FOL representation and attention of a new node $n_{j}^{l+1}$ are aggregated from edges for the next iteration:
$$
\begin{split}{\rm\textbf{y}}_{n_{j}^{l+1}}\!&=\!\!\!\sum_{(n_{i}^{l},\bar{r})%
\in\widetilde{\mathcal{N}}(n_{j}^{l+1})}\!\!\!\!\beta_{n_{i}^{l},\bar{r},n_{j}%
^{l+1}}{\rm\textbf{y}}_{n_{i}^{l},\bar{r},n_{j}^{l+1}},\\
&\beta_{n_{j}^{l+1}}\!=\!\!\!\sum_{(n_{i}^{l},\bar{r})\in\widetilde{\mathcal{N%
}}(n_{j}^{l+1})}\!\!\!\!\!\!\!\!\beta_{n_{i}^{l},\bar{r},n_{j}^{l+1}}.\end{split} \tag{12}
$$
Insights of FOL Rule Learning and Reasoning.
Actually, Tunsr introduces a novel FOL learning and reasoning strategy by forward logic message-passing mechanism over reasoning graphs. In general, the learning and reasoning of FOL rules on KGs or TKGs are usually in two-step fashion [20, 50, 51, 33, 28, 23, 18]. First, it searches over whole data to mine rules and their confidences. Second, for a query, the model instantiates all variables to find all groundings of learned rules and then aggregates all confidences of eligible rules. For example, for a target entity $o$ , its score can be the sum of learned rules with valid groundings and rule confidences can be modeled by a GRU. However, this is apparently not differentiable and cannot be optimized in an end-to-end manner because of the discrete rule learning and grounding operations. Thus, our model conducts the transformation of merging multiple rules by merging possible relations at each step, using FOL attention as:
$$
\begin{split}&\underbrace{S_{o}\!=\!\sum_{\gamma\in\Gamma}\beta_{\gamma}\!=\!%
\sum_{\gamma\in\Gamma}f\big{[}\textsc{gru}({\rm\textbf{g}}_{\gamma,h},{\rm%
\textbf{g}}_{\gamma,b^{1}},\cdots,{\rm\textbf{g}}_{\gamma,b^{|\gamma|}})]}_{(a%
)}\\
&\underbrace{\approx\prod_{l=1}^{L}\sum_{n_{j}\in\mathcal{O}_{l}}\bar{f_{l}}%
\big{[}\textsc{gru}({\rm\textbf{g}}_{\bar{r}},{\rm\textbf{o}}_{n_{j}}^{l}))%
\big{]}}_{(b)}.\end{split} \tag{13}
$$
$\beta_{\gamma}$ is the confidence of rule $\gamma$ . ${\rm\textbf{g}}_{\gamma,h}$ and ${\rm\textbf{g}}_{\gamma,b^{i}}$ are the relation embeddings of head $h$ and $i$ -th body $b^{i}$ of this rule. Part (a) utilizes the grounding of the learned rules to calculate reasoning scores, where each ruleβs confidence can be modeled by GRU and feedforward network $f$ . We can conduct reasoning at each step rather than whole multi-step processing, so the previous can approximate to part (b). $\bar{f_{l}}$ is for the attention calculation. In this way, the differentiable process is achieved. This is an extension and progression of Neural LP [21] and DURM [32] by introducing several specific strategies for unified KG reasoning. Finally, the real FOL rules can be easily induced to constantly perform attention calculation over the reasoning graph, which is summarized as the Forward Attentive Rule Induction (FARI) algorithm. It is shown in Algorithm 1, where the situation on TKGs is given and that on SKGs can be obtained by omitting time information. In this way, Tunsr has the ability to capture CCH, TIH, and TEH rules with the specific-designed reasoning graphs as described in Section 3.1. As we add an extra self relation in the reasoning graph, the FARI algorithm can obtain all possible rules (no longer than length L) by deleting existing atoms with the self relation in induced FOL rules.
Input: the reasoning graph $\widetilde{\mathcal{G}}$ , FOL attentions $\beta$ .
Output: the FOL rule set $\Gamma$ .
1 Init $\Gamma=\varnothing$ , $B(n_{\tilde{s}}^{0})=[0,[]]$ , $\mathcal{D}_{0}[n_{\tilde{s}}^{0}]=[1,B(n_{\tilde{s}}^{0})]$ ;
2 for l=1 to L of decoder iterations do
3 Initialize node-rule dictionary $\mathcal{D}_{l}$ ;
4 for node $n_{j}^{l}$ in $\mathcal{O}_{l}$ do
5 Set rule body list $B(n_{j}^{l})$ = [] ;
6 for ( $n_{i}^{l-1},\bar{r}$ ) of $\widetilde{\mathcal{N}}$ ( $n_{j}^{l}$ ) in $\mathcal{O}_{l-1}$ do
7 Prior $e_{i,l-1}^{2}$ , $B(n_{i}^{l-1})$ = $\mathcal{D}_{l-1}[n_{i}^{l-1}]$ ;
8 for weight $\epsilon$ , body $\gamma_{b}$ in $B(n_{i}^{l-1})$ do
9 $\epsilon^{\prime}=e_{i,l-1}^{2}Β· e_{n_{i}^{l-1},\bar{r},n_{j}^{l}}^{2}$ ;
10 $\gamma^{\prime}_{b}=\gamma_{b}.add(\bar{r})$ , $B(n_{j}^{l}).add([\epsilon^{\prime},\gamma^{\prime}_{b}])$ ;
11
12
13 $e_{j,l}^{2}=sum\{[\epsilonβ B(n_{j}^{l})]\}$ ;
14 Add $n_{j}^{l}$ : [ $e_{j,l}^{2}$ , $B(n_{j}^{l})$ ] to $\mathcal{D}_{l}$ ;
15
16 Normalize $e_{j,l}^{2}$ of $n_{j}^{l}$ in $\mathcal{O}_{l}$ using softmax;
17
18 for $n_{i}^{L}$ in $\mathcal{O}_{L}$ do
19 $e_{i,L}^{2}$ , $B(n_{i}^{L})$ = $\mathcal{D}_{L}[n_{j}^{L}]$ ;
20 for $\epsilon,\gamma_{b}$ in $B(n_{i}^{L})$ do
21 $\Gamma.add([\epsilon,\gamma_{b}[1](X,Y_{1}):t_{1}\landΒ·s\land\gamma_{b}[L]%
(Y_{L-1},Z):t_{L}β\tilde{r}(X,Z):t])$
22
Return rule set $\Gamma$ .
Algorithm 1 FARI for FOL rules Induction.
3.4 Reasoning Prediction and Process Overview
After calculation with $L$ logic blocks, the reasoning score for each entity can be obtained. For each entity $o$ at the last step of the reasoning graph for SKGs, we can utilize the representation and attention value of the propositional and FOL reasoning for calculating answer validity:
$$
{\rm\textbf{h}}_{o}=(1-\lambda){\rm\textbf{x}}_{o}+\lambda{\rm\textbf{y}}_{o},%
\gamma_{o}=(1-\lambda)\alpha_{o}+\lambda\beta_{o}, \tag{14}
$$
where $\lambda$ is a learnable weight for the combination of propositional and FOL reasoning. $\alpha_{o}$ and $\beta_{o}$ are learned attention values for propositional and FOL reasoning, respectively. We calculate it dynamically using propositional embedding ${\rm\textbf{x}}_{o}$ , FOL embedding ${\rm\textbf{y}}_{o}$ , and query embedding q. Based on it, the final score is given by:
$$
s(\mathcal{Q},o)={\rm\textbf{W}}_{5}{\rm\textbf{h}}_{o}+\gamma_{o}. \tag{15}
$$
Reasoning scores for those entities that are not in the last step of the reasoning graph are set to 0 as it indicates that there are no available propositional and FOL rules for those entities. Finally, the model is optimized by the multi-class log-loss [52] like RED-GNN:
$$
\mathcal{L}=\sum_{\mathcal{Q}}\Big{[}-s(\mathcal{Q},o)+\log\big{(}\sum_{\bar{o%
}\in\mathcal{E}}\exp(s(\mathcal{Q},\bar{o}))\big{)}\Big{]}, \tag{16}
$$
where $s(\mathcal{Q},o)$ denotes the reasoning score of labeled entity $o$ for query $\mathcal{Q}$ , while $\bar{o}$ is the arbitrary entity. For reasoning situations of TKGs, we need firstly aggregate node embedding and attentions with the same entity to get the entity score. Because the nodes in the reasoning graph of TKGs except the start node are in the form of entity-time pair.
The number of nodes may explode in the reasoning graph as it shows an exponential increase to reach $|\mathcal{N}(n_{i})|^{L}$ by iterations, especially for TKGs. For computational efficiency, we introduce the strategies of iteration fusion and sampling for interpolation and extrapolation reasoning, respectively. In the interpolation scenarios, nodes of entity-time pairs with the same entity are fused to an entity node and then are used to expand the reasoning graph. In the extrapolation scenarios, posterior neighbors of each node are sampled with a maximum of M nodes in each iteration. For sampling M node in the reasoning graph, we follow a time-aware weighted sampling strategy, considering that recent events may have a greater impact on the forecast target. Specifically, for a posterior neighbor node with time $t^{\prime}$ , we compute its sampling weight by $\frac{\exp(t^{\prime}-\tilde{t})}{\sum_{\bar{t}}{\exp(\bar{t}-\tilde{t})}}$ for the query ( $\tilde{s}$ , $\tilde{r}$ ,?, $\tilde{t}$ ), where $\bar{t}$ denotes the time of all possible posterior neighbor nodes for a prior node. After computing attention weights for each edge in the same iteration, we select top- N among them with larger attention weights and prune others.
4 Experiments and Results
4.1 Experiment Setups
The baselines cover a wide range of mainstream techniques and strategies for KG reasoning, with detailed descriptions provided in the Appendix. In the following parts of this section, we will carry out experiments and analyze results to answer the following four research questions.
$\bullet$ RQ1. How does the unified Tunsr perform in KG reasoning compared to state-of-the-art baselines?
$\bullet$ RQ2. How effective are propositional and FOL reasoning, and is it reasonable to integrate them?
$\bullet$ RQ3. What factors affect the reasoning performance of the Tunsr framework?
$\bullet$ RQ4. What is the actual reasoning process of Tunsr?
4.2 Comparison Results (RQ1)
TABLE II: The experiment results of transductive reasoning. The optimal and suboptimal values on each metric are marked in red and blue, respectively. The percent signs (%) for Hits@k metrics are omitted for better presentation. The following tables have a similar setting.
| Model | WN18RR | FB15k237 | | | | | | |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
| MRR | Hits@1 | Hits@3 | Hits@10 | MRR | Hits@1 | Hits@3 | Hits@10 | |
| TransE [19] | 0.481 | 43.30 | 48.90 | 57.00 | 0.342 | 24.00 | 37.80 | 52.70 |
| DistMult [53] | 0.430 | 39.00 | 44.00 | 49.00 | 0.241 | 15.50 | 26.30 | 41.90 |
| UltraE [54] | 0.485 | 44.20 | 50.00 | 57.30 | 0.349 | 25.10 | 38.50 | 54.10 |
| ComplEx-DURA [55] | 0.491 | 44.90 | β | 57.10 | 0.371 | 27.60 | β | 56.00 |
| AutoBLM [56] | 0.490 | 45.10 | β | 56.70 | 0.360 | 26.70 | β | 55.20 |
| SE-GNN [57] | 0.484 | 44.60 | 50.90 | 57.20 | 0.365 | 27.10 | 39.90 | 54.90 |
| RED-GNN [58] | 0.533 | 48.50 | β | 62.40 | 0.374 | 28.30 | β | 55.80 |
| CompoundE [59] | 0.491 | 45.00 | 50.80 | 57.60 | 0.357 | 26.40 | 39.30 | 54.50 |
| GATH [60] | 0.463 | 42.60 | 47.50 | 53.70 | 0.344 | 25.30 | 37.60 | 52.70 |
| TGformer [61] | 0.493 | 45.50 | 50.90 | 56.60 | 0.372 | 27.90 | 41.00 | 55.70 |
| AMIE [62] | 0.360 | 39.10 | β | 48.50 | 0.230 | 14.80 | β | 41.90 |
| AnyBURL [63] | 0.454 | 39.90 | β | 56.20 | 0.342 | 25.80 | β | 50.20 |
| SAFRAN [64] | 0.501 | 45.70 | β | 58.10 | 0.370 | 28.70 | β | 53.10 |
| Neural LP [21] | 0.381 | 36.80 | 38.60 | 40.80 | 0.237 | 17.30 | 25.90 | 36.10 |
| DRUM [32] | 0.382 | 36.90 | 38.80 | 41.00 | 0.238 | 17.40 | 26.10 | 36.40 |
| RLogic [23] | 0.470 | 44.30 | β | 53.70 | 0.310 | 20.30 | β | 50.10 |
| RNNLogic [33] | 0.483 | 44.60 | 49.70 | 55.80 | 0.344 | 25.20 | 38.00 | 53.00 |
| LatentLogic [24] | 0.481 | 45.20 | 49.70 | 55.30 | 0.320 | 21.20 | 32.90 | 51.40 |
| RNN+RotE [65] | 0.550 | 51.00 | 57.20 | 63.50 | 0.353 | 26.50 | 38.70 | 52.90 |
| TCRA [66] | 0.496 | 45.70 | 51.10 | 57.40 | 0.367 | 27.50 | 40.30 | 55.40 |
| Tunsr | 0.558 | 51.36 | 58.25 | 65.78 | 0.389 | 28.82 | 41.83 | 57.15 |
TABLE III: The experiment results on 12 inductive reasoning datasets.
| | Model | WN18RR | FB15k-237 | NELL-995 | | | | | | | | | |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| V1 | V2 | V3 | V4 | V1 | V2 | V3 | V4 | V1 | V2 | V3 | V4 | | |
| MRR | GraIL [67] | 0.627 | 0.625 | 0.323 | 0.553 | 0.279 | 0.276 | 0.251 | 0.227 | 0.481 | 0.297 | 0.322 | 0.262 |
| RED-GNN [58] | 0.701 | 0.690 | 0.427 | 0.651 | 0.369 | 0.469 | 0.445 | 0.442 | 0.637 | 0.419 | 0.436 | 0.363 | |
| MLSAA [68] | 0.716 | 0.700 | 0.448 | 0.654 | 0.368 | 0.457 | 0.442 | 0.431 | 0.694 | 0.424 | 0.433 | 0.359 | |
| RuleN [69] | 0.668 | 0.645 | 0.368 | 0.624 | 0.363 | 0.433 | 0.439 | 0.429 | 0.615 | 0.385 | 0.381 | 0.333 | |
| Neural LP [21] | 0.649 | 0.635 | 0.361 | 0.628 | 0.325 | 0.389 | 0.400 | 0.396 | 0.610 | 0.361 | 0.367 | 0.261 | |
| DRUM [32] | 0.666 | 0.646 | 0.380 | 0.627 | 0.333 | 0.395 | 0.402 | 0.410 | 0.628 | 0.365 | 0.375 | 0.273 | |
| Tunsr | 0.721 | 0.722 | 0.451 | 0.656 | 0.375 | 0.474 | 0.462 | 0.456 | 0.746 | 0.427 | 0.455 | 0.387 | |
| Hits@1 | GraIL [67] | 55.40 | 54.20 | 27.80 | 44.30 | 20.50 | 20.20 | 16.50 | 14.30 | 42.50 | 19.90 | 22.40 | 15.30 |
| RED-GNN [58] | 65.30 | 63.30 | 36.80 | 60.60 | 30.20 | 38.10 | 35.10 | 34.00 | 52.50 | 31.90 | 34.50 | 25.90 | |
| MLSAA [68] | 66.20 | 64.50 | 39.10 | 61.20 | 29.20 | 36.60 | 35.60 | 34.00 | 56.00 | 33.30 | 34.30 | 25.30 | |
| RuleN [69] | 63.50 | 61.10 | 34.70 | 59.20 | 30.90 | 34.70 | 34.50 | 33.80 | 54.50 | 30.40 | 30.30 | 24.80 | |
| Neural LP [21] | 59.20 | 57.50 | 30.40 | 58.30 | 24.30 | 28.60 | 30.90 | 28.90 | 50.00 | 24.90 | 26.70 | 13.70 | |
| DRUM [32] | 61.30 | 59.50 | 33.00 | 58.60 | 24.70 | 28.40 | 30.80 | 30.90 | 50.00 | 27.10 | 26.20 | 16.30 | |
| Tunsr | 66.25 | 66.31 | 38.11 | 61.55 | 30.44 | 37.88 | 37.90 | 36.37 | 73.13 | 32.67 | 37.13 | 27.30 | |
| Hits@10 | GraIL [67] | 76.00 | 77.60 | 40.90 | 68.70 | 42.90 | 42.40 | 42.40 | 38.90 | 56.50 | 49.60 | 51.80 | 50.60 |
| RED-GNN [58] | 79.90 | 78.00 | 52.40 | 72.10 | 48.30 | 62.90 | 60.30 | 62.10 | 86.60 | 60.10 | 59.40 | 55.60 | |
| MLSAA [68] | 81.10 | 79.60 | 54.40 | 72.40 | 49.00 | 61.60 | 58.90 | 59.70 | 87.80 | 59.40 | 59.20 | 55.00 | |
| RuleN [69] | 73.00 | 69.40 | 40.70 | 68.10 | 44.60 | 59.90 | 60.00 | 60.50 | 76.00 | 51.40 | 53.10 | 48.40 | |
| Neural LP [21] | 77.20 | 74.90 | 47.60 | 70.60 | 46.80 | 58.60 | 57.10 | 59.30 | 87.10 | 56.40 | 57.60 | 53.90 | |
| DRUM [32] | 77.70 | 74.70 | 47.70 | 70.20 | 47.40 | 59.50 | 57.10 | 59.30 | 87.30 | 54.00 | 57.70 | 53.10 | |
| Tunsr | 85.87 | 83.98 | 60.76 | 73.28 | 55.96 | 63.24 | 61.43 | 63.28 | 88.56 | 62.14 | 61.05 | 58.78 | |
TABLE IV: The experiment results (Hits@10 metrics) on 12 inductive reasoning datasets with 50 negative entities for ranking.
| Model | WN18RR | FB15k-237 | NELL-995 | | | | | | | | | |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| V1 | V2 | V3 | V4 | V1 | V2 | V3 | V4 | V1 | V2 | V3 | V4 | |
| GraIL [67] | 82.45 | 78.68 | 58.43 | 73.41 | 64.15 | 81.80 | 82.83 | 89.29 | 59.50 | 93.25 | 91.41 | 73.19 |
| CoMPILE [70] | 83.60 | 79.82 | 60.69 | 75.49 | 67.64 | 82.98 | 84.67 | 87.44 | 58.38 | 93.87 | 92.77 | 75.19 |
| TACT [71] | 84.04 | 81.63 | 67.97 | 76.56 | 65.76 | 83.56 | 85.20 | 88.69 | 79.80 | 88.91 | 94.02 | 73.78 |
| RuleN [69] | 80.85 | 78.23 | 53.39 | 71.59 | 49.76 | 77.82 | 87.69 | 85.60 | 53.50 | 81.75 | 77.26 | 61.35 |
| Neural LP [21] | 74.37 | 68.93 | 46.18 | 67.13 | 52.92 | 58.94 | 52.90 | 55.88 | 40.78 | 78.73 | 82.71 | 80.58 |
| DRUM [32] | 74.37 | 68.93 | 46.18 | 67.13 | 52.92 | 58.73 | 52.90 | 55.88 | 19.42 | 78.55 | 82.71 | 80.58 |
| ConGLR [26] | 85.64 | 92.93 | 70.74 | 92.90 | 68.29 | 85.98 | 88.61 | 89.31 | 81.07 | 94.92 | 94.36 | 81.61 |
| SymRITa [72] | 91.22 | 88.32 | 73.22 | 81.67 | 74.87 | 84.41 | 87.11 | 88.97 | 64.50 | 94.22 | 95.43 | 85.56 |
| Tunsr | 93.69 | 93.72 | 86.48 | 89.27 | 95.37 | 89.33 | 89.38 | 92.16 | 89.05 | 97.91 | 94.69 | 92.63 |
TABLE V: The experiment results of interpolation reasoning, including ICEWS14, ICEWS0515 and ICEWS18 datasets.
| Model | ICEWS14 | ICEWS0515 | | | | | | |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
| MRR | Hits@1 | Hits@3 | Hits@10 | MRR | Hits@1 | Hits@3 | Hits@10 | |
| TTransE [73] | 0.255 | 7.40 | β | 60.10 | 27.10 | 8.40 | β | 61.60 |
| DE-SimplE [74] | 0.526 | 41.80 | 59.20 | 72.50 | 0.513 | 39.20 | 57.80 | 74.80 |
| TA-DistMult [75] | 0.477 | 36.30 | β | 68.60 | 0.474 | 34.60 | β | 72.80 |
| ChronoR [76] | 0.625 | 54.70 | 66.90 | 77.30 | 0.675 | 59.60 | 72.30 | 82.00 |
| TComplEx [77] | 0.610 | 53.00 | 66.00 | 77.00 | 0.660 | 59.00 | 71.00 | 80.00 |
| TNTComplEx [77] | 0.620 | 52.00 | 66.00 | 76.00 | 0.670 | 59.00 | 71.00 | 81.00 |
| TeLM [78] | 0.625 | 54.50 | 67.30 | 77.40 | 0.678 | 59.90 | 72.80 | 82.30 |
| BoxTE [79] | 0.613 | 52.80 | 66.40 | 76.30 | 0.667 | 58.20 | 71.90 | 82.00 |
| RotateQVS [80] | 0.591 | 50.70 | 64.20 | 75.04 | 0.633 | 52.90 | 70.90 | 81.30 |
| TeAST [27] | 0.637 | 56.00 | 68.20 | 78.20 | 0.683 | 60.40 | 73.20 | 82.90 |
| Tunsr | 0.648 | 56.21 | 69.61 | 80.16 | 0.705 | 59.89 | 74.67 | 83.95 |
TABLE VI: The experiment results of extrapolation reasoning, including ICEWS14, ICEWS0515, and ICEWS18 datasets.
| Model | ICEWS14 | ICEWS0515 | ICEWS18 | | | | | | | | | |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| MRR | Hits@1 | Hits@3 | Hits@10 | MRR | Hits@1 | Hits@3 | Hits@10 | MRR | Hits@1 | Hits@3 | Hits@10 | |
| TransE [19] | 0.224 | 13.36 | 25.63 | 41.23 | 0.225 | 13.05 | 25.61 | 42.05 | 0.122 | 5.84 | 12.81 | 25.10 |
| DistMult [53] | 0.276 | 18.16 | 31.15 | 46.96 | 0.287 | 19.33 | 32.19 | 47.54 | 0.107 | 4.52 | 10.33 | 21.25 |
| ComplEx [81] | 0.308 | 21.51 | 34.48 | 49.58 | 0.316 | 21.44 | 35.74 | 52.04 | 0.210 | 11.87 | 23.47 | 39.87 |
| TTransE [73] | 0.134 | 3.11 | 17.32 | 34.55 | 0.157 | 5.00 | 19.72 | 38.02 | 0.083 | 1.92 | 8.56 | 21.89 |
| TA-DistMult [75] | 0.264 | 17.09 | 30.22 | 45.41 | 0.243 | 14.58 | 27.92 | 44.21 | 0.167 | 8.61 | 18.41 | 33.59 |
| TA-TransE [75] | 0.174 | 0.00 | 29.19 | 47.41 | 0.193 | 1.81 | 31.34 | 50.33 | 0.125 | 0.01 | 17.92 | 37.38 |
| DE-SimplE [74] | 0.326 | 24.43 | 35.69 | 49.11 | 0.350 | 25.91 | 38.99 | 52.75 | 0.193 | 11.53 | 21.86 | 34.80 |
| TNTComplEx [77] | 0.321 | 23.35 | 36.03 | 49.13 | 0.275 | 19.52 | 30.80 | 42.86 | 0.212 | 13.28 | 24.02 | 36.91 |
| RE-Net [82] | 0.382 | 28.68 | 41.34 | 54.52 | 0.429 | 31.26 | 46.85 | 63.47 | 0.288 | 19.05 | 32.44 | 47.51 |
| CyGNet [83] | 0.327 | 23.69 | 36.31 | 50.67 | 0.349 | 25.67 | 39.09 | 52.94 | 0.249 | 15.90 | 28.28 | 42.61 |
| AnyBURL [63] | 0.296 | 21.26 | 33.33 | 46.73 | 0.320 | 23.72 | 35.45 | 50.46 | 0.227 | 15.10 | 25.44 | 38.91 |
| TLogic [28] | 0.430 | 33.56 | 48.27 | 61.23 | 0.469 | 36.21 | 53.13 | 67.43 | 0.298 | 20.54 | 33.95 | 48.53 |
| TR-Rules [29] | 0.433 | 33.96 | 48.55 | 61.17 | 0.476 | 37.06 | 53.80 | 67.57 | 0.304 | 21.10 | 34.58 | 48.92 |
| xERTE [84] | 0.407 | 32.70 | 45.67 | 57.30 | 0.466 | 37.84 | 52.31 | 63.92 | 0.293 | 21.03 | 33.51 | 46.48 |
| TITer [85] | 0.417 | 32.74 | 46.46 | 58.44 | β | β | β | β | 0.299 | 22.05 | 33.46 | 44.83 |
| TECHS [30] | 0.438 | 34.59 | 49.36 | 61.95 | 0.483 | 38.34 | 54.69 | 68.92 | 0.308 | 21.81 | 35.39 | 49.82 |
| INFER [86] | 0.441 | 34.52 | 48.92 | 62.14 | 0.483 | 37.61 | 54.30 | 68.52 | 0.317 | 21.94 | 35.64 | 50.88 |
| Tunsr | 0.447 | 35.16 | 50.39 | 63.32 | 0.491 | 38.31 | 55.67 | 69.88 | 0.321 | 22.99 | 36.68 | 51.08 |
The experiments on transductive, inductive, interpolation, and extrapolation reasoning are carried out to evaluate the performance. The results are shown in Tables II, III, V and VI, respectively. It can be observed that our model has performance advantages over neural, symbolic, and neurosymbolic methods.
Specifically, from Table II of transductive reasoning, it is observed that Tunsr achieves the optimal performance. Compared with advanced neural methods, Tunsr shows performance advantages. For example, it improves the Hits@10 values of the two datasets by 8.78%, 16.78%, 8.48%, 8.68%, 9.08%, 3.38%, 8.18%, 12.08% and 4.45%, 15.25%, 3.05%, 1.15%, 1.95%, 1.35%, 2.65%, 4.45% compared with TransE, DistMult, UltraE, ComplEx-DURA, AutoBLM, RED-GNN, CompoundE and GATH model. Moreover, compared with symbolic and neurosymbolic methods, the advantages of the Tunsr are more obvious. For symbolic methods (AMIE, AnyBURL, and SAFRAN), the average achievements of MRR, Hits@1, and Hits@10 values are 0.119, 9.79%, 11.51% and 0.075, 5.72%, 8.75% on two datasets. For advanced neurosymbolic RNNLogic, LatentLogic and RNN+RotE, Tunsr also shows performance advantages, achieving 9.98%, 10.48%, 2.28% and 4.15%, 5.75%, 4.25% of Hits@10 improvements on two datasets.
For inductive reasoning, Tunsr also has the performance advantage compared with all neural, symbolic, and neurosymbolic methods as Table III shows, especially on WN18RR v1, WN18RR v2, WN18RR v3, FB15k-237 v1, and NELL-995 v1 datasets. Specifically, Tunsr is better than neural methods GraIL, MLSAA, and RED-GNN. Compared with the latter, it achieves 5.97%, 5.98%, 8.36%, 1.18%, 7.66%, 0.34%, 1.13%, 1.18%, 1.96%, 2.04%, 1.65%, and 3.18% improvements on the His@10 metric, reaching an average improvement of 3.39%. For symbolic and neurosymbolic methods (RuleN, Neural LP, and DRUM), Tunsr has greater performance advantages. For example, compared with DRUM, Tunsr has achieved an average improvement of 0.069, 8.19%, and 6.05% on MRR, Hits@1, and Hits@10 metrics, respectively. Besides, for equal comparison with CoMPILE [70], TACT [71], ConGLR [26], and SymRITa [72], we carry out the evaluation under their setting which introduces 50 negative entities (rather than all entities) for ranking for each query. The results are shown in Table IV. These results also verify the superiority of our model.
<details>
<summary>extracted/6596839/fig/zhexian1.png Details</summary>

### Visual Description
## Line Chart: Hits@10 Values vs Training Epochs
### Overview
The chart compares three optimization methods (Proportional, FOL, Unified) across 46 training epochs, showing their performance in Hits@10 values (percentage). All three methods demonstrate upward trends, with Unified consistently outperforming the others.
### Components/Axes
- **X-axis**: Training Epochs (1 to 46, linear scale)
- **Y-axis**: Hits@10 Values (%) (45% to 65%, linear scale)
- **Legend**: Located at bottom-left, with:
- Purple circles: Proportional
- Blue squares: FOL
- Red stars: Unified
### Detailed Analysis
1. **Proportional (Purple Circles)**
- Starts at ~49% at epoch 1
- Gradual increase to ~64% by epoch 46
- Slope: ~0.3% per epoch (steady linear growth)
2. **FOL (Blue Squares)**
- Starts at ~46% at epoch 1
- Sharp initial rise (46% β 51% in first 5 epochs)
- Slows to ~0.2% per epoch after epoch 10
- Reaches ~62% by epoch 46
3. **Unified (Red Stars)**
- Starts at ~53% at epoch 1
- Consistent ~0.5% per epoch growth
- Peaks at ~66% by epoch 46
- Maintains 10-15% lead over Proportional throughout
### Key Observations
- Unified method shows highest performance across all epochs
- FOL demonstrates fastest initial improvement but plateaus
- Proportional maintains middle-ground performance
- All methods show diminishing returns after epoch 20
- No crossover points between methods after epoch 10
### Interpretation
The data suggests that while all three optimization methods improve with training, the Unified approach demonstrates superior long-term performance. The FOL method's rapid early gains may indicate effective initial parameter tuning, but its plateau suggests potential overfitting or saturation. The Proportional method's steady growth pattern implies balanced optimization without extreme fluctuations. The persistent gap between Unified and other methods highlights its architectural or algorithmic advantages in this specific task. The diminishing returns across all methods after epoch 20 suggest potential for optimization in training duration or resource allocation.
</details>
(a) WN18RR of SKG T.
<details>
<summary>extracted/6596839/fig/zhexian2.png Details</summary>

### Visual Description
## Line Chart: Hits@10 Values vs Training Epochs
### Overview
The chart compares three optimization methods (Proportional, FOL, Unified) across 46 training epochs, measuring Hits@10 performance in percentage. All three methods show improvement over time, with Unified consistently outperforming the others after ~11 epochs.
### Components/Axes
- **X-axis**: Training Epochs (1β46, labeled at intervals of 5)
- **Y-axis**: Hits@10 Values (%) (55β80%, labeled in 5% increments)
- **Legend**: Located at bottom-right, with:
- Purple circles: Proportional
- Blue squares: FOL
- Red stars: Unified
### Detailed Analysis
1. **Proportional (Purple Circles)**
- Starts at ~58% at epoch 1
- Rises steadily to ~79% by epoch 26
- Slight decline to ~78% by epoch 46
- Notable dip between epochs 26β31 (~79% β ~78%)
2. **FOL (Blue Squares)**
- Begins at ~59% at epoch 1
- Gradual ascent to ~75% by epoch 21
- Stabilizes between 75β76% from epochs 26β46
- Minor fluctuation: ~75% β ~76% β ~75% β ~76%
3. **Unified (Red Stars)**
- Starts lowest at ~57% at epoch 1
- Sharp rise to ~74% by epoch 11
- Peaks at ~79% by epoch 26
- Maintains ~79β80% from epochs 26β46
- Final value: ~80% at epoch 46
### Key Observations
- **Unified** achieves highest performance by epoch 11 and maintains lead
- **Proportional** shows fastest initial growth but plateaus earlier
- **FOL** demonstrates slowest improvement but stabilizes at mid-range performance
- All methods converge near 75β80% by epoch 36, with Unified sustaining highest values
### Interpretation
The data suggests Unified optimization is most effective for this task, achieving 80% Hits@10 by final epoch. Proportional's early dominance followed by slight decline may indicate overfitting or parameter sensitivity. FOL's gradual improvement suggests robustness but lower ceiling performance. The convergence near 75β80% implies diminishing returns after ~36 epochs for all methods. Unified's sustained lead highlights its architectural or algorithmic advantages in this context.
</details>
(b) ICEWS14 of TKG I.
<details>
<summary>extracted/6596839/fig/zhexian3.png Details</summary>

### Visual Description
## Line Graph: Hits@10 Values (%) vs Training Epochs
### Overview
The graph compares three training strategies (Propositional, FOL, Unified) across 10 training epochs, measuring performance via Hits@10 Values (%). The y-axis ranges from 40% to 70%, and the x-axis spans epochs 1β10. Three distinct lines represent the strategies, with clear trends in performance over time.
### Components/Axes
- **X-axis**: Training Epochs (1β10, integer increments).
- **Y-axis**: Hits@10 Values (%) (40β70, 5% increments).
- **Legend**: Located in the top-right corner, with:
- **Purple circles**: Propositional
- **Blue squares**: FOL
- **Red stars**: Unified
### Detailed Analysis
1. **Propositional (Purple Circles)**:
- Starts at ~49% at epoch 1.
- Peaks at ~52% at epoch 2.
- Declines steadily to ~45% by epoch 10.
- Shows volatility after epoch 2, with minor fluctuations (e.g., ~46% at epoch 3, ~47% at epoch 5).
2. **FOL (Blue Squares)**:
- Begins at ~59% at epoch 1.
- Rises to ~62% at epoch 4.
- Stabilizes between ~60β62% from epochs 5β10.
- Minor dip to ~60% at epoch 6, then recovery.
3. **Unified (Red Stars)**:
- Starts at ~61% at epoch 1.
- Peaks at ~63% at epoch 3.
- Maintains ~61β63% across epochs 4β10.
- Slight dip to ~61% at epoch 6, then recovery.
### Key Observations
- **Propositional** exhibits the most significant decline after epoch 2, underperforming compared to other strategies.
- **FOL** shows moderate improvement early on but stabilizes, maintaining mid-range performance.
- **Unified** demonstrates the highest and most consistent performance, with minimal fluctuation.
### Interpretation
The data suggests that the **Unified** strategy is the most robust, maintaining high performance across all epochs. **FOL** performs better than **Propositional** but lags slightly behind **Unified**. The **Propositional** strategyβs sharp decline after epoch 2 indicates poor scalability or overfitting. The trends imply that training beyond epoch 2 does not benefit **Propositional**, while **FOL** and **Unified** plateau at higher performance levels. This could reflect differences in algorithmic efficiency or data utilization between the strategies.
</details>
(c) ICEWS14 of TKG E.
<details>
<summary>extracted/6596839/fig/zhexian4.png Details</summary>

### Visual Description
## Line Graph: Hits@10 Values (%) vs Training Epochs
### Overview
The graph compares three optimization methods (Proportional, FOL, Unified) across 10 training epochs, measuring Hits@10 performance in percentage. All three methods show distinct trends, with Unified maintaining the highest performance throughout.
### Components/Axes
- **X-axis**: Training Epochs (1β10, integer increments)
- **Y-axis**: Hits@10 Values (%) (30β60% range)
- **Legend**: Top-right corner, color-coded:
- Purple circles: Proportional
- Blue squares: FOL
- Red stars: Unified
- **Grid**: Light gray horizontal/vertical lines for reference
### Detailed Analysis
1. **Proportional (Purple Circles)**:
- Starts at 40% (Epoch 1)
- Gradual increase to 46% by Epoch 6
- Plateaus between 46β47% from Epoch 6β10
- *Trend*: Steady improvement in early epochs, then stabilization
2. **FOL (Blue Squares)**:
- Begins at 49% (Epoch 1)
- Peaks at 50% (Epoch 3)
- Dips to 48% (Epoch 6), then stabilizes at 49β50%
- *Trend*: Early volatility followed by consistent performance
3. **Unified (Red Stars)**:
- Starts at 51% (Epoch 1)
- Minor dip to 49% (Epoch 5)
- Maintains 50β51% from Epoch 6β10
- *Trend*: High stability with slight mid-epoch fluctuation
### Key Observations
- Unified method consistently outperforms others by 1β2% across all epochs
- Proportional shows the most significant improvement (6% gain) but remains lowest-performing
- FOL demonstrates initial superiority (Epoch 3 peak) but converges with Unified by later epochs
- All methods maintain >45% performance, indicating baseline effectiveness
### Interpretation
The data suggests:
1. **Unified method** offers optimal stability and performance for this task
2. **Proportional** requires more training (Epochs 1β6) to approach mid-tier performance
3. **FOL**'s early peak may indicate initial overfitting or aggressive optimization that later stabilizes
4. All methods plateau by Epoch 6β10, suggesting diminishing returns after ~6 epochs
The visual trends align precisely with numerical values: Proportional's upward slope matches its 40β46% increase, while Unified's flat line reflects its 51%β51% maintenance. The legend placement and color coding are unambiguous, with no conflicting data series. No anomalies detected beyond expected optimization curve patterns.
</details>
(d) ICEWS18 of TKG E.
Figure 5: The impacts of propositional and FOL reasoning on transductive, interpolation, and extrapolation scenarios. It is generally observed that the unified model has a better performance compared with the single propositional or FOL setting, demonstrating the validity and rationality of the unified mechanism in Tunsr.
For interpolation reasoning in Table V, the performance of Tunsr surpasses that of mainstream neural reasoning methods. It achieves optimal results on seven out of eight metrics. Compared with TNTComplEx of the previous classic tensor-decomposition method, the improvement on each metric is 0.028, 4.21%, 3.61%, 4.16%, 0.035, 0.89%, 3.67%, and 2.95%, respectively. Moreover, compared with the state-of-the-art model TeAST that encodes temporal knowledge graph embeddings via the archimedean spiral timeline, Tunsr also has 0.011, 0.21%, 1.41%, 1.96%, 0.022, -0.51%, 1.47%, and 1.05% performance advantages (only slightly smaller on Hits@1 metric of ICEWS0515 dataset).
As Table VI shows for extrapolation reasoning, Tunsr also performs better. Compared with 10 neural reasoning methods, Tunsr has obvious performance advantages. For instance, it achieves 14.19%, 27.02%, and 14,17% Hits@10 improvement on three datasets against the tensor-decomposition method TNTComplEx. Additionally, Tunsr outperforms symbolic rule-based methods, i.e., AnyBURL, TLogic, and TR-Rules, achieving average improvements of 0.061, 5.57%, 7.01%, 6.94%, 0.069, 5.98%, 8.21%, 8.06%, 0.045, 4.08%, 5.36%, and 5.63% on all 12 evaluation metrics. Moreover, Tunsr excels three neurosymbolic methods (xERTE, TITer and INFER) across all datasets. Furthermore, compared with the previous study TECHS, Tunsr also has the performance boost, which shows 1.37%, 0.96%, and 1.26% Hits@10 metric gains.
In summary, the experimental results on four reasoning scenarios demonstrate the effectiveness and superiority of the proposed unified framework Tunsr. It shows the rationality of the unified mechanism at both the methodological and application perspectives and verifies the tremendous potential for future KG reasoning frameworks.
4.3 Ablation Studies (RQ2)
To explore the impacts of propositional and FOL parts on KG reasoning performance, we carry out ablation studies on transductive (WN18RR), interpolation (ICEWS14), and extrapolation (ICEWS14 and ICEWS18) scenarios in Figure 5. As inductive reasoning is entity-independent, we only conduct experiments using FOL reasoning for it. In each line chart, we depict the performance trends associated with propositional, FOL, and unified reasoning throughout the training epochs. In the propositional/FOL setting, we set $\lambda$ in the Eq. (14) as 0/1, indicating the model only uses propositional/FOL reasoning to get the answer. In the unified setting, the value of $\lambda$ is the dynamic learned by embeddings. From the figure, it is generally observed that the unified setting has a better performance compared with the single propositional or FOL setting. It is noteworthy that propositional and FOL display unique characteristics when applied to diverse datasets. For transductive and interpolation reasoning (Figures 5 and 5), the performance of propositional reasoning consistently surpasses that of FOL, despite both exhibiting continuous improvement throughout the training process. However, it is contrary to the results on the extrapolation scenario (Figures 5 and 5), where FOL reasoning has performance advantages. It is noted that propositional reasoning performs well in ICEWS18 while badly in ICEWS14 under the extrapolation setting. This may be caused by the structural differences between ICEWS14 and ICEWS18. Compared with ICEWS14, the graph structure of ICEWS18 is notably denser (8.94 vs. 16.19 in node degree). So propositional reasoning in ICEWS18 can capture more comprehensive pattern semantics and exhibit robust generalization in testing scenarios. These observations indicate that propositional and FOL reasoning emphasizes distinct aspects of knowledge. Thus, combining them allows for the synergistic exploitation of their respective strengths, resulting in an enhanced overall effect.
<details>
<summary>extracted/6596839/fig/bar1.png Details</summary>

### Visual Description
## Bar Chart: Performance Metrics Across Step Configurations
### Overview
The chart compares performance metrics (MRR, Hits@1, Hits@10) across four step configurations (2, 4, 6, 8 steps) using grouped bars. Each metric is measured on a 0-0.7 scale, with distinct color coding for step configurations.
### Components/Axes
- **X-axis**: Three categories - MRR, Hits@1, Hits@10 (left to right)
- **Y-axis**: Numeric scale from 0 to 0.7 in 0.1 increments
- **Legend**: Located at top-right, mapping colors to step configurations:
- Teal: 2 steps
- Blue: 4 steps
- Purple: 6 steps
- Red: 8 steps
- **Bar Structure**: Four bars per category, ordered by step count (left to right)
### Detailed Analysis
1. **MRR Category**:
- 2 steps: ~0.40 (teal)
- 4 steps: ~0.49 (blue)
- 6 steps: ~0.55 (purple)
- 8 steps: ~0.56 (red)
- *Trend*: Steady increase with diminishing returns (0.01 gain between 6-8 steps)
2. **Hits@1 Category**:
- 2 steps: ~0.38 (teal)
- 4 steps: ~0.48 (blue)
- 6 steps: ~0.50 (purple)
- 8 steps: ~0.51 (red)
- *Trend*: Consistent growth with plateauing at higher steps
3. **Hits@10 Category**:
- 2 steps: ~0.42 (teal)
- 4 steps: ~0.60 (blue)
- 6 steps: ~0.65 (purple)
- 8 steps: ~0.66 (red)
- *Trend*: Significant jump between 2-4 steps, then gradual improvement
### Key Observations
- **Performance Correlation**: All metrics show improvement with more steps, but Hits@10 demonstrates the strongest step-configuration dependency
- **Diminishing Returns**: MRR and Hits@1 show minimal gains after 6 steps
- **Color Consistency**: Legend colors perfectly match bar colors across all categories
- **Scale Utilization**: Y-axis fully utilizes range (0-0.7) with no empty space
### Interpretation
The data suggests that increasing computational steps improves performance metrics, with Hits@10 showing the most pronounced benefits. The diminishing returns observed in MRR and Hits@1 after 6 steps indicate potential optimization opportunities. The chart implies a trade-off between computational cost (steps) and performance gains, particularly evident in the Hits@10 metric where 4 steps already achieve 60% of maximum performance. This pattern could inform resource allocation decisions in systems where step configuration impacts efficiency.
</details>
(a) WN18RR of SKG T.
<details>
<summary>extracted/6596839/fig/bar2.png Details</summary>

### Visual Description
## Bar Chart: Performance Metrics Across Step Counts
### Overview
The chart compares performance metrics (MRR, Hits@1, Hits@10) across four step counts (2, 4, 6, 8 steps) using grouped bars. Each metric is measured on a scale from 0 to 0.7, with distinct colors for each step count.
### Components/Axes
- **X-axis**: Three categories:
- MRR (Mean Reciprocal Rank)
- Hits@1
- Hits@10
- **Y-axis**: Numerical scale from 0 to 0.7, labeled in increments of 0.1.
- **Legend**: Located at the top-right, mapping colors to step counts:
- Teal: 2 steps
- Blue: 4 steps
- Purple: 6 steps
- Red: 8 steps
### Detailed Analysis
1. **MRR**:
- 2 steps: ~0.35
- 4 steps: ~0.45
- 6 steps: ~0.47
- 8 steps: ~0.43
- *Trend*: Peaks at 6 steps, then declines slightly at 8 steps.
2. **Hits@1**:
- 2 steps: ~0.30
- 4 steps: ~0.37
- 6 steps: ~0.39
- 8 steps: ~0.35
- *Trend*: Gradual increase until 6 steps, then a drop at 8 steps.
3. **Hits@10**:
- 2 steps: ~0.44
- 4 steps: ~0.60
- 6 steps: ~0.62
- 8 steps: ~0.60
- *Trend*: Sharp rise from 2 to 4 steps, plateauing at 6 and 8 steps.
### Key Observations
- **Highest Performance**: Hits@10 consistently shows the highest values across all step counts.
- **Diminishing Returns**: MRR and Hits@1 exhibit declines at 8 steps compared to 6 steps.
- **Step Count Impact**: Increasing steps generally improves performance, but beyond 6 steps, gains are minimal or reversed.
### Interpretation
The data suggests that increasing the number of steps improves performance metrics up to a point (6 steps), after which further steps yield diminishing or negative returns. Hits@10 benefits most from additional steps, indicating it may be more sensitive to incremental improvements. The decline in MRR and Hits@1 at 8 steps implies potential overfitting or inefficiencies in extended processing. This pattern could inform optimization strategies, such as capping step counts at 6 for these metrics.
</details>
(b) FB15k-237 v3 of SKG I.
<details>
<summary>extracted/6596839/fig/bar3.png Details</summary>

### Visual Description
## Bar Chart: Performance Metrics Across Steps
### Overview
The chart compares performance metrics (MRR, Hits@1, Hits@10) across four incremental steps (1 step, 2 steps, 3 steps, 4 steps). Each metric is represented by grouped bars, with performance values on the y-axis (0β0.8) and metric categories on the x-axis.
### Components/Axes
- **X-axis**: Labeled with three categories: "MRR", "Hits@1", "Hits@10".
- **Y-axis**: Labeled "0" to "0.8" in increments of 0.1.
- **Legend**: Located at the top-right, mapping colors to steps:
- Teal: 1 step
- Blue: 2 steps
- Purple: 3 steps
- Red: 4 steps
### Detailed Analysis
#### MRR
- **1 step**: Teal bar at ~0.49
- **2 steps**: Blue bar at ~0.64
- **3 steps**: Purple bar at ~0.66
- **4 steps**: Red bar at ~0.66
#### Hits@1
- **1 step**: Teal bar at ~0.38
- **2 steps**: Blue bar at ~0.54
- **3 steps**: Purple bar at ~0.57
- **4 steps**: Red bar at ~0.57
#### Hits@10
- **1 step**: Teal bar at ~0.71
- **2 steps**: Blue bar at ~0.78
- **3 steps**: Purple bar at ~0.80
- **4 steps**: Red bar at ~0.80
### Key Observations
1. **Increasing Trends**: All metrics show improvement with more steps:
- MRR plateaus at 0.66 after 3 steps.
- Hits@1 plateaus at 0.57 after 3 steps.
- Hits@10 plateaus at 0.80 after 3 steps.
2. **Performance Gaps**:
- Hits@10 consistently outperforms other metrics (e.g., 0.80 vs. 0.66 for MRR at 4 steps).
- Hits@1 starts lowest (0.38 at 1 step) but closes the gap with MRR by 4 steps.
3. **Color Consistency**: Legend colors match bar colors exactly (e.g., teal for 1 step in all categories).
### Interpretation
The data suggests that incremental steps improve performance across all metrics, with diminishing returns after 3 steps. Hits@10 demonstrates the highest performance, indicating it may be the most sensitive or critical metric in this context. The plateauing effect after 3 steps implies that additional steps beyond this point yield minimal gains, highlighting potential optimization opportunities. The consistent color-coding reinforces clarity in comparing step-based performance.
</details>
(c) ICEWS14 of TKG I.
<details>
<summary>extracted/6596839/fig/bar4.png Details</summary>

### Visual Description
## Bar Chart: Performance Metrics Across Steps
### Overview
The chart compares performance metrics across four incremental steps (1-4 steps) for three evaluation categories: MRR (Mean Reciprocal Rank), Hits@1, and Hits@10. The y-axis represents normalized performance values (0-0.7), while the x-axis categorizes results by evaluation metric. Each step is color-coded (teal, blue, purple, red) and grouped by category.
### Components/Axes
- **X-axis**: Labeled with three categories:
- MRR (left)
- Hits@1 (center)
- Hits@10 (right)
- **Y-axis**: Scaled from 0 to 0.7 in increments of 0.1.
- **Legend**: Located at the top-right corner, mapping colors to steps:
- Teal: 1 step
- Blue: 2 steps
- Purple: 3 steps
- Red: 4 steps
### Detailed Analysis
1. **MRR Category**:
- 1 step (teal): ~0.40
- 2 steps (blue): ~0.40
- 3 steps (purple): ~0.45
- 4 steps (red): ~0.44
- *Trend*: Gradual increase from 1 to 3 steps, followed by a slight decline at 4 steps.
2. **Hits@1 Category**:
- 1 step (teal): ~0.30
- 2 steps (blue): ~0.30
- 3 steps (purple): ~0.35
- 4 steps (red): ~0.35
- *Trend*: Flat performance at 1-2 steps, then steady improvement to 3-4 steps.
3. **Hits@10 Category**:
- 1 step (teal): ~0.60
- 2 steps (blue): ~0.60
- 3 steps (purple): ~0.65
- 4 steps (red): ~0.63
- *Trend*: Consistent growth until 3 steps, followed by a minor drop at 4 steps.
### Key Observations
- **Performance Correlation**: All categories show improved performance with more steps, though the rate of improvement varies.
- **Anomaly**: Hits@10's 4-step bar (red) dips slightly below the 3-step bar, breaking the upward trend observed in other categories.
- **Color Consistency**: Legend colors match bar colors exactly across all categories.
### Interpretation
The data suggests that incremental steps generally enhance performance, with Hits@10 showing the strongest gains (up to 0.65). The slight decline in Hits@10 at 4 steps may indicate diminishing returns or system-specific constraints. MRR and Hits@1 demonstrate more stable growth patterns, while Hits@10's volatility suggests sensitivity to step increments. This could reflect trade-offs between precision (Hits@1) and recall (Hits@10) in a retrieval system, where additional steps improve coverage but may reduce top-tier accuracy.
</details>
(d) ICEWS14 of TKG E.
Figure 6: The impacts of reasoning iterations which correspond to the length of the reasoning rules. It is evident that choosing the appropriate value is crucial for obtaining accurate reasoning results.
<details>
<summary>extracted/6596839/fig/bar3d1.png Details</summary>

### Visual Description
## 3D Bar Chart: Hits@10 Values vs. M and N Parameters
### Overview
The image depicts a 3D bar chart visualizing the relationship between two parameters (M and N) and their impact on Hits@10 Values (%). The chart uses a color gradient to represent value ranges, with bars arranged in a grid-like structure across the M and N axes.
### Components/Axes
- **X-axis (M)**: Labeled "M" with values ranging from 40 to 1000 in increments of 200 (40, 60, 80, 100, 120, 140, 200, 400, 600, 800, 1000).
- **Y-axis (N)**: Labeled "N" with values ranging from 50 to 140 in increments of 20 (50, 60, 80, 100, 120, 140).
- **Z-axis**: Represents "Hits@10 Values (%)" with a color-coded legend on the right:
- **Color Scale**: Blue (48%) β Green (54%) β Yellow (56%) β Orange (58%) β Red (64%).
- **Legend**: Positioned vertically on the right side of the chart, with a gradient from dark blue (48%) to dark red (64%).
### Detailed Analysis
- **Bar Structure**:
- Bars are grouped in clusters along the M and N axes, with heights corresponding to Hits@10 Values.
- Colors transition from blue (low values) at the base to red (high values) at the top.
- **Key Data Points**:
- **Highest Values**: Dark red bars (64%) are concentrated at the top-right corner (M=1000, N=140).
- **Lowest Values**: Dark blue bars (48%) are clustered at the bottom-left corner (M=40, N=50).
- **Mid-Range Values**: Yellow/orange bars (54β58%) dominate the central region (M=200β600, N=80β120).
- **Spatial Grounding**:
- Legend is positioned to the right of the chart, aligned with the Z-axis.
- Axis labels are placed at the base of each axis, with gridlines extending into the 3D space.
### Key Observations
1. **Positive Correlation**: Higher M and N values generally correspond to higher Hits@10 Values, with the strongest performance at M=1000 and N=140.
2. **Diminishing Returns**: The plateau in values (54β58%) at mid-range M and N suggests limited improvement beyond certain thresholds.
3. **Outliers**: No significant outliers; the gradient is consistent across the dataset.
### Interpretation
The chart demonstrates that increasing both M and N parameters improves Hits@10 performance, with the most substantial gains observed at the upper bounds of the axes. The plateau in mid-range values implies that optimizing M and N beyond specific thresholds may not yield proportional benefits. This could inform resource allocation strategies, where prioritizing extreme parameter values might be more effective than incremental adjustments. The color gradient provides an intuitive visual cue for performance tiers, aiding in rapid identification of high-performing configurations.
</details>
(a) Performance on ICEWS14.
<details>
<summary>extracted/6596839/fig/bar3d2.png Details</summary>

### Visual Description
## 3D Bar Chart: GPU Memory Allocation by M and N Parameters
### Overview
The image depicts a 3D bar chart visualizing GPU memory allocation (in GB) as a function of two parameters: M (horizontal axis, 100β1000) and N (depth axis, 40β140). The vertical axis represents GPU memory (0β40 GB), with a color gradient from blue (low memory) to red (high memory). The chart shows a clear trend of increasing GPU memory requirements as both M and N values rise.
### Components/Axes
- **X-axis (M)**: Labeled "M" with values ranging from 100 to 1000 in increments of 200.
- **Y-axis (N)**: Labeled "N" with values ranging from 40 to 140 in increments of 20.
- **Z-axis (GPU Memory)**: Labeled "GPU memory (GB)" with values from 0 to 40 in increments of 5.
- **Legend**: Positioned on the right, mapping colors to GPU memory values (blue = 0β5 GB, green = 5β15 GB, yellow = 15β25 GB, orange = 25β35 GB, red = 35β40 GB).
### Detailed Analysis
- **Bar Heights**:
- At M=100 and N=40, bars are blue (0β5 GB).
- At M=1000 and N=140, bars reach red (35β40 GB).
- Intermediate values (e.g., M=500, N=100) show green/yellow bars (~10β20 GB).
- **Color Gradient**:
- Blue dominates the lower-left corner (low M/N).
- Red dominates the upper-right corner (high M/N).
- **Trends**:
- GPU memory increases linearly with both M and N.
- The relationship appears multiplicative: higher M and N values compound memory requirements.
### Key Observations
1. **Maximum Allocation**: The tallest bars (red) at M=1000 and N=140 indicate a peak GPU memory requirement of ~40 GB.
2. **Minimum Allocation**: The shortest bars (blue) at M=100 and N=40 suggest near-zero memory usage (~0 GB).
3. **Scaling Pattern**: Memory requirements grow proportionally with M and N, with no visible saturation or non-linear thresholds in the observed range.
### Interpretation
The chart demonstrates that GPU memory allocation scales directly with both M and N parameters. This suggests:
- **Resource Planning**: Systems requiring large M and N values (e.g., high-resolution simulations or parallel processing tasks) must allocate ~40 GB of GPU memory.
- **Efficiency Trade-offs**: Lower M/N combinations (e.g., M=200, N=60) may suffice for less demanding workloads, using ~10β15 GB.
- **Predictability**: The linear relationship implies predictable scaling, useful for optimizing hardware procurement or cloud resource allocation.
No anomalies or outliers are observed; the data follows a consistent, monotonic trend. The chart likely represents a computational model where memory demands grow with task complexity (M) and parallelization degree (N).
</details>
(b) Space on ICEWS14.
<details>
<summary>extracted/6596839/fig/bar3d3.png Details</summary>

### Visual Description
## 3D Bar Chart: Hits@10 Values as a Function of M and N
### Overview
The image depicts a 3D bar chart visualizing the relationship between two variables, **M** (x-axis) and **N** (y-axis), and their impact on **Hits@10 Values (%)** (z-axis). The chart uses a color gradient (blue to red) to represent values ranging from 38% to 52%, with a legend on the right side. Bars are arranged in a grid, with height and color intensity indicating the magnitude of Hits@10 Values.
---
### Components/Axes
1. **X-Axis (M)**: Labeled "M," with values ranging from **40 to 140** in increments of 20.
2. **Y-Axis (N)**: Labeled "N," with values ranging from **50 to 1000** in increments of 100.
3. **Z-Axis (Hits@10 Values %)**: Labeled "Hits@10 Values (%)", with values from **38% to 52%**.
4. **Legend**: Positioned on the right, mapping colors to percentage values:
- **Blue**: 38% (lowest)
- **Green**: 42β44%
- **Yellow**: 44β46%
- **Orange**: 46β48%
- **Red**: 48β52% (highest)
---
### Detailed Analysis
- **Bar Structure**:
- Bars are organized in a grid, with each bar corresponding to a unique combination of **M** and **N**.
- Height and color intensity increase with higher Hits@10 Values.
- Example: The tallest bar (52%) is at **M=140, N=1000** (top-right corner), while the shortest (38%) is at **M=40, N=50** (bottom-left corner).
- **Color Gradient**:
- Blue dominates the lower-left region (low M and N).
- Red dominates the upper-right region (high M and N).
- Intermediate values (green/yellow/orange) occupy the middle ranges.
- **Trends**:
- **Positive Correlation**: As **M** and **N** increase, Hits@10 Values consistently rise.
- **Nonlinear Growth**: The steepest increase occurs at higher M and N values (e.g., M=100β140, N=800β1000).
---
### Key Observations
1. **Outlier**: The bar at **M=140, N=1000** (52%) is the only value exceeding 50%, standing out as the maximum.
2. **Consistency**: No bars fall below 38% or above 52%, indicating bounded data.
3. **Spatial Pattern**: The gradient forms a "hill" shape, with the peak at the top-right and the base at the bottom-left.
---
### Interpretation
The chart demonstrates that **Hits@10 Values (%)** are strongly influenced by both **M** and **N**, with higher values of these variables leading to significantly improved performance. The 3D structure highlights the interplay between the two variables, suggesting a multiplicative or synergistic effect. For example:
- Doubling **M** (from 40 to 80) while keeping **N** constant (e.g., 50) increases Hits@10 from ~38% to ~42%.
- Increasing **N** from 50 to 1000 at **M=140** boosts Hits@10 from ~48% to 52%.
This pattern implies that optimizing both **M** and **N** is critical for maximizing Hits@10, with diminishing returns observed at the upper bounds of the variables. The absence of values outside the 38β52% range suggests the dataset is well-constrained, possibly reflecting a controlled experimental setup.
</details>
(c) Performance on ICEWS18.
<details>
<summary>extracted/6596839/fig/bar3d4.png Details</summary>

### Visual Description
## 3D Bar Chart: GPU Memory Allocation by M and N Parameters
### Overview
The image depicts a 3D bar chart visualizing GPU memory allocation (in GB) as a function of two parameters: **M** (y-axis) and **N** (x-axis). The chart uses a color gradient (blue to red) to represent memory values, with a legend on the right. The z-axis represents GPU memory, ranging from 0 to 40 GB.
---
### Components/Axes
- **X-axis (N)**: Labeled "N" with values from **40 to 140** in increments of 20.
- **Y-axis (M)**: Labeled "M" with values from **50 to 1000** in increments of 200.
- **Z-axis**: Labeled "GPU memory (GB)" with values from **0 to 40** in increments of 10.
- **Legend**: Positioned on the right, mapping colors to memory values:
- **Blue**: 0β5 GB
- **Cyan**: 5β10 GB
- **Green**: 10β15 GB
- **Yellow**: 15β20 GB
- **Orange**: 20β25 GB
- **Red**: 25β40 GB
---
### Detailed Analysis
1. **Bar Structure**:
- Bars are arranged in a grid corresponding to **M** (y-axis) and **N** (x-axis).
- Height of each bar represents GPU memory, with color intensity matching the legend.
2. **Key Data Points**:
- **Highest Memory (Red)**:
- Located at **M = 1000** and **N = 140**.
- Approximate value: **40 GB** (top of the red gradient).
- **Lowest Memory (Blue)**:
- Located at **M = 50** and **N = 40**.
- Approximate value: **0β2 GB** (base of the blue gradient).
- **Intermediate Values**:
- At **M = 500, N = 100**: Green-yellow gradient (~15β20 GB).
- At **M = 200, N = 80**: Cyan (~5β10 GB).
3. **Trends**:
- **Positive Correlation**: GPU memory increases monotonically with both **M** and **N**.
- **Steepest Growth**: Observed in the upper-right quadrant (high M and N values).
- **Color Gradient Consistency**: Colors align precisely with the legend (e.g., red bars only appear at the highest memory values).
---
### Key Observations
- **Outlier**: No anomalies detected; all bars follow the expected gradient.
- **Dominant Pattern**: Memory usage scales linearly with **M** and **N**, suggesting a direct dependency.
- **Color Accuracy**: All bars match the legendβs color-to-value mapping without deviation.
---
### Interpretation
The chart demonstrates that GPU memory allocation is **directly proportional to both M and N parameters**. This implies:
1. **Resource Planning**: Higher M and N values require significantly more GPU memory, critical for system design or optimization.
2. **Efficiency Insight**: Lower M and N values (e.g., M=50, N=40) consume minimal memory, indicating potential for lightweight configurations.
3. **Scalability**: The linear relationship suggests predictable scaling for applications dependent on these parameters.
The visualization effectively communicates how memory demands grow with increasing computational parameters, aiding in capacity planning or performance analysis.
</details>
(d) Space on ICEWS18.
Figure 7: The impacts of sampling in the reasoning process. Performance and GPU space usage with batch size 64. Large values of M and N can achieve excellent performance at the cost of increased space requirements.
4.4 Hyperparameter Analysis (RQ3)
We run our model with different hyperparameters to explore weight impacts in Figures 6 and 7. Specifically, Figure 6 illustrates the performance variation with different reasoning iterations, i.e., the length of the reasoning rules. At WN18RR and FB15k-237 v3 datasets of transductive and inductive settings, experiments on rule lengths of 2, 4, 6, and 8 are carried out as illustrated in Figures 6 and 6. It is observed that the performance generally improves with the iteration increasing from 2 to 6. When the rule length continues to increase, the inference performance changes little or decreases slightly. The same phenomenon can also be observed in Figures 6 and 6, which corresponds to interpolation and extrapolation reasoning on the ICEWS14 dataset. The rule length ranges from 1 to 4, where the model performance typically displays an initial improvement, followed by a tendency to stabilize or exhibit a marginal decline. This phenomenon occurs due to the heightened rule length, which amplifies the modeling capability while potentially introducing noise into the reasoning process. Therefore, an appropriate value of rule length (reasoning iteration) is significant for KG reasoning.
We also explore the impacts of hyperparameters M for node sampling and N for edge selection on ICEWS14 and ICEWS18 datasets of extrapolation reasoning. The results are shown in Figure 7. For each dataset, we show the reasoning performance (Hits@10 metric) and utilized space (GPU memory) in detail, with the M varies in {50, 100, 200, 600, 800, 1000} while N varies in {40, 60, 80, 100, 120, 140}. It is evident that opting for smaller values results in a significant decline in performance. This decline can be attributed to the inadequate number of nodes and edges, which respectively contribute to insufficient and unstable training. Furthermore, as M surpasses 120, the marginal gains become smaller or even lead to performance degradation. Additionally, when M and N are increased, the GPU memory utilization of the model experiences rapid growth, as depicted in Figure 7 and 7, with a particularly pronounced effect on M.
TABLE VII: Some reasoning cases in transductive, interpolation, and extrapolation scenarios, where both propositional reasoning and learned FOL rules are displayed. β ${-1}$ β denotes the reverse of a specific relation and textual descriptions of some relations are simplified. Values in orange rectangles represent propositional attentions and relations marked with red in FOL rules represent the target relation to be predicted.
| Propositional Reasoning | FOL Rules |
| --- | --- |
|
<details>
<summary>extracted/6596839/fig/case1.png Details</summary>

### Visual Description
## Network Diagram: Semantic Relationships with Weighted Connections
### Overview
The image depicts a directed network diagram with nodes representing entities (labeled with numeric IDs) and edges representing semantic relationships. Nodes are connected via labeled edges with numerical weights (0.05β0.59) and directional arrows. Some nodes include status indicators (β for valid, β for invalid). The central node (00238867) acts as a hub connecting multiple sub-networks.
### Components/Axes
- **Nodes**:
- Central node: `00238867` (blue, labeled "hypernym").
- Sub-network nodes:
- `00239321` (self-loop, weight 0.23),
- `13530408` (weight 0.13),
- `06084469` (weight 0.09),
- `0025728` (weight 0.08),
- `00298896` (weight 0.18),
- `00407848` (weight 0.07, β).
- **Edges**:
- Labeled with relationship types (e.g., "derivationally", "verbGroup^-1", "synsetDomain", "TopicOf").
- Weights (0.05β0.59) indicate connection strength.
- Directional arrows show flow (e.g., `00238867 β 00239321`).
- **Status Indicators**:
- β (valid): `00239321` (0.59), `13530408` (0.11).
- β (invalid): `14712036` (0.10), `06084469` (0.05), `00407848` (0.07).
### Detailed Analysis
- **Central Hub**: Node `00238867` connects to six sub-networks via edges labeled "derivationally", "verbGroup^-1", "synsetDomain", and "TopicOf".
- **Self-Loop**: Node `00239321` has a self-loop with weight 0.23 and a high validity score (0.59 β).
- **Sub-Networks**:
1. **Derivationally Connected**:
- `00239321 β 13530408` (0.13), `13530408 β 00239321` (0.11).
- `00239321 β 00239321` (self-loop, 0.23).
2. **VerbGroup^-1**:
- `00238867 β 00239321` (0.21).
3. **SynsetDomain**:
- `00238867 β 06084469` (0.09), `00238867 β 0025728` (0.08), `00238867 β 00298896` (0.18).
4. **TopicOf**:
- `00238867 β 0025728` (0.08), `00238867 β 00298896` (0.18).
5. **Hypernym**:
- `00238867 β 00126264` (0.23).
- **Invalid Connections**:
- `00239321 β 14712036` (0.10 β), `00239321 β 06084469` (0.05 β), `00239321 β 00407848` (0.07 β).
### Key Observations
1. **High-Weight Valid Connections**:
- The self-loop on `00239321` (0.23) and `00238867 β 00298896` (0.18) are the strongest edges.
2. **Invalid Connections**:
- All edges from `00239321` to other nodes (except itself) are marked invalid (β).
3. **Hub Dominance**:
- `00238867` has the most outgoing edges (6), suggesting it is a central concept.
### Interpretation
This diagram likely represents a semantic or linguistic network where nodes are concepts/terms and edges represent relationships (e.g., derivational, synonymy, topic association). The central node (`00238867`) acts as a root or hypernym, connecting to specialized sub-networks. The checkmarks (β) and crosses (β) indicate validation status, possibly from a machine learning model or human annotation.
- **Notable Trends**:
- High-weight valid connections (e.g., 0.59 β) suggest strong, confirmed relationships.
- Invalid connections (β) may represent errors in automated linking or low-confidence associations.
- **Implications**:
- The networkβs structure highlights hierarchical relationships (e.g., hypernym to hyponym) and potential areas for refinement (e.g., invalid edges).
- The self-loop on `00239321` might indicate reflexive properties (e.g., a term relating to itself).
This diagram provides a visual framework for analyzing semantic relationships, with weights and status indicators guiding further investigation into valid vs. invalid connections.
</details>
| [1]
0.21 verbGroup -1 (X,Z) $β$ verbGroup (X,Z) [2]
0.32 verbGroup -1 (X,Y 1) $\land$ derivationallyRelatedForm (Y 1,Y 2) $\land$ derivationallyRelatedForm -1 (Y 2,Z) $β$ verbGroup (X,Z) [3]
0.07 derivationallyRelatedForm -1 (X,Y 1) $\land$ derivationallyRelatedForm -1 (Y 1,Y 2) $\land$ verbGroup -1 (Y 2,Z) $β$ verbGroup (X,Z) [4]
0.05 synsetDomainTopicOf (X,Y 1) $\land$ synsetDomainTopicOf -1 (Y 1,Y 2) $\land$ derivationallyRelatedForm (Y 2,Z) $β$ verbGroup (X,Z) [5]
0.18 hypernym (X,Y 1) $\land$ hypernym -1 (Y 1,Y 2) $\land$ alsoSee (Y 2,Z) $β$ verbGroup (X,Z) |
| Transductive reasoning: query (00238867, verbGroup, ?) in WN18RR | |
|
<details>
<summary>extracted/6596839/fig/case2.png Details</summary>

### Visual Description
## Network Diagram: Entity Interactions in Ukraine (2014)
### Overview
The diagram illustrates a network of interactions between political entities in Ukraine during early 2014, focusing on dates between January and April. Nodes represent entities (e.g., "Police (Ukraine)", "Protester (Ukraine)"), while edges depict directional relationships with labeled actions and numerical weights. Symbols (β , β) indicate status, and dates contextualize temporal relevance.
---
### Components/Axes
- **Nodes**:
- Labeled with entity names and dates (e.g., "Police (Ukraine):2014-01-21").
- Colors:
- Orange for most nodes.
- Blue for "Party of Regions".
- Green checkmark (β ) on "Security Service of Ukraine" node.
- Red X (β) on two nodes.
- **Edges**:
- Directed arrows with labels (e.g., "reduceRelations", "consult").
- Numerical weights (0.05β0.74) on edges.
- Dashed lines for some connections (e.g., "Security Service of Ukraine" to "Benjamin Netanyahu").
- **Legend**:
- β : Approved/Successful interaction.
- β: Rejected/Failed interaction.
---
### Detailed Analysis
#### Nodes
1. **Police (Ukraine)**:
- Multiple instances with dates (2014-01-21, 2014-01-12, 2014-01-27).
- Self-loop on 2014-01-21 (weight: 0.24).
- Connections to "An Appeal" (0.39) and "Security Service of Ukraine" (0.74 β ).
2. **Protester (Ukraine)**:
- Dates: 2014-01-13, 2014-02-18.
- Connections to "Police (Ukraine)" (0.23 "obstruct", 0.31 "repression").
3. **Party of Regions**:
- Connects to "Protester (Ukraine)" (0.19 "consult").
4. **Arseniy Yatsenyuk**:
- Date: 2014-03-27.
- Connects to "Protester (Ukraine)" (0.12 "consult").
5. **John Kerry**:
- Date: 2014-02-01.
- Connects to "Protester (Ukraine)" (0.10 "discussBy").
6. **Benjamin Netanyahu**:
- Date: 2014-03-19.
- Connects to "Security Service of Ukraine" (0.08 β).
#### Edges
- **Highest Weight**: 0.74 β between "Police (Ukraine):2014-01-21" and "Security Service of Ukraine".
- **Lowest Weight**: 0.05 between "Protester (Ukraine):2014-01-27" and "Security Service of Ukraine".
- **Rejected Interactions**:
- 0.13 β between "Police (Ukraine):2014-01-27" and "Security Service of Ukraine".
- 0.08 β between "Security Service of Ukraine" and "Benjamin Netanyahu".
---
### Key Observations
1. **Dominant Entity**: "Police (Ukraine)" has the most connections and highest interaction weights.
2. **Security Service of Ukraine**:
- Strong internal connection (0.74 β ) but weak/failed external ties (0.05, 0.08 β).
3. **Protester (Ukraine)**:
- Multiple repressive actions toward Police (0.23β0.31).
4. **Temporal Spread**:
- Interactions span JanuaryβApril 2014, suggesting escalating tensions.
---
### Interpretation
The diagram reflects a polarized network during Ukraine's 2014 political crisis. The **Police (Ukraine)** acts as a central hub, with strong self-referential interactions (0.24 self-loop) and a critical alliance with the **Security Service of Ukraine** (0.74 β ). However, the Security Service struggles with external engagement, evidenced by low weights and rejections (0.05, 0.08 β).
The **Protester (Ukraine)** node shows adversarial relations with Police (0.23β0.31), while the **Party of Regions** and **John Kerry** engage in consultative roles (0.12β0.19). The **Benjamin Netanyahu** connection (0.08 β) highlights international involvement but with limited success.
The β and β symbols suggest a system where interactions are evaluated for legitimacy, with internal entities (Police, Security Service) maintaining higher approval rates. The timeline (JanuaryβApril 2014) aligns with real-world events like the Euromaidan protests and geopolitical tensions, underscoring the diagram's relevance to historical conflict dynamics.
</details>
| [1]
0.46 reduceRelations -1 (X,Z) $:t_{1}$ $β$ makeAnAppeal (X,Z) $:t$ [2]
0.19 reduceRelations -1 (X,Y 1) $:t_{1}$ $\land$ repression (Y 1,Y 2) $:t_{2}$ $\land$ makeAnAppeal -1 (Y 2,Z) $:t_{3}$ $β$ makeAnAppeal (X,Z) $:t$ [3]
0.14 obstructPassage -1 (X,Y 1) $:t_{1}$ $\land$ repression -1 (Y 1,Y 2) $:t_{2}$ $\land$ makeStatement (Y 2,Z) $:t_{3}$ $β$ makeAnAppeal (X,Z) $:t$ [4]
0.12 consult (X,Y 1) $:t_{1}$ $\land$ consult -1 (Y 1,Y 2) $:t_{2}$ $\land$ discussByTelephone -1 (Y 2,Z) $:t_{3}$ $β$ makeAnAppeal (X,Z) $:t$ |
| Interpolation reasoning: query (Party of Regions, makeAnAppeal, ?, 2014-05-15) in ICEWS14 | |
|
<details>
<summary>extracted/6596839/fig/case3.png Details</summary>

### Visual Description
## Network Diagram: Entity Interactions and Probabilities
### Overview
The diagram depicts a network of interactions between entities (countries, organizations, individuals) with labeled relationships and probabilistic weights. Nodes are color-coded (blue for Iran, orange for others), and edges represent actions (e.g., "accuse," "engageIn") with numerical probabilities. Dates are embedded in node labels to contextualize events.
### Components/Axes
- **Nodes**:
- **Blue Node**: Iran (central hub).
- **Orange Nodes**:
- United Nations (2018-07-31, 2018-08-02)
- Donald Trump (2018-06-08, 2018-08-02)
- Russia (2018-05-14, 2018-07-06)
- Morocco (2018-05-01)
- Germany (2018-04-12)
- Police (Afghanistan) (2018-08-22)
- **Edges**:
- Labeled with actions (e.g., "express," "defyNorms") and probabilities (0.05β0.62).
- Dashed edges indicate weaker or indirect connections.
- **Legend**: Located at bottom-right, mapping colors to entities (blue = Iran, orange = others).
### Detailed Analysis
#### Node Relationships
1. **Iran (2018-05-02)**:
- **Accuses** United Nations (2018-07-31) with probability **0.26**.
- **Makes Visit** to Morocco (2018-05-01) with probability **0.37**.
- **Hosts Visit** from Germany (2018-04-12) with probability **0.42**.
- **Makes Statement** to Russia (2018-05-03) with probability **0.34**.
2. **United Nations (2018-07-31)**:
- **Expresses Intent** to itself with probability **0.48**.
- **Defies Norms** (Law) with probability **0.09**.
- **Engages in Cooperation** with Russia (2018-05-14) with probability **0.34**.
3. **Donald Trump (2018-06-08, 2018-08-02)**:
- **Defies Norms** (Law) with probability **0.62** (highest in network).
- **Makes Optimistic Comment** to Police (Afghanistan) with probability **0.05** (lowest in network).
4. **Russia (2018-05-14, 2018-07-06)**:
- **Engages in Cooperation** with Iran (2018-05-14) with probability **0.34**.
- **Makes Optimistic Comment** to Police (Afghanistan) with probability **0.08**.
5. **Police (Afghanistan) (2018-08-22)**:
- Receives **Optimistic Comment** from Donald Trump (2018-06-08) with probability **0.06**.
- Receives **Optimistic Comment** from Russia (2018-07-06) with probability **0.08**.
#### Edge Trends
- **High-Probability Edges**:
- UN self-reference (**0.48**) and UN-Trump interaction (**0.62**) dominate.
- **Low-Probability Edges**:
- Trump-Police (**0.05**) and Russia-Police (**0.08**) edges are weakest.
- **Dashed Edges**:
- Indicate indirect or less certain connections (e.g., UN-Trump dashed edge with **0.08**).
### Key Observations
1. **Central Role of Iran**: Iran is the primary connector, initiating interactions with multiple entities.
2. **UN Dominance**: The UN has the highest self-referential probability (**0.48**) and strong ties to Trump (**0.62**).
3. **Trumpβs Polarized Influence**: High probability (**0.62**) with the UN but minimal impact on Police (Afghanistan) (**0.05**).
4. **Russiaβs Dual Role**: Engages with Iran and the UN but has weak ties to Police (Afghanistan).
5. **Morocco/Germany**: Limited interactions (only with Iran) and no further connections.
### Interpretation
The diagram reveals a geopolitical network where:
- **Iran** acts as a central node, driving interactions with global entities.
- The **United Nations** exhibits self-referential behavior (e.g., "express IntentTo") and strong ties to Trump, suggesting institutional influence or policy alignment.
- **Donald Trump**βs highest-probability interaction is with the UN (**0.62**), but his influence on ground-level actors like Police (Afghanistan) is negligible (**0.05**).
- **Russia** bridges Iran and the UN but has minimal impact on Afghanistan.
- **Low probabilities** for Police (Afghanistan) interactions may reflect logistical barriers, political distance, or data uncertainty.
The network highlights how high-level diplomatic actions (e.g., UN resolutions) correlate with lower-probability ground-level outcomes, emphasizing systemic complexity and fragmented influence.
</details>
| [1]
0.14 accuse (X,Y) $:t_{1}$ $\land$ expressIntentTo (Y,Z) $:t_{2}$ $β$ makeVisit (X,Z) $:t$ [2]
0.09 makeVisit (X,Y 1) $:t_{1}$ $\land$ engageInCooperation (Y 1,Y 2) $:t_{2}$ $\land$ defyNormsLaw (Y 2,Z) $:t_{3}$ $β$ makeVisit (X,Z) $:t$ [3]
0.11 makeStatement (X,Y 1) $:t_{1}$ $\land$ reject -1 (Y 1,Y 2) $:t_{2}$ $\land$ makeOptimisticComment (Y 2,Z) $:t_{3}$ $β$ makeVisit (X,Z) $:t$ [4]
0.25 makeVisit (X,Y) $:t_{1}$ $\land$ makeOptimisticComment (Y,Z) $:t_{2}$ $β$ makeVisit (X,Z) $:t$ [5]
0.17 hostVisit -1 (X,Y 1) $:t_{1}$ $\land$ meetAtThirdLocation (Y 1,Y 2) $:t_{2}$ $\land$ makeOptimisticComment -1 (Y 2,Z) $:t_{3}$ $β$ makeVisit (X,Z) $:t$ |
| Extrapolation reasoning: query (Nasser Bourita, makeVisit, ?, 2018-09-28) in ICEWS18 | |
TABLE VIII: Some reasoning cases in inductive scenarios, where learned FOL rules are displayed. Relations marked with red represent the target relation to be predicted. β ${-1}$ β denotes the reverse of a specific relation and textual descriptions of some relations are simplified.
| Col1 |
| --- |
| [1] 0.41 memberMeronym (X,Y 1) $\land$ hasPart (Y 1,Y 2) $\land$ hasPart -1 (Y 2,Z) $β$ memberMeronym (X,Z) [2] 0.19 hasPart -1 (X,Y 1) $\land$ hypernym (Y 1,Y 2) $\land$ memberOfDomainUsage -1 (Y 2,Z) $β$ memberMeronym (X,Z) [3] 0.25 hypernym (X,Y 1) $\land$ hypernym -1 (Y 1,Y 2) $\land$ memberMeronym (Y 2,Z) $β$ memberMeronym (X,Z) [4] 0.17 hypernym (X,Y 1) $\land$ hypernym -1 (Y 1,Y 2) $\land$ hasPart (Y 2,Z) $β$ memberMeronym (X,Z) |
| Inductive reasoning: query (08174398, memberMeronym, ?) in WN18RR v3 |
| [1] 0.32 filmReleaseRegion (X,Y 1) $\land$ filmReleaseRegion -1 (Y 1,Y 2) $\land$ filmCountry (Y 2,Z) $β$ filmReleaseRegion (X,Z) [2] 0.10 distributorRelation -1 (X,Y 1) $\land$ nominatedFor (Y 1,Y 2) $\land$ filmReleaseRegion -1 (Y 2,Z) $β$ filmReleaseRegion (X,Z) [3] 0.19 filmReleaseRegion (X,Y 1) $\land$ exportedTo -1 (Y 1,Y 2) $\land$ locationCountry (Y 2,Z) $β$ filmReleaseRegion (X,Z) [4] 0.05 filmCountry (X,Y 1) $\land$ filmReleaseRegion -1 (Y 1,Y 2) $\land$ filmMusic (Y 2,Z) $β$ filmReleaseRegion (X,Z) |
| Inductive reasoning: query (/m/0j6b5, filmReleaseRegion, ?) in FB15k-237 v3 |
| [1] 0.46 collaboratesWith -1 (X,Z) $β$ collaboratesWith (X,Z) [2] 0.38 collaboratesWith -1 (X,Y 1) $\land$ holdsOffice (Y 1,Y 2) $\land$ holdsOffice -1 (Y 2,Z) $β$ collaboratesWith (X,Z) [3] 0.03 collaboratesWith -1 (X,Y 1) $\land$ graduatedFrom (Y 1,Y 2) $\land$ graduatedFrom -1 (Y 2,Z) $β$ collaboratesWith (X,Z) [4] 0.03 collaboratesWith -1 (X,Y 1) $\land$ collaboratesWith (Y 1,Y 2) $\land$ graduatedFrom (Y 2,Z) $β$ collaboratesWith (X,Z) |
| Inductive reasoning: query (Hillary Clinton, collaboratesWith, ?) in NELL v3 |
4.5 Case Studies (RQ4)
To show the actual reasoning process of Tunsr, some practical cases are presented in detail on all four reasoning scenarios, which illustrate the transparency and interpretability of the proposed Tunsr. For better presentation, the maximum length of the reasoning iterations is set to 3. Specifically, Table VII shows the reasoning graphs for three specific queries on transductive, interpolation, and extrapolation scenarios, respectively, The propositional attention weights of nodes are listed near them, which represent the propositional reasoning score of each node at the current step. For example, in the first case, the uppermost propositional reasoning path (00238867, verbGroup -1, 00239321) at first step learns a large attention score for the correct answer 00239321. Generally, nodes with more preceding neighbors or larger preceding attention weights significantly impact subsequent steps and the prediction of final entity scores. Besides, we observe that propositional and first-order reasoning have an incompletely consistent effect. For example, the FOL rules of β[3]β and β[4]β in the third case have relatively high rule confidence values compared with β[1]β and β[2]β (0.11, 0.25 vs. 0.14, 0.09), but the combination of their corresponding propositional reasoning paths β(Nasser Bourita, makeStatement, Morocco:2018-05-01, reject -1, Iran:2018-05-03, makeOptimisticComment, Donald Trump:2018-06-08)β and β(Nasser Bourita, makeVisit, Iran:2018-05-03, self, Iran:2018-05-03, makeOptimisticComment, Donald Trump:2018-06-08)β has a small propositional attention, i.e., 0.08. This combination prevents the model from predicting the wrong answer Donald Trump. Thus, propositional and FOL reasoning can be integrated to jointly guide the reasoning process, leading to more accurate reasoning results.
Table VIII shows some learned FOL rules of inductive reasoning on WN18RR v3, FB15k-237 v3, and NELL v3 datasets. As the inductive setting is entity-independent, so the propositional reasoning part is not involved here. Each rule presented carries practical significance and is readily understandable for humans. For instance, rule β[1]β collaboratesWith -1 (X, Z) $β$ collaboratesWith(X, Z) in the third case has a relatively high confidence value (0.46). This aligns with human commonsense cognition, as the relation collaboratesWith has mutual characteristics for subject and object and can be derived from each other. If person a has collaborated with person b, it inherently implies person b has collaborated with person a. These results illustrate the effectiveness of the rules learned by Tunsr and its interpretable reasoning process.
5 Conclusion and Future Works
To combine the advantages of connectionism and symbolicism of AI for KG reasoning, we propose a unified neurosymbolic framework Tunsr for both perspectives of methodology and reasoning scenarios, including transductive, inductive, interpolation, and extrapolation reasoning. Tunsr first introduces a consistent structure of reasoning graph that starts from the query entity and constantly expands subsequent nodes by iteratively searching posterior neighbors. Based on it, a forward logical message-passing mechanism is proposed to update both the propositional representations and attentions, as well as FOL representations and attentions of each node in the expanding reasoning graph. In this way, Tunsr conducts the transformation of merging multiple rules by merging possible relations at each step by using FOL attentions. Through gradually adding rule bodies and updating rule confidence, the real FOL rules can be easily induced to constantly perform attention calculation over the reasoning graph, which is summarized as the FARI algorithm. The experiments on 19 datasets of four different reasoning scenarios illustrate the effectiveness of Tunsr. Meanwhile, the ablation studies show that propositional and FOL have different impacts. Thus, they can be integrated to improve the whole reasoning results. The case studies also verify the transparency and interpretability of its computation process.
The future works lie in two folds. Firstly, we aim to extend the application of this idea to various reasoning domains, particularly those necessitating interpretability for decision-making [87], such as intelligent healthcare and finance. We anticipate this will enhance the accuracy of reasoning while simultaneously offering human-understandable logical rules as evidence. Secondly, we intend to integrate the concept of unified reasoning with state-of-the-art technologies to achieve optimal results. For instance, large language models have achieved great success in the community of natural language processing and AI, while they often encounter challenges when confronted with complex reasoning tasks [88]. Hence, there is considerable prospect for large language models to enhance reasoning capabilities.
References
- [1] I. Tiddi and S. Schlobach. Knowledge graphs as tools for explainable machine learning: A survey. Artificial Intelligence, 302:103627, 2022.
- [2] M. Li and M. Moens. Dynamic key-value memory enhanced multi-step graph reasoning for knowledge-based visual question answering. In Thirty-Sixth AAAI Conference on Artificial Intelligence, pp. 10983β10992. AAAI Press, 2022.
- [3] C. Mavromatis et al. Tempoqr: Temporal question reasoning over knowledge graphs. In Thirty-Sixth AAAI Conference on Artificial Intelligence, pp. 5825β5833. AAAI Press, 2022.
- [4] Y. Yang et al. Knowledge graph contrastive learning for recommendation. In The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), pp. 1434β1443. ACM, 2022.
- [5] Y. Zhu et al. Recommending learning objects through attentive heterogeneous graph convolution and operation-aware neural network. IEEE Transactions on Knowledge and Data Engineering (TKDE), 35:4178β4189, 2023.
- [6] A. Bastos et al. RECON: relation extraction using knowledge graph context in a graph neural network. In The Web Conference (WWW), pp. 1673β1685. ACM / IW3C2, 2021.
- [7] X. Chen et al. Knowprompt: Knowledge-aware prompt-tuning with synergistic optimization for relation extraction. In The Web Conference (WWW), pp. 2778β2788. ACM, 2022.
- [8] B. D. Trisedya et al. GCP: graph encoder with content-planning for sentence generation from knowledge bases. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 44(11):7521β7533, 2022.
- [9] W. Yu et al. A survey of knowledge-enhanced text generation. ACM Comput. Surv., 54(11s):227:1β227:38, 2022.
- [10] K. D. Bollacker et al. Freebase: A collaboratively created graph database for structuring human knowledge. In Proceedings of the International Conference on Management of Data (SIGMOD), pp. 1247β1250, 2008.
- [11] D. Vrandecic. Wikidata: A new platform for collaborative data collection. In Proceedings of the 21st World Wide Web Conference (WWW), pp. 1063β1064, 2012.
- [12] Q. Wang et al. Knowledge graph embedding: A survey of approaches and applications. IEEE Transactions on Knowledge and Data Engineering (TKDE), 29(12):2724β2743, 2017.
- [13] A. Rossi et al. Knowledge graph embedding for link prediction: A comparative analysis. ACM Transactions on Knowledge Discovery from Data (TKDD), 15(2):1β49, 2021.
- [14] S. Pinker and J. Mehler. Connections and symbols. Mit Press, 1988.
- [15] T. H. Trinh et al. Solving olympiad geometry without human demonstrations. Nature, 625(7995):476β482, 2024.
- [16] Q. Lin et al. Contrastive graph representations for logical formulas embedding. IEEE Transactions on Knowledge and Data Engineering, 35:3563β3574, 2023.
- [17] F. Xu et al. Symbol-llm: Towards foundational symbol-centric interface for large language models. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL), pp. 13091β13116, 2024.
- [18] Q. Lin et al. Fusing topology contexts and logical rules in language models for knowledge graph completion. Information Fusion, 90:253β264, 2023.
- [19] A. Bordes et al. Translating embeddings for modeling multi-relational data. In Advances in Neural Information Processing Systems (NeurIPS), pp. 2787β2795, 2013.
- [20] L. A. GalΓ‘rraga et al. AMIE: association rule mining under incomplete evidence in ontological knowledge bases. In 22nd International World Wide Web Conference (WWW), pp. 413β422, 2013.
- [21] F. Yang et al. Differentiable learning of logical rules for knowledge base reasoning. In Advances in Neural Information Processing Systems (NeurIPS), pp. 2319β2328, 2017.
- [22] Y. Shen et al. Modeling relation paths for knowledge graph completion. IEEE Transactions on Knowledge and Data Engineering, 33(11):3607β3617, 2020.
- [23] K. Cheng et al. Rlogic: Recursive logical rule learning from knowledge graphs. In The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), pp. 179β189. ACM, 2022.
- [24] J. Liu et al. Latentlogic: Learning logic rules in latent space over knowledge graphs. In Findings of the EMNLP, pp. 4578β4586, 2023.
- [25] C. Jiang et al. Path spuriousness-aware reinforcement learning for multi-hop knowledge graph reasoning. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL), pp. 3173β3184, 2023.
- [26] Q. Lin et al. Incorporating context graph with logical reasoning for inductive relation prediction. In The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), pp. 893β903, 2022.
- [27] J. Li et al. Teast: Temporal knowledge graph embedding via archimedean spiral timeline. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL), pp. 15460β15474, 2023.
- [28] Y. Liu et al. Tlogic: Temporal logical rules for explainable link forecasting on temporal knowledge graphs. In Thirty-Sixth AAAI Conference on Artificial Intelligence, pp. 4120β4127. AAAI Press, 2022.
- [29] N. Li et al. Tr-rules: Rule-based model for link forecasting on temporal knowledge graph considering temporal redundancy. In Findings of the Association for Computational Linguistics (EMNLP), pp. 7885β7894, 2023.
- [30] Q. Lin et al. TECHS: temporal logical graph networks for explainable extrapolation reasoning. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL), pp. 1281β1293, 2023.
- [31] E. Cambria et al. SenticNet 7: A commonsense-based neurosymbolic AI framework for explainable sentiment analysis. In LREC, pp. 3829β3839, 2022.
- [32] A. Sadeghian et al. DRUM: end-to-end differentiable rule mining on knowledge graphs. In Advances in Neural Information Processing Systems (NeurIPS), pp. 15321β15331, 2019.
- [33] M. Qu et al. Rnnlogic: Learning logic rules for reasoning on knowledge graphs. In 9th International Conference on Learning Representations (ICLR), 2021.
- [34] Y. Zhang et al. GMH: A general multi-hop reasoning model for KG completion. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 3437β3446, 2021.
- [35] J. Zhang et al. Subgraph retrieval enhanced model for multi-hop knowledge base question answering. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 5773β5784, 2022.
- [36] Y. Lan et al. Complex knowledge base question answering: A survey. IEEE Trans. Knowl. Data Eng., 35(11):11196β11215, 2023.
- [37] H. Dong et al. Temporal inductive path neural network for temporal knowledge graph reasoning. Artificial Intelligence, pp. 104085, 2024.
- [38] J. Chung et al. Empirical evaluation of gated recurrent neural networks on sequence modeling. CoRR, abs/1412.3555, 2014.
- [39] S. Abiteboul et al. Foundations of databases, volume 8. Addison-Wesley Reading, 1995.
- [40] M. Gebser et al. Potassco: The potsdam answer set solving collection. Ai Communications, 24(2):107β124, 2011.
- [41] M. Alviano et al. Wasp: A native asp solver based on constraint learning. In Logic Programming and Nonmonotonic Reasoning: 12th International Conference, LPNMR 2013, Corunna, Spain, September 15-19, 2013. Proceedings 12, pp. 54β66. Springer, 2013.
- [42] W. Rautenberg. A Concise Introduction to Mathematical Logic. Springer, 2006.
- [43] G. Ciravegna et al. Logic explained networks. Artificial Intelligence, 314:103822, 2023.
- [44] H. Ren and J. Leskovec. Beta embeddings for multi-hop logical reasoning in knowledge graphs. In Advances in Neural Information Processing Systems (NeurIPS), 2020.
- [45] P. B. Andrews. An Introduction to Mathematical Logic and Type Theory: To Truth Through Proof, volume 27. Springer Science & Business Media, 2013.
- [46] J. Sun et al. A survey of reasoning with foundation models. arXiv preprint arXiv:2312.11562, 2023.
- [47] W. Zhang et al. Knowledge graph reasoning with logics and embeddings: Survey and perspective. CoRR, abs/2202.07412, 2022.
- [48] D. Poole. Probabilistic horn abduction and bayesian networks. Artificial Intelligence, 64(1):81β129, 1993.
- [49] D. Xu et al. Inductive representation learning on temporal graphs. In 8th International Conference on Learning Representations (ICLR), 2020.
- [50] L. GalΓ‘rraga et al. Fast rule mining in ontological knowledge bases with AMIE+. The VLDB Journal, 24(6):707β730, 2015.
- [51] W. Zhang et al. Iteratively learning embeddings and rules for knowledge graph reasoning. In The World Wide Web Conference (WWW), pp. 2366β2377, 2019.
- [52] T. Lacroix et al. Canonical tensor decomposition for knowledge base completion. In Proceedings of the 35th International Conference on Machine Learning (ICML), volume 80, pp. 2869β2878. PMLR, 2018.
- [53] B. Yang et al. Embedding entities and relations for learning and inference in knowledge bases. In International Conference on Learning Representations (ICLR), 2015.
- [54] B. Xiong et al. Ultrahyperbolic knowledge graph embeddings. In The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), pp. 2130β2139. ACM, 2022.
- [55] J. Wang et al. Duality-induced regularizer for semantic matching knowledge graph embeddings. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 45(2):1652β1667, 2023.
- [56] Y. Zhang et al. Bilinear scoring function search for knowledge graph learning. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 45(2):1458β1473, 2023.
- [57] R. Li et al. How does knowledge graph embedding extrapolate to unseen data: A semantic evidence view. In Thirty-Sixth AAAI Conference on Artificial Intelligence (AAAI), pp. 5781β5791. AAAI Press, 2022.
- [58] Y. Zhang and Q. Yao. Knowledge graph reasoning with relational digraph. In The ACM Web Conference, pp. 912β924. ACM, 2022.
- [59] X. Ge et al. Compounding geometric operations for knowledge graph completion. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL), pp. 6947β6965, 2023.
- [60] W. Wei et al. Enhancing heterogeneous knowledge graph completion with a novel gat-based approach. ACM Transactions on Knowledge Discovery from Data, 2024.
- [61] F. Shi et al. Tgformer: A graph transformer framework for knowledge graph embedding. IEEE Transactions on Knowledge and Data Engineering, 2025.
- [62] L. A. GalΓ‘rraga et al. Amie: association rule mining under incomplete evidence in ontological knowledge bases. In Proceedings of the 22nd international conference on World Wide Web, pp. 413β422, 2013.
- [63] C. Meilicke et al. Anytime bottom-up rule learning for knowledge graph completion. In IJCAI, pp. 3137β3143, 2019.
- [64] S. Ott et al. SAFRAN: an interpretable, rule-based link prediction method outperforming embedding models. In 3rd Conference on Automated Knowledge Base Construction (AKBC), 2021.
- [65] A. Nandi et al. Simple augmentations of logical rules for neuro-symbolic knowledge graph completion. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL), pp. 256β269, 2023.
- [66] J. Guo et al. A unified joint approach with topological context learning and rule augmentation for knowledge graph completion. In Findings of the Association for Computational Linguistics, pp. 13686β13696, 2024.
- [67] K. Teru et al. Inductive relation prediction by subgraph reasoning. In International Conference on Machine Learning, pp. 9448β9457, 2020.
- [68] K. Sun et al. Incorporating multi-level sampling with adaptive aggregation for inductive knowledge graph completion. ACM Transactions on Knowledge Discovery from Data, 2024.
- [69] C. Meilicke et al. Fine-grained evaluation of rule-and embedding-based systems for knowledge graph completion. In 17th International Semantic Web Conference, pp. 3β20, 2018.
- [70] S. Mai et al. Communicative message passing for inductive relation reasoning. In Thirty-Fifth AAAI Conference on Artificial Intelligence, pp. 4294β4302, 2021.
- [71] J. Chen et al. Topology-aware correlations between relations for inductive link prediction in knowledge graphs. In Thirty-Fifth AAAI Conference on Artificial Intelligence, pp. 6271β6278, 2021.
- [72] Y. Pan et al. A symbolic rule integration framework with logic transformer for inductive relation prediction. In Proceedings of the ACM Web Conference, pp. 2181β2192, 2024.
- [73] J. Leblay and M. W. Chekol. Deriving validity time in knowledge graph. In Companion of the The Web Conference (WWW), pp. 1771β1776. ACM, 2018.
- [74] R. Goel et al. Diachronic embedding for temporal knowledge graph completion. In The Thirty-Fourth AAAI Conference on Artificial Intelligence, pp. 3988β3995, 2020.
- [75] A. GarcΓa-DurΓ‘n et al. Learning sequence encoders for temporal knowledge graph completion. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 4816β4821, 2018.
- [76] A. Sadeghian et al. Chronor: Rotation based temporal knowledge graph embedding. In Thirty-Fifth AAAI Conference on Artificial Intelligence, pp. 6471β6479, 2021.
- [77] T. Lacroix et al. Tensor decompositions for temporal knowledge base completion. In 8th International Conference on Learning Representations (ICLR), 2020.
- [78] C. Xu et al. Temporal knowledge graph completion using a linear temporal regularizer and multivector embeddings. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), pp. 2569β2578, 2021.
- [79] J. Messner et al. Temporal knowledge graph completion using box embeddings. In Thirty-Sixth AAAI Conference on Artificial Intelligence, pp. 7779β7787, 2022.
- [80] K. Chen et al. Rotateqvs: Representing temporal information as rotations in quaternion vector space for temporal knowledge graph completion. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 5843β5857, 2022.
- [81] T. Trouillon et al. Complex embeddings for simple link prediction. In International Conference on Machine Learning (ICML), volume 48, pp. 2071β2080, 2016.
- [82] W. Jin et al. Recurrent event network: Autoregressive structure inferenceover temporal knowledge graphs. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 6669β6683, 2020.
- [83] C. Zhu et al. Learning from history: Modeling temporal knowledge graphs with sequential copy-generation networks. In Thirty-Fifth AAAI Conference on Artificial Intelligence, pp. 4732β4740, 2021.
- [84] Z. Han et al. Explainable subgraph reasoning for forecasting on temporal knowledge graphs. In 9th International Conference on Learning Representations (ICLR), 2021.
- [85] H. Sun et al. Timetraveler: Reinforcement learning for temporal knowledge graph forecasting. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 8306β8319, 2021.
- [86] N. Li et al. Infer: A neural-symbolic model for extrapolation reasoning on temporal knowledge graph. In The Thirteenth International Conference on Learning Representations (ICLR), 2025.
- [87] E. Cambria et al. Seven pillars for the future of artificial intelligence. IEEE Intelligent Systems, 38(6):62β69, 2023.
- [88] F. Xu et al. Are large language models really good logical reasoners? a comprehensive evaluation and beyond. IEEE Transactions on Knowledge and Data Engineering, 2025.