2202.10706
Model: nemotron-free
Ragib Ahsan
David Arbour
Elena Zheleva
Editors: Bernhard Sch¨ olkopf, Caroline Uhler and Kun Zhang
## Abstract
Causal reasoning in relational domains is fundamental to studying real-world social phenomena in which individual units can influence each other's traits and behavior. Dynamics between interconnected units can be represented as an instantiation of a relational causal model; however, causal reasoning over such instantiation requires additional templating assumptions that capture feedback loops of influence. Previous research has developed lifted representations to address the relational nature of such dynamics but has strictly required that the representation has no cycles. To facilitate cycles in relational representation and learning, we introduce relational Ο -separation, a new criterion for understanding relational systems with feedback loops. We also introduce a new lifted representation, Ο -abstract ground graph which helps with abstracting statistical independence relations in all possible instantiations of the cyclic relational model. We show the necessary and sufficient conditions for the completeness of Ο -AGG and that relational Ο -separation is sound and complete in the presence of one or more cycles with arbitrary length. To the best of our knowledge, this is the first work on representation of and reasoning with cyclic relational causal models.
Keywords:
Representation, Cycles, Relational Causal Model
## 1. Introduction
Causal inference methods from observational data often assume that causal relationships can be represented in an acyclic graphical model. However, many real-world phenomena involve feedback loops or cycles that violate the acyclicity assumption. For example, supply and demand affect price and vice versa, hormone levels in the body affect each other and friends can impact each other's choices. Common to these scenarios is that these interactions occur over time but any individual directed relationship may not be observed. Instead what is observed is a set of values that are result of long term individual interactions. This current state can be represented through a cycle.
The existing works on cyclic causal models primarily focus independently and identically distributed (i.i.d) data instances (Richardson, 1996, 1997; Strobl, 2019; Rantanen et al., 2020). However, in many real-world systems units are often interconnected in a complex network. Causal reasoning over such relational systems is central to understanding real-world social phenomena, such as social influence and information diffusion. For example, a well-known study by Christakis and Fowler (2007) investigates whether obesity is contagious in a population where each person's eating habits can affect their friends and family and vice-versa. Similarly, several real-world problems in epidemiology and computational social science encounter such mutual influence and interaction
## Relational Causal Models with Cycles: Representation and Reasoning
RAHSAN3@UIC.EDU
ARBOUR@ADOBE.COM
EZHELEVA@UIC.EDU
Figure 1: Example of relational model with and without feedback loop. Rectangle, rhombus and oval shapes represent entity, relationship and attributes respectively. Arrows refer to relational dependence. The solid arrows constitute an acyclic relational model. The dashed arrow creates a feedback loop in the model.
<details>
<summary>Image 1 Details</summary>

### Visual Description
## Flowchart: User-Post-Media Interaction Model
### Overview
The diagram illustrates a cyclical interaction model between users, posts, and media, emphasizing sentiment, engagement, and preference dynamics. Arrows and labeled connections define the flow of influence and feedback.
### Components/Axes
- **Nodes**:
- **USER**: Contains an oval labeled "Sentiment."
- **POST**: Contains an oval labeled "Engagement."
- **MEDIA**: Contains an oval labeled "Preference."
- **Connectors**:
- **Diamonds**:
- Between USER and POST: Labeled "REACTS."
- Between POST and MEDIA: Labeled "CREATES."
- **Dashed Arrow**: From MEDIA back to USER, indicating feedback.
- **Arrows**:
- Solid arrows show directional flow (USER β POST β MEDIA β USER).
### Detailed Analysis
- **USER β POST**:
- "Sentiment" (USER) influences "Engagement" (POST) via the action "REACTS."
- **POST β MEDIA**:
- "Engagement" (POST) drives "Preference" (MEDIA) through "CREATES."
- **MEDIA β USER**:
- "Preference" (MEDIA) feeds back into "Sentiment" (USER) via a dashed arrow, suggesting iterative refinement.
### Key Observations
1. **Unidirectional Flow**: The primary flow is linear (USER β POST β MEDIA), with feedback only from MEDIA to USER.
2. **Feedback Mechanism**: The dashed arrow implies a weaker or indirect relationship compared to solid connections.
3. **Central Role of Engagement**: "Engagement" (POST) acts as a bridge between user sentiment and media preference.
### Interpretation
This model highlights how user sentiment shapes post engagement, which in turn influences media preferences. The feedback loop suggests that media preferences refine user sentiment over time, creating a self-reinforcing cycle. The use of "REACTS" and "CREATES" emphasizes active user participation, while the dashed feedback arrow may represent latent or indirect influences (e.g., algorithmic adjustments).
No numerical data or trends are present; the diagram focuses on conceptual relationships rather than quantitative metrics.
</details>
among human subjects. For example, effect of vaccination in mother and child (Shpitser, 2015; VanderWeele et al., 2012), and justices influencing each other in supreme court decisions (Ogburn et al., 2020) are all examples of mutual influence. Studying such phenomena requires causal reasoning over complex interactions between interconnected entities. However, causal questions of mutual influence in a relational system can be hard to answer due to the lack of appropriate causal representations and reasoning mechanisms.
One way to represent interference is through SCMs which capture pairwise relationships (Ogburn et al., 2014; Shalizi and Thomas, 2011), where one individual's influence on another is represented independently. Such representation is easy to reason about but it assumes that pairs of nodes are independent of other pairs of nodes. Unfortunately, this does not reflect the properties of real-world social networks in which nodes are arbitrarily interconnected with each other.
The development of relational causal models , which generalize over structural causal models, is an important step towards capturing interactions between non-i.i.d instances (Maier et al., 2013b,a; Lee and Honavar, 2015; Bhattacharya et al., 2020). Relational models involve multiple types of interacting entities with probabilistic dependencies among their attributes. Maier et al. (2013b) developed a lifted causal representation named abstract ground graph (AGG) that abstracts over all instantiations of a relational model. AGG enables reasoning about causal queries in relational causal models and relational causal discovery. However, existing relational causal models assume acyclicity and do not allow for reasoning about identification in the presence of feedback loops.
Figure 1 shows an example of a relational model where users react to news articles or posts on a specific topic (e.g., vaccines) generated by media agencies. The preference of the media regarding a topic (e.g., pro- vs anti-vaccination) is influenced by the engagement or feedback (e.g., positive/negative comments) it receives on its existing posts. The sentiment of a user towards a given post directly impacts their engagement in the post. The relational model representation tempts one to conclude that the sentiment of users regarding vaccination is independent of the preference of media agencies given engagements of the posts the users react to. However, as we show in section 3.5, this is not necessarily true. Moreover, users' sentiment can also be impacted by the engagements in posts they interact with. The dashed arrow in Figure 1 represents such a dependency which makes the model a cyclic one. Unfortunately, even though the model seems simple and realistic, the abstract ground graph (AGG) representation and relational d -separation no longer apply due to the presence of a feedback loop (i.e., a cycle).
In this work, we specifically study cyclic RCMs and show that they offer the necessary representation to reason about a lot of real-world causal problems where popular assumptions do not hold. We do not assume the Stable Unit Treatment Value Assumption (SUTVA), of causal inference, according to which the outcome of each unit depends only on the unit itself (Rubin, 1980).
This Assumption is violated in relational systems where the treatment and outcome of one unit can impact other units' outcomes. To the best of our knowledge, this is the first work that addresses representation of and reasoning about cyclic relational causal models. We define a new abstract representation, Ο -abstract ground graph ( Ο -AGG) which generalizes over cyclic relational models. In order to reason about relational queries in Ο -AGG, we introduce relational Ο -separation and provide proofs for its soundness and completeness for all instantiations of a relational model. The implications of this new representation are twofold. First, it can lead the way for reasoning about causal effects like interference and contagion in relational systems, which are of wide interest in the social sciences. Second, the abstract representation and its identified properties will lay a foundation for causal structure learning from relational models with cycles. We show the sufficient conditions for the completeness of Ο -AGG by first resolving an open problem on the completeness of AGG. Finally, we discuss the Markov condition of relational Ο -separation and its implications.
## 2. Related Work
Maier et al. (2013b) proposed a sound and complete abstract representation for relational data with multiple entities and relationships. They defined an abstract representation of relational dependence called abstract ground graph based on the assumption that the underlying relational model is acyclic. Moreover, they introduced relational d -separation, a new criterion to reason about statistical independence in all the instantiations of a relational model. They proved the soundness and completeness of relational d -separation. The AGG and associated relational d -separation criterion together allows a formal basis of causal analysis in relational systems. The same authors proposed a constraint-based causal structural learning algorithm called RCD (Maier et al., 2013a) which is first of its kind.
Lee and Honavar (2015) presented a critical view on the generalizability of AGG where they constructed a counterexample which shows how AGG can fail to abstract all d -separation relationships in the ground graphs. They also identified a shortcoming of the necessary conditions for considering a intersection variable and provided a revised set of conditions. In a different work, Lee and Honavar (2020) proposed a new abstract representation of relational causal models called the unrolled graph which guarantees a weaker sense of completeness. However, neither of the two groups of work moved away from the assumption of acyclicity. As a result their developments are not directly applicable to relational models with cycles or feedback loops. Alternatively, chain graphs have been used for representing systems with pairwise feedback loops (Lauritzen and Richardson, 2002; Dawid, 2010; Ogburn et al., 2020). There is no prior work that focuses on representation and reasoning of relational models with cycles or feedback loops of arbitrary length.
Existing research on representation and reasoning about systems with cycles is based on propositional variables, not relational ones. Spirtes (1995) showed that, in general case, without any specific assumption regarding the nature of dependence (linear, polynomial etc.), the d -separation relations in a directed cyclic graph are not sufficient to entail all the corresponding conditional independence relations. ForrΒ΄ e and Mooij (2017) introduced Ο -separation, an alternative formulation for which the corresponding Markov property is shown to hold in a very general setting. According to them, the Ο -separation Markov property seems appropriate for a wide class of cyclic structural causal models with non-linear functional relationships between non-discrete variables.
## 3. Preliminaries
In this section we introduce the necessary terminology to discuss representation and abstraction of cyclic relational models.
## 3.1. Directed Cyclic Graphs
ADirected Cyclic Graph (DCG) is a graph G = γV , Eγ with nodes V and edges E β { ( u, v ) : u, v β V, u = v } where ( u, v ) is an ordered pair of nodes. We will denote a directed edge ( u, v ) β E as u β v or v β u , and call u a parent of v . In this work, we restrict ourselves to DCG as the causal graphical model. A walk between two nodes u, v βV is a tuple γ v 0 , e 1 , v 1 , e 2 , v 2 , ..., e n , v n γ of alternating nodes and edges in G ( n β₯ 0) , such that v 0 , ..., v n β V , and e 1 , ..., e n β E , starting with node v 0 = u and ending with node v n = v where the edge e k connects the two nodes v k -1 and v k β G for all k = 1 , ..., n . If the walk contains each node at most once, it is called a path . A directed walk (path) from v i β V to v j β V is a walk (path) between v i and v j such that every edge e k on the walk (path) is of the form v k -1 β v k , i.e., every edge is directed and points away from v i .
We get the ancestors of node v j by repeatedly following the path(s) through the parents: AN G ( v j ) := { v i β V : v i = v 0 β v 1 β ... β v n = v j β G} . Similarly, we define the descendants of v i : DE G ( v i ) := v j β V : v i = v 0 β v 1 β ... β v n = v j β G . Each node is an ancestor and descendant of itself. A directed cycle is a directed path from v i to v j such that in addition, v j β v i β E . All nodes on directed cycles passing through v i β V together form the strongly connected component SC G ( v i ) := AN G ( v i ) β© DE G ( v i ) of v i .
## 3.2. Ο -separation
The idea of Ο -separation follows from d -separation, a fundamental notion in DAGs which was first introduced by Pearl (1988):
Definition 1 ( d -separation) A walk γ v 0 ...v n γ in DCG G = γV , Eγ is d -separated by C β V if:
1. its first node v 0 β C or its last node v n β C , or
2. it contains a collider v k / β AN G ( C ) , or
3. it contains a non-collider v k β C .
If all paths in G between any node in set A β V and any node in set B β V are d -blocked by a set C β V , we say that A is d -separated from B by C , and we write A d | = G B | C .
d -separation exhibits the global Markov property in DAGs which states that if two variables X and Y are d -separated given another variable Z in a DAG representation then X and Y are conditionally independent given Z in the corresponding distribution of the variables. However, Spirtes (1995); Neal (2000) show that without any specific assumption regarding the nature of dependence (i.e. linear, polynomial), the d -separation relations are not sufficient to entail all the corresponding conditional independence relations in a DCG. In a recent work, an alternative formulation called Ο -separation is introduced which holds for a very general graphical settings (ForrΒ΄ e and Mooij, 2017).
Here, we consider a simplified version of the formal definition of Ο -separation:
Definition 2 ( Ο -separation)
A walk γ v 0 ...v n γ in DCG G = γV , Eγ is Ο -blocked by C β V
(ForrΒ΄ e and Mooij, 2017) if:
1. its first node v 0 β C or its last node v n β C , or
2. it contains a collider v k / β AN G ( C ) , or
3. it contains a non-collider v k β C that points to a node on the walk in another strongly connected component (i.e., v k -1 β v k β v k +1 with v k +1 / β SC G ( v k ) , v k -1 β v k β v k +1 with v k -1 / β SC G ( v k ) or v k -1 β v k β v k +1 with v k -1 / β SC G ( v k ) or v k +1 / β SC G ( v k ) ).
If all paths in G between any node in set A β V and any node in set B β V are Ο -blocked by a set C β V , we say that A is Ο -separated from B by C , and we write A Ο | = G B | C .
ForrΒ΄ e and Mooij (2017) show that the global directed Markov property 1 holds for Ο -separation in directed cyclic graphs, unlike d -separation. This enables Ο -separation to perform a similar role to d -separation but for directed cyclic graphs. Ο -separation is a generalization of d -separation.
## 3.3. Ο -faithfulness
Ο -faithfulness refers to the property which states that all statistical dependencies found in the distribution generated by a given causal structure model is entailed by the Ο -separation relationships.
Definition 3 ( Ο -faithfulness) Given X A , X B , X C as the distributions of variables A , B , C respectively in solution X of a causal model M , Ο -faithfulness states that if X A and X B are conditionally independent given X C , then A and B are Ο -separated by C in the corresponding possibly cyclic graphical model G of M .
## 3.4. Relational Causal Models (RCM)
We adopt the definition of relational causal model used by previous work on relational causal discovery (Maier et al., 2013a; Lee and Honavar, 2020). We denote random variables and their realizations with uppercase and lowercase letters respectively, and bold to denote sets. We use a simplified Entity-Relationship model to describe relational data following previous work (Heckerman et al., 2007). A relational schema S = γ E , R , A , card γ represents a relational domain where E , R and A refer to the set of entity, relationship and attribute classes respectively. It includes a cardinality function that constrains the number of times an entity instance can participate in a relationship. Figure 1 shows an example relational model that describes a simplified user-media engagement system. The model consists of three entity classes (User, Post, and Media), and two relationship classes (Reacts and Creates). Each entity class has a single attribute. The cardinality constraints are shown with crow's feet notation- a user can react to multiple posts, multiple users can react to a post, a post can be created by only a single media entity.
A relational skeleton s is an instantiation of a relational schema S , represented by an undirected graph of entities and relationships. Figure 2( a ) shows an example skeleton of the relational model from Figure 1. It shows that Alice and Bob both react to post P1. Alice also reacts to post P2. P1
1. Definition given in appendix
Figure 2: Fragments of a relational skeleton, ground graph, and abstract ground graph corresponding to the relational causal model from Figure 1. The arrows represent relational dependencies.
<details>
<summary>Image 2 Details</summary>

### Visual Description
## Diagram: User Interaction Model in Social Media Context
### Overview
The image contains three interconnected diagrams (a, b, c) illustrating a user interaction model involving sentiment analysis, engagement, and preference formation in a social media environment. The diagrams use labeled nodes, directional arrows, and set operations to represent relationships between users, posts, media, and system components.
### Components/Axes
**Diagram (a):**
- **Nodes:**
- `ALICE.Sentiment`, `BOB.Sentiment` (user sentiments)
- `P1.Engagement`, `P2.Engagement` (platform engagement metrics)
- `M1.Preference` (media preference)
- `P1`, `P2`, `MI` (platform/user interaction nodes)
- **Arrows:**
- `REACTS`, `CREATES`, `SLOWS`, `SLIDER` (interaction types)
- Set operations: `β©` (intersection), `βͺ` (union)
**Diagram (b):**
- **Nodes:**
- `Alice.Sentiment`, `Bob.Sentiment`
- `P1.Engagement`, `M1.Preference`
- **Arrows:**
- Direct connections between sentiment β engagement β preference
**Diagram (c):**
- **Nodes:**
- `[USER]`, `[SENTIMENT]`, `[REACTS]`, `[POST]`, `[CREATES]`, `[MEDIA]`, `[PREFERENCE]`
- **Arrows:**
- Process flows: `USER β REACTS β POST β CREATES β MEDIA β PREFERENCE`
- Set operations: `β©`, `βͺ`
### Detailed Analysis
**Diagram (a):**
- **Structure:**
- Users (ALICE, BOB) generate sentiments that influence platform engagement (`P1`, `P2`).
- Engagement metrics (`P1.Engagement`, `P2.Engagement`) are created through interactions (`REACTS`, `CREATES`).
- `M1.Preference` is derived from engagement metrics via set operations (`β©`, `βͺ`).
- **Key Relationships:**
- `ALICE.Sentiment` and `BOB.Sentiment` β `P1.Engagement` (via `REACTS`).
- `P1.Engagement` β `M1.Preference` (via `CREATES`).
**Diagram (b):**
- **Simplified Flow:**
- `Alice.Sentiment` and `Bob.Sentiment` directly influence `P1.Engagement`.
- `P1.Engagement` determines `M1.Preference`.
- **Notable:** Lacks explicit user labels (e.g., `ALICE`, `BOB`) but retains sentiment and engagement labels.
**Diagram (c):**
- **Process Flow:**
1. `[USER]` generates `[SENTIMENT]`.
2. `[USER]` `[REACTS]` to `[POST]`, which `[CREATES]` `[MEDIA]`.
3. `[MEDIA]` contributes to `[PREFERENCE]` via set operations (`β©`, `βͺ`).
4. `[USER]` `[REACTS]` and `[POST]` again to form `[Engagement]`.
- **Complexity:** Explicitly models iterative user actions and their cumulative impact on engagement and preference.
### Key Observations
1. **Sentiment β Engagement β Preference:** All diagrams emphasize a causal chain from user sentiment to platform engagement and ultimately to preference formation.
2. **Set Operations:** Diagram (c) uses `β©` and `βͺ` to model how multiple user actions (e.g., reactions, posts) combine to shape preferences.
3. **Platform vs. User Roles:** Diagrams (a) and (c) distinguish between platform components (`P1`, `P2`, `MI`) and user actions (`USER`, `REACTS`, `POST`).
4. **Simplification in (b):** Diagram (b) abstracts away user-specific labels, focusing on high-level sentiment-engagement-preference relationships.
### Interpretation
The diagrams collectively model how user-generated content and interactions in social media platforms drive engagement metrics and shape preferences. Key insights:
- **User Agency:** Users (`ALICE`, `BOB`) actively shape platform dynamics through sentiment and actions (`REACTS`, `POST`).
- **Platform Feedback Loops:** Engagement metrics (`P1.Engagement`, `P2.Engagement`) are both influenced by and influence user preferences (`M1.Preference`).
- **Media as Mediator:** Diagram (c) highlights media creation as a critical intermediary between user actions and preference formation.
- **Set Theory Application:** The use of `β©` and `βͺ` suggests preferences are aggregated from diverse user interactions, not isolated events.
This model could inform algorithms for content recommendation, sentiment-driven engagement optimization, or preference-based media curation in social platforms.
</details>
and P2 both are created by media M1. There could be infinitely many possible skeletons for a given RCM. We denote the set of all skeletons for schema S as β S .
Given a relational schema, we can specify relational paths, which intuitively correspond to ways of traversing the schema. For the schema shown in Figure 1, possible paths include [ User, Reacts, Post ] (the posts a user reacts to), as well as [ User, Reacts, Post, Reacts, User ] (other users who react to the same post). Relational variables consist of a relational path and an attribute. For example, the relational variable [ User, Reacts, Post ] .Engagement corresponds to the overall engagements of the post that a user reacts to. The first item (i.e. User ) in the relational path corresponds to the perspective of the relational variable. A terminal set, P | i k is the terminal item on the relational path P = [ I j , ..., I k ] consisting of instances of class I k β E βͺ R .
A relational causal model M = γS , Dγ , is a collection of relational dependencies defined over schema S . Relational dependencies consist of two relational variables, cause and effect. As an example, consider the following relational dependency [ Post, Reacts, User ] .Sentiment β [ Post ] .Engagement which states that the engagement of a post is affected by the actions of users who react on that post. In Figure 1, the arrows represents relational dependencies. Note that, all causal dependencies are defined with respect to a specific perspective.
## 3.5. Ground Graph and Abstract Ground Graph
A realization of a relational model M with a relational skeleton is referred to as the ground graph GG M . It is a directed graph consisting attributes of entities in the skeleton as nodes and relational dependencies among them as edges. A single relational model is actually a template for a set of possible ground graphs based on the given schema. A ground graph has the same semantic as a graphical model. Given a relational model M and a relational skeleton s , we can construct a ground graph GG M s by applying the relational dependencies as specified in the model to the specific instances of the relational skeleton. Figure 2( b ) shows the ground graph for the relational model from
Figure 1. The relational dependencies present in the given RCM may temp one to conclude a conditional independence statement: [ User ] .Sentiment | = [ Media ] .Preference | [ Post ] .Engagement . However, when the model is unrolled in a ground graph we see the corresponding statement is not true (i.e. [ Bob ] .Sentiment β₯ β₯ [ M 1] .Preference | [ P 1] .Engagement ) since there is an alternative path through [ Alice ] .Sentiment and [ P 2] .Engagement which is activated when conditioned on [ P 1] .Engagement . This shows why generalization over all possible ground graphs is hard.
An abstract ground graph (AGG) is an abstract representation that solves the problem of generalization by capturing the consistent dependencies in all possible ground graphs and representing them as a directed graph. AGGs are defined for a specific perspective and hop threshold , h . Hop threshold refers to the maximum length of the relational paths allowed in a specific AGG. There are two types of nodes in AGG, relational variables and intersection variables. Intersection variables are constructed from pairs of relational variables with non-empty intersections (Maier et al., 2013b). For example, [ User, Reacts, Post ] refers to the set of posts a user reacts to whereas [ User, Reacts, Post, Reacts, User, Reacts, Post ] refers to the set of other posts reacted by other users who also reacted to the same post as the given user. These two sets of posts can overlap which is reflected by the corresponding intersection variable. Edges between a pair of nodes of AGG exist if the instantiations of those constituting relational variables contain a dependent pair in all ground graphs. We define W as the set of nodes augmented with their corresponding intersection nodes for the set of relational variables W : W = W βͺ β W β W { W β© W β² | W β© W β² is an intersection node in AGG M s } . Figure 2( c ) presents the AGG from the perspective of User and with h = 6 corresponding to the model from Figure 1. The AGG shows that the sentiment of a user is no longer independent of media preference given just engagements of the corresponding posts the user reacts to. We also need to condition on the sentiment of other users who reacted to the same post.
## 3.6. Relational d-separation
Relational model describes a template for many possible instantiations of a relational schema. In order to reason about conditional independence facts entailed in all instances of a given relational template, Maier et al. (2013b) develop a relational counterpart for d -separation criteria. Two sets of relational variables X and Y from a given perspective are said to be d -separated by another set Z if and only if the terminal sets of X and Y are d -separated by the terminal set of Z from the given perspective in all possible ground graphs of the given model. Maier et al. (2013b) introduce AGG as a means to reason about relational d -separation queries from a given perspective. The soundness and completeness of relational d -separation for AGG relies on the following assumptions:
## A 1 The relational model is acyclic.
## A 2 There are no unobserved confounders in the relational model.
Here, soundness refers to the fact that any d -separation relationship found in AGG implies corresponding d -separation relationship in all ground graphs it represents whereas completeness claims that the d -separation facts that hold across all ground graphs are also entailed by d -separation on the AGG. The soundness of relational d -separation under AGG is already proved by Maier et al. (2013b). However, the conditions under which completeness holds have been an open question since Lee and Honavar (2015) show that the initial formulation of Maier et al. (2013b) is not complete. We resolve this question in Section 4.4 and show that AGG is also complete under certain realistic assumptions.
Figure 3: Ground graph and Ο -AGG for the cyclic relational model shown in Figure 1.
<details>
<summary>Image 3 Details</summary>

### Visual Description
## Diagram Analysis: User Engagement and Sentiment Flow
### Overview
The image contains two interconnected diagrams (a) and (b) illustrating relationships between user sentiment, engagement, preferences, and content creation. Diagram (a) shows a simplified flow, while diagram (b) details a complex feedback loop system.
### Components/Axes
**Diagram (a):**
- **Nodes (Ovals):**
- `Alice.Sentiment`
- `Bob.Sentiment`
- `P1.Engagement`
- `P2.Engagement`
- `M1.Preference`
- **Edges (Arrows):**
- Bidirectional arrows between `Alice.Sentiment` and `P1.Engagement`
- Bidirectional arrows between `Bob.Sentiment` and `P1.Engagement`
- Unidirectional arrows from `P1.Engagement` to `P2.Engagement`
- Unidirectional arrows from `P2.Engagement` to `M1.Preference`
**Diagram (b):**
- **Nodes (Rectangles):**
- `[USER].Sentiment`
- `[USER, REACTS, POST].Engagement`
- `[USER, REACTS, POST, CREATES, MEDIA].Preference`
- `[USER, REACTS, POST, CREATES, MEDIA, CREATES, POST].Engagement`
- `[USER, REACTS, POST, REACTS, USER].Sentiment`
</details>
## 4. Relational Systems With Cycles
In this section, we develop definitions for cyclic relational causal models (RCM) and an abstract representation that allows causal reasoning and discovery. We introduce a criterion for abstracting cyclic RCMs and provide proofs for its correctness.
## 4.1. Cyclic RCM
An RCM is cyclic when the set of relational dependencies form one or more arbitrary length cycles.
Definition 4 (Cyclic RCM) A relational model M = ( S , D ) is said to be cyclic if the set of relational dependencies D constructs one or more directed cycles of arbitrary length.
Cycles in RCM represent equilibrium states among a set of nodes in the ground graphs. It implies that the ground graphs are no longer guaranteed to be DAGs. In order to facilitate this, we propose a revised definition of relational dependency provided by Maier et al. (2013b) by relaxing the restriction of having different attribute classes for cause and effect.
Definition 5 (Relational Dependency) A relational dependency [ I j , ..., I k ] .X β² β [ I j ] .X is a directed probabilistic dependence from any attribute class X β² to X through the relational path [ I j , ..., I k ] such that I j , ..., I k β E βͺ R , X,X β² β A . Note that it is possible to have X = X β² .
Figure 1 shows an example relational model with cyclic dependencies (i.e. dashed arrows). We see a pair of dependencies [ Post, Reacts, User ] .Sentiment β [ Post ] .Engagement and [ Post ] .Engagement β [ Post, Reacts, User ] .Sentiment which are inverse to each other and form a feedback loop. However, this mere feedback loop prohibits the use of AGG to answer relational causal query that asks whether a user's Sentiment about a post they reacted to is independent of the preference of media given the posts. Unfortunately, the work by Maier et al. (2013b) is not sufficient to reason about conditional independence relationships in the ground graphs of such relational models since it contain cycles. This motivates us to introduce a new criterion that enables the abstraction of relational queries with cyclic dependencies over all ground graphs.
## 4.2. Relational Ο -separation
Conditional independence facts are only useful when they hold across all ground graphs that are consistent with the model. Maier et al. (2013b) show that relational d -separation is sufficient to achieve that for acyclic models. However, such abstraction is not possible for cyclic models since the correctness of d -separation is not guaranteed for cyclic graphical models (Spirtes, 1995; Neal,
2000). In this work, we propose the following definition of relational Ο -separation specifically for cyclic relational models:
Definition 6 (Relational Ο -separation) Let X , Y , and Z be three distinct sets of relational variables with the same perspective B β E βͺ R defined over relational schema S . Then, for relational model structure M , X and Y are Ο -separated by Z if and only if, for all skeletons s β β S , X | b and Y | b are Ο -separated by Z | b in ground graph GG M s for all instances b β s ( B ) where s ( B ) refers to the instances of B in skeleton s .
The definition directly follows from the definition of relational d -separation. If there exists even one skeleton and faithful distribution represented by the relational model for which X β₯ β₯ Y | Z , then X | b and Y | b are not Ο -separated by Z | b for b β s ( B ) .
## 4.3. Ο -Abstract Ground Graph
We refer to the lifted representation for cyclic RCMs as Ο -abstract ground graph or Ο -AGG. A Ο -AGG is constructed using the same extend method used to construct AGG (Maier et al., 2013b).
Definition 7 ( Ο -Abstract Ground Graph) An abstract ground graph Ο -AGG M = ( V, E ) for relational model structure M = ( S , D ) , perspective B β E βͺ R , and hop threshold h β N 0 is a directed graph that abstracts the dependencies D for all ground graphs GG M s , where s β β S . The Ο -AGG M s is a directed cyclic graph with the following nodes and edges:
1. V = RV βͺ IV , where
2. (a) RV is the set of relational variables with a path of length at most h + 1.
3. (b) IV are intersection variables between pairs of relational variables that could intersect
2. E = RVE βͺ IVE , where
5. (a) RVE β RV Γ RV are the relational variable edges
6. (b) IVE β ( IV Γ RV ) βͺ ( RV Γ IV ) are the intersection variable edges. This is the set of edges that intersection variables 'inherit' from the relational variables that they were created from
Since the construction of an AGG and a Ο -AGG M s is identical, they share mostly identical properties as defined by Maier et al. (2013b) for AGG. The main difference being the existence of cycles. Consequently, the goal of Ο -AGG M s is to reason about relational Ο -separation queries instead of relational d -separation. Figure 3( b ) shows the Ο -AGG M s corresponding to the cyclic RCM in Figure 1 with a pairwise feedback loop. It is similar to the AGG in Figure 2( c ) but allows cycles without violating the conditional independence statements under Ο -separation which are otherwise undefined with d -separation.
## 4.4. Soundness and Completeness of Relational Ο -separation
In order to discuss soundness and completeness of relational Ο -separation we first address the open problem of necessary conditions for the completeness of relational d -separation. Previous work has shown that the original claim of completeness of relational d -separation by Maier et al. (2013a)
cannot be guaranteed for any relational model (Lee and Honavar, 2015). A counterexample has been developed as well. In this work, we show that relational d -separation is complete under the following assumption:
## A 3 The degree of any entity in the relational skeleton is greater than 1.
Note that, this assumption is about the topology of the ground graphs, i.e., the network which defines how entities are connected to each other. It only allows entities that are connected to at least two other entities. For example, if the entities are users in a social network, the framework would only consider users who have degree at least two, i.e., are connected to at least two other users. While this restricts the space of graph topologies allowable under the results in this work, many networks observed in real world domains, such as social networks, have minimum degree greater than one. We introduce the following lemma that establishes sufficient conditions for AGGs to be realizable in ground graphs. This result may be of independent interest, since it provides sufficient conditions for soundness in the original presentation of relational d -separation under additional assumptions (Maier et al., 2013b; Lee and Honavar, 2015).
Lemma 1 Under assumption 3, every abstract ground graph can be realized as a ground graph. That is, for every acyclic relational model M and skeleton s β β S any relational variable in AGG M s has non-empty terminal sets in some ground graph GG M s .
## Proof
We first consider the conditions under which empty terminal sets can occur, resulting in an AGG that is unrealizable in the ground graphs. There are two necessary and sufficient conditions for empty terminal sets to appear in all ground graphs corresponding to an AGG. First, there must be at least one intersection variable present in the AGG. If no intersection variable exist in the AGG, then the completeness proof of relational d -separation by Maier et al. (2013b) holds. The second condition is that the intersection must be on a path consisting of only one-to-one relationships. In order to understand this condition, lets look at an example with the following relational paths from a hypothetical relational model which is a generalization of the counterexample given by Lee and Honavar (2015) 2 :
$$\begin{array} { r l } & { \bullet P = [ E _ { i } , \dots , R _ { j } , \dots , E _ { k } ] } & { \bullet Q = [ E _ { i } , \dots , R _ { j } , \dots , R _ { m } , E _ { k } , \dots , E _ { q } ] } \\ & { \bullet S = [ E _ { i } , \dots , R _ { j } , \dots , R _ { m } , \dots , E _ { s } ] } & { \bullet S ^ { \prime } = P + [ R _ { m } , \dots , E _ { s } ] } \end{array}$$
where E i , E k , E q , E s are some entity classes, R j , R m are relationship classes, ' . . . ' are arbitrary valid sequences of entities and relationships, and + represents the concatenation of relational paths. Let's assume two relational dependencies exist in the given model, P.X β Q.Y and S.Z β Q.Y where X,Y,Z are attributes of corresponding entity classes. By definition, the corresponding edges P.X β Q.Y , S.Z β Q.Y appear in the AGG. Since S and S β² are intersectable an additional edge S.Z β© S β² .Z β Q.Y also appears in the AGG. Such a model can be realized in many possible ground graphs. However, if we restrict the relationships to be strictly one-to-one, then there is only one skeleton structure possible to satisfy the relational dependencies at the cost of S β² .Z having empty
2. The complete counterexample and figure explaining it is given in the Appendix
terminal sets since an intance of R m can connect to only one instance of E k . If we allow many-tomany relationships then we can always construct a skeleton where an instance of R m connects to two instances of E k to produce non-empty terminal sets for both Q and S β² .
Since assumption 3 prohibits the second condition, it essentially implies that any relational variable in AGG M s results into non-empty terminal sets in corresponding ground graphs for every acyclic relational model M and skeleton s β β S which completes the proof for Lemma 1.
The following proposition establishes the completeness of AGG for relational d -separation under the assumption of a minimum degree greater than 1.
Proposition 1 AGG is sound and complete for relational d -separation under assumption 3.
Proof Following lemma 1, the original proof of soundness and completeness of relational d -separation by Maier et al. (2013b) directly applies which proves proposition 1.
The correctness of our approach to relational Ο -separation relies on several facts which are similar to the case for AGG: (1) Ο -separation is valid for directed cyclic graphs; (2) ground graphs are directed cyclic graphs; and (3) Ο -AGGs are directed cyclic graphs that represent exactly the edges that could appear in all possible ground graphs. Note that we no longer need assumption 1, but assumptions 2 and 3 are adopted from relational d -separation. Using the previous definitions and lemmas, the following additional assumptions and sequence of results proves the correctness of our approach to identifying independence in cyclic relational models.
A 4 The given cyclic relational model structure is Ο -faithful.
Theorem 1 The rules of Ο -separation are sound and complete for cyclic directed graphs.
Proof ForrΒ΄ e and Mooij (2017) show that for quite general structural equation models HEDGes 3 always follow a directed global Markov property based on Ο -separation which completes the proof for soundness since directed cyclic graphs are subsets of HEDGes. The completeness claim is already covered by Assumption 4.
Theorem 2 For every cyclic RCM M = ( S , D ) and skeleton s β β S such that relational variables involved in D are non-empty, the ground graph GG M s is a cyclic directed graph.
## Proof
Let's assume for contradiction that there exists an acyclic ground graph g which is a realization of a given cyclic RCM M = ( S , D ) and skeleton s β β S . According to the definition of ground graphs, the edges of ground graphs are directly constructed based on the relational dependencies of the model. Definition 4 states that a cyclic RCM consists of one or more cycles formed by the relational dependencies. Assume a cycle in cyclic RCM is formed by the pair of relational dependencies as follows: a) P.j β Q.k , and b) Q.k β P.j where P and Q are relational paths from some perspective b and i, j refers to two attribute classes. By construction of g there must be two nodes a, b in g corresponding to P.j and Q.k respectively. Moreover, the definition of g requires two edges a β b and b β a to be present in the ground graph. But such edges constructs a cycle which is contradictory to the initial claim. Thus, the ground graph g must be cyclic.
3. The definition of HEDG is provided in the Appendix
Theorem 3 For every cyclic relational model structure M and perspective B β E βͺ R , the Ο -AGG M s is sound and complete for all ground graphs GG M s with skeleton s β β S .
The proof follows from the proof of soundness and completeness of AGG (Maier et al., 2013b). The complete proof is provided in the Appendix.
Theorem 4 The abstract ground graph Ο -AGG M s is a cyclic directed graph if and only if the underlying relational model structure is cyclic.
## Proof
Let M be an arbitrary (possibly) cyclic relational model structure, and let B β E βͺ R be an arbitrary perspective. It is clear by Definition 7 that every edge in the abstract ground graph Ο -AGG M s is directed by construction. Assume for contradiction that no cycles exists in Ο -AGG M s even if the relational dependencies form one or more cycles. Now assume the following two dependencies are part of the given relational model M : 1. [ I j , ..., I k ] .X β [ I j ] .Y β D , 2. [ I j ] .Y β [ I j , ..., I k ] .X β D where I j , ..., I k β E βͺ R . By Definition 7, all edges inserted in Ο -AGG M s are drawn from some dependency in M and edges in Ο -AGG M s are constructed for all the dependencies in D . As a result there must be corresponding edges in the Ο -AGG M s for both dependencies that form a cycle, which contradicts the assumption.
Now, assume that a Ο -AGG M s is acyclic even if the underlying RCM is cyclic. Using the same argument as above we can say that the edges in the Ο -AGG M s constructed based on the dependencies in D . If a cycle exists in the Ο -AGG M s it directly implies the existence of a cycle in the RCM which leads to a contradiction. Thus the proof completes from both directions.
Theorem 5 Relational Ο -separation is sound and complete for Ο -AGG. Let M be a (possibly) cyclic relational model structure, and let X , Y , and Z be three distinct sets of relational variables defined over relational schema S . Then, X and Y are Ο -separated by Z on the abstract ground graph Ο -AGG M s if and only if for all skeletons s β β S and for all perspectives b β s ( B ) , X | b and Y | b are Ο -separated by Z | b in ground graph GG M s .
## Proof
We must show that Ο -separation on an abstract ground graph implies Ο -separation on all ground graphs it represents (soundness) and that Ο -separation facts that hold across all ground graphs are also entailed by Ο -separation on the abstract ground graph (completeness). The proof follows from the proof of soundness and completeness of AGG (Maier et al., 2013b).
## Soundness:
Assume that Β― X and Β― Y are Ο -separated by Β― Z on Ο -AGG M s . Assume for contradiction that there exists an item instance b such that X | b and Y | b are not Ο -separated by Z | b in the ground graph GG M s for some arbitrary skeleton s . Then, there must exist a Ο -connecting path p from some x β Β― X | b to some y β Β― Y | b given all z β Β― Z | b . By Theorem 3, Ο -AGG M s is complete, so all edges in GG M s are captured by edges in Ο -AGG M s . So, path p must be represented from some node in { N x | x β N x | b } to some node in { N y | y β N y | b } , where N x , N y are nodes in Ο -AGG M s . If p is Ο -connecting in GG M s , then it is Ο -connecting in Ο -AGG M s , implying that Β― X and Β― Y are not Ο -separated by Β― Z . So, X | b and Y | b must be Ο -separated by Z | b .
## Completeness:
Assume that X | b and Y | b are Ο -separated by Z | b in the ground graph GG M s for all skeletons s for all b β s ( B ) . Assume for contradiction that Β― X and Β― Y are not Ο -separated by Β― Z on Ο -AGG M s . Then, there must exist a Ο -connecting path p for some relational variable X β Β― X to some Y β Β― Y given all Z β Β― Z . By Theorem 3, Ο -AGG M s is sound, so every edge in Ο -AGG M s must correspond to some pair of variables in some ground graph. So, if p is Ο -connecting in Ο -AGG M s , then there must exist some skeleton s such that p is Ο -connecting in GG M s for some b β s ( B ) , implying that Ο -separation does not hold for that ground graph. So, Β― X and Β― Y must be Ο -separated by Β― Z on Ο -AGG M s .
Maier et al. (2013b) show that relational d -separation is equivalent to the Markov condition on acyclic relational models. However, it doesn't hold for cyclic relational model. Here, we show how relational Ο -separation is equivalent to the Markov condition on cyclic relational models.
Definition 8 (Relational Ο -separation Markov Condition) Let X,Y,Z be relational variables for perspective B β E βͺ R defined over relational schema S . For any solution ( X , ) of a relational model M which follows a simple SCM,
$$X \underset { \mathcal { M } } { \overset { \perp } { \| } } Y | Z \implies \pm b { \ m a t h s c r { x } } _ { X } \underset { \mathbb { P } _ { \mathcal { M } } ( \pm b { \ m a t h s c r { x } } ) } { \overset { \perp } { \| } } \pm b { \ m a t h s c r { X } } _ { Y } | \pm b { \ m a t h s c r { X } } _ { Z } , i f a n d o n l y i f$$
$$\underset { G G _ { \mathcal { M } } } { x \overset { \sigma } { \| } } y | z \implies \mathfrak { x } _ { x } ^ { \prime } \underset { \mathbb { P } _ { G G _ { \mathcal { M } } } ( \mathcal { X } ^ { \prime } ) } { \overset { \| } { \perp } } \mathcal { X } _ { y } ^ { \prime } | \mathcal { X } _ { z } ^ { \prime } , f o r \, \forall x \in X | _ { b } , \, \forall y \in Y | _ { b } , \, \forall z \in Z | _ { b }$$
$$X _ { \mathcal { M } } ^ { \underline { \sigma } } | Y | Z \implies x _ { X } _ { \mathbb { P } _ { \mathcal { M } } ( \mathcal { X } ) } \, | x _ { X } \, \underline { \perp }$$
in ground graph GG M s for all skeletons s β β S and for all b β s ( B ) where ( X β² , β² ) refers to the solution of the SCM corresponding to the ground graphs.
In other words, Ο -separation of two relational variables X and Y given a third relational variable Z would imply X and Y are conditionally independent given Z if and only if, for all instances of X , Y , Z in all possible ground graphs, the same condition holds. Since ground graphs of cyclic RCMare directed cyclic graphs and Ο -separation on Ο -AGG M s is sound and complete (by Theorem 5), we can conclude that relational Ο -separation is equivalent to the relational Markov property.
## 5. Conclusion
Cycles or feedback loops are common elements of many real-world system. Unfortunately, it is hardly studied in the field of causal inference primarily because it breaks the nice properties of directed acyclic graphs. As a result, cycles and feedback loops are mostly avoided in the domain of relational causal model. In this study, we take a step forward to bridge this gap by developing an abstract representation and a criterion to reason about statistical relationships in relational models with or without cycles under a general framework. We show that the new criterion called Ο -separation can consistently capture the statistical independence relationships of all possible instantiations of a relational causal model. We believe that this work will open the door for further development including but not limited to causal structure learning of relational models with cycles.
## 6. Acknowledgment
This material is based on research sponsored in part by the Defense Advanced Research Projects Agency (DAPRA) under contract number HR001121C0168 and the National Science Foundation under grant No. 2047899. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies, either expressed or implied, of DARPA or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for governmental purposes notwithstanding any copyright annotation therein.
## References
- Rohit Bhattacharya, Daniel Malinsky, and Ilya Shpitser. Causal inference under interference and network uncertainty. In Uncertainty in Artificial Intelligence , pages 1028-1038. PMLR, 2020.
- N. A. Christakis and J. H. Fowler. The spread of obesity in a large social network over 32 years. NEJM , 357(4):370-379, 2007.
- A Philip Dawid. Beware of the dag! In Causality: objectives and assessment , pages 59-86. PMLR, 2010.
- Patrick ForrΒ΄ e and Joris M Mooij. Markov properties for graphical models with cycles and latent variables. arXiv preprint arXiv:1710.08775 , 2017.
- D. Heckerman, C. Meek, and D. Koller. Probabilistic entity-relationship models, prms, and plate models. Introduction to statistical relational learning , pages 201-238, 2007.
- Steffen L Lauritzen and Thomas S Richardson. Chain graph models and their causal interpretations. Journal of the Royal Statistical Society: Series B (Statistical Methodology) , 64(3):321-348, 2002.
- Sanghack Lee and Vasant Honavar. Towards robust relational causal discovery. In Uncertainty in Artificial Intelligence , pages 345-355. PMLR, 2020.
- Sanghack Lee and Vasant G Honavar. Lifted representation of relational causal models revisited: Implications for reasoning and structure learning. In ACI@UAI , 2015.
- Marc Maier, Katerina Marazopoulou, David Arbour, and David Jensen. A sound and complete algorithm for learning causal models from relational data. arXiv preprint arXiv:1309.6843 , 2013a.
- Marc Maier, Katerina Marazopoulou, and David Jensen. Reasoning about independence in probabilistic models of relational data. arXiv preprint arXiv:1302.4381 , 2013b.
- Radford M Neal. On deducing conditional independence from d-separation in causal graphs with feedback (research note). Journal of Artificial Intelligence Research , 12:87-91, 2000.
- Elizabeth L Ogburn, Tyler J VanderWeele, et al. Causal diagrams for interference. Statistical science , 29(4):559-578, 2014.
- Elizabeth L Ogburn, Ilya Shpitser, and Youjin Lee. Causal inference, social networks and chain graphs. Journal of the Royal Statistical Society: Series A (Statistics in Society) , 183(4):16591676, 2020.
- Judea Pearl. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference . Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1988.
- Kari Rantanen, Antti Hyttinen, and Matti J¨ arvisalo. Discovering causal graphs with cycles and latent confounders: An exact branch-and-bound approach. Int. J. Approx. Reason. , 117:29-49, 2020.
- T Richardson. A discovery algorithm for directed cyclic graphs. In Proceedings of the 12th Conference on Uncertainty in Artificial Intelligence , 1996.
- Thomas S. Richardson. A characterization of markov equivalence for directed cyclic graphs. Int. J. Approx. Reason. , 17:107-162, 1997.
- Donald B Rubin. Randomization analysis of experimental data: The fisher randomization test comment. Journal of the American Statistical Association , 75(371):591-593, 1980.
- Cosma Rohilla Shalizi and Andrew C Thomas. Homophily and contagion are generically confounded in observational social network studies. Sociological methods & research , 40(2):211239, 2011.
- Ilya Shpitser. Segregated graphs and marginals of chain graph models. Advances in Neural Information Processing Systems , 28:1720-1728, 2015.
- Peter L Spirtes. Directed cyclic graphical representations of feedback models. Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence , 1995.
- Eric V. Strobl. A constraint-based algorithm for causal discovery with cycles, latent variables and selection bias. International Journal of Data Science and Analytics , pages 1-24, 2019.
- Tyler J VanderWeele, Eric J Tchetgen Tchetgen, and M Elizabeth Halloran. Components of the indirect effect in vaccine trials: identification of contagion and infectiousness effects. Epidemiology (Cambridge, Mass.) , 23(5):751, 2012.
## Appendix A. Preliminaries
## A.1. Directed Graphs with Hyperedges (HEDGes)
A directed graph with hyperedges or hyperedged directed graph (HEDG) is a tuple G = ( V , E , H ) , where ( V , E ) is a directed graph (with or without cycles) and H a simplicial complex over the set of vertices V of G . A simplicial complex H over V is a set of subsets of V such that: 1) all single element sets { v } are in H for v β V , and 2) if F β H then also all subsets F β² β F are elements of V .
The general directed global Markov property (gdGMP) for the HEDGes is stated as follows:
Definition 9 (gdGMP) For all subsets X,Y,Z β V we have the implication:
$$X _ { \frac { \| } { G } } ^ { \stackrel { \sigma } { \| } } Y | Z \implies X _ { \frac { \| } { P _ { v } } } Y | Z$$
## A.2. Counterexample by Lee and Honavar (2015)
The following counterexample shows that AGG is not complete for relational d -separation.
Example. Let S = γE , R , A , card γ be a relational schema such that: E = { E i } 5 i =1 ; R = { R j } 3 j =1 with R 1 = γ E 1 , E 2 , E 4 γ , R 2 = γ E 2 , E 3 γ , R 3 = γ E 3 , E 4 , E 5 γ ; A = { E 2 : { Y } , E 3 : { X } , E 5 : { Z }} ; and β R βR β E βE card ( R,E ) = one . Let M = γS , D γ be a relational model with
$$\pm b { \ m a t h s c r D } = \{ D _ { 1 } . X \to [ I _ { Y } ] . Y , D _ { 2 } . Z \to [ I _ { Y } ] . Y \}$$
such that D 1 = [ E 2 , R 2 , E 3 , R 3 , E 4 , R 1 , E 2 , R 2 , E 3 ] and D 2 = [ E 2 , R 2 , E 3 , R 3 , E 5 ] . Let P.X,Q.Y,S.Z,S β² .Z be four relational variables of the same perspective B = E 1 where their relational paths are distinct where
- P = [ E 1 , R 1 , E 2 , R 2 , E 3 ] Β· Q = [ E 1 , R 1 , E 4 , R 3 , E 3 , R 2 , E 2 ]
- S = [ E 1 , R 1 , E 4 , R 3 , E 5 ] Β· S β² = [ E 1 , R 1 , E 2 , R 2 , E 3 , R 3 , E 5 ]
Figure 4: Construction of the counterexample by Lee and Honavar (2015). The notations inside the circles/rhombus refer to instances of the corresponding entity/relationship classes which are mentioned outside the shapes. The dashed lines represent relational dependencies. The dotted line represents a hypothetical connection that can nullify the counterexample under assumption 3.
<details>
<summary>Image 4 Details</summary>

### Visual Description
## Diagram: Process Flow Architecture
### Overview
The diagram illustrates a multi-stage process flow with interconnected components, decision nodes, and feedback loops. It includes labeled nodes (e.g., `E_y`, `i_y`, `R_n`), directional arrows, and dashed lines indicating alternative or optional pathways.
### Components/Axes
- **Nodes**:
- **Input/Output Nodes**:
- `E_y`, `E_x`, `E_z`, `E_m`, `E_b` (likely represent events or entities).
- `i_y`, `i'_x`, `i_z`, `i_x`, `e_b` (possibly intermediate states or variables).
- **Process Nodes**:
- `R_n`, `r_m` (diamond-shaped, suggesting decision points or rules).
- **Connections**:
- Solid arrows indicate primary flow (e.g., `E_y β i_y β E_x β i'_x β R_n β i_z β E_z`).
- Dashed lines suggest secondary or conditional pathways (e.g., `E_y β i_y` feedback loop, `i_x β E_m β r_m β E_b`).
### Detailed Analysis
1. **Top Pathway**:
- `E_y` (event) β `i_y` (intermediate state) β `E_x` (event) β `i'_x` (modified state) β `R_n` (decision node) β `i_z` (intermediate state) β `E_z` (event).
- Dashed line from `E_y` back to `i_y` implies a feedback loop or retry mechanism.
2. **Bottom Pathway**:
- `E_x` (event) β `i_x` (intermediate state) β `E_m` (event) β `r_m` (decision node) β `E_b` (event) β `e_b` (final state).
- Dashed line from `i_x` to `E_m` may indicate an optional or parallel process.
3. **Decision Nodes**:
- `R_n` and `r_m` (diamond shapes) likely represent conditional logic (e.g., "if-then" rules) governing transitions between states.
### Key Observations
- **Feedback Loops**: The dashed line from `E_y` to `i_y` suggests iterative processing or error correction.
- **Parallel Paths**: `E_x` splits into two branches: one through `R_n` and another through `r_m`, indicating divergent workflows.
- **Final States**: `E_z` and `e_b` represent terminal outputs, with `e_b` possibly denoting a base or default outcome.
### Interpretation
This diagram likely models a system with:
1. **Sequential Processing**: Primary flow from `E_y` to `E_z` via `R_n`, with intermediate states (`i_y`, `i'_x`, `i_z`).
2. **Conditional Branching**: Decision nodes (`R_n`, `r_m`) dictate transitions based on rules or thresholds.
3. **Redundancy/Alternatives**: Dashed lines imply fallback mechanisms or optional steps (e.g., retrying `i_y` after `E_y`).
The structure emphasizes modularity, with distinct input/output nodes and decision points. The feedback loop at `E_y`/`i_y` highlights resilience, while parallel paths (`E_x`β`R_n` vs. `E_x`β`r_m`) suggest scalability or adaptability in handling events.
</details>
Given the above example, Lee and Honavar (2015) make two claims:
$$C l a i m \, 1 . ( \overline { P . X } \not \leq \overline { S ^ { \prime } . Z } | \overline { Q . Y } ) _ { A G G _ { M } }$$
$$\begin{array} { r l } & { C l a i m \, 2 . T h e r e i s n o s \in \sum _ { S } a n d b \in s ( B ) s u c h t h a t \left ( P . X | _ { b } \not \subset S ^ { \prime } . Z | _ { b } | Q . Y | _ { b } \right ) _ { G G _ { M _ { s } } } } \\ & { F i n } \end{array}$$
Figure 4 shows the general pattern discussed in Section 4.4 regarding the construction of the counterexample by Lee and Honavar (2015).
## Appendix B. Proofs
The proof for Theorem 3 is given as follows:
## Proof
Let M = ( S , D ) be an arbitrary cyclic relational model structure and B β E βͺ R an arbitrary perspective.
Soundness: To prove that Ο -AGG M s is sound, we must show that for every edge P k .X β P j .Y in Ο -AGG M s , there exists a corresponding edge i k .X β i j .Y in the ground graph GG M s for some skeleton s β β S , where i k β P k | b and i j β P j | b for some b β s ( B ) . There are three subcases, one for each type of edge in an abstract ground graph:
(a) Let [ B,..., I k ] .X β [ B,..., I j ] .Y β RVE be an arbitrary edge in Ο -AGG M s between a pair of relational variables. Assume for contradiction that there exists no edge i k .X β i j .Y in any ground graph:
$$\begin{array} { r } { \forall s \in \Sigma _ { \mathcal { S } } , \forall i _ { k } \in [ B , \dots , I _ { k } ] | _ { b } , \forall i _ { j } \in [ B , \dots , I _ { j } ] | _ { b } } \\ { ( i _ { k } . X \to i _ { j } . Y \notin G G _ { \mathcal { M } _ { S } } ) } \end{array}$$
By Definition 7 for Ο -AGG M s , if [ B,..., I k ] .X β [ B,..., I j ] .Y β RVE , then the model must have dependency [ I j , ..., I k ] .X β [ I j ] .Y β D such that [ B,..., I k ] β extend ([ B,..., I j ] , [ I j , ..., I k ]) . So, by the definition of ground graphs, there is an edge from every i k .X to every i j .Y , where i k is in the terminal set for i j along [ I j , ..., I k ] . Therefore, there exists a ground graph GG M s such that i k .X β i j .Y β GG M s , which contradicts the assumption.
- (b) Let P 1 .X β© P 2 .X β [ B,..., I j ] .Y β IVE be an arbitrary edge in Ο -AGG M s between an intersection variable and a relational variable, where P 1 = [ B,..., I m , ..., I k ] and P 2 = [ B,..., I n , ..., I k ] with I m = I n . By Definition 7, if the Ο -abstract ground graph has edge P 1 .X β© P 2 .X β [ B,..., I j ] .Y β IVE , then either P 1 .X β [ B,..., I j ] .Y β RVE or P 2 .X β [ B,..., I j ] .Y β RVE . Then, as shown in case (a), there exists an i j β [ B,..., I j ] | b such that i k .X β i j .Y β GG M s , which contradicts the assumption.
(c) Let [ B,..., I k ] .Y β P 1 .X β© P 2 .X β IVE be an arbitrary edge in Ο -AGG M s between an intersection variable and a relational variable, where P 1 = [ B,..., I m , ..., I j ] and P 2 = [ B,..., I n , ..., I j ] with I m = I n . The proof follows case (b) to show that there exists a skeleton s β β S and b β s ( B ) such that for all i k β [ B,..., I k ] | b there exists an i j β P 1 β© P 2 | b such that i k .X β i j .Y β GG M s .
Completeness: To prove that the Ο -abstract ground graph Ο -AGG M s is complete, we show that for every edge i k .X β i j .Y in every ground graph GG M s where s β β S , there is a set of corresponding edges in Ο -AGG M s . Specifically, the edge i k .X β i j .Y yields two sets of relational variables for some b β s ( B ) , namely P k .X = { P k .X | i k β P k | b } and P j .Y = { P j .Y | i j β P j | b } . Note that all relational variables in both P k .X and P j .Y are nodes in Ο -AGG M s , as are all pairwise intersection variables. We show that for all P k .X β P k .X and for all P j .Y β P j .Y either (a)
P k .X β P j .Y β Ο -AGG M s (b) P k .X β© P β² k .X β P j .Y β Ο -AGG M s where P β² k .X β P k .X , or (c) P k .X β P j .Y β© P β² j .Y β Ο -AGG M s where P β² j .Y β P j .Y .
Let s β β S be an arbitrary skeleton, let i k .X β i j .Y β GG M s be an arbitrary edge drawn from [ I j , ..., I k ] .X β [ I j ] .Y β D , and let P k .X β P k .X, P j .Y β P j .Y be an arbitrary pair of relational variables.
- (a) If P k β extend ( P j , [ I j , ..., I k ]) , then P k .X β P j .Y β Ο -AGG M s by Definition 7.
- (b) If P k / β extend ( P j , [ I j , ..., I k ]) , but β P β² k β extend ( P j , [ I j , ..., I k ]) such that P β² k .X β P k .X , then P β² k .X β P j .Y β Ο -AGG M s , and P k .X β© P β² k .X β P j .Y β Ο -AGG M s by Definition 7.
- (c) If β P β extend ( P j , [ I j , ..., I k ])( P.X / β Pk.X ) , then β P β² j such that i j β P β² j | b and P k β extend ( P β² j , [ I j , ..., I k ]) . Therefore, P β² j .Y β P j .Y, P k .X β P β² j .Y β Ο -AGG M s , and P k .X β P β² j .Y β© P j .Y β Ο -AGG M s by Definition 7.