## Causal Diagram: Three-Level Fairness Model
### Overview
The image displays three separate causal diagrams, labeled "1) Level-One", "2) Level-Two", and "3) Level-Three", arranged horizontally from left to right. These diagrams model the relationships between protected attributes, academic/test scores, and an outcome variable (FYA) under different fairness interventions. A comprehensive legend is provided at the bottom of the image to explain the node types and connection styles.
### Components/Axes (Legend)
The legend at the bottom defines the following visual elements:
* **Node Types (by color and border):**
* **Blue circle (solid border):** "Prot. Attr" (Protected Attribute). Nodes: `SEX`, `RACE`.
* **Orange circle (dashed border):** "Outcome". Node: `FYA`.
* **Purple circle (solid border):** "Unfair Observable". Nodes: `GPA`, `LSAT`.
* **Green circle (solid border):** "Fair Unobservable". Node: `K` (in Level-Two).
* **Green circle (dashed border):** "Fair Unobservable". Nodes: `ε_GPA`, `ε_LSAT` (in Level-Three).
* **Yellow circle (dashed border):** "Fair Observable". Node: `X_fair` (in Level-One).
* **Connection Types:**
* **Solid arrow (→):** "Cause". Indicates a direct causal influence.
* **Dotted line (........):** "Additive Noise". Indicates the addition of a noise term.
* **Dashed circle border:** "Seen by CFP". Indicates the variable is observed by the "CFP" (presumably a fairness-aware predictor or mechanism).
### Detailed Analysis
#### **1) Level-One Diagram**
* **Components:** `SEX`, `RACE` (Protected Attributes); `GPA`, `LSAT` (Unfair Observables); `FYA` (Outcome); `X_fair` (Fair Observable).
* **Flow & Relationships:**
* `SEX` and `RACE` have direct causal arrows pointing to `GPA`, `LSAT`, and `FYA`.
* `GPA` has a causal arrow pointing to `LSAT`.
* `GPA`, `LSAT`, and `X_fair` all have causal arrows pointing to `FYA`.
* The `X_fair` node has a dashed border, indicating it is "Seen by CFP".
* **Interpretation:** This level introduces a single, fair observable variable (`X_fair`) that directly influences the outcome (`FYA`). The protected attributes (`SEX`, `RACE`) still influence both the unfair observables (`GPA`, `LSAT`) and the outcome directly.
#### **2) Level-Two Diagram**
* **Components:** `SEX`, `RACE` (Protected Attributes); `GPA`, `LSAT` (Unfair Observables); `FYA` (Outcome); `K` (Fair Unobservable).
* **Flow & Relationships:**
* `SEX` and `RACE` have direct causal arrows pointing to `GPA`, `LSAT`, and `FYA`.
* `GPA` has a causal arrow pointing to `LSAT`.
* The new node `K` (Fair Unobservable, solid green circle) has causal arrows pointing to `GPA`, `LSAT`, and `FYA`.
* The `K` node has a dashed border, indicating it is "Seen by CFP".
* **Interpretation:** This level replaces the fair observable (`X_fair`) with a fair unobservable (`K`). This `K` variable influences all downstream variables (`GPA`, `LSAT`, `FYA`). The protected attributes still have their direct paths.
#### **3) Level-Three Diagram**
* **Components:** `SEX`, `RACE` (Protected Attributes); `GPA`, `LSAT` (Unfair Observables); `FYA` (Outcome); `ε_GPA`, `ε_LSAT` (Fair Unobservables/Noise).
* **Flow & Relationships:**
* `SEX` and `RACE` have direct causal arrows pointing to `GPA`, `LSAT`, and `FYA`.
* `GPA` has a causal arrow pointing to `LSAT`.
* Two new nodes, `ε_GPA` and `ε_LSAT` (Fair Unobservable, dashed green circles), are introduced.
* `ε_GPA` is connected to `GPA` via a dotted line ("Additive Noise").
* `ε_LSAT` is connected to `LSAT` via a dotted line ("Additive Noise").
* Both `ε_GPA` and `ε_LSAT` have dashed borders, indicating they are "Seen by CFP".
* **Interpretation:** This level models fairness by adding independent, fair noise terms (`ε`) to the unfair observable variables (`GPA`, `LSAT`). These noise terms are observed by the CFP. The direct influence of protected attributes on the observables and outcome remains.
### Key Observations
1. **Progression of Intervention:** The diagrams show a conceptual progression in fairness intervention: from adding a separate fair feature (Level-One), to introducing a single latent fair factor (Level-Two), to adding fair noise directly to the unfair observables (Level-Three).
2. **Persistent Bias Path:** In all three levels, the protected attributes (`SEX`, `RACE`) maintain direct causal links to both the intermediate variables (`GPA`, `LSAT`) and the final outcome (`FYA`). The interventions do not sever these paths but attempt to add "fair" components alongside them.
3. **"Seen by CFP" Condition:** The fairness mechanism (whether `X_fair`, `K`, or `ε` terms) is always marked as observable to the CFP, suggesting the fairness-aware model has access to these specific fair components.
4. **Node Styling Consistency:** The styling is consistent with the legend. Outcome (`FYA`) always has a dashed orange border. Protected attributes are always solid blue. Unfair observables are solid purple. The "fair" components vary in color (yellow/green) and border style (solid/dashed) based on their type and level.
### Interpretation
These diagrams visually formalize different approaches to achieving fairness in a predictive model (e.g., for law school admission, given GPA and LSAT scores). The core problem is that protected attributes like `SEX` and `RACE` causally influence both the inputs (`GPA`, `LSAT`) and the outcome (`FYA`), leading to potential unfairness.
* **Level-One** suggests fairness can be achieved by discovering and using an alternative, fair observable (`X_fair`) that captures legitimate predictive power without bias.
* **Level-Two** posits the existence of a single, unobservable but fair latent factor (`K`) that drives both academic performance and the outcome. If this factor can be estimated or proxied, fairness might be attainable.
* **Level-Three** takes a more statistical approach, suggesting that adding carefully calibrated, fair noise to the biased observables can "wash out" the unfair influence, making the final prediction fairer.
The diagrams highlight a fundamental tension: the protected attributes have a real, causal effect on the variables we observe and the outcome. Fairness interventions, therefore, are not about erasing this reality but about strategically adding new, fair information or noise to counteract the *unfair* component of that influence during the decision-making process. The choice of model (Levels 1-3) implies different assumptions about what "fairness" means and what kind of additional data or noise is available.