\n
## Textual Comparison: Paraphrastic Probing Case Study
### Overview
The image presents a side-by-side comparison of two responses to the same mathematical word problem. The problem asks for the total distance James runs in a week, given he runs 3 sprints, 3 times a week, with each sprint being 60 meters. The "ground truth answer" is provided as 540 meters. The two responses demonstrate different approaches to solving the problem, with one using the term "meters" and the other "distances". The responses are annotated with checkmarks indicating correctness.
### Components/Axes
The image is divided into two columns, each representing a different response. Each column is further divided into sections: "Question", "Response", "Step 1", "Step 2", and a final statement of the answer. The "Question" section is identical in both columns.
### Detailed Analysis or Content Details
**Column 1:**
* **Question:** "James decides to run 3 sprints 3 times a week. He runs 60 meters each sprint. How many total meters does he run a week? (ground truth answer: 540)"
* **Response:** "To find the total **meters** James runs a week, we need to calculate the total meters he runs in one sprint session and then multiply it by the number of sessions he has in a week."
* **Step 1:** "Calculate the total meters James runs in one sprint session. James runs 3 sprints of 60 meters each. Total meters in one sprint session = 3 * 60 = 180 meters."
* **Step 2:** "Calculate the total meters..."
* **Answer:** "Therefore, James runs a total of 540 meters a week. The answer is 540. ✅"
**Column 2:**
* **Question:** "James decides to run 3 sprints 3 times a week. He runs 60 meters each sprint. How many total meters does he run a week? (ground truth answer: 540)"
* **Response:** "To find the total **distances** James runs in a week, we need to calculate the total distance he runs in one sprint session and then multiply it by the number of sessions he has in a week."
* **Step 1:** "Calculate the total distance James runs in one sprint session. Distance per sprint = 60 meters. Number of sprints per session = 3. Total distance per session = Distance per sprint * Number of sprints per session. Total distance per session = 60 * 3. Total distance per session = 180 meters."
* **Step 2:** "Calculate the total distance James runs in a week."
* **Answer:** "Therefore, James runs a total of 540 meters a week. ✅"
### Key Observations
The primary difference between the two responses is the use of "meters" versus "distances". Both responses arrive at the correct answer (540 meters) and demonstrate a similar problem-solving approach. The second response provides a more explicit breakdown of the calculation in Step 1, defining "Distance per sprint" and "Number of sprints per session". Both responses are marked as correct with a checkmark.
### Interpretation
This image illustrates a case study in "Paraphrastic Probing," likely used in evaluating language models. The goal is to assess how sensitive a model's response is to minor changes in wording (e.g., "meters" vs. "distances"). The fact that both responses are correct suggests that the model is robust to this particular paraphrasing. The study highlights the importance of evaluating not just the correctness of an answer, but also the consistency and clarity of the reasoning process. The use of the "ground truth answer" provides a benchmark for evaluating the model's performance. The annotations (checkmarks) indicate a human evaluation of the responses. The slight differences in phrasing and detail in the steps suggest that the model may have multiple valid ways of arriving at the correct solution.