2304.10877v1

Model: nemotron-free

## Timing the Transient Execution: A New Side-Channel Attack on Intel CPUs ∗ ∗ † ‡ Yu Jin , Pengfei Qiu , Chunlu Wang , Yihao Yang , Dongsheng Wang § , Gang Qu ¶ Key Laboratory of Trustworthy Distributed Computing and Service (BUPT), Ministry of Education § Tsinghua University ¶ University of Maryland lambda.jinyu@gmail.com, { qpf, wangcl } @bupt.edu.cn, khaosyg@gmail.com, wds@tsinghua.edu.cn, gangqu@umd.edu Abstract -The transient execution attack is a type of attack leveraging the vulnerability of modern CPU optimization technologies. New attacks surface rapidly. The side-channel is a key part of transient execution attacks to leak data. In this work, we discover a vulnerability that the change of the EFLAGS register in transient execution may have a side effect on the Jcc (jump on condition code) instruction after it in Intel CPUs. Based on our discovery, we propose a new side-channel attack that leverages the timing of both transient execution and Jcc instructions to deliver data. This attack encodes secret data to the change of register which makes the execution time of context slightly slower, which can be measured by the attacker to decode data. This attack doesn't rely on the cache system and doesn't need to reset the EFLAGS register manually to its initial state before the attack, which may make it more difficult to detect or mitigate. We implemented this side-channel on machines with Intel Core i7-6700, i7-7700, and i9-10980XE CPUs. In the first two processors, we combined it as the side-channel of the Meltdown attack, which could achieve 100% success leaking rate. We evaluate and discuss potential defenses against the attack. Our contributions include discovering security vulnerabilities in the implementation of Jcc instructions and EFLAGS register and proposing a new side-channel attack that does not rely on the cache system. A number of the microarchitectural side-channel attacks are based on the side effects and state of the cache system. There are many ways to leak info through the cache system. After the first cache-timing attack reported by Bernstein in 2005 [17]. Multiple variations such as evict+time (2006) [18], flush+reload(2014) [19], prime+probe(2015) [3], flush+flush(2016) [20]. New cache side-channel attacks are still being discovered, e.g., attacks in cache replacement policies(2020) [21], streamline(2021) [22], using cache dirty states(2022) [23], cache coherence(2022) [24]. Besides, some attacks do not directly rely on the cache system, such as PortSmash [25], PlatyPus(2021) [26], PMU-Spill [27]. Index Terms -Timing, EFLAGS register, side-channel attacks ## I. INTRODUCTION The increasing complexity and aggressive optimizations of modern CPUs, with their many microarchitectural features, have led to improved performance, but they have also created a range of security vulnerabilities [1], [2]. This complexity and optimization are the root cause of many security issues, including side-channel attacks [3], Meltdown attack [4], [5], Spectre attack [6], [7], Microarchitectural data sampling (MDS) attack [8]-[10], fault injection attack [11]-[16], and more. The complex and dynamic nature of modern CPUs has made them a challenging target for security researchers and developers to discover and mitigate, and a constant source of concern for users. As the field of computer security continues to evolve, new techniques and countermeasures will be needed to keep pace with the ever-evolving threat landscape. * Equal contributions. Transient execution attacks [28], including Meltdown, Spectre, and MDS attacks, exploit the complex and aggressive optimizations [10] of modern CPUs to leak sensitive information through transient states. As a high-level overview [28], transient execution attacks consist of five phases: (1) microarchitectural preparation, (2) triggering a fault, (3) encoding secret data to a covert channel, (4) flushing transient instructions, and (5) decoding the secret data. The success of phases 3 and 4 depends on side-channels, which require the channel state to be initialized or set to a specific state. For example, in the flush+reload attack [19], the attacker needs to flush the monitored memory line from the cache in phase 1 and encode secret data by loading one index of the memory line into the cache in phase 4. This allows the attacker to measure the time of the monitored memory line being loaded in the cache in phase 5 to decode data. Reverse engineering efforts, such as those by uops.info [29], have attempted to reveal information about the behavior of a processor's microarchitecture despite the lack of publicly available implementation details. Intel's manual [30] has provided additional insights into instruction performance characteristics, highlighting that certain instructions can cause pipeline stalls or other effects due to their functional requirements [31]. For example, the MFENCE instruction introduces a stall in the pipeline until all previous memory operations have been completed. Awareness of these nuances is crucial for optimizing code to maximize performance on a particular processor architecture and identify potential security vulnerabilities. In our work, we conduct in-depth research on the behavior ∗†‡ Fig. 1: Overview of Transient Execution Timing Side-Channel. <details> <summary>Image 1 Details</summary> ![9bf30394](/v1/image/9bf30394c2021211586d0150d9f66ed1243248604e80fe80eb22a9dc4043bf7b) ### Visual Description ## Diagram: Attacker Process with Transient Execution Timing Side Channel ### Overview This diagram illustrates an attacker's process to exploit transient execution timing side channels. It is divided into two phases, with decision points based on CPU flags (ZF/Zero Flag) and exception handling. The flow involves recording timestamps, loading secrets, and conditional branching based on CPU operations. ### Components/Axes - **Phase 1** (dashed box): - **1: record timestamp** → **2: Load Secret** → **3: JS (rely on ZF)** → **4: record timestamp** - **Exception Handler** (red box) with **CMP/SUB %bl, (%rcx)** comparison - **Transient Execution** (blue box) with **ZF (Zero Flag)** checks - **Phase 2** (solid box): - **2.5: ZF still 0** (short path) vs. **ZF set to 1** (longer path) - Arrows labeled **"short"** and **"longer"** indicate timing differences - **Legend**: - Red: Exception Handler - Blue: Transient Execution Timing Side Channel ### Detailed Analysis 1. **Phase 1**: - **Step 1**: Attacker records a timestamp. - **Step 2**: Loads a secret into memory. - **Step 3**: Executes a transient operation (`CMP/SUB %bl, (%rcx)`) to compare values. - If **CMP/SUB result ≠ 0**, exception handling triggers. - If **CMP/SUB result = 0**, flow proceeds to Phase 2. - **Step 4**: Records a second timestamp to measure execution time. 2. **Phase 2**: - **Step 2.5**: - If **ZF still 0** (short path), flow proceeds directly to Step 3. - If **ZF set to 1** (longer path), flow takes a detour, increasing execution time. - **Step 3**: JS operation relies on ZF state. - **Step 4**: Final timestamp recorded to infer ZF state via timing differences. ### Key Observations - **Timing Side Channel**: The attacker infers ZF state by measuring execution time differences between "short" and "longer" paths. - **Exception Handling**: The red box highlights a critical component where exceptions may leak information. - **Conditional Branching**: ZF state determines the attack's success, with timing discrepancies revealing secrets. ### Interpretation This diagram models a **transient execution attack** where the attacker exploits timing variations caused by CPU flag states (ZF) and exception handling. By comparing timestamps before and after transient operations, the attacker deduces whether ZF was set to 0 or 1, indirectly revealing secret data. The red box around Step 2.5 emphasizes the critical decision point where timing differences are maximized. The blue shading of the "Transient Execution Timing Side Channel" box underscores its role as the attack vector. This aligns with real-world side-channel attacks like **Speculative Execution vulnerabilities (e.g., Spectre)**. </details> and side effects of transient execution attacks and discover a vulnerability of the implementation in Intel CPUs. Specifically, the change of the EFLAGS register in transient execution may influence the Jcc instruction after it. Based on our discovery, we introduce a novel side-channel attack that leverages the timing of transient execution with Jcc instruction. The change of EFLAGS in transient execution could make some Jcc instructions after it slightly slower. As we showed in Fig.1, by encoding secret data to the EFLAGS register, we can measure the execution time of the Jcc instruction's context to decode data without the need to reset the EFLAGS register to its initial state in phase 1 of transient attack. This attack does not rely on the cache system, which may make it more difficult to detect compared to previous side-channel attacks. We implement this side-channel in real machines with Intel Core i7-6700 and i7-7700 and i9-10980XE CPUs. We build the Meltdown attack with our side-channel attack and evaluate it on i7-6700 and i7-7700. In practice, our side-channel can achieve 100% success rate. To mitigate this attack, we propose several practical mitigation methods based on our evaluation. Our research makes several contributions: - 1) We discover security vulnerabilities in the implementation of EFLAGS register and Jcc Instruction. The change of EFLAGS register during transient execution can affect the timing of Jcc instruction after it. - 2) We propose a novel side-channel attack that exploits the timing of execution affected by Jcc instructions, which are dependent on the EFLAGS register. Our attack is distinct from previous side-channel attacks appearing in that it does not rely on the cache system and does not require resetting the initial state during the preparation phase of the transient execution attack. - 3) We implement this side-channel in Intel Core i7-6700 and i7-7700 and i9-10980XE CPUs. We build the Meltdown attack with this side-channel on Intel Core i7-6700 and i7-7700 CPUs on a real machine and it could achieve 100% success rate. To the best of our knowledge, this is the first time that the EFLAGS register has been used as a side-channel. We hope that our work can help to improve the security of future CPUs and bring insight for microarchitecture attack research. The source code of our attacks would be published at https://github.com/. The rest of the paper is organized as follows. In Section II, we introduce the background knowledge of side-channel attacks and transient execution attacks. In Section III, we present the details of the attack. In Section IV, we evaluate our attack on Intel CPU. In Section V, we propose several mitigations for this side-channel. In Section VI, we discuss the limitation of this attack and future work. Finally, we conclude our work in Section VII. ## II. BACKGROUND ## A. Microarchitecture The microarchitecture [32] of each core of the CPU is composed of several components, such as the cache system, the frontend that includes the branch predictor, the out-of-order execution unit, etc. The CPU core is the core of the CPU, which is responsible for the execution of the instructions. The cache system is used to store the data and instructions that are frequently accessed. The frontend is responsible for the instruction fetch and decode. The out-of-order execution unit is responsible for the out-of-order execution of the instructions. As the modern CPU is an microarchitecture complex system. For most commercial CPUs, the microarchitecture of the CPU is a black box, and much research trying to reverse engineering in it [33]-[35]. ## B. Side-Channel Attacks Side-Channel Attacks in microarchitecture [36] is a class of attacks that exploit the side effects of a program to leak information about the program's execution. The side effects can be the cache system [3], [17]-[21], [37], the branch predictor [38], [39], the power [26], etc. For example, the cache system can be used to leak info about the memory access pattern of the program. The branch predictor can be used to leak info about the control flow of the program. The root cause [1] of most of the side-channel attacks in microarchitecture is the shared resource, which is one key to the performance optimization of the CPU. ## C. Transient Execution Attacks Fig. 2: Category of Transient Execution Attacks <details> <summary>Image 2 Details</summary> ![e5acf22d](/v1/image/e5acf22d038e3bbac98a63a43aae170a17af2f8250ad17020ca1cba60928aa51) ### Visual Description ## Flowchart: Taxonomy of Transient Execution Attacks ### Overview The diagram categorizes transient execution attacks into three primary types: **Meltdown-Type**, **MDS**, and **Spectre-Type**, with hierarchical subcategories. A legend on the right labels the entire structure as a "Transient Execution Attack" taxonomy. ### Components/Axes - **Legend**: - Position: Top-right corner. - Label: "Transient Execution Attack" (black text). - **Main Categories**: 1. **Meltdown-Type** (red): - Subcategories: - Meltdown - ForeShadow - ForeShadow-NG 2. **MDS** (red): - Subcategories: - RIDL - Fallout - Zombieload - LVI - CrossTalk 3. **Spectre-Type** (green): - Subcategories: - **Branch Predict**: - Spectre V1 - PHT - BTB - **Target Predict**: - Spectre V2 - BTB - RSB 4. **Other-Type** (green): No subcategories listed. ### Detailed Analysis - **Hierarchy**: - Arrows indicate parent-child relationships (e.g., "Meltdown-Type" → "Meltdown"). - Subcategories are grouped under their parent type (e.g., "Spectre V1" under "Branch Predict"). - **Color Coding**: - Red: Meltdown-Type and MDS. - Green: Spectre-Type and Other-Type. - **Textual Labels**: - All labels are in English. No non-English text detected. ### Key Observations 1. **Categorization**: - Meltdown-Type and MDS are grouped under red, while Spectre-Type and Other-Type use green. - Spectre-Type has the most granular subcategories (e.g., Branch Predict → Spectre V1, PHT, BTB). 2. **Structure**: - MDS has the most subcategories (5), followed by Spectre-Type (6 subcategories across two branches). - Other-Type is a standalone category with no further subdivisions. ### Interpretation - The diagram provides a structured taxonomy of transient execution attacks, grouping them by type and subcategory. - **Meltdown-Type** and **MDS** represent distinct attack families, while **Spectre-Type** is further divided into prediction-based subcategories (Branch Predict and Target Predict). - The inclusion of "Other-Type" suggests the taxonomy is not exhaustive, leaving room for unclassified attacks. - The color coding (red vs. green) may imply severity, prevalence, or technical similarity, though this is not explicitly stated. - No numerical data or trends are present, as the diagram focuses on categorical relationships rather than quantitative analysis. </details> Since Meltdown [4] and Spectre [6] attacks have been discovered, the transient execution attacks [28] has been a hot topic in the security community. The transient execution can be caused by the fault, the branch misprediction, the cache miss, etc. And there are several variations of transient execution attacks [5], [7]-[10], [40]-[44]. IP vendors like Intel and AMD have released the microcode update to mitigate the transient execution attacks [45]. The researcher also proposed some countermeasures to mitigate it [1], [28], [46]-[48]. ## III. TRANSIENT EXECUTION TIMING ATTACKS ## A. Assumption and Threat Model Assumption: The secret data can be accessed through a transient execution attack. There are lots of transient execution attacks such as Meltdown [4], Spectre [6], Foreshadow [5], ZombieLoad [10], etc. Although most existing attacks have been mitigated, they may have undisclosed transient execution vulnerabilities in the CPUs and have been exploited in the future [27]. Threat Model: The attacker runs in the unprivileged mode. And there is another victim process that runs on the same machine. We define the threat model of this study as: the attacker utilizes transient execution timing to recover the secret data acquired in the transient execution attacks. ## B. Attack Overview We implement the attack as the side-channel of the Meltdown attack, shown in Fig. 1. The attack is composed of two phases. In the first phase, we trigger transient execution and encode the secret data through the EFLAGS register. In the second phase, we measure the execution time of the Jcc instruction's context to decode data. To encode a secret through a binary flag, we need to use iteration test\_num to set the flag. If the test\_num equals the secret, the flag will be set and the secret would be encoded successfully. ## C. Implement Detail We notate the secret\_addr as the address of the secret data. And the offset is the offset of the secret\_addr . The EFLAGS instruction is the instruction that can change the EFLAGS register. The Jcc instruction is the instruction that can be influenced by the EFLAGS register. The available instruction set is the list in Table I. We use \_\_rdtsc from x86intrin.h to get the time-stamp counter of the CPU. ``` __rdtsc from x86intrin.h to get the time-stamp counter of the CPU. for (uint8_t test_num = 0; test_num <= TO; test_num++){ start_time = __rdtsc(); // timing context start asm volatile( "MOV %0, %%RCX;" "MOV %1, %%BL;" : : "r"(secret_addr + offset), "r"(test_num) :); if (xbegin() == (~0u)) { // EFLAGS instruction asm volatile("SUB %BL, (%RCX);"); } asm volatile( "JZ equal;" // Jcc instruction "JMP notequal;" "equal: NOP;" "notequal: NOP;"); // timing context end spend_time = __rdtsc() - start_time; if (max_time < spend_time) { max_time = spend_time; argmax = i; argmax = i; } } ``` Listing 1: Pseudocode for timing the transient execution attack in Intel X86 architecture. The secret\_addr is an address in kernel space. And the attacker running in an unprivileged mode which can not access the secret\_addr . The offset is the offset of the secret\_addr . In the TSX transaction, the attacker will try to access the secret\_addr by sub instruction. A transient execution will be caused by the fault. During the transient execution, the ZF may be set to 1 if the secret data in *(secret\_addr+offest) is equal to i . And the ZF will be restored to 0 after the transient execution. The ZF will be used by the JZ instruction to determine whether to jump to the equal label or the notequal label. In our experiment, if the secret data in *(secret\_addr+offest) is equal to i , the execution time of the context will be slower than the execution time of the context that the secret data in *(secret\_addr+offest) is not equal to i . The max\_time is the maximum execution time of the context. The argmax is the secret data, as we have shown the distribution in Fig. 3. The max\_time and argmax can be used to decode the secret data. TABLE I: Instruction for our Attack. | Type | Instruction | Description | EFLAGS | |--------|---------------|-------------------------------|----------| | EFLAGS | SUB | Subtract. | ZF | | | CMP | Compare Two Operands. | ZF | | | CMPXCHG | Compare and Exchange. | ZF | | Jcc | JE | Jump short if equal (ZF = 1). | ZF | | | JZ | Jump short if zero (ZF = 1). | ZF | ## IV. EXPERIMENT AND EVALUATION ## A. Experimental Setup and Result We implement the attack in Intel i7-6700, i7-7700, and i9-10980XE CPU. The experiment is running in the Ubuntu 16.04 xenial with kernel version 4.15.0 (i7-6700, i7-7700) and Ubuntu 22.04 jammy with kernel version 5.15.0 (i9-10980XE). We build the Meltdown attack with our side-channel to read kernel memory from user space and achieve 100% success leaking rate in the first two processors. ## B. Evaluation In our experiment, we found that the influence of the EFLAGS register on the execution time of Jcc instruction is not as persistent as the cache state. For about 6-9 cycles after the transient execute, the Jcc execute time will not be about to construct a side-channel. Empirically, the attack needs to repeat thousands of times for higher accuracy. For the Listing1, we use Intel TSX, a transactional memory implementation, to completely suppress the exception. We also use system interrupt handlers to suppress the exception and achieve the same effect. The TSX is more efficient than the system interrupt handlers for transient execution attacks. Though the statistical distribution of the argmax of timing can be used to decode the secret data. The distribution of the average clock, due to the noise, cannot be used as a sidechannel, as we have shown in Fig. 4. ## V. MITIGATION For mitigating the Transient Execution Timing attacks, we can use two gadgets to delay the Jcc instruction or rewrite the EFLAGS register after the transient execution. ## A. Hareware Mitigation The implementation of the Jcc instruction should not have timing or other side effects on different conditions to avoid the adversarial execution measuring. ## B. Delay Jcc If the Jcc instruction is not executed immediately after the EFLAGS register has been changed, the influence of the EFLAGS register can be reduced. 10 cycles are enough to reduce the influence of the EFLAGS register. .rept count Repeat the sequence of lines between the .rept directive and the next .endr directive count times. By using the NOP instruction to delay the Jcc instruction, we can reduce the influence of the EFLAGS register on the execution time. There are lots of ways to delay, We just give one example here. ## C. Rewrite EFLAGS The LAHF and SAHF instructions are x86 assembly language instructions that are used to manipulate the low 8 bits of the FLAGS register in the x86 processor [30]. The LAHF instruction is short for 'Load AH from Flags'. It loads the low 8 bits of the FLAGS register into the AH register while leaving the upper 8 bits of the AH register unchanged. The AH register is a 16-bit register that is used to store the high byte of the AX register. The SAHF instruction is short for 'Store AH into Flags'. It stores the low 8 bits of the AH register into the low 8 bits of the FLAGS register while leaving the upper 8 bits of the FLAGS register unchanged. The PUSHF and POPF instructions are x86 assembly language instructions that are used to push and pop the contents of the FLAGS register onto and off of the stack, respectively. The PUSHF instruction pushes the entire 16-bit FLAGS register onto the top of the stack. This includes the status flags that are used to indicate the result of arithmetic and logic operations, as well as other control flags that control the behavior of the processor. The POPF instruction pops the contents of the top of the stack into the FLAGS register. This can be used to restore the state of the FLAGS register after it has been saved by a previous PUSHF instruction. By rewriting the EFLAGS though LAHF and SAHF , or PUSHF and POPF instructions, the influence of the EFLAGS register can be reduced. ## VI. DISCUSSION AND FUTRUE WORK The root causes of this attack are still not fully understood. We guess that there is some buffer in the execution unit of the Intel CPU which need some time to revert if the execution should be withdrawn. This withdrawal process will cause a stall if the following instruction depends on the target of the buffer. ## A. Limitation This timing attack relies on other transient execution attacks to build a real-world attack and it is easy to be disturbed by noise. But it is still a new side-channel attack and worth further exploration. This attack may bring insight for new microarchitecture attacks and give a new way to build sidechannel attacks in cache side-channel resistant CPU. <details> <summary>Image 3 Details</summary> ![b2497690](/v1/image/b2497690a1ed05f87cc5b7dd69007a1699b67d680a4d9e327495b664e990f04d) ### Visual Description ## Line Graph: Number of Argtmax vs ASCII of test_num ### Overview The image depicts a line graph illustrating the relationship between the ASCII values of test numbers (A-Z) and the corresponding "Number of Argtmax." The graph includes a single data series labeled "GT of Secret: S" (blue line) and highlights a peak labeled "argmax: S" at ASCII value S (~350). The y-axis ranges from 0 to 500, while the x-axis spans ASCII values A-Z. ### Components/Axes - **X-axis**: Labeled "ASCII of test_num," with discrete markers A-Z (representing ASCII values 65-90). - **Y-axis**: Labeled "Number of Argtmax," scaled from 0 to 500 in increments of 100. - **Legend**: Located in the top-right corner, with two entries: - "GT of Secret: S" (blue line with circular markers). - "argmax: S" (annotated peak at S). - **Line**: A single blue line with circular markers, representing the "GT of Secret: S" data series. ### Detailed Analysis - **Data Points**: - **A**: ~250 (blue marker). - **B**: ~300 (blue marker, local peak). - **C**: ~150 (blue marker). - **D**: ~180 (blue marker). - **E**: ~170 (blue marker). - **F**: ~120 (blue marker). - **G**: ~90 (blue marker). - **H**: ~100 (blue marker). - **I**: ~110 (blue marker). - **J**: ~80 (blue marker). - **K**: ~60 (blue marker). - **L**: ~70 (blue marker). - **M**: ~100 (blue marker). - **N**: ~80 (blue marker). - **O**: ~60 (blue marker). - **P**: ~50 (blue marker). - **Q**: ~100 (blue marker). - **R**: ~50 (blue marker). - **S**: ~350 (blue marker, annotated as "argmax: S"). - **T**: ~70 (blue marker). - **U**: ~100 (blue marker). - **V**: ~60 (blue marker). - **W**: ~50 (blue marker). - **X**: ~50 (blue marker). - **Y**: ~90 (blue marker). - **Z**: ~40 (blue marker). ### Key Observations 1. **Peak at S**: The highest value (~350) occurs at ASCII S, explicitly annotated as "argmax: S." 2. **General Trend**: The line exhibits a downward trend after B (~300), with fluctuations (e.g., minor peaks at Q and U). 3. **Lowest Values**: The minimum (~40) occurs at Z, with several points (P, R, V, W, X) below 100. 4. **Anomaly**: The sharp spike at S (~350) stands out as an outlier compared to other values (mostly 50–200). ### Interpretation - **Data Meaning**: The graph suggests that the "GT of Secret: S" metric peaks at ASCII S, potentially indicating a critical test case or anomaly. The fluctuations may reflect variability in performance or data collection across test numbers. - **Relationships**: The "argmax: S" annotation directly correlates with the highest y-axis value, confirming S as the maximum. The blue line consistently represents the "GT of Secret: S" series. - **Significance**: The outlier at S warrants further investigation, as it deviates significantly from the general trend. This could indicate a unique condition or error in the dataset. </details> Fig. 3: Distribution of argmax. Fig. 4: Distribution of average clock. <details> <summary>Image 4 Details</summary> ![048d6b33](/v1/image/048d6b33d426d9f4d7ef75ee0bdc202cf96e678f510767126d7e1cd2769c3263) ### Visual Description ## Line Chart: Clock Cycle vs. ASCII of test_num ### Overview The image depicts a line chart illustrating the relationship between the "Clock Cycle" (y-axis) and the "ASCII of test_num" (x-axis). The chart includes a single data series represented by a blue line with circular markers. A notable annotation highlights the maximum value ("argmax: T") at a specific point on the x-axis. ### Components/Axes - **X-axis (Horizontal)**: Labeled "ASCII of test_num", with categories A–Z (26 discrete points). The axis uses uppercase letters as labels, though the title suggests a connection to ASCII values (e.g., A=65, B=66, ..., Z=90). - **Y-axis (Vertical)**: Labeled "Clock Cycle", scaled from 250 to 370 in increments of 20. - **Legend**: Located in the top-right corner, with a blue line labeled "GT of Secret: S". - **Annotation**: A black arrow labeled "argmax: T" points to the peak of the line at the x-axis label "T". ### Detailed Analysis - **Data Series**: - The blue line fluctuates between approximately 290 and 310 for most x-axis labels (A–Z), except at "T", where it peaks sharply to ~330. - The line starts at ~295 (A), dips to ~290 (B), and exhibits minor oscillations (e.g., ~310 at G, ~290 at H, ~310 at J). - The peak at "T" is the only value exceeding 310, reaching ~330. After "T", the line declines to ~290 (W) and stabilizes around ~295–300 for the remaining labels (X–Z). ### Key Observations 1. **Peak at "T"**: The maximum clock cycle (~330) occurs at the x-axis label "T", confirmed by the "argmax: T" annotation. 2. **Stability**: The clock cycle remains relatively stable (~290–310) for all labels except "T". 3. **Asymmetry**: The peak at "T" is followed by a gradual decline, suggesting a potential causal relationship between "T" and subsequent values. ### Interpretation - **Data Meaning**: The chart demonstrates that the "GT of Secret: S" (likely a ground-truth metric) exhibits a significant anomaly at test "T", with clock cycles spiking to ~330. This could indicate a computational bottleneck, error condition, or intentional stress test at "T". - **Relationships**: The stability before and after "T" implies that "T" is an outlier rather than part of a broader trend. The annotation "argmax: T" explicitly identifies this as the global maximum. - **Uncertainties**: - The x-axis label "ASCII of test_num" is ambiguous. If "test_num" refers to numerical test identifiers (e.g., 1, 2, ..., 26), the use of letters (A–Z) may be a mapping error. - The purpose of "GT of Secret: S" is unclear without additional context (e.g., is this a hardware performance metric, algorithmic complexity, or something else?). ### Spatial Grounding - **Legend**: Top-right corner, aligned with the line's color (blue). - **Annotation**: Centered above the peak at "T", ensuring clarity. - **Axis Labels**: X-axis labels (A–Z) are evenly spaced below the chart; y-axis ticks are on the left. ### Content Details - **Numerical Values**: - A: ~295 | B: ~290 | C: ~295 | D: ~295 | E: ~295 | F: ~295 | G: ~310 | H: ~290 | I: ~295 | J: ~310 | K: ~295 | L: ~295 | M: ~295 | N: ~295 | O: ~295 | P: ~295 | Q: ~295 | R: ~295 | S: ~310 | T: ~330 | U: ~300 | V: ~300 | W: ~290 | X: ~300 | Y: ~300 | Z: ~295. - **Trends**: - The line slopes upward to "T" (argmax) and then slopes downward, forming a single prominent peak. ### Final Notes The chart provides a clear visual of a singular anomaly at "T", but the lack of context for "GT of Secret: S" and the x-axis labeling ambiguity limits deeper analysis. Further investigation into the system or test conditions at "T" would be warranted to understand the cause of the spike. </details> Listing 2: Pseudocode 1 for mitigating Transient Execution Timing attacks in Intel X86 architecture. <details> <summary>Image 5 Details</summary> ![7c93e264](/v1/image/7c93e264be63f1bcf1b10ba35d0aac1e37b03c45125dd0256cca56b0503d2748) ### Visual Description ## Code Snippet: `volatile()` Function Definition ### Overview The image displays a code snippet defining a function named `volatile()` with three parameters. The code uses syntax highlighting, with keywords in red (`asm`, `volatile`), strings in purple (e.g., `".rept 6 \n\t"`), and line numbers in black. ### Components/Axes - **Function Declaration**: - `asm volatile(`: Indicates an assembly function with volatile behavior (prevents compiler optimizations). - **Parameters**: 1. `".rept 6 \n\t"`: A string literal containing a repetition directive (`rept`) and escape sequences (`\n\t` for newline and tab). 2. `"NOP \n\t"`: A string literal with the `NOP` (no-operation) instruction and escape sequences. 3. `".endr \n\t"`: A string literal ending the repetition block (`endr`) with escape sequences. - **Line Numbers**: Left-aligned numbers (1–4) indicate line breaks in the code. ### Detailed Analysis - **Line 1**: `asm volatile(` - Declares an assembly function with `volatile` qualifier. - **Line 2**: `".rept 6 \n\t"` - A string parameter containing the directive `.rept 6`, which likely repeats the following instruction 6 times (common in assembly languages like ARM or x86). - **Line 3**: `"NOP \n\t"` - A string parameter with the `NOP` instruction, which does nothing but consumes a clock cycle. - **Line 4**: `".endr \n\t"` - Ends the repetition block (`endr`), terminating the `.rept` loop. ### Key Observations - The code uses escape sequences (`\n\t`) to embed newline and tab characters within string literals. - The `volatile` keyword ensures the compiler does not optimize away the function, critical for low-level operations like hardware register manipulation. - The `.rept` and `.endr` directives suggest this is assembly code for a processor architecture that supports loop repetition (e.g., ARM). ### Interpretation This code defines a volatile assembly function that executes a `NOP` instruction 6 times. The use of `volatile` prevents the compiler from optimizing the function away, which is essential for ensuring side effects (e.g., hardware register writes) are preserved. The `.rept` and `.endr` directives indicate a loop structure, common in assembly programming for repetitive tasks. The escape sequences (`\n\t`) suggest the strings may be used in contexts requiring formatted output or control characters. No numerical data or trends are present, as this is a textual code snippet. The focus is on syntactic structure and low-level programming conventions. </details> ## B. Futrue Work - a) Combined with other Transient Execution Attacks: More experiments are underway to fully understand it. Which will be published in the future. - b) Finding other various: There may be other microarchitectural components and instructions that can be used as timing side-channels. We will continue to explore this area. ## VII. CONCLUSIONS We present a new side-channel attack that leaks info through the timing of execution. When the ZF has been changed from 0 to 1 during transient execution caused by a fault in the Meltdown attack, though the ZF will be restored to 0 after the transient execution, the instruction like JZ execution time will be slightly longer. As a result, we can leak info throw EFLAGS register by measuring the execution time of context. Compared with previous side-channel attacks, our attack does not rely on the cache system, which may make it difficult to be detected by existing tools or methods [49]-[51]. As far as we know, our work is the first that build sidechannel with the EFLAGS register. Hope our work can help to improve the security of future CPUs. Listing 3: Pseudocode 2 for mitigating Adversarial Jcc attacks in Intel X86 architecture. <details> <summary>Image 6 Details</summary> ![861c37ff](/v1/image/861c37ff157d44c3b3b137b15a2e579fa1784b8f9167dd1c8b6a41943b5177da) ### Visual Description ## Code Snippet: `asm volatile` Function Definition ### Overview The image shows a code snippet defining an `asm volatile` block, likely in a low-level programming context (e.g., assembly or inline assembly in C/C++). The block contains four string literals and a comment, with syntax highlighting applied to distinguish keywords, strings, and comments. ### Components/Axes - **Function Declaration**: - `asm volatile(`: Keyword `asm` in black, `volatile` in magenta, parentheses in black. - **String Literals**: - `"LAHF";` (Line 2): String in purple, semicolon in black. - `"SAHF";` (Line 3): String in purple, semicolon in black. - `"PUSHF";` (Line 5): String in purple, semicolon in black. - `"POPF";` (Line 6): String in purple, semicolon in black. - **Comment**: - `// or` (Line 4): Comment in green, indicating an alternative or optional path. ### Detailed Analysis - **Line Numbers**: Left-aligned numeric labels (1–6) in gray. - **Syntax Highlighting**: - Keywords (`asm`, `volatile`) in black/magenta. - Strings (`"LAHF"`, `"SAHF"`, `"PUSHF"`, `"POPF"`) in purple. - Comment (`// or`) in green. - **Structure**: - The `asm volatile` block is enclosed in parentheses, with each string literal on a separate line. - The comment `// or` appears between `"SAHF";` and `"PUSHF";`, suggesting an alternative implementation or configuration. ### Key Observations 1. **Volatile Keyword**: The `volatile` qualifier ensures the compiler does not optimize away the assembly instructions, critical for hardware interactions where side effects must be preserved. 2. **Hardware Instructions**: The strings `"LAHF"`, `"SAHF"`, `"PUSHF"`, and `"POPF"` correspond to x86 assembly instructions for loading/storing flags and pushing/popping the flags register. 3. **Comment Ambiguity**: The `// or` comment lacks context, leaving the relationship between the instructions unclear (e.g., whether `"PUSHF"`/`"POPF"` are alternatives to `"LAHF"`/`"SAHF"`). ### Interpretation This code snippet defines a volatile inline assembly block, likely used to execute hardware-specific instructions without compiler optimization. The presence of `"LAHF"`/`"SAHF"` (Load/Store All Flags) and `"PUSHF"`/`"POPF"` (Push/Pop Flags) suggests manipulation of processor flags, which are critical for controlling interrupt handling, arithmetic operations, and other low-level system states. The comment `// or` implies a choice between two implementations, but without additional context, it is unclear whether the alternatives are mutually exclusive or combinable. The use of `volatile` emphasizes the need for precise execution order, which is essential in scenarios like atomic operations or direct hardware register manipulation. The code’s structure and syntax highlighting confirm it is part of a low-level programming environment, such as embedded systems or performance-critical applications where direct hardware access is required. </details> ## REFERENCES - [1] N. R. Holtryd, M. Manivannan, and P. Stenstr¨ om, 'Sok: Analysis of root causes and defense strategies for attacks on microarchitectural optimizations,' arXiv preprint arXiv:2212.10221 , 2022. - [2] Intel, 'Analysis of Speculative Execution Side Channels,' https: //software.intel.com/sites/default/files/managed/b9/f9/336983-IntelAnalysis-of-Speculative-Execution-Side-Channels-White-Paper.pdf, 2018. - [3] F. Liu, Y. Yarom, Q. Ge, G. Heiser, and R. B. Lee, 'Last-level cache side-channel attacks are practical,' in IEEE Symposium on Security and Privacy , 2015. - [4] M. Lipp, M. Schwarz, D. Gruss, T. Prescher, W. Haas, A. Fogh, J. Horn, S. Mangard, P. Kocher, D. Genkin, Y. Yarom, and M. Hamburg, 'Meltdown: Reading Kernel Memory from User Space,' in Usenix Security , 2018. - [5] J. V. Bulck, M. Minkin, O. Weisse, D. Genkin, B. Kasikci, F. Piessens, M. Silberstein, T. Wenisch, Y. Yarom, and R. Strackx, 'Foreshadow: Extracting the Keys to the Intel SGX Kingdom with Transient Out-ofOrder Execution,' in Usenix Security , 2018. - [6] P. Kocher, J. Horn, A. Fogh, D. Genkin, D. Gruss, W. Haas, M. Hamburg, M. Lipp, S. Mangard, T. Prescher, M. Schwarz, and Y. Yarom, 'Spectre Attacks: Exploiting Speculative Execution,' in S&P , 2019. - [7] E. M. Koruyeh, K. N. Khasawneh, C. Song, and N. Abu-Ghazaleh, 'Spectre Returns! Speculation Attacks using the Return Stack Buffer,' in WOOT , 2018. - [8] S. van Schaik, A. Milburn, S. ¨ Osterlund, P. Frigo, G. Maisuradze, K. Razavi, H. Bos, and C. Giuffrida, 'RIDL: Rogue In-flight Data Load,' in S&P . IEEE, 2019. - [9] C. Canella, D. Genkin, L. Giner, D. Gruss, M. Lipp, M. Minkin, D. Moghimi, F. Piessens, M. Schwarz, B. Sunar, J. Van Bulck, and Y. Yarom, 'Fallout: Leaking data on Meltdown-resistant CPUs,' in CCS , 2019. - [10] M. Schwarz, M. Lipp, D. Moghimi, J. Van Bulck, J. Stecklina, T. Prescher, and D. Gruss, 'ZombieLoad : Cross-privilege-boundary data sampling,' in CCS , 2019. - [11] K. Murdock, D. Oswald, F. D. Garcia, J. Van Bulck, D. Gruss, and F. Piessens, 'Plundervolt: Software-based fault injection attacks against intel sgx,' in 2020 IEEE Symposium on Security and Privacy (SP) . IEEE, 2020, pp. 1466-1482. - [12] A. Tang, S. Sethumadhavan, and S. Stolfo, ' { CLKSCREW } : Exposing the perils of { Security-Oblivious } energy management,' in 26th USENIX Security Symposium (USENIX Security 17) , 2017, pp. 1057-1074. - [13] P. Qiu, D. Wang, Y. Lyu, and G. Qu, 'Voltjockey: Breaking sgx by software-controlled voltage-induced hardware faults,' in 2019 Asian Hardware Oriented Security and Trust Symposium (AsianHOST) . IEEE, 2019, pp. 1-6. - [14] --, 'Voltjockey: Breaching trustzone by software-controlled voltage manipulation over multi-core frequencies,' in Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security , 2019, pp. 195-209. - [15] P. Qiu, D. Wang, Y. Lyu, R. Tian, C. Wang, and G. Qu, 'Voltjockey: A new dynamic voltage scaling-based fault injection attack on intel sgx,' IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems , vol. 40, no. 6, pp. 1130-1143, 2020. - [16] Z. Kenjar, T. Frassetto, D. Gens, M. Franz, and A.-R. Sadeghi, ' { V0LTpwn } : Attacking x86 processor integrity from software,' in 29th USENIX Security Symposium (USENIX Security 20) , 2020, pp. 14451461. - [17] D. J. Bernstein, 'Cache-timing attacks on AES,' 2005. - [18] D. A. Osvik, A. Shamir, and E. Tromer, 'Cache Attacks and Countermeasures: The Case of AES,' in CT-RSA , 2006. - [19] Y. Yarom and K. Falkner, 'Flush+Reload: A High Resolution, Low Noise, L3 Cache Side-channel Attack,' in Usenix Security , 2014. - [20] D. Gruss, C. Maurice, K. Wagner, and S. Mangard, 'Flush+ Flush: a fast and stealthy cache attack,' in International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment . Springer, 2016, pp. 279-299. - [21] S. Briongos, P. Malag´ on, J. M. Moya, and T. Eisenbarth, 'RELOAD+ REFRESH: Abusing cache replacement policies to perform stealthy cache attacks,' in 29th USENIX Security Symposium (USENIX Security 20) , 2020, pp. 1967-1984. - [22] G. Saileshwar, C. W. Fletcher, and M. Qureshi, 'Streamline: a fast, flushless cache covert-channel attack by enabling asynchronous collusion,' in Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems , 2021, pp. 1077-1090. - [23] Y. Cui and X. Cheng, 'Abusing cache line dirty states to leak information in commercial processors,' in 2022 IEEE International Symposium on High Performance Computer Architecture (HPCA) . IEEE, 2022. - [24] Y. Guo, A. Zigerelli, Y. Zhang, and J. Yang, 'Adversarial prefetch: New cross-core cache side channel attacks,' in 2022 IEEE Symposium on Security and Privacy (SP) . IEEE, 2022, pp. 1458-1473. - [25] A. C. Aldaya, B. B. Brumley, S. ul Hassan, C. P. Garc´ ıa, and N. Tuveri, 'Port contention for fun and profit,' in 2019 IEEE Symposium on Security and Privacy (SP) . IEEE, 2019, pp. 870-887. - [26] M. Lipp, A. Kogler, D. Oswald, M. Schwarz, C. Easdon, C. Canella, and D. Gruss, 'Platypus: Software-based power side-channel attacks on x86,' in 2021 IEEE Symposium on Security and Privacy (SP) . IEEE, 2021, pp. 355-371. - [27] P. Qiu, Q. Gao, D. Wang, Y. Lyu, C. Liu, X. Li, C. Wang, and G. Qu, 'Pmu-spill: Performance monitor unit counters leak secrets in transient executions,' in 2022 Asian Hardware Oriented Security and Trust Symposium (AsianHOST) . IEEE, 2022, pp. 1-6. - [28] C. Canella, J. Van Bulck, M. Schwarz, M. Lipp, B. Von Berg, P. Ortner, F. Piessens, D. Evtyushkin, and D. Gruss, 'A systematic evaluation of transient execution attacks and defenses,' in 28th USENIX Security Symposium (USENIX Security 19) , 2019. - [29] A. Abel and J. Reineke, 'uops.info: Characterizing latency, throughput, and port usage of instructions on Intel microarchitectures,' in ASPLOS , 2019. - [30] Intel Corporation, Intel ® 64 and IA-32 Architectures Software Developer's Manual , 2019. - [31] U. Degenbaev, 'Formal Specification of the x86 Instruction Set Architecture,' Ph.D. dissertation, Universit¨ at des Saarlandes, 2012. - [32] WikiChip, 'Skylake (client) -Microarchitectures -Intel,' https:// en.wikichip.org/wiki/intel/microarchitectures/skylake (client), 2018, accessed: May, 2021. - [33] B. Gras, C. Giuffrida, M. Kurth, H. Bos, and K. Razavi, 'ABSynthe: Automatic Blackbox Side-channel Synthesis on Commodity Microarchitectures,' in NDSS , 2020. - [34] D. Weber, A. Ibrahim, H. Nemati, M. Schwarz, and C. Rossow, 'Osiris: Automated discovery of microarchitectural side channels,' in Usenix Security , 2021. - [35] D. Lustig, M. Pellauer, and M. Martonosi, 'Pipecheck: Specifying and verifying microarchitectural enforcement of memory consistency models,' in MICRO , 2014. - [36] J. R. S. Vicarte, P. Shome, N. Nayak, C. Trippel, A. Morrison, D. Kohlbrenner, and C. W. Fletcher, 'Opening Pandora's Box: A Systematic Study of New Ways Microarchitecture Can Leak Private Data,' in ISCA , 2021. - [37] B. Gras, K. Razavi, H. Bos, and C. Giuffrida, 'Translation leak-aside buffer: Defeating cache side-channel protections with TLB attacks,' in 27th USENIX Security Symposium (USENIX Security 18) , 2018, pp. 955-972. - [38] D. Evtyushkin, R. Riley, N. C. Abu-Ghazaleh, ECE, and D. Ponomarev, 'Branchscope: A new side-channel attack on directional branch predictor,' in Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems , ser. ASPLOS '18, 2018, pp. 693-707. - [39] A. R. H. Coronado, W. Lee, and W.-M. Lin, 'Branchboozle: a sidechannel within a hidden pattern history table of modern branch prediction units.' in SAC , 2021, pp. 1617-1625. - [40] J. Van Bulck, D. Moghimi, M. Schwarz, M. Lipp, M. Minkin, D. Genkin, Y. Yarom, B. Sunar, D. Gruss, F. Piessens, and K. Leuven, 'LVI: Hijacking Transient Execution through Microarchitectural Load Value Injection,' in S&P , 2020. - [41] H. Ragab, A. Milburn, K. Razavi, H. Bos, and C. Giuffrida, 'CrossTalk: Speculative Data Leaks Across Cores Are Real,' in S&P . IEEE, 2021. - [42] G. Maisuradze and C. Rossow, 'ret2spec: Speculative Execution Using Return Stack Buffers,' in CCS , 2018. - [43] P. Z. Google, 'Speculative Execution, Variant 4: Speculative Store Bypass,' https://bugs.chromium.org/p/project-zero/issues/detail?id=1528, 2018, accessed: May, 2021. - [44] V. Kiriansky and C. Waldspurger, 'Speculative Buffer Overflows: Attacks and Defenses,' arXiv , 2018. - [45] Intel, 'Affected Processors: Transient Execution Attacks & Related Security Issues by CPU,' https://www.intel.com/content/www/us/en/ developer/topic-technology/software-security-guidance/processorsaffected-consolidated-product-cpu-model.html, 2022, accessed: August, 2022. - [46] S. Cauligi, C. Disselkoen, D. Moghimi, G. Barthe, and D. Stefan, 'SoK: Practical Foundations for Spectre Defenses,' in S&P . IEEE, 2022. - [47] M. F. A. Kadir, J. K. Wong, F. Ab Wahab, A. F. A. A. Bharun, M. A. Mohamed, and A. H. Zakaria, 'Retpoline technique for mitigating spectre attack,' in 2019 6th International Conference on Electrical and Electronics Engineering (ICEEE) . IEEE, 2019, pp. 96-101. - [48] A. Gonzalez, B. Korpan, J. Zhao, E. Younis, and K. Asanovi´ c, 'Replicating and Mitigating Spectre Attacks on an Open Source RISC-V Microarchitecture,' in CARRV , 2019. - [49] C. Li and J.-L. Gaudiot, 'Detecting spectre attacks using hardware performance counters,' IEEE Transactions on Computers , 2022. - [50] G. Wang, S. Chattopadhyay, A. K. Biswas, T. Mitra, and A. Roychoudhury, 'KLEESpectre: Detecting information leakage through speculative cache attacks via symbolic execution,' TOSEM , 2020. - [51] M. Guarnieri, B. K¨ opf, J. Morales, J. Reineke, and A. Sanchez, 'Spectector: Principled Detection of Speculative Information Flows,' in S&P . IEEE, 2020. ## APPENDIX ## Appendix A: Victim The victim code is shown in Fig. 4, as same as the POC in Meltdown [4]. We run it parallelly with the attacker in the same physical core but a different logical core for a higher reading rate. The victim will try to keep the secret string cached in the cache line. ``` char *strings[] = {SECRET_STR}; while (1) { // keep string cached for better results volatile size_t dummy = 0, i; for (i = 0; i < len; i++) { dummy += secret[i]; } sched_yield(); } ``` Listing 4: Code Snippet for victim.

Rendering Paper...