2111.08404v1

Model: gemini-2.0-flash

## Practical Timing Side Channel Attacks on Memory Compression Martin Schwarzl Graz University of Technology martin.schwarzl@iaik.tugraz.at Pietro Borrello Sapienza University of Rome borrello@diag.uniroma1.it Michael Schwarz CISPA Helmholtz Center for Information Security michael.schwarz@cispa.saarland Abstract -Compression algorithms are widely used as they save memory without losing data. However, elimination of redundant symbols and sequences in data leads to a compression side channel. So far, compression attacks have only focused on the compression-ratio side channel, i.e. , the size of compressed data, and largely targeted HTTP traffic and website content. In this paper, we present the first memory compression attacks exploiting timing side channels in compression algorithms, targeting a broad set of applications using compression. Our work systematically analyzes different compression algorithms and demonstrates timing leakage in each. We present Comprezzor, an evolutionary fuzzer which finds memory layouts that lead to amplified latency differences for decompression and therefore enable remote attacks. We demonstrate a remote covert channel exploiting small local timing differences transmitting on average 643 . 25bit ♥ h over 14 hops over the internet. We also demonstrate memory compression attacks that can leak secrets bytewise as well as in dictionary attacks in three different case studies. First, we show that an attacker can disclose secrets co-located and compressed with attacker data in PHP applications using Memcached. Second, we present an attack that leaks database records from PostgreSQL, managed by a Python-Flask application, over the internet. Third, we demonstrate an attack that leaks secrets from transparently compressed pages with ZRAM, the memory compression module in Linux. We conclude that memory-compression attacks are a practical threat. ## I. INTRODUCTION Data compression plays a vital role for performance and efficiency. For example, compression leads to smaller data footprints, allowing more data to be stored in memory. Memory accesses are typically faster than retrieving data from slower mediums such as hard disks or solid-state disks. As a result, both Microsoft [63] and Apple [3] rely on memory compression in their operating systems (OSs) to better utilize memory. Similarly, memory compression is also used in databases [46] or keyvalue stores [45]. Compression can also increase performance when storing or transferring data on slow mediums. Hence, data compression is also widely used for HTTP traffic [41], [15] and file-system compression [6]. While compression has many advantages, it is problematic in scenarios where sensitive data is compressed [49], [20], [4], [58], [57], [32]. Attacks such as CRIME [49], BREACH [20], Gururaj Saileshwar Georgia Tech Graz University of Technology gururaj.s@gatech.edu Hanna M¨ uller Graz University of Technology h.mueller@student.tugraz.at Daniel Gruss Graz University of Technology daniel.gruss@iaik.tugraz.at TIME [4], or HEIST [58], exploit the combination of encryption and compression in TLS-encrypted HTTP traffic. Karaskostas et al. [32] extended BREACH to commonly used ciphers, such as AES. These attacks demonstrated that an attacker could learn secrets when they are compressed together with attacker-controlled input. However, all these attacks focused on web traffic and only exploited differences in the compressed size of data. The size of compressed data is either accessed directly [49], e.g., in a man-in-the-middle (MITM) attack or indirectly by observing the impact of the compressed size on the transmission time [4]. Although attacks exploiting the compression ratio were first described by Kelsey et al. [33] in 2002, most attacks thus far have focused on compressed web traffic. Surprisingly, security implications of compression in other settings, such as virtual memory and file systems, have not been studied thoroughly. Also, compression fundamentally provides a reduction in size of data at the expense of additional time required for compression and decompression. However, so far, only the result of the compression, i.e. , the compressed size, has been exploited to leak data but not the time consumed by the process of compression or decompression itself. In this paper, we demonstrate that the decompression time also leaks information about the compressed data. Instead of focussing on the final compressed data, we measure the time it takes to decompress data. Hence, we do not need to observe the transport of the compressed data. Decompression timing can be measured in several scenarios, e.g., a simple read access to the data can suffice for compressed virtual memory or file systems. Moreover, decompression timings can be observed remotely, even when the compressed data never leaves the victim system. We systematically analyzed six compression algorithms, including widely-used algorithms such as DEFLATE (in zlib), PGLZ (in PostgreSQL), and zstd (by Facebook). We observe that the decompression time not only correlates with the entropy of the uncompressed data, but also with various other aspects, such as the relative position or alignment of compressible data. In general, these timing differences arise due to the design of the compression algorithm and its implementation. To explore these parameters in an automated fashion, we introduce Comprezzor, an evolutional fuzzer. In line with previous attacks on compression, Comprezzor assumes that parts of the compressed data are controlled by an attacker. Based on this assumption and a target secret, Comprezzor searches for a memory layout that maximizes the timing differences for decompression timing attacks. As a result, Comprezzor discovers a memory layout for compression algorithms which leads to decompression timing attacks with high-latency differences up to three orders of magnitude higher than manually discovered layouts. Based on the results of Comprezzor, we present three case studies demonstrating that these timing differences can be exploited in realistic scenarios to leak sensitive data. We demonstrate that these timing differences can be exploited even in remote attacks on an in-memory database system without executing code on the victim machine and without observing the victim's network traffic. Hence, our case studies show that compressing sensitive data poses a security risk in any scenario using compression and not just for web traffic. In the first case study, we build a remote covert channel which works across 14 hops on the internet, abusing the memory compression of Memcached, an in-memory object caching system. By measuring the execution time of a public PHP API of an application using Memcached, we can covertly transfer data between two computers at an average transmission rate of 643 . 25 bit ♥ h with an error rate of only 0 . 93 % . In a similar setup, we also demonstrate the capability of leaking a secret bytewise co-located with data compressed by a PHP application into Memcached, in 31 . 95 min ( n 20 , σ 8 . 48% ) over the internet. Using a dictionary with 100 guesses, we leak a 6 B -secret in 14 . 99 min . In the second case study, we exploit the transparent database compression of PostgreSQL. Our exploit leaks database records from an inaccessible database table of a remote web app. We show that as long as an attacker can co-locate data with a secret, it is possible for the attacker to influence and observe decompression times when accessing the data, thus, leaking secret data. Such a scenario might arise if structured data, i.e. , JSON documents, are stored with attack-controlled fields into a single cell. Using an attack that leaks a secret bytewise, we leak a 6 B -secret in 7 . 85 min . Our dictionary attack runs in 8 . 28 min on average over the internet for 100 guesses. In the third case study, we exploit ZRAM, the memory compression module in Linux. We demonstrate that even if the application itself or the file system does not compress data, the OS's memory compression can transparently introduce timing side channels. We demonstrate that a dictionary attack with 100 guesses on ZRAM decompression can leak a 6 B -secret co-located with attacker data in a page within 2 . 25 min on average. Our fuzzer shows that most if not all compression algorithms are susceptible to timing side channels when observing the decompression of data. With our case studies, we demonstrate that data leakage is not only a concern for data in transit but also for data at rest. With this work, we seek to highlight the importance of evaluating the trade-off between compressing data, and leaking side-channel information about the compressed data, for any system adopting compression. Contributions. The main contributions of this work are: Fig. 1: DEFLATE algorithm has two parts: LZ77 part to compress sequences and a Huffman part to compress symbols. <details> <summary>Image 1 Details</summary> ![775a9351](/v1/image/775a93518dbfc3346f88533a6acb1f917064e8747e3917dcf2f2203e4c3db432) ### Visual Description ## Diagram: Deflate and Inflate Data Compression ### Overview The image illustrates the processes of data compression (deflate) and decompression (inflate) using LZ77 and Huffman encoding/decoding. It shows the flow of data through these processes, highlighting the order of encoding and decoding steps. ### Components/Axes * **Data:** Represents the original, uncompressed data. * **Compressed Data:** Represents the data after compression. * **LZ77 Encoding/Decoding:** Indicates the LZ77 encoding or decoding process. The blocks are colored light red. * **Huffman Encoding/Decoding:** Indicates the Huffman encoding or decoding process. The blocks are colored light blue. * **Deflate:** The overall compression process, enclosed in a dotted box. * **Inflate:** The overall decompression process, enclosed in a dotted box. * **Arrows:** Indicate the direction of data flow. ### Detailed Analysis **Deflate (Compression):** 1. **Input:** Data enters the Deflate process. 2. **LZ77 Encoding:** The data first undergoes LZ77 encoding. 3. **Huffman Encoding:** The output of LZ77 encoding is then processed by Huffman encoding. 4. **Output:** The result is Compressed Data. **Inflate (Decompression):** 1. **Input:** Compressed Data enters the Inflate process. 2. **Huffman Decoding:** The compressed data first undergoes Huffman decoding. 3. **LZ77 Decoding:** The output of Huffman decoding is then processed by LZ77 decoding. 4. **Output:** The result is Data, which should be the same as the original data before compression. ### Key Observations * The Deflate process involves LZ77 encoding followed by Huffman encoding. * The Inflate process involves Huffman decoding followed by LZ77 decoding, reversing the order of the encoding process. * The diagram clearly shows the flow of data from uncompressed to compressed and back to uncompressed. ### Interpretation The diagram illustrates a common data compression technique where two algorithms, LZ77 and Huffman coding, are used in sequence. The "Deflate" process compresses data by first using LZ77 to remove redundancy by replacing repeated sequences with references to earlier occurrences of the same sequence. Then, Huffman coding is applied to further compress the data by assigning shorter codes to more frequent symbols. The "Inflate" process reverses these steps to decompress the data back to its original form. The order of operations is crucial for successful compression and decompression. The diagram highlights the relationship between these two processes and the flow of data through them. </details> - 1) We present a systematic analysis of timing leakage for several lossless data-compression algorithms. - 2) We demonstrate an evolutional fuzzer to automatically search for memory layouts causing large timing differences for timing side channels on memory compression. - 3) We show a remote covert channel exploiting the in-memory compression of PHP-Memcached. - 4) We demonstrate that compression-based timing side channels can be introduced by compression in applications, databases, or the system's memory compression. - 5) We leak secrets from Memcached, PostgreSQL, and ZRAM within minutes. Outline. In Section II, we provide background. In Section III, we present attack model and building blocks. In Section IV, we systematically analyze compression algorithms. In Section V, we present Comprezzor. In Section VI, we demonstrate local and remote attacks exploiting decompression timing. We discuss mitigations in Section VII and conclude in Section VIII. We responsibly disclosed our findings to the developers and will open-source our fuzzing tool and attacks on Github. ## II. BACKGROUND AND RELATED WORK In this section, we provide background on data compression algorithms, prior side-channel attacks on data compression, and the use of fuzzing to discover side channels. ## A. Data Compression Algorithms Lossless compression reduces the size of data without losing information. One of the most popular algorithms is the DEFLATE compression algorithm, which is used in gzip (zlib). The DEFLATE compression algorithm [12] consists of two main parts, LZ77 and Huffman encoding and decoding, as shown in Figure 1. The Lempel-Ziv (LZ77) part scans for the longest repeating sequence within a sliding window and replaces repeated sequences with a reference to the first occurrence [8]. This reference stores distance and length of the occurrence. The Huffman -coding part tries to reduce the redundancy of symbols. When compressing data, DEFLATE first performs LZ77 encoding and Huffman encoding [8]. When decompressing data (inflate), they are performed in reverse order. The algorithm provides different compression levels to optimize for compression speed or compression ratio. The smallest possible sequence has a length of 3 B [12]. Other algorithms provide different design points for compressibility and speed. Zstd, designed by Facebook [10] for modern CPUs, improves both compression ratio and speed, and is used for compression in file systems (e.g., btrfs, squashfs) and databases (e.g., AWS Redshift, RocksDB). LZ4 and LZO are other algorithms used in file systems, which are optimized for compression and decompression speed. LZ4, in particular, gains its performance by using a sequence compression stage (LZ77) without the symbol encoding stage (Huffman) like in DEFLATE. FastLZ, similar to LZ4, is a fast compression algorithm implementing LZ77. PGLZ is a fast LZ-family compression algorithm used in PostgreSQL for varying-length data in the database [46]. ## B. Prior Data Compression Attacks In 2002, Kelsey [33] first showed that any compression algorithm is susceptible to information leakage based on the compression-ratio side channel. Duong and Rizzo [49] applied this idea to steal web cookies with the CRIME attack by exploiting TLS compression. In the CRIME attack, the attacker adds additional sequences in the HTTP request, which act as guesses for possible cookies values, and observes the request packet length, i.e. , the compression ratio of the HTTP header injected by the browser. If the guess is correct, the LZ77-part in gzip compresses the sequence, making the compression ratio higher, thus allowing the secret to be discovered. To perform CRIME, the attacker needs to be able to spy on the packet length, and the secret needs to have a known prefix such as sessionid= or cookie= . To mitigate CRIME, TLS-level compression was disabled for requests [4], [20]. The BREACH attack [20] revived the CRIME attack by attacking HTTP responses instead of requests and leaking secrets in the HTTP responses such as cross-site-request-forgery tokens. The TIME attack [4] uses the time of a response as a proxy for the compression ratio, as it can be measured even via JavaScript. To reliably amplify the signal, the attacker chooses the size of the payload such that additional bytes, due to changes in compressibility, cross a boundary and cause significantly higher delays in the round-trip time (RTT). TIME exploits the compression ratio to amplify timing differences via TCP windows and does not exploit timing differences in the underlying compression algorithm itself. A malicious site can perform a TIME attack on their visitors to break same-origin policy or leak data from sites that reflect user input in the response, such as a search engine [4]. Vanhoef and Van Goethem [58] showed with HEIST that HTTP/2 features can also be used to determine the size of crossorigin responses and to exploit BREACH using the information. Van Goethem et al. [57] similarly showed that compression can be exploited to determine the exact size of any resource in browsers. Karaskostas and Zindros[32] presented Rupture, extending BREACH attacks to web apps using block ciphers such as AES. Tsai et al. [55] demonstrated cache timing attacks on compressed caches, which can leak a secret key in under 10 ms . Common Theme. All of the prior attacks primarily exploit the compression-ratio side channel. However, the time taken by the underlying compression algorithm for compression or decompression is not analyzed or exploited as side channels. Additionally, these attacks largely target the HTTP traffic and website content, and do not focus on the broader applications Fig. 2: Overview of a memory compression attack exploiting a timing side channel. <details> <summary>Image 2 Details</summary> ![b33ade50](/v1/image/b33ade500d7300ea6ccb3145e3ed83751f86b3a473c1d3fddf9f3f6a583d7ec2) ### Visual Description ## Diagram: Data Compression Attack ### Overview The image illustrates a data compression attack targeting a victim's RAM/Disk page. The diagram shows the steps involved, starting with the attacker sending data to the victim, compressing secret data co-located with attacker data, requesting data, decompressing data, and calculating the round-trip time. ### Components/Axes * **Title:** Victim's RAM/Disk Page * **Attacker:** A gray box on the left side of the diagram. * **Victim:** A representation of a computer screen with a key icon in the bottom-right corner, located on the right side of the diagram. * **Memory Block:** A stacked block representing the victim's RAM/Disk page, consisting of two sections: * **SECRET (victim data):** Top section, colored light red. * **XYZ...GUESS (attacker data):** Bottom section, colored light gray. * **Steps:** Numbered steps indicating the flow of the attack. ### Detailed Analysis 1. **Send data to a victim application/server:** A horizontal arrow points from the "Attacker" to the "Victim". 2. **Compress secret data co-located with attacker data:** A curved arrow points from the "Attacker" to the memory block, specifically targeting both the "SECRET" and "XYZ...GUESS" sections. 3. **Request data (Save timestamp):** A horizontal arrow points from the "Victim" to the "Attacker". 4. **Decompress data:** A vertical arrow points upwards from the memory block. 5. **Calculate Round-Trip Time (RTT):** A horizontal arrow points from the "Attacker" to the "Victim". ### Key Observations * The attacker manipulates the victim's memory by co-locating their data with the victim's secret data. * The attack involves compressing and decompressing data, likely to exploit vulnerabilities in the compression algorithm or the way the data is handled. * The round-trip time is calculated, suggesting that timing analysis is part of the attack strategy. ### Interpretation The diagram depicts a sophisticated attack where the attacker attempts to extract secret information from the victim's memory by manipulating data compression. The attacker first sends data to the victim, then influences the compression process by co-locating their data with the victim's secret data. By requesting the data and measuring the round-trip time, the attacker can potentially infer information about the compressed data and, ultimately, the secret data itself. This type of attack exploits vulnerabilities related to data compression and timing analysis to compromise the victim's system. </details> of compression such as memory-compression, databases, file systems, and others, that we target in this paper. ## C. Fuzzing to Discover Side Channels Historically, fuzzing has been used to discover bugs in applications [64], [16]. Typically, it involves feedback based on novelty search, executing inputs, and collecting ones that cover new program paths in the hope of triggering bugs. Other fuzzing proposals use genetic algorithms to direct input generation towards interesting paths [48], [54]. Recently fuzzing has also been used to discover side channels both in software and in the microarchitecture [59], [21], [40], [18]. ct-fuzz [27] used fuzzing to discover timing side channels in cryptographic implementations. Nilizadeh et al. [42] used differential fuzzing to detect compression-ratio side channels that enable the CRIME attack. In this work, we build on these and use fuzzing to discover timing channels in compression algorithms. ## III. HIGH-LEVEL OVERVIEW In this section, we discuss the high-level overview of memory compression attacks and the attack model. ## A. Attack Model & Attack Overview Most prior attacks discussed in Section II-B focused on the compression ratio side channel. Even the TIME attack and its variants by Vanhoef and Van Goethem et al. [58], [57] only exploited timing differences due to the TCP protocol. None of these exploited or analyzed timing differences due to the compression algorithm itself, which is the focus of our attack. We assume that the attacker can co-locate arbitrary data with secret data. This co-location can be given, e.g., via a memory storage API like Memcached or a shared database between the attacker and the victim. Once the attacker data is compressed with the secret, the attacker only needs to measure the latency of a subsequent access to its data. Furthermore, we assume no software vulnerabilities i.e. , memory corruption vulnerabilities in the application it self. Figure 2 illustrates an overview of a memory compression attack in five steps. The victim application can be a web server with a database or software cache, or a filesystem that compresses stored files. First , the attacker sends its data to be stored to the victim's application. Second , the victim application compresses the attacker-controlled data, together with some co-located secret data, and stores the compressed data. The attacker-controlled data contains a partial guess of the co-located victim's data SECRET or, in the case where a prefix is known, prefix=SECRET . The guess can be performed bytewise to reduce the guessing entropy. If the partial guess is correct, i.e. , SECR , the compressed data not only has a higher compression ratio, but it also influences the decompression time. Third , after the compression happened, the attacker requests the content of the stored data again and takes a timestamp. Fourth , the victim application decompresses the input data and responds with the requested data. Fifth , the attacker takes another timestamp when the application responds and computes the RTT as the difference between the two timestamps. Based on the RTT, which depends on the decompression latency of the algorithm, the attacker infers whether the guess was correct or not and thus infers the secret data. Thus, the attack relies on the timing differences of the compression algorithm itself, which we characterize next. ## IV. SYSTEMATIC STUDY: COMPRESSION ALGORITHMS In this section, we provide a systematic analysis of timing leakage in compression algorithms. We choose six popular compression algorithms (zlib, zstd, LZ4, LZO, PGLZ, and FastLZ), and evaluate the compression and decompression times based on the entropy of the input data. Zlib (which implements the DEFLATE algorithm) is the most popular as a standard for compressing files and is used in gzip. Zstd is Facebook's alternative to Zlib. PGLZ is used in the popular relational database management system PostgreSQL. LZ4, FastLZ, and LZO were built to increase compression speeds. For each of the algorithms, we observe timing differences in the range of hundreds to thousands of nanoseconds based on the content of the input data ( 4 kB -page). ## A. Experimental Setup We conducted the experiments on an Intel i7-6700K (Ubuntu 20.04, kernel 5.4.0) with a fixed frequency of 4 GHz . We evaluate the latency of each compression algorithm with three different input values, each 4 kB in size. The first input is the same byte repeated 4096 times, which should be fully compressible . The second input is partly compressible and a hybrid of two other inputs: half random bytes and half compressible repeated bytes. The third input consists of random bytes which are theoretically incompressible . With these, we show that compression algorithms can have different timings dependent on the compressibility of the input. ## B. Timing Differences for Different Inputs For each algorithm and input, we measure the decompression and compression time over 100 000 repetitions and compute the mean values and standard deviations. Decompression. Table I lists the decompression latencies for all evaluated compression algorithms. Depending on the entropy of the input data, there is considerable variation in the decompression time. All algorithms incur a higher latency for decompressing a fully compressible page compared to an incompressible page, leading to a timing difference of few hundred to few thousand nanoseconds for different TABLE I: Different compression algorithms yield distinguishable timing differences when decompressing content with a different entropy ( n 100000 ). | Algorithm | Fully Compressible (ns) | Partially Compressible (ns) | Incompressible (ns) | |-------------|---------------------------------|----------------------------------|---------------------------------| | FastLZ | 7257 . 88 ( 0 . 23% ) | 4264 . 56 ( 2 . 27% ) | 1155 . 57 ( 0 . 92% ) | | LZ4 | 605 . 79 ( 1 . 02% ) | 218 . 68 ( 1 . 76% ) | 107 . 90 ( 2 . 49% ) | | LZO | 2115 . 65 ( 2 . 05% ) | 1220 . 07 ( 3 . 64% ) | 309 . 44 ( 6 . 27% ) | | PGLZ | 813 . 75 ( 0 . 71% ) | 5340 . 47 ( 0 . 38% ) | - | | zlib | 7016 . 02 ( 0 . 33% ) | 13212 . 53 ( 0 . 35% ) | 1640 . 09 ( 1 . 51% ) | | zstd | 941 . 05 ( 0 . 94% ) | 772 . 55 ( 0 . 77% ) | 370 . 59 ( 2 . 87% ) | algorithms. This is because, for incompressible data, algorithms can augment the raw data with additional metadata to identify such cases and perform simple memory copy operations to 'decompress' the data, as is the case for zlib where the decompression for an incompressible page is 5375 . 93 ns faster than a fully compressible page. For decompression of partially compressible pages, some algorithms (FastLZ, LZ4, LZO, zstd) have lower latency than fully compressible pages, while others (zlib and PGLZ) have a higher latency than fully compressible pages. This shows the existence of even algorithm-specific variations in timings. PGLZ does not create compressible memory in the case of an incompressible input, and hence we do not measure its latency for this input. Compression. For compression, we observed a trend in the other direction (Table IV in Appendix A lists compression latencies for different algorithms). For a fully compressible page, there are also latencies between the three different inputs, which are clearly distinguishable in the order of multiple hundreds to thousands of nanoseconds. Thus, timing side channels from compression might also be used to exploit compression of attacker-controlled memory co-located with secret memory. However, attacks using the compression side channel might be harder to perform in practice as the compression of data might be performed in a separate task (in the background), and the latency might not be easily observable for a user. Hence, our work focuses on attacks exploiting the decompression timing side channel. Handling of Corner Cases. For incompressible pages, the 'compressed' data can be larger than the original size with the additional compression metadata. Additionally, it is slower to access after compression than raw uncompressed data. Hence, this corner-case with incompressible data may be handled in an implementation-specific manner, which can itself lead to additional side channels. For example, a threshold for the compression ratio can decide when a page is stored in a raw format or in a compressed state, like in MemcachedPHP [45]. Alternatively, PGLZ, the algorithm used in PostgreSQL database, which computes the maximum acceptable output size for input by checking the input size and the strategy compression rate, could simply fail to compress inputs in such corner cases. In Section VI, we show how real-world applications like Memcached, PostgreSQL, and ZRAM deal with such corner cases and demonstrate attacks on each of them. ## C. Leaking secrets via timing side channel Thus far, we analyzed timing differences for decompressing different inputs, which in itself is not a security issue. In this section, we demonstrate Decomp+Time to leak secrets from compressed pages using these timing differences. We focus on sequence compression, i.e. , LZ77 in DEFLATE. 1) Building Blocks for Decomp+Time: Decomp+Time consists of three building blocks: sequence compression that we use to modulate the compressibility of an input, colocation of attacker data and secrets, and timing variation for decompression depending on the change in compressibility of the input. Sequence compression: Sequence compression i.e. , LZ77 tries to reduce the redundancy of repeated sequences in an input by replacing each occurrence with a pointer to the first occurrence. This results in a higher compression ratio if redundant sequences are present in the input and a lower ratio if no such sequences are present. This compressibility side channel can leak information about the compressed data. Co-location of attacker data and secrets: If the attacker can control a part of data that is compressed with a secret, as described in Figure 2, then the attacker can place a guess about the secret and place it co-located with the secret to exploit sequence compression. If the compression ratio increases, the attacker can infer if the guess matches the secret or not. While the CRIME attack [49] previously used a similar set up and observed the compressed size of HTTP requests to steal secrets like HTTP cookies, we target a more general attack and do not require observability of compressed sizes. Timing Variation in Decompression: We infer the change in compressibility via its influence on decompression timing. We observe that even sequence compression can cause variation in the decompression timing based on compressibility of inputs (for all algorithms in Section IV-B). If the sequence compression reduces redundant symbols in the input and increases the compression ratio, we observe faster decompression due to fewer symbols. Otherwise, with a lower compression ratio and more number of symbols, decompression is slower. Hence, the attacker can infer the compressibility changes for different guesses by observing differences in decompression time due to sequence compression. For a correct guess, the guess and the secret are compressed together and the decompression time is faster due to fewer symbols. Otherwise, it is slower for incorrect guesses with more symbols. 2) Launching the Decomp+Time: Using the building blocks described above, we setup the attack with an artificial victim program that has a 6 B -secret string ( SECRET ) embedded in a 4 kB -page. The page also contains attacker-controlled data that is compressed together with the secret, like the scenario shown in Figure 2. The attacker can update its own data in place, allowing it to make multiple guesses. The attacker can also read this data, which triggers a decompression of the page and allows the attacker to measure the decompression time. A correct guess that matches with the secret results in faster decompression. We perform the attack on the zlib library (1.2.11) and use 8 different guesses, including the correct guess. For each guess, a single string is placed 512 B away from the secret value; other data in the page is initialized with dummy values (repeated number sequence from 0 to 16). To measure the execution time, we use the rdtsc instruction. Fig. 3: Decompression time with Decomp+Time for a dictionary attack. There is a clear threshold (line) separating the correct from wrong secrets. <details> <summary>Image 3 Details</summary> ![3ff34325](/v1/image/3ff343251572c195abc2931e5d5685235fbb6f04ab4bf9f0c7512e5a9811e584) ### Visual Description ## Timing Attack Chart: Guess Timing Analysis ### Overview The image is a scatter plot showing the timing (in nanoseconds) for different "Guess" values. The x-axis represents the "Guess" values, and the y-axis represents the "Timing [ns]". A horizontal red line is present at 3950 ns. Most data points are blue, except for "PYTHON" which is red. ### Components/Axes * **X-axis:** "Guess" - Categorical values: FOOBAR, SECRET, PYTHON, 123456, COOKIE, SOMEDA, ADMINI, NOPEΝΟ * **Y-axis:** "Timing [ns]" - Numerical scale with markers at 3950 and 4000. * **Horizontal Line:** A red line at Timing = 3950 ns. * **Data Points:** Blue circles, except for PYTHON which is a red circle. ### Detailed Analysis * **FOOBAR:** Timing is approximately 4030 ns (blue). * **SECRET:** Timing is approximately 4030 ns (blue). * **PYTHON:** Timing is approximately 3930 ns (red). * **123456:** Timing is approximately 4020 ns (blue). * **COOKIE:** Timing is approximately 4000 ns (blue). * **SOMEDA:** Timing is approximately 4010 ns (blue). * **ADMINI:** Timing is approximately 4030 ns (blue). * **NOPEΝΟ:** Timing is approximately 3980 ns (blue). ### Key Observations * The timing for "PYTHON" is significantly lower than the other "Guess" values. It is also the only data point colored red. * All other "Guess" values have timings clustered around 4000-4030 ns. * The red horizontal line at 3950 ns serves as a visual reference point. ### Interpretation The chart likely represents the results of a timing attack. The "Guess" values are attempts to guess a secret. The timing of each guess is measured. The significantly lower timing for "PYTHON" suggests that this guess might be correct or have a different execution path, making it distinguishable from the other guesses. The red color highlights this anomaly. The horizontal line at 3950 ns could represent a threshold or a baseline timing value. The data suggests that "PYTHON" is a potential vulnerability. </details> Fig. 4: Leaking the secret sixth offset character-wise. Still, the correct guess is distinguishable from the wrong guesses. A clear threshold can be used to separate the correct guess from the incorrect guesses. <details> <summary>Image 4 Details</summary> ![77567ac2](/v1/image/77567ac2c84dce012b0e91ec21514f8dd4fa4771aa5897ae5d912f5d2f061a5b) ### Visual Description ## Scatter Plot: Timing vs. Guess ### Overview The image is a scatter plot showing the timing (in nanoseconds) for different "guesses." The x-axis represents the guesses, and the y-axis represents the timing. A horizontal red line is present at approximately 3528 ns. Most guesses have a timing around 3550 ns, except for one guess labeled "SECRET" which has a significantly lower timing. ### Components/Axes * **X-axis:** "Guess" with categories: SECRES, SECRE1, SECRET, SECRE2, SECRE5, SECRE7, SECREP, SECREZ. * **Y-axis:** "Timing [ns]" with scale from 3520 to 3560 ns. Axis markers are present at 3520, 3540, and 3560 ns. * **Horizontal Line:** A red horizontal line is present at approximately 3528 ns. ### Detailed Analysis * **SECRES:** Timing is approximately 3546 ns (blue dot). * **SECRE1:** Timing is approximately 3559 ns (blue dot). * **SECRET:** Timing is approximately 3512 ns (red dot). * **SECRE2:** Timing is approximately 3557 ns (blue dot). * **SECRE5:** Timing is approximately 3551 ns (blue dot). * **SECRE7:** Timing is approximately 3557 ns (blue dot). * **SECREP:** Timing is approximately 3558 ns (blue dot). * **SECREZ:** Timing is approximately 3559 ns (blue dot). ### Key Observations * The timing for the "SECRET" guess is significantly lower than all other guesses. * All other guesses have timings clustered between 3546 ns and 3559 ns. * The red horizontal line is positioned at approximately 3528 ns. ### Interpretation The scatter plot suggests that the "SECRET" guess results in a significantly faster timing compared to the other guesses. This could indicate that the "SECRET" guess is the correct one, or that it triggers a different code path that executes more quickly. The red line could represent a threshold or a target timing value. The fact that the "SECRET" guess falls below this line while all other guesses are above it further emphasizes its distinct behavior. </details> Evaluation. Our evaluation was performed on an Intel i76700K (Ubuntu 20.04, kernel 5.4.0) with a fixed frequency of 4 GHz . To get stable results, we repeat the decompression step with each guess 10 000 times and repeat the entire attack 100 times. For each guess, we take the minimum timing difference per guess and choose the global minimum timing difference to determine the correct guess. Figure 3 illustrates the minimum decompression times. With zlib, we observe that the correct guess is faster on average by 71 . 5 ns ( n 100 , σ 27 . 91% ) compared to the second-fastest guess. Our attack correctly guessed the secret in all 100 repetitions of the attack. While we used a 6 B secret, our experiment also works for smaller secrets down to a length of 4 B . We also observe the attack is faster if the secret has a known prefix with a length of 3 B or more. Bytewise leakage. If the attacker manages to guess or know the first three bytes of the secret, the subsequent bytes can even be leaked bytewise using our attack. Both CRIME and BREACH, assume a known prefix such as cookie= . Similar to CRIME and BREACH [49], [20], [32], we try to perform a bytewise attack by modifying our simple layout. We use the first 5 characters of SECRET as a prefix "SECRE" and guess the last byte with 7 different guesses. On average the latency is 28 . 37 ns ( n 100 , σ 65 . 78% ), between the secret and second fastest guess. Figure 4 illustrates the minimum decompression times for the different guesses. However, we observe an error rate of 8 % for this experiment, which might be caused by the Huffmann-decoding part in DEFLATE. While techniques like the Two-Tries method [49], [20], [32] have been proposed to overcome the effects of Huffmancoding in DEFLATE to improve the fidelity of byte-wise attacks exploiting compression ratio, we seek to explore whether bytewise leakage can be reliably performed via the timing only by amplying the timing differences. 3) Challenge Of Amplifying Timing: While the decompression timing side channels can be directly used in attacks as demonstrated, the timing differences are still quite small for practical exploits on real-world applications. For example, the timing differences we observed for the correct guess are in tens of nanoseconds, while most practical use-cases of compression such as in a Memcached server accessed over the network, or PostgreSQL database accessed from a filesystem could have access latencies of milliseconds. Amplification. To enable memory compression attacks via the network or on data stored in file systems, we need to amplify the timing difference between correct and incorrect guesses. However, it is impractical to manually identify inputs that could amplify the timing differences, as each compression algorithm has a different implementation that is often highly optimized. Moreover, various input parameters could influence the timing of decompression, such as frequency of sequences, alignments of the secret and attack-controlled data, size of the input, entropy of the input, and different compression levels provided by algorithms. As an additional factor, the underlying hardware microarchitecture might also have undesirable influences. Therefore, to automate this process in a systematic manner, we develop an evolutionary fuzzer, Comprezzor, to find input corner cases that amplify the timing difference between correct and incorrect guesses for different compression algorithms. ## V. EVOLUTIONARY COMPRESSION-TIME FUZZER Compression algorithms are highly optimized, complex and even though most of them are open-source, their internals are only partially documented. Hence, we introduce Comprezzor, an evolutionary fuzzer to discover attacker-controlled inputs for compression algorithms that maximize differences in decompression times for certain guesses enabling both bytewise leakage and dictionary attacks. Comprezzor empowers genetic algorithms to amplify decompression side channels. It treats the decompression process of a compression algorithm as an opaque box and mutates inputs to the compression while trying to maximize the output, i.e. , timing differences for decompression with different guesses. The mutation process in Comprezzor focuses on not only the entropy of the data, but also the memory layout and alignment that end up triggering optimizations and slow paths in the implementation that are not easily detectable manually. While previous approaches used fuzzing to detect timing side channels [42], [27], Comprezzor can dramatically amplify timing differences by being specialized for compression algorithms by varying parameters like the input size, layout, and entropy that affect the decompression time. The inputs discovered by Comprezzor can amplify timing differences to such an extent that they are even observable remotely. ## A. Design of Comprezzor In this section, we describe the key components of our fuzzer: Input Generation, Fitness Function, Input Mutation and Input Evolution. Input Generation. Comprezzor generates memory layouts for Decomp+Time by maximizing the timing differences on decompression of the correct guess compared to incorrect ones. Comprezzor creates layouts with sizes in the range of 1 kB to 64 kB . It uses a helper program that takes the memory layout configuration as input, builds the requested memory layout for each guess, compresses them using the target compression algorithm, and reports the observed timing differences in the decompression times among the guesses. A memory layout configuration is composed of a file to start from, the offset of the secret in the file, the offset of guesses, how often the guesses are repeated in the layout, the compression level ( i.e. , between 1 and 9 for zlib), and a modulus for entropy reduction that reduces the range of the random values. The fuzzer can be used in cases where a prefix is known and unknown. Fitness Function. The evolutionary algorithm of Comprezzor starts from a random population of candidate layouts (samples) and takes as feedback the difference in time between decompression of the generated memory containing the correct guess and the incorrect ones. Comprezzor uses the timing difference between the correct guess and the second-fastest guess as the fitness score for a candidate. The fitness function is evaluated using a helper program performing an attack on the same setup as in Section IV-A. The program performs 100 iterations per guess and reports the minimum decompression time per guess to reduce the impact of noise. This minimum decompression time is the output of the fitness function for Comprezzor. Input Mutation. Comprezzor is able to amplify timing differences thanks to its set of mutations over the samples space specifically designed for data compression algorithms. Data compression algorithms leverage input patterns and entropy to shrink the input into a compressed form. For performance reasons, their ability to search for patterns in the input is limited by different internal parameters, like lookback windows, lookahead buffers, and history table sizes [46], [8]. We designed the mutations that affect the sample generation process to focus on input characteristics that directly impact compression algorithm strategies and limitations towards corner cases. Comprezzor mutations randomize the entropy and size of the samples that are generated. This has an effect on the overall compressibility of sequences and literals in the sample [8]. Moreover, the mutator varies the number of repeated guesses and their position in the resulting sample stressing the capability of the compression algorithm to find redundant sequences over different parts of the input. This affects the sequence compression and can lead to corner cases, i.e. , where subsequent blocks to be compressed are directly marked as incompressible, as we will later show for zlib in Section V-B. All of these factors together contribute to Comprezzor's ability to amplify timing differences. Input Evolution. Comprezzor follows an evolutionary approach to generate better inputs that maximize timing differences. It generates and mutates candidate layout configurations for the attack. Each configuration is forwarded to the helper program that builds the requested layout, inserts the candidate guess, compresses the memory, and returns the decompression time. Algorithm 1 shows the pseudocode of the evolutionary algorithm used by Comprezzor. ``` ``` Algorithm 1: Comprezzor evolutionary pseudo-code TABLE II: Timing differences between correct and incorrect guesses found by Comprezzor and the corresponding runtime. | Algorithm | Max difference for correct guess (ns) | Runtime (h) | |-------------|-----------------------------------------|---------------| | PGLZ | 109233 | 2.09 | | zlib | 71514.8 | 2.46 | | zstd | 4239.25 | 1.73 | | LZ4 | 2530.5 | 1.64 | Comprezzor iterates through different generations, with each sample having a probability of survival to the new generation that depends on its fitness score, where the fitness score is the time difference between the correct guess and the nearest incorrect one. Comprezzor discards all the samples where the correct guess is not the fastest or slowest. A retention factor decides the percentage of samples selected to survive among the best ones in the old generation ( 5 % by default). The population for each new generation is initialized with the samples that survived the selection process and enhanced by random mutations of such samples, tuning evolution towards local optimal solutions. By default, 70 % of the new population is generated by mutating the best samples from the previous generation. However, to avoid trapping the fuzzer in locally optimal solutions, a percentage of completely random new samples is injected in each new generation. Comprezzor runs until the maximum number of generations is evaluated, and the best candidate layouts found are returned. ## B. Results: Fuzzing Compression Algorithms Evaluation. Our test system has an Intel i7-6700K (Ubuntu 20.04, kernel 5.4.0) with a fixed frequency of 4 GHz . We run Comprezzor on four compression algorithms: zlib (1.2.11), Facebook's Zstd (1.5.0), LZ4 (v1.9.3), and PGLZ in PostgreSQL (v12.7). Comprezzor can support new algorithms by just adding compression and decompression functions. We run Comprezzor with 50 epochs and a population of 1000 samples per epoch. We set the retention factor to 5 % , selecting the best 50 samples in each generation of 1000 samples. We randomly mutate the selected samples to generate 70 % of the children and add 25 % of completely randomly generated layouts in the new generation. The runtime of Comprezzor to finish all 50 epochs was for zlib, 2 . 46 h , for zstd, 1 . 73 h , for LZ4, 1 . 64 h , and for PGLZ, 2 . 09 h . Table II lists the maximum timing differences found for Decomp+Time on the four compression algorithms, all of which use sequence compression like LZ77. Particularly, for zlib and PGLZ, the fuzzer discovers cases with timing differences of the scale of microseconds between correct and incorrect guesses, that is, orders of magnitude higher than the manually discovered differences in Section IV. Since zlib is one of the more popular algorithms, we analyze its high latency difference in more detail below. Zlib. Using Comprezzor, we found a corner case in zlib where all incorrect guesses lead to a slow code path, and the correct guess leads to a significantly faster execution time. We ran Comprezzor with a known prefix i.e. , cookie. We observe a high timing difference of 71 514 . 75 ns between correct and incorrect guesses, which is 3 orders of magnitude higher compared to the latency difference found manually in Section IV-C. This memory layout also leads to similarly high timing differences across all compression levels of zlib. To rule out micro-architectural effects, we re-run the experiment on different systems equipped with an Intel i5-8250U, AMD Ryzen Threadripper 1920X, and Intel Xeon Silver 4208, and observe similar timing differences. On further analysis, we observe that the corner case identified by the fuzzer is due to incompressible data. The initial data in the page, from a uniform distribution, is primarily incompressible. For such incompressible blocks, DEFLATE algorithm can store them as raw data blocks, also called stored blocks [12]. Such blocks have fast decompression times as only a single memcpy operation is needed on decompression instead of the actual DEFLATE process. In this particular corner case, the fuzzer discovered the correct guess results in such an incompressible stored block which is faster, while an incorrect guess results in a partly compressible input which is slower, as we describe below. Correct Guess. On a compression for the case where the guess matches the secret, the entire guess string, i.e. , cookie=SECRET , is compressed with the secret string. All subsequent data in the input is incompressible and treated as a stored block and decompressed with a single memcpy operation, which is fast without any Huffman or LZ77 decoding. Incorrect Guess. On a compression for the case where the guess does not match the secret, only the prefix of the guess, i.e. , cookie= , is compressed with the prefix of the secret, while another longer sequence i.e. , cookie=FOOBAR leads to forming a new block. On a decompression, this block must now undergo the Huffman decoding (and LZ77), which results in several table lookups, memory accesses, and higher latency. Thus, the timing differences for the correct and incorrect guesses are amplified by the layout that Comprezzor discovered. We provide more details about this layout in Figure 14 in the Appendix C and also provide listings of the debug trace from zlib for the decompression with the correct and incorrect guesses, to illustrate the root-cause of the amplified timing differences with this layout. Takeaway: We showed that it is possible to amplify timing differences for decompression timing attacks. With Comprezzor, we presented an approach to automatically find high timing differences in compression algorithms. The high latencies can be used for stable remote attacks. <details> <summary>Image 5 Details</summary> ![b8567bf5](/v1/image/b8567bf57327d98c3e2c39782aa22770ed77d0f60f03096661680029b9d02c61) ### Visual Description ## Data Flow Diagram: Key-Value Store ### Overview The image is a data flow diagram illustrating the interaction between a sender, a key-value store, and a receiver. The diagram shows data being stored/updated from the sender to the key-value store, where it is compressed/decompressed, and then fetched by the receiver. ### Components/Axes * **Sender:** A rectangular box on the left side of the diagram. * **Key-Value Store:** A larger rectangular box in the center, labeled "Key-Value Store Compress/Decompress". The bottom portion of this box is shaded with diagonal lines. * **Receiver:** A rectangular box on the right side of the diagram, shaded with diagonal lines. * **Store/Update:** An arrow pointing from the Sender to the Key-Value Store, labeled "Store/Update". * **Fetch data:** An arrow pointing from the Key-Value Store to the Receiver, labeled "Fetch data". ### Detailed Analysis 1. **Sender:** The sender is represented as a simple rectangular box. 2. **Key-Value Store:** The key-value store is a larger rectangular box, indicating its role as a storage component. The "Compress/Decompress" label suggests that data is compressed when stored and decompressed when retrieved. The bottom shaded portion of the box likely represents the storage medium or a specific layer within the key-value store. 3. **Receiver:** The receiver is represented as a shaded rectangular box, indicating that it receives data. 4. **Data Flow:** * The "Store/Update" arrow indicates that the sender sends data to the key-value store for storage or updating existing data. * The "Fetch data" arrow indicates that the receiver retrieves data from the key-value store. ### Key Observations * The diagram illustrates a unidirectional data flow from the sender to the key-value store and then to the receiver. * The key-value store acts as an intermediary, providing storage and compression/decompression capabilities. ### Interpretation The diagram depicts a common data storage and retrieval pattern. The sender transmits data to the key-value store, which manages the data, potentially compressing it for efficient storage. The receiver then retrieves the data from the key-value store, which decompresses it before sending it to the receiver. This architecture is often used in distributed systems and cloud computing environments where data needs to be stored and accessed efficiently. The compression/decompression step is crucial for optimizing storage space and network bandwidth. </details> Fig. 5: Covert communication via a key-value store using memory compression. <details> <summary>Image 6 Details</summary> ![a52b408b](/v1/image/a52b408b72255f2d1862a5407f325027d64fb8e34fd77b82e4b9450238bb76e1) ### Visual Description ## Histogram: Response Time vs. Amount for Different Entropy Levels ### Overview The image is a histogram comparing the distribution of response times for two different entropy levels: "Lower entropy" and "Higher entropy". The x-axis represents response time in nanoseconds (ns), scaled by 10^4, and the y-axis represents the amount or frequency of occurrence. ### Components/Axes * **X-axis:** Response time [ns] * 10^4, ranging from 0.6 to 1.2 with increments of 0.1. * **Y-axis:** Amount, ranging from 0 to 600 with increments of 200. * **Legend (Top-Right):** * Blue dotted line: Lower entropy (0x42*4096) * Red solid line: Higher entropy ### Detailed Analysis * **Higher Entropy (Red):** The distribution is concentrated around 0.67 ns. The peak is approximately at an amount of 550. * **Lower Entropy (Blue):** The distribution is concentrated around 1.04 ns. The peak is approximately at an amount of 450. ### Key Observations * The response times for higher entropy are significantly lower than those for lower entropy. * The distribution for higher entropy appears to be slightly more concentrated than the distribution for lower entropy. ### Interpretation The histogram suggests that higher entropy levels are associated with faster response times. The two distinct peaks indicate that the entropy level has a clear impact on the response time distribution. The difference in response times could be due to the computational complexity or data processing requirements associated with different entropy levels. The data suggests that systems with lower entropy may experience longer processing times, potentially due to factors such as increased data redundancy or less efficient algorithms. </details> Fig. 6: Timing when decompressing a single zlib-compressed 4 kB page with a high-entropy in comparison to a page with a low-entropy. ## VI. CASE STUDIES In this section, we present three case studies showing the security impacts of the timing side channel. In Section VI-A, we present a local covert channel that allows hidden communication via the latency decompression. Furthermore, we present a remote covert channel that exploits the decompression of memory objects in a PHP application using Memcached. We demonstrate Decomp+Time on a PHP application that compresses secret data together with attacker-data to leak the secret bytewise. In Section VI-D, we leak inaccessible values from a database, exploiting the internal compression of PostgreSQL. In Section VI-E, we show that OS-based memory compression also has timing side channels that can leak secrets. ## A. Covert channel To evaluate the transmission capacity of memorycompression attacks, we evaluate the transmission rate for a covert channel, where the attacker controls both the sending and receiving end. First, we evaluate a cross-core covert channel that relies on shared memory. This is in line with state-of-theart work to measure the maximum capacity of the memory compression side channel [24], [35], [37], [62], [23], [25]. The maximum capacity poses a leakage rate limit for other attacks that use memory compression as a side channel. Our local covert channel achieves a capacity of 448 bit ♥ s ( n 100 , σ 0 . 001% ). Setup. Figure 5 illustrates the covert communication between a sender and receiver via shared memory. We create a simple key-value store that communicates via UNIX sockets. The store takes input from a client and stores it on a 4 kB -aligned page. The sender inserts a key and value into the first page to communicate with the server. The receiver inserts a small key and value as well, which should be placed on the same 4 kB . If the 4 kB -page is fully written, the key-value store compresses the whole page. Compressing full 4 kB -page separately also occurs on filesystems like BTRFS [6]. Sender and receiver agree on a time frame to send and read content. The basic idea is to communicate via the observation on zlib that memory with low entropy, e.g., 4096 times the same value, requires more time when decompressing compared to pages with a higher entropy, e.g., repeating sequence number from 0 to 255 . Figure 6 shows the histogram of latency when decompression is triggered for both cases for the key-value on an Intel i7-6700K running at 4 GHz . On average, we observe a timing difference of 3566 . 22 ns ( 14 264 . 88 cycles, n 100000 ). These two timing differences are easy to distinguish and enable covert communication. Transmission. We evaluate our cross-core covert channel by generating and sending random content from /dev/urandom through the memory compression timing side channel. The sender transmits a '1'-bit by performing a store with the page that has a higher entropy 4095 B in the key-value store. Conversely, to transmit a '0'-bit, the sender compresses a lowentropy 4 kB page. To trigger the compression, the receiver also stores a key-value pair in the page, wheres as 1 B fills the remaining page. The key-value store will now compress the full 4 kB -page, as it is fully used. The receiver performs a fetch request from the key-value store, which triggers a decompression of the 4 kB -page. To distinguish bits, the receiver measures the mean RTT of the fetch request. Evaluation. Our test machine is equipped with an Intel Core i7-6700K (Ubuntu 20.04, kernel 5.4.0), and all cores are running on a fixed CPU frequency of 4 GHz . We repeat the transmission 50 times and send per run 640 B . To reduce the error rate, the receiver fetches the receiver-controlled data 50 times and compares the average response time against the threshold. Our cross-core covert channel achieves an average transmission rate of 448 bit ♥ s ( n 100 , σ 0 . 007% ) with an error rate of 0 . 082 % ( n 100 , σ 2 . 85% ). This capacity of the unoptimized covert channel is comparable to other stateof-the-art microarchitectural cross-core covert channels that do not rely on shared memory [60], [36], [14], [44], [37], [53], [47], [59]. ## B. Remote Covert Channel We extend the scope of our covert channel to a remote covert channel. In the remote scenario, we rely on Memcached on a web server for memory compression and decompression. Memcached. Memcached is a widely used caching system for web sites [38]. Memcached is a simple key-value store that internally uses a slab allocator. A slab is a fixed unit of contiguous physical memory, which is typically assigned to a certain slab class which is typically a 1 MB region [29]. Certain programming languages such as PHP or Python offer the option to compress memory before it is stored in Memcached [45], [5]. For PHP, memory compression is enabled per default if Memcached is used [45]. PHP-Memcached has a threshold that decides at which size data is compressed, with the default value at 2000 B . Furthermore, PHP-Memcached computes the compression ratio to a compression factor and decides whether it stores the data compressed or uncompressed in Memcached. The default value for the compression factor is 1 . 3 , i.e. , 23 % of the space needs to be saved from the original data size to store it compressed [45]. Fig. 7: Distribution of HTTP response times for zlibdecompressed pages stored in Memcached on memory compression encoding a '1' and a '0'-bit. <details> <summary>Image 7 Details</summary> ![4949f006](/v1/image/4949f006be9a0cb5c4d284f26ce904cc42b292b60023e6edb6d474035de0bc58) ### Visual Description ## Chart: Response Time Distribution by Category ### Overview The image is a line chart comparing the distribution of response times for two categories, labeled "0" and "1". The x-axis represents response time in nanoseconds (ns), scaled by 10^6, and the y-axis represents the amount or frequency of each response time. ### Components/Axes * **X-axis:** Response time [ns] * 10^6. Scale ranges from 0.4 to 2.0, with increments of 0.2. * **Y-axis:** Amount. Scale ranges from 0 to 400, with increments of 200. * **Legend (Top-Right):** * Dotted Blue Line: "0" * Solid Red Line: "1" ### Detailed Analysis * **Category "0" (Dotted Blue Line):** * The line starts near 0 at x=0.4. * It rises to a peak around x=0.9, with a value of approximately 300. * It then decreases, reaching approximately 250 at x=1.0. * The line continues to decrease, reaching approximately 150 at x=1.2. * The line approaches 0 after x=1.4. * **Category "1" (Solid Red Line):** * The line starts near 0 at x=0.4. * It rises to a peak around x=0.95, with a value of approximately 380. * It then decreases, reaching approximately 240 at x=1.1. * The line continues to decrease, reaching approximately 20 at x=1.4. * The line approaches 0 after x=1.6. ### Key Observations * Both categories exhibit a unimodal distribution, with a single peak. * Category "1" has a higher peak amount (approximately 380) compared to category "0" (approximately 300). * Category "1" peaks slightly later (around 0.95) than category "0" (around 0.9). * Both categories have a similar spread, with response times concentrated between 0.6 and 1.4. ### Interpretation The chart compares the distribution of response times for two distinct categories, "0" and "1". The data suggests that category "1" generally has a higher frequency of responses around its peak compared to category "0". While both categories have similar response time ranges, the slight shift in peak and the higher peak amount for category "1" indicate a difference in the underlying processes or conditions that generate these response times. The data could be used to understand performance differences or identify potential bottlenecks in a system where these categories represent different states or operations. </details> Bypassing the Compression Factor. While the compression factor already introduces a timing side channel, we focus on scenarios where data is always compressed. This is later useful for leakage on co-located data. Intuitively, it should suffice to prepend highly-compressible data to enforce compression. However, we found that only prepending and adopting the offsets for secret repetitions, as for zlib, also influenced the corner case we found and the large timing difference. We integrate prepending of compressible pages to Comprezzor and also add the compression factor constraint to automatically discover inputs that fulfills the constraint and leads to large latencies between a correct and incorrect guess. Transmission. We use the page found by Comprezzor which triggers a significantly lower decompression time to encode a '1'-bit. Conversely, for a '0'-bit, we choose content that triggers a significantly higher decompression time. The sender places a key-value pair for each bit index at once into PHPMemcached. The receiver sends GET requests to the resource, causing decompression of the data that contains the sender content. The timing difference of the decompression is reflected in the RTT of the HTTP request. Hence, we measure the timing difference between the HTTP request being sent and the first response packet received. Evaluation. Our sender and receiver are equipped with an Intel i7-6700K (Ubuntu 20.04, kernel 5.4.0) and connected to the internet with a 10 Gb/s connection. For the web server, we use a dedicated server in the Equinix [13] cloud, 14 hops away from our network (over 1000 km physical distance) with a 10 Gbit ♥ s connection. The victim server is equipped with an Intel Xeon E3-1240 v5 (Ubuntu 20.04, kernel 5.4.0). Our server runs a PHP (version 7.4) website, which allows storing and retrieving data, backed by Memcached. We installed Memcached 1.5.22, the default version on Ubuntu 20.04. The PHP website is hosted on Nginx 1.18.0 with PHP-FPM. We perform a simple test where we perform 5000 HTTP requests to a PHP site that stores zlib-compressed memory in Memcached. Figure 7 illustrates the timing difference between a '0'-bit and a '1'-bit. The timing difference between the mean values for a '0'- and '1'-bit is 61 622 . 042 ns . Figure 7 shows that the two distributions overlap to a certain extent. We call this overlap the critical section . We implement an early-stopping mechanism for bits in the covert channel that clearly belong into one of the two regions below or above the critical section. After 25 requests, we start to classify the RTT per bit. Fig. 8: Distribution of the response time for the correct guess (S) and the incorrect guesses (0-9, A-R, T-Z) for the first byte leaked in the byte-wise leakage of the secret from PHPMemcached. Similar trends are observed for subsequent bytes leaked as shown in Appendix D. <details> <summary>Image 8 Details</summary> ![b1beec63](/v1/image/b1beec6323ad94f553f11c233b317d3adc6fed9b4edf1936ac4bfcb722cfdb3d) ### Visual Description ## Scatter Plot: Timing vs. Guess for Byte 0 ### Overview The image is a scatter plot showing the timing in nanoseconds (ns) against different "Guess" values, which represent cookie values. The plot appears to be analyzing the timing behavior for "Byte 0" based on different cookie values. Most data points are blue, but one is red. ### Components/Axes * **Title:** Byte 0 * **Y-axis:** * Label: Timing [ns] * Scale: 1 x 10^6 to 1.06 x 10^6 ns, with tick marks at 1 x 10^6, 1.02 x 10^6, 1.04 x 10^6, and 1.06 x 10^6. * **X-axis:** * Label: Guess * Categories: cookie=0, cookie=5, cookie=9, cookie=A, cookie=H, cookie=M, cookie=S, cookie=Z ### Detailed Analysis The data points are scattered across the x-axis categories. The y-axis represents timing in nanoseconds, and the x-axis represents different cookie values. * **cookie=0:** Timing values are approximately 1.01 x 10^6 ns to 1.06 x 10^6 ns. * **cookie=5:** Timing values are approximately 1.04 x 10^6 ns to 1.06 x 10^6 ns. * **cookie=9:** Timing values are approximately 1.04 x 10^6 ns to 1.05 x 10^6 ns. * **cookie=A:** Timing values are approximately 1.04 x 10^6 ns. * **cookie=H:** Timing values are approximately 1.02 x 10^6 ns to 1.05 x 10^6 ns. * **cookie=M:** Timing values are approximately 1.04 x 10^6 ns to 1.05 x 10^6 ns. * **cookie=S:** A single data point, colored red, is at approximately 1.00 x 10^6 ns. * **cookie=Z:** Timing values are approximately 1.04 x 10^6 ns to 1.06 x 10^6 ns. ### Key Observations * The timing values for most cookie values are clustered between 1.02 x 10^6 ns and 1.06 x 10^6 ns. * The "cookie=S" value has a significantly lower timing value (approximately 1.00 x 10^6 ns) compared to the other cookie values. This data point is also highlighted in red. ### Interpretation The scatter plot suggests that the "cookie=S" value results in a different timing behavior for "Byte 0" compared to other cookie values. The significantly lower timing value for "cookie=S" could indicate a successful guess or a different processing path within the system. The red color highlights this anomaly, suggesting it is of particular interest. The data suggests a potential vulnerability or exploitable behavior related to the "cookie=S" value. </details> At most, we perform 200 HTTP requests per bit. If there are still unclassified bits, we compare the amount of requests classified as '0'-bit and those classified as '1'-bit and decide on the side with more elements. We transmit a series of different messages of 8 B over the internet. Our simple remote covert channel achieves an average transmission rate of 643 . 25 bit ♥ h ( n 20 , σ 6 . 66% ) at an average error rate of 0 . 93 % . Our covert channel outperforms the one by Schwarz et al. [52] and Gruss [22],although our attack works with HTTP instead of the more lightweight UDP sockets. Other remote timing attacks usually do not evaluate their capacity with a remote covert channel [65], [30], [2], [51], [1], [2], [56]. ## C. Remote Attack on PHP-Memcached Using our building blocks to perform Decomp+Time and the remote covert channel, we perform a remote attack on PHP-Memcached to leak secret data from a server over the internet. We assume a memory layout where secret memory is co-located to attacker-controlled memory, and the overall memory region is compressed. Attack Setup. We use the same PHP-API as in Section VI-B that allows storing and retrieving data. We run the attack using the same server setup as used for the remote covert channel attacking a dedicated server in the Equinix [13] cloud 14 hops away from our network. We define a 6 B long secret ( SECRET ) with a 7 B long prefix ( cookie= ) and prepend it to the stored data of users. PHP-Memcached will compress the data before storing it in Memcached and decompress it when accessing it again. For each guess, the PHP application stores the uploaded data to a certain location in Memcached. On each data fetch, the PHP application decompresses the secret data together with the co-located attacker-controlled data and then responds onlythe attacker-controlled data. The attacker measures the RTT per guess and distinguishes the timing differences between the guesses. In the case of zlib, the guess with the fastest response time is the correct guess. We demonstrate a byte-wise leakage of an arbitrary secret and also a dictionary attack with a dictionary size of 100 guesses. Evaluation. For the byte-wise attack, we assume each byte of the secret is from 'A-Z,0-9' (36 different options). For each of the 6 bytes to be leaked, we generate a separate memory layout using Comprezzor that maximizes the latency between the guesses. We repeat the experiment 20 times. On average, our attack leaks the entire 6 B secret string in 31 . 95 min ( n Fig. 9: Distribution of the response time for the correct guess (SECRET) and the 99 other incorrect guesses in dictionary attack on PHP-Memcached. Note that we only label SECRET and 6 out of 99 incorrect guesses on the X-axis for brevity. <details> <summary>Image 9 Details</summary> ![76a3bb6b](/v1/image/76a3bb6bfcf5552e35d8b2ea299fd7c2015080c4ffb659483444f9a4329f08ba) ### Visual Description ## Scatter Plot: Timing vs. Guess ### Overview The image is a scatter plot showing the timing (in nanoseconds) for different guesses. The x-axis represents the "Guess", and the y-axis represents the "Timing [ns]". Most guesses have a timing around 1.05 * 10^6 ns, while one guess labeled "SECRET" has a significantly lower timing. ### Components/Axes * **X-axis:** "Guess" with labels: "03eBxD", "8kX92B", "IEMle7", "SECRET", "dDFqk3", "m4GWKM", "t2XtoC". * **Y-axis:** "Timing [ns]" with scale markers: 0.95 * 10^6, 1 * 10^6, 1.05 * 10^6, 1.1 * 10^6. ### Detailed Analysis * **Blue Data Points:** The blue data points represent the timing for each guess, except for the "SECRET" guess. The blue points are clustered around 1.04 * 10^6 ns to 1.12 * 10^6 ns. * **03eBxD:** Timing values range from approximately 1.03 * 10^6 ns to 1.07 * 10^6 ns. * **8kX92B:** Timing values range from approximately 1.04 * 10^6 ns to 1.06 * 10^6 ns. * **IEMle7:** Timing values range from approximately 1.04 * 10^6 ns to 1.09 * 10^6 ns. * **dDFqk3:** Timing values range from approximately 1.04 * 10^6 ns to 1.12 * 10^6 ns. * **m4GWKM:** Timing values range from approximately 1.03 * 10^6 ns to 1.12 * 10^6 ns. * **t2XtoC:** Timing values range from approximately 1.04 * 10^6 ns to 1.08 * 10^6 ns. * **Red Data Point:** The red data point represents the timing for the "SECRET" guess. It has a timing value of approximately 0.94 * 10^6 ns. ### Key Observations * The timing for the "SECRET" guess is significantly lower than the timing for other guesses. * The timing values for the other guesses are relatively consistent, ranging from approximately 1.03 * 10^6 ns to 1.12 * 10^6 ns. ### Interpretation The scatter plot suggests that the "SECRET" guess has a significantly different timing profile compared to the other guesses. This could indicate that the "SECRET" guess triggers a different code path or algorithm, resulting in a faster execution time. The consistent timing values for the other guesses suggest that they might be processed similarly. The plot highlights the "SECRET" guess as an outlier, potentially revealing information about the system's behavior when the correct secret is provided. </details> 20 , σ 8 . 48% ). Since the latencies between a correct and incorrect guess are in the microseconds range, we do not observe false positives with our approach. Figure 8 shows the median response time for each guess in the first iteration (leaking the first byte) as a representative example. It can be seen that the response time for the correct guess ( S ) for the first byte is significantly faster than the incorrect guesses. We observe similar results for the remaining bytes ( E,C,R,E,T ). Dictionary attack. While bytewise leakage is the more generic and practical approach for decompression timing attacks, there might be use cases where also a dictionary attack is applicable. For instance, if the attacker would try the top n most common usernames or passwords, which are publicly available. Another use case for dictionary attacks could be to profile if certain images are forbidden in a database used for digital rights management. We perform a dictionary attack over the network using randomly generated 100 guesses, including the correct guess. We repeat the experiment 20 times. For each run, we re-generate the guesses and randomly shuffle the order of the guesses per request iteration to not create a potential bias. On average, our attack leaks the secret string in 14 . 99 min ( n 20 , σ 16 . 27% ). To classify the correct guess, we use the same early stopping technique as was used for the covert channel. Since the timing difference is in the microseconds range, we do not observe false positives with our approach. Figure 9 illustrates the median of the response times for each guess from one iteration. Figure 15 (in Appendix D) shows the latencies for all leaked bytes. It can be clearly seen that the response time for the correct guess ( SECRET ) is significantly faster. Takeaway: We show that a PHP application using Memcached to cache blobs hosted on Nginx enables covert communication with a transmission rate of 643 . 25 bit ♥ h . If secret data is stored together with attacker-controlled data, we demonstrate a remote memory-compression attack on Memcached leaking a 6 B -secret bytewise in 31 . 95 min . We further demonstrated that the attack can also be used for dictionary attacks. ## D. Leaking Data from Compressed Database In this section, we show that an attacker can also exploit compression in databases to leak inaccessible information, i.e. , from the internal database compression of PostgreSQL. PostgreSQL Data Compression. PostgreSQL is a widespread open-source relational database system using the SQL standard. PostgreSQL maintains tuples saved on disk using a fixed page size of commonly 8 kB , storing larger fields compressed and possibly split into multiple pages. By default, variable-length fields that may produce large values, e.g., TEXT fields, are stored compressed. PostgreSQL's transparent compression is known as TOAST (The OversizedAttribute Storage Technique) and uses a fast LZ-family compression, PGLZ [46]. By default, data in a cell is actually stored compressed if such a form saves at least 25 % of the uncompressed size to avoid wasting decompression time if it is not worth it. Indeed, data stored uncompressed is accessed faster than data stored compressed since the decompression algorithm is not executed. Moreover, the decompression time depends on the compressibility of the underlying data. A potential attack scenario for structured text in a cell is where JSON documents are stored and compressed within a single cell, and the attacker controls a specific field within the document. While our focus is restricted to cell-level compression, note that compressed columnar storage [9], [39], [7], a non-default extension available in PostgreSQL, may also be vulnerable to decompression timing attacks. Attack Setup. To assess the feasibility of an attack, we use a local database server and access two differently compressed rows with a Python wrapper using the psycopg2 library. The first row contains 8192 characters of highly compressible data, while the second one 8192 characters of random incompressible data. Both rows are stored in a table as TEXT data and accessed 1000 times. The median for the number of clock cycles required to access the compressible row is 249 031 , while for the uncompressed one is 221 000 , which makes the two accesses distinguishable. On our 4 GHz CPUs, this is a timing difference of 7007 . 75 ns . We use Comprezzor to amplify these timing differences and demonstrate a bytewise leakage of a secret and also a dictionary attack. Leaking First Byte. For the bytewise leeakage of the secret, we first create a memory layout to leak the first byte using Comprezzor against a standalone version of PostgreSQL's compression library, using a similar setup as the previous Memcached attack. A key difference in the use of Comprezzor with PostgreSQL is that the helper program measuring the decompression time returns a time of 0 when the input is not compressed, i.e. , the data compressed with PGLZ does not save at least 25 % of the original size. Comprezzor found a layout that sits exactly at the corner case where a correct guess in the secret results in a compressed size that saves 25 % of the original size. Hence, a correct guess is saved compressed, while for any wrong guess, the threshold is not met, and the data is saved uncompressed. This is because the correct guess characters match the secret bytes and result in a higher compression rate. Leaking Subsequent Bytes with Secret Shifting. We observed that one good layout is enough and can be reused for bytewise leakage in PGLZ. The prefix can be shifted by one character to the left by a single character, i.e. , from 'cookie=S' to 'ookie=SE', to accommodate an additional byte for the guess. This shifting allows bytewise leakage with the same memory layout. Note, that the same approach did not work Fig. 10: Distribution of the response time for the first-byte guesses in the remote PostgreSQL attack leaking a secret bytewise, with the correct guess (S) and incorrect guesses (0-9, A-R, T-Z). Similar trends are observed for subsequent bytes leaked as shown in Appendix D. <details> <summary>Image 10 Details</summary> ![cf24ef5a](/v1/image/cf24ef5ab25877adc171318a0322c63b1053606ad256717c37e1d6c2520c71f8) ### Visual Description ## Scatter Plot: Timing vs. Guess for Byte 0 ### Overview The image is a scatter plot showing the timing in nanoseconds (ns) against different "Guess" values, which represent cookie values. The plot visualizes the timing behavior for different cookie values, with one data point highlighted in red. ### Components/Axes * **Title:** Byte 0 * **Y-axis:** Timing [ns], scaled from approximately 1.2 x 10^6 to 1.28 x 10^6. * **X-axis:** Guess, with categorical values: cookie=0, cookie=5, cookie=9, cookie=A, cookie=H, cookie=M, cookie=S, cookie=Z. * **Data Points:** Blue circles represent timing values for each cookie guess. One data point for cookie=S is highlighted in red. ### Detailed Analysis The data points are clustered around specific timing values for each cookie guess. * **cookie=0:** Timing values range from approximately 1.21 x 10^6 ns to 1.23 x 10^6 ns. * **cookie=5:** Timing values range from approximately 1.205 x 10^6 ns to 1.23 x 10^6 ns. * **cookie=9:** Timing values range from approximately 1.21 x 10^6 ns to 1.235 x 10^6 ns. * **cookie=A:** Timing values range from approximately 1.22 x 10^6 ns to 1.25 x 10^6 ns. * **cookie=H:** Timing values range from approximately 1.205 x 10^6 ns to 1.24 x 10^6 ns. * **cookie=M:** Timing values range from approximately 1.21 x 10^6 ns to 1.23 x 10^6 ns. * **cookie=S:** Timing values range from approximately 1.18 x 10^6 ns to 1.24 x 10^6 ns. The red data point for cookie=S is at approximately 1.27 x 10^6 ns. * **cookie=Z:** Timing values range from approximately 1.19 x 10^6 ns to 1.23 x 10^6 ns. ### Key Observations * The timing values for different cookie guesses are relatively consistent within each guess category, with some variation. * The red data point for "cookie=S" is significantly higher than the other timing values for that guess and the other guesses. ### Interpretation The plot suggests that the "cookie=S" guess has a significantly higher timing value (highlighted in red) compared to other guesses and the other data points for "cookie=S". This could indicate a specific condition or behavior triggered by the "cookie=S" value, potentially related to a security vulnerability or performance issue. The other cookie values show relatively consistent timing, suggesting a baseline performance level. The outlier for "cookie=S" warrants further investigation. </details> Fig. 11: Distribution of the response time in the remote PostgreSQL dictionary attack, for the correct guess (SECRET) and the 99 other incorrect guesses (we only label 4 out of 99 incorrect guesses and SECRET on the X-axis for brevity). <details> <summary>Image 11 Details</summary> ![06ed9786](/v1/image/06ed9786bd20180879a1861f01f57934c106acb0afa089df031fdb284dddf9d6) ### Visual Description ## Scatter Plot: Timing vs. Guess ### Overview The image is a scatter plot showing the timing (in milliseconds) versus different guesses. Most guesses result in similar timing values, clustered around 1.2 * 10^6 ms, but one guess labeled "SECRET" has a significantly higher timing value. ### Components/Axes * **X-axis:** "Guess" with labels "02OdDt", "I6mfXm", "SECRET", "ibQ0tW", "youFg0". * **Y-axis:** "Timing [ms]" with a multiplier "·10^6". The y-axis ranges from approximately 1.18 * 10^6 ms to 1.32 * 10^6 ms. The y-axis markers are at 1.2 * 10^6, 1.25 * 10^6, and 1.3 * 10^6. * **Data Points:** Blue dots represent the timing for each guess. The "SECRET" guess is represented by a red dot. ### Detailed Analysis * The majority of the blue data points are clustered between 1.19 * 10^6 ms and 1.24 * 10^6 ms. * The red data point, corresponding to the "SECRET" guess, is located at approximately 1.31 * 10^6 ms. * The guesses "02OdDt", "I6mfXm", "ibQ0tW", and "youFg0" all have similar timing values. ### Key Observations * The "SECRET" guess has a significantly higher timing value compared to the other guesses. * The timing values for the other guesses are relatively consistent. ### Interpretation The scatter plot suggests that the "SECRET" guess takes significantly longer to process compared to the other guesses. This could indicate a vulnerability or a deliberate design feature where the system takes longer to process the correct guess. The consistent timing values for the other guesses suggest that they are processed in a similar manner. The plot highlights a potential timing attack vulnerability, where an attacker could determine the correct guess by observing the processing time. </details> on DEFLATE. We leave it as future work, to mount such a shifting approach to other compression algorithms. Evaluation. We perform a remote decompression timing attack against a Flask [17] web server that uses a PostgreSQL database to store user-provided data. We used the same Equinix cloudserver setup as used for the Memcached remote attack (cf. Section VI-C). The server runs Python 3.8.5 with Flask version 2.0.1 and PostgreSQL version 12.7. We define a 6 B long-secret with a 7 B prefix that the server colocates with attacker-provided data in a cell of the database through POST requests. The secret is never shown to the user. Using the layout found by Comprezzor, the entry in the database is stored compressed only when the secret matches in the provided data. A second endpoint in the server accesses the database to read the data without returning the secret to the attacker. The attack leaks bytewise over the internet by guessing again in the character set 'A-Z,0-9' ( 36 possibilities per character), including the correct one. We repeat the attack 20 times. The average time required to determine the guess with the highest latency (that the server had to decompress before returning) is 17 . 84 min ( 2 . 97 min ♥ B ) ( n 20 , σ 2 . 3% ). Figure 10 illustrates the median of the response times for a single guess in an example iteration, showing how the correct guess results in a slower response. Dictionary attack. We also performed a dictionary attack over the internet with 100 randomly generated guesses, including the correct one, and repeated the experiments 20 times. We measured the time required for the server to reply to a request in the second endpoint by observing the network packets arriving at our machine with the same setup as the remote Memcached attack, repeating each request 100 times. The average time required to brute-force all the guesses and select the guess with the highest latency (that the server had to decompress before returning) is 8 . 28 min ( n 20 , σ 0 . 003% ). Figure 11 illustrates the median of the response times for each guess in an example iteration, showing how the correct guess results in a slower response. Again, the timing difference is clearly distinguishable over 14 h ops in the internet. Takeaway: Secrets can be leaked from databases due to timing differences caused by PostgreSQL's transparent compression, if an application stores untrusted data with secrets in the same cell of a database. With a Flask appplication using a PostgreSQL database, we show that an attacker can infer when its inserted data matches the secret based on response times, which are influence by the decompression timing. Our decompression timing attack on PostgreSQL leaks a 6 B secret bytewise over the network in 17 . 84 min . ## E. Attacking OS Memory Compression In this section, we show how memory compression in modern OSs can introduce exploitable timing differences. In particular, we demonstrate a dictionary-based attack to leak secrets from compressed pages in ZRAM, the Linux implementation of memory compression. Background. Memory compression is a technique used in many modern OSs, e.g., Linux [34], Windows [28], or MacOS [11]. Similar to traditional swapping, memory compression increases the effective memory capacity of a system. When processes require more memory than available, the OS can transparently compress unused pages in DRAM to ensure they occupy a smaller footprint in DRAM rather than swapping them to disk. This frees up memory while still allowing the compressed pages to be accessed from DRAM. Compared to disk I/O, DRAM access is an order of magnitude faster, and even with the additional decompression overhead, memory compression is significantly faster than swapping. Hence, memory compression can improve the performance despite the additional CPU cycles required for compression and decompression. Such memory compression is adopted in the Linux kernel (since kernel version 3.14) in a module called ZRAM [26], which is enabled by default on Fedora [34] and Chrome OS [11]. 1) Characterizing Timing Differences in ZRAM: To understand how memory compression can be exploited, we characterize its behavior in ZRAM. On Linux systems, ZRAM appears as a DRAM-backed block device. When pages need to be swapped to free up memory, they are instead compressed and moved to ZRAM. Subsequent accesses to data in ZRAM result in a page fault, and the page is decompressed from ZRAM and copied to a regular DRAM page for use again. We show that the time to access data from a ZRAM page depends on its compressibility and thus the data values. We characterize the latency of accessing data from ZRAM pages with different entropy levels: pages that are incompressible (with random bytes), partially-compressible (random values for 2048 bytes and a fixed value repeated for the remaining TABLE III: Mean latency of accesses to ZRAM. Distinguishable timing differences exist based on data compressibility in the pages ( n 500 and 6% of samples removed as outliers with more than an order of magnitude higher latency). | Algorithm | Incompressible (ns) | Partly Compressible (ns) | Fully Compressible (ns) | |-------------|------------------------|----------------------------|---------------------------| | deflate | 1763 ( 12% ) | 12208 ( 2% ) | 1551 ( 12% ) | | 842 | 1789 ( 11% ) | 8785 ( 2% ) | 1556 ( 10% ) | | lzo | 1684 ( 9% ) | 4866 ( 4% ) | 1479 ( 12% ) | | lzo-rle | 1647 ( 9% ) | 4751 ( 4% ) | 1453 ( 12% ) | | zstd | 1857 ( 10% ) | 2612 ( 9% ) | 1674 ( 11% ) | | lz4 | 1710 ( 11% ) | 1990 ( 7% ) | 1470 ( 10% ) | | lz4hc | 1746 ( 9% ) | 2091 ( 9% ) | 1504 ( 11% ) | 2048 bytes), and fully-compressible (a fixed value in each of the 4096 bytes). We ensure a page is moved to ZRAM by accessing more memory than the memory limit allows. To ensure fast run times for the proof of concept, we allocate the process to a cgroup with a memory limit of a few megabytes. We measure the latency for accessing a 8-byte word from the page in ZRAM, and repeat this process 500 times. Table III shows the mean latency for ZRAM accesses for different ZRAM compression algorithms on an Intel i76700K (Ubuntu 20.04, kernel 5.4.0). The latency for accesses to ZRAM is much higher for partially compressible pages (with lower entropy) compared to incompressible pages (with higher entropy) for all the compression algorithms. This is because the process of moving compressed ZRAM pages to regular memory on an access requires additional calls to functions that decompress the page; on the other hand, ZRAM pages that are stored uncompressed do not require these function calls (Appendix B provides a trace of the kernel function calls on ZRAM accesses and their execution times to illustrate this difference). The largest timing difference is observed for deflate algorithm (close to 10 000 ns ) and 842 algorithm (close to 7000 ns ); moderate timing differences are observed for lzo and lzo-rle (close to 1000 ns ), and zstd (close to 750 ns ); the smallest timing difference is seen for lz4 and lz4hc (close to 250 ns ). These timing differences largely correspond with the raw decompression latency for these algorithms, some of which are measured in Table I. Accesses to a fully-compressible page in ZRAM, i.e. , a page containing the same byte repeatedly, are faster (by 200 ns ) than accesses to an incompressible page for all the compression algorithms. This is because ZRAM stores such pages with a special encoding as a single-byte (independent of the compression algorithm) that only requires reading a single byte from ZRAM on an access to such a page. 2) Leaking Secrets via ZRAM Decompression Timings: In this section, we exploit timing differences between accesses to a partially compressible and incompressible page in ZRAM (using deflate algorithm) to leak secrets. Like previous PoCs, we use Comprezzor to generate data layouts that maximize the decompression-timing difference for the correct guess, while leaking a secret byte-wise. Attack Setup. We demonstrate byte-wise leakage attack on a program with a 4KB page stored in ZRAM containing both a secret value and attacker-controlled data, as is common in many applications like databases. To determine optimal data layouts an attacker might use, we combine this program with Comprezzor. With a known secret value, Comprezzor runs the program with the attacker guessing each byte position successively. For each byte position, Comprezzor measures the decompression time for the page from ZRAM with different attacker data layouts and different attacker guesses, and the layout maximizing the timing difference when the attacker's guessed-byte matches the secret-byte is returned. In such an optimal layout, when the attacker's guess matches the secretbyte, the page entropy reduces (page is partially compressible) and ZRAM decompression takes longer; and for all other guesses, the entropy is high and ZRAM decompression is fast. We repeat this process to generate optimal data layouts for each byte position. Note that this optimal data layout only relies on the number of repetitions of the guess and the relative position of the guessed data and the secret (a property of the compression algorithm), and is applicable with any data values. Using these attacker data-layouts, we perform the byte-wise leakage of an unknown secret. At each step, the attacker makes one-byte guesses (0-9, A-Z) and then denotes the guess with the highest latency to be the 'correct' guess. It appends this value to its guess-string and then guesses the next byte-position. We repeat this attack for 100 random secret strings. Evaluation. Figure 12 shows the bytewise leakage for a secret value (cookie= SECRET ), with the decompression times for guesses of the first four bytes depicted in each of the graphs. For each byte, among guesses of (0-9,A-Z), the highest decompression time successfully leaks the secret byte value (shown in red). For example, for byte-0, the highest time is for cookie= S and similarly for byte-1, it is cookie= SE . For byte-2, we observe a false-positive, cookie= SE8 , which also has a high latency along with the correct guess cookie= SEC . But in the subsequent byte-3, with both these strings used as prefixes for the guesses, the false-positives are eliminated and cookie= SECR is obtained as the correct guess. The next two bytes are also successfully leaked to fully obtain cookie= SECRET , as shown in the Figure 17 in Appendix D. Our attack successfully completes in less than 5 minutes. Over 100 randomly generated secrets, we observe that 90% of secrets are exactly leaked successfully. In 9 out 10 remaining cases, we narrow down the secret to within four 6-byte candidates (due to false-positives) and in the last case we recover only 4 out of 6-bytes of the secret (as the false-positives grows considerably beyond this). These false-positives arise due to the data-layouts generated not being as robust as in previous PoCs. Comprezzor with ZRAM is a few orders of magnitude slower (almost 0.03x the speed) compared to iterations with raw algorithms studied in Section V-B, so it is unable to explore as large a search space as previous PoCs. This is because moving a page to ZRAM and compressing it requires accessing sufficient memory to swap the page out, which is slower than just executing the compression algorithm. But such false positives can be easily addressed by using multiple data-layouts per byte (each layout causes different false-positives), or by fuzzing for a longer duration to generate more robust data-layouts. The fuzzing speed can also be increased by implementing the ZRAM compression algorithm (a fork of zlib) inside Comprezzor, so this is not a fundamental limitation. Dictionary Attack. For completeness, we also demonstrate a dictionary attack where the secret is guessed from a range Fig. 12: Times for guesses (0-9, A-Z) for each of the first four bytes (S, E, C, R) in a 6-byte secret (cookie= SECRET ) leaked byte-wise from ZRAM. The highest times correspond to the secret-byte value (shown in red). The last two bytes (E, T) are also successfully leaked similarly and shown in Appendix D. <details> <summary>Image 12 Details</summary> ![9ee2b550](/v1/image/9ee2b5501eda46c344d7931f5896ca5fcc5b2c2310cf5979137da65dd8891625) ### Visual Description ## Timing Attack Analysis: Cookie Byte Guessing ### Overview The image presents four scatter plots, each representing a "Byte" (0 to 3). Each plot shows the timing (in nanoseconds) of operations based on different "cookie" guesses. The x-axis represents the "Guess" (cookie values), and the y-axis represents the "Timing [ns]". The plots aim to identify timing variations that could reveal information about the correct cookie value. Each plot has a cluster of blue dots at a low timing value, and a single red dot at a higher timing value, indicating a timing difference for a specific cookie guess. ### Components/Axes * **Titles:** "Byte 0", "Byte 1", "Byte 2", "Byte 3" (each plot is labeled with a byte number) * **X-axis:** "Guess" (with labels like "cookie=0", "cookie=5", "cookie=9", "cookie=A", etc.) * **Y-axis:** "Timing [ns]" with scale markers at 0, 0.5 \* 10^4, 1 \* 10^4, and 1.5 \* 10^4. * **Data Points:** Blue dots representing typical timing, and a single red dot representing an outlier timing. ### Detailed Analysis **Byte 0:** * X-axis labels: cookie=0, cookie=5, cookie=9, cookie=A, cookie=H, cookie=M, cookie=S, cookie=Z * Trend: Most guesses have a low timing. The guess "cookie=S" has a significantly higher timing (red dot). * Timing values: Blue dots are clustered around 0.2 \* 10^4 ns. The red dot for "cookie=S" is at approximately 1.4 \* 10^4 ns. **Byte 1:** * X-axis labels: cookie=S0, cookie=S5, cookie=S9, cookie=SA, cookie=SE, cookie=SH, cookie=SM, cookie=SS, cookie=SZ * Trend: Similar to Byte 0, most guesses have low timing. The guess "cookie=SE" has a significantly higher timing (red dot). * Timing values: Blue dots are clustered around 0.2 \* 10^4 ns. The red dot for "cookie=SE" is at approximately 1.6 \* 10^4 ns. **Byte 2:** * X-axis labels: cookie=SE0, cookie=SE2, cookie=SE5, cookie=SE8, cookie=SEC, cookie=SEH, cookie=SEM, cookie=SES, cookie=SEZ * Trend: Most guesses have low timing. The guess "cookie=SEC" has a significantly higher timing (red dot). There is also a blue dot with a slightly elevated timing at cookie=SE8. * Timing values: Blue dots are clustered around 0.2 \* 10^4 ns, with "cookie=SE8" at approximately 0.8 \* 10^4 ns. The red dot for "cookie=SEC" is at approximately 1.6 \* 10^4 ns. **Byte 3:** * X-axis labels: cookie=SE80, cookie=SE89, cookie=SE8A, cookie=SE8M, cookie=SE8Z, cookie=SEC0, cookie=SECA, cookie=SECR, cookie=SECRZ * Trend: Most guesses have low timing. The guess "cookie=SECR" has a slightly higher timing (red dot). * Timing values: Blue dots are clustered around 0.2 \* 10^4 ns. The red dot for "cookie=SECR" is at approximately 1.6 \* 10^4 ns. ### Key Observations * Each byte has one specific "cookie" guess that results in a significantly higher timing compared to other guesses. * The timing difference is substantial, suggesting a potential vulnerability. * The "cookie" guesses with high timing are "cookie=S" for Byte 0, "cookie=SE" for Byte 1, "cookie=SEC" for Byte 2, and "cookie=SECR" for Byte 3. ### Interpretation The data strongly suggests a timing attack vulnerability. The significant timing differences observed for specific "cookie" guesses indicate that the system takes longer to process these particular values. This information can be exploited to deduce the correct cookie value byte by byte. The correct cookie value is likely "SECR", as the timing increases as the cookie is built up one byte at a time. The timing difference is likely due to a conditional statement that is triggered when the correct byte is guessed, causing a longer execution path. This information could be used by an attacker to reconstruct the entire cookie and gain unauthorized access. </details> Fig. 13: Median latency for ZRAM accesses in dictionary attack, with correct guess SECRET and 99 incorrect guesses (we only label 4 out of 99 incorrect guesses on X-axis for brevity). <details> <summary>Image 13 Details</summary> ![7d82cd44](/v1/image/7d82cd44e9aa05da95b9e2bbad5fa96a9832ada37c87d4ec93b4486eecce0f13) ### Visual Description ## Scatter Plot: Timing vs. Guess ### Overview The image is a scatter plot showing the timing in nanoseconds (ns) on the y-axis versus different "Guess" values on the x-axis. Most data points are clustered near the bottom, but one data point, labeled "SECRET", has a significantly higher timing value. ### Components/Axes * **X-axis:** "Guess" with categories: 9Q7Lwt, IMP6Pv, SECRET, 0NSQFp, e4gvLN * **Y-axis:** "Timing [ns]" with a scale from 0 to 1.5, multiplied by 10^4. * **Data Points:** Blue dots represent timing values for each guess, except for the "SECRET" guess, which is represented by a red dot. ### Detailed Analysis * **X-axis Categories:** The x-axis represents different "Guess" values. The categories are: * 9Q7Lwt * IMP6Pv * SECRET * 0NSQFp * e4gvLN * **Y-axis Values:** The y-axis represents the timing in nanoseconds (ns), scaled by 10^4. The scale ranges from 0 to 1.5. * **Data Point Analysis:** * Most of the guesses (9Q7Lwt, IMP6Pv, 0NSQFp, e4gvLN) have timing values clustered around 0.2 * 10^4 ns (2000 ns). * The "SECRET" guess has a significantly higher timing value of approximately 1.55 * 10^4 ns (15500 ns). ### Key Observations * The "SECRET" guess has a significantly higher timing value compared to the other guesses. * The timing values for the other guesses are relatively consistent. ### Interpretation The scatter plot suggests that the "SECRET" guess takes significantly longer to process compared to the other guesses. This could indicate a vulnerability or a specific condition triggered by the "SECRET" input that causes a performance bottleneck. The consistent timing values for the other guesses suggest that they are processed in a similar manner. The outlier "SECRET" is a key point of interest. </details> of 100 possible guesses (dictionary), similar to previous PoCs. We assume the target page has a secret embedded the tophalf and the bottom half of the page is attacker-controlled and filled with repeated guesses from a dictionary. Figure 13 shows the median latency for attacker's data access from ZRAM (over 10 observations) for different guesses from the dictionary. The guesses consist of 99 randomly generated strings and the secret (cookie= SECRET ). We observe a high timing for decompression when the guess matches the secret as the page entropy is lowered (and the page compressed in ZRAM), thus leaking the secret; for all other guesses the decompression is fast. This dictionary attack completes in less than 10 minutes. But in general, the duration depends on the time required to force a target page into ZRAM, depending on the physical memory limit for a target process. Takeaway: We show that even if an application does not explicitly use compression, its data may still get compressed by the OS due to memory compression. Our case study of ZRAM shows that all compression algorithms in ZRAM have timing differences of 250 ns to 10 400 ns for accesses to compressed versus uncompressed pages. We also show that if secrets are co-located with attacker-controlled data in the same page and compressed with memory compression, they can be leaked in minutes. ## VII. MITIGATIONS Disabling LZ77. The most rudimentary solution is to completely disable compression or at least disable the LZ77 part. Karakostas et al. [31] showed for web pages that this adds an overhead between 91 % and 500 % . Furthermore, attacks have not been studied well enough on symbol compression to provide any security guarantees. Masking. Karakostas et al. [31] presented a generic defense technique called Context Transformation Extension (CTX). The general idea is to use context-hiding to protect secrets from being compressed with attacker-controlled data. Data is permuted on the server-side using a mask, and on the clientside, an inverse permutation is performed (JavaScript library). The overhead compared to the original algorithms decrease with the number of compressed data [31]. Randomization. Yang et al. [61] showed an approach with randomized input to mitigate compression side-channel attacks. The service would require adding an additional amount of random data to hide the size of the compressed memory. However, as the authors also show, randomization-based approaches can be defeated with at the expense of a higher execution time. Also, Karaskostas et al. [31] showed that size randomization is ineffective against memory compression attacks. It is also unclear if size randomization mitigates the timing-based side channel of the memory decompression. Keyword protection. Zieli´ nski demonstrated an implementation of DEFLATE called SafeDeflate [66]. SafeDeflate mitigates memory compression attacks by splitting the set of keywords into subsets of sensitive and non-sensitive keywords. Depending on the completeness of the sensitive keyword list, this approach can be considered as secure, but there is no guaranteed security. As Paulsen et al. [43] mention, it is easy to overlook a corner case. Furthermore, this approach leads to a loss of compression ratio of about 200 % to 400 % [31]. Taint tracking. Taint tracking tries to trace the data flow and mark input sources and their sinks. Paulsen et al. [43] use taint analysis to track the flow of secret data before feeding data into the compression algorithm. Their tool, Debreach, is about 2-5 times faster than SafeDeflate [43]. However, this approach is, so far, only compatible with PHP. Furthermore, for such an approach, the developer needs to flag the sensitive input which is being tracked. The probably best strategy in practice is to avoid sensitive data being compressed with potential attacker-controlled data. The aforementioned mitigations focus on mitigating compression ratio side channels. As the compression and decompression timings are not constant, a timing side channel is harder to mitigate, e.g., remote attacks might be mitigated by adding artificial noise or using network packet inspection to detect attacks [52]. Since the latency for a correct guess is in the region of microseconds, not many requests ( & 200 ) are required per guess to distinguish the latency. Therefore a simple DDoS detection might detect an attack but only after a certain amount of data being leaked. ## VIII. CONCLUSION In this paper, we presented a timing side channel attack on several memory compression algorithms. With Comprezzor, we developed a generic approach to amplify latencies for correct guesses when performing dictionary attack on different compression algorithms. Our remote covert channel achieves a transmission rate of 643 . 25 bit ♥ h over 14 hops in the internet. We demonstrated three cases studies on memory compression attacks. First, we showed a remote dictionary attack on a PHP application, hosted on an Nginx, storing zlibcompressed data in Memcached. We demonstrate bytewise leakage across the internet from a server located 14 hops away. Using Comprezzor, we explored high timing differences when performing a dictionary attack on PostgreSQLs PGLZ algorithm. and leveraged those to leak database records from PostgreSQL. Lastly, we also demonstrated a dictionary attack on ZRAM. ## REFERENCES - [1] Acıic ¸mez, Onur and Schindler, Werner and Koc, Cetin K. Cache based remote timing attack on the aes. In CT-RSA , 2006. - [2] Hassan Aly and Mohammed ElGayyar. Attacking aes using bernstein's attack on modern processors. In International Conference on Cryptology in Africa , 2013. - [3] Apple Insider, 2013. URL: https://appleinsider.com/articles/13/06/13/c ompressed-memory-in-os-x-109-mavericks-aims-to-free-ram-extendbattery-life. - [4] Tal Be'ery and Amichai Shulman. A Perfect CRIME? Only TIME Will Tell. Black Hat Europe , 2013. - [5] Bmemcached. bmemcached package, 2021. URL: https://python-binarymemcached.readthedocs.io/en/latest/bmemcached/. - [6] BTRFS. Compression, 2021. URL: https://btrfs.wiki.kernel.org/index. php/Compression#Why does not du report the compressed size.3F. - [7] Buckenhofer. PostgreSQL Columnar Storage, 2021. URL: https://www. buckenhofer.com/2021/01/postgresql-columnar-extension-cstore/. - [8] Euccas Chen. Understanding zlib, 2021. URL: https://www.euccas.me/ zlib/#deflate. - [9] Citus. Columnar: Distributed PostgreSQL as an extension., 2021. URL: https://github.com/citusdata/citus/blob/master/src/backend/columnar/RE ADME.md. - [10] Collett, Yann. Smaller and faster data compression with Zstandard, 2016. URL: https://engineering.fb.com/2016/08/31/core-data/smaller-andfaster-data-compression-with-zstandard/. - [11] DennyL. Enabling ZRAM in Chrome OS, 2020. URL: https://suppor t.google.com/chromebook/thread/75256850/enabeling-zram-doesn-twork-in-crosh?hl=en&msgid=75453860. - [12] P. Deutsch. Rfc1951: Deflate compressed data format specification version 1.3, 1996. - [13] Equinix, 2021. URL: https://www.packet.com/. - [14] Dmitry Evtyushkin and Dmitry Ponomarev. Covert channels through random number generator: Mechanisms, capacity estimation and mitigations. In CCS , 2016. - [15] R. Fielding, J. Gettys, J. Mogul, H. Frystyk, L. Masinter, P. Leach, and T. Berners-Lee. Rfc 2616, hypertext transfer protocol - http/1.1, 1999. URL: http://www.rfc.net/rfc2616.html. - [16] Andrea Fioraldi, Dominik Maier, Heiko Eißfeldt, and Marc Heuse. AFL++: Combining incremental steps of fuzzing research. In USENIX Workshop on Offensive Technologies (WOOT) , August 2020. - [17] Flask. Flask, 2021. URL: https://flask.palletsprojects.com/en/2.0.x/. - [18] Anders Fogh. Covert Shotgun: automatically finding SMT covert channels, 2016. URL: https://cyber.wtf/2016/09/27/covert-shotgun/. - [19] Gailly, Jean-loup and Adler, Mark. zlib.h - interface of the 'zlib' general purpose compression library, 2020. URL: https://github.com/torvalds/li nux/blob/5bfc75d92efd494db37f5c4c173d3639d4772966/include/linux/ zlib.h. - [20] Yoel Gluck, Neal Harris, and Angelo Prado. BREACH: reviving the CRIME attack. Unpublished manuscript , 2013. - [21] Ben Gras, Cristiano Giuffrida, Michael Kurth, Herbert Bos, and Kaveh Razavi. ABSynthe: Automatic Blackbox Side-channel Synthesis on Commodity Microarchitectures. In NDSS , 2020. - [22] Daniel Gruss, Erik Kraft, Trishita Tiwari, Michael Schwarz, Ari Trachtenberg, Jason Hennessey, Alex Ionescu, and Anders Fogh. Page Cache Attacks. In CCS , 2019. - [23] Daniel Gruss, Cl´ ementine Maurice, Klaus Wagner, and Stefan Mangard. Flush+Flush: A Fast and Stealthy Cache Attack. In DIMVA , 2016. - [24] Daniel Gruss, Raphael Spreitzer, and Stefan Mangard. Cache Template Attacks: Automating Attacks on Inclusive Last-Level Caches. In USENIX Security Symposium , 2015. - [25] Berk G¨ ulmezo˘ glu, Mehmet Sinan Inci, Thomas Eisenbarth, and Berk Sunar. A Faster and More Realistic Flush+Reload Attack on AES. In COSADE , 2015. - [26] Gupta, Nitin. zram: Compressed RAM-based block devices, 2021. URL: https://www.kernel.org/doc/html/latest/admin-guide/blockdev/zram.ht ml. - [27] Shaobo He, Michael Emmi, and Gabriela Ciocarlie. ct-fuzz: Fuzzing for timing leaks. In International Conference on Software Testing, Validation and Verification (ICST) , 2020. doi:10.1109/ICST46399.2020 .00063 . - [28] Hoffman, Chris. What Is Memory Compression in Windows 10?, 2017. URL: https://www.howtogeek.com/319933/what-is-memory-compressio n-in-windows-10/. - [29] holmeshe.me. Understanding the memcached source code - slab ii, 2020. URL: https://holmeshe.me/understanding-memcached-source-code-II/. - [30] Darshana Jayasinghe, Jayani Fernando, Ranil Herath, and Roshan Ragel. Remote cache timing attack on advanced encryption standard and countermeasures. In ICIAFs , 2010. - [31] Dimitris Karakostas, Aggelos Kiayias, Eva Sarafianou, and Dionysis Zindros. Ctx: Eliminating breach with context hiding. Black Hat EU , 2016. - [32] Dimitris Karakostas and Dionysis Zindros. Practical new developments on breach. Black Hat Asia , 2016. - [33] John Kelsey. Compression and information leakage of plaintext. In Fast Software Encryption , 2002. - [34] Larabel, Michael. What Is Memory Compression in Windows 10?, 2020. URL: https://www.phoronix.com/scan.php?page=news item&px=Fedo ra-33-Swap-On-zRAM-Proposal. - [35] Moritz Lipp, Daniel Gruss, Raphael Spreitzer, Cl´ ementine Maurice, and Stefan Mangard. ARMageddon: Cache Attacks on Mobile Devices. In USENIX Security Symposium , 2016. - [36] Fangfei Liu, Yuval Yarom, Qian Ge, Gernot Heiser, and Ruby B. Lee. Last-Level Cache Side-Channel Attacks are Practical. In S&P , 2015. - [37] Cl´ ementine Maurice, Manuel Weber, Michael Schwarz, Lukas Giner, Daniel Gruss, Carlo Alberto Boano, Stefan Mangard, and Kay R¨ omer. Hello from the Other Side: SSH over Robust Cache Covert Channels in the Cloud. In NDSS , 2017. - [38] Memcached, 2020. URL: https://memcached.org/. - [39] Microsoft. concepts-hyperscale-columnar, 2021. URL: https://docs.mic rosoft.com/en-us/azure/postgresql/concepts-hyperscale-columnar. - [40] Daniel Moghimi, Moritz Lipp, Berk Sunar, and Michael Schwarz. Medusa: Microarchitectural Data Leakage via Automated Attack Synthesis. In USENIX Security Symposium , 2020. - [41] Mozilla. Compression in HTTP, 2021. URL: https://developer.mozilla. org/en-US/docs/Web/HTTP/Compression. - [42] Shirin Nilizadeh, Yannic Noller, and Corina S Pasareanu. Diffuzz: differential fuzzing for side-channel analysis. In International Conference on Software Engineering (ICSE) , 2019. - [43] Brandon Paulsen, Chungha Sung, Peter AH Peterson, and Chao Wang. Debreach: mitigating compression side channels via static analysis and transformation. In IEEE/ACM International Conference on Automated Software Engineering (ASE) . IEEE, 2019. - [44] Peter Pessl, Daniel Gruss, Cl´ ementine Maurice, Michael Schwarz, and Stefan Mangard. DRAMA: Exploiting DRAM Addressing for CrossCPU Attacks. In USENIX Security Symposium , 2016. - [45] php.net. memcached.constants.php, 2021. URL: https://www.php.net/ manual/en/memcached.constants.php. - [46] PostgreSQL. TOAST Compression, 2021. URL: https://www.postgresql .org/docs/current/storage-toast.html. - [47] Hany Ragab, Alyssa Milburn, Kaveh Razavi, Herbert Bos, and Cristiano Giuffrida. CrossTalk: Speculative Data Leaks Across Cores Are Real. In S&P , 2021. - [48] Sanjay Rawat, Vivek Jain, Ashish Kumar, Lucian Cojocar, Cristiano Giuffrida, and Herbert Bos. Vuzzer: Application-aware evolutionary fuzzing. In NDSS , 2017. - [49] Juliano Rizzo and Thai Duong. The CRIME attack. In ekoparty security conference , volume 2012, 2012. - [50] Rostedt, Steven. ftrace - Function Tracer, 2017. URL: https://www.kern el.org/doc/Documentation/trace/ftrace.txt. - [51] Vishal Saraswat, Daniel Feldman, Denis Foo Kune, and Satyajit Das. Remote cache-timing attacks against aes. In Workshop on Cryptography and Security in Computing Systems , 2014. - [52] Michael Schwarz, Martin Schwarzl, Moritz Lipp, and Daniel Gruss. NetSpectre: Read Arbitrary Memory over Network. In ESORICS , 2019. - [53] Benjamin Semal, Konstantinos Markantonakis, Keith Mayes, and Jan Kalbantner. One covert channel to rule them all: A practical approach to data exfiltration in the cloud. In IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom) , 2020. - [54] Dongdong She, Kexin Pei, Dave Epstein, Junfeng Yang, Baishakhi Ray, and Suman Jana. Neuzz: Efficient fuzzing with neural program smoothing. In IEEE Symposium on Security and Privacy , 2019. doi: 10.1109/SP.2019.00052 . - [55] Po-An Tsai, Andres Sanchez, Christopher W. Fletcher, and Daniel Sanchez. Safecracker: Leaking secrets through compressed caches. In ASPLOS , 2020. - [56] Tom Van Goethem, Christina P¨ opper, Wouter Joosen, and Mathy Vanhoef. Timeless Timing Attacks: Exploiting Concurrency to Leak Secrets over Remote Connections. In USENIX Security Symposium , 2020. - [57] Tom Van Goethem, Mathy Vanhoef, Frank Piessens, and Wouter Joosen. Request and conquer: Exposing cross-origin resource size. In USENIX Security Symposium , 2016. - [58] Mathy Vanhoef and Tom Van Goethem. Heist: Http encrypted information can be stolen through tcp-windows. In Black Hat US Briefings, Location: Las Vegas, USA , 2016. - [59] Daniel Weber, Ahmad Ibrahim, Hamed Nemati, Michael Schwarz, and Christian Rossow. Osiris: Automated Discovery Of Microarchitectural Side Channels. In USENIX Security Symposium , 2021. - [60] Zhenyu Wu, Zhang Xu, and Haining Wang. Whispers in the Hyperspace: High-speed Covert Channel Attacks in the Cloud. In USENIX Security Symposium , 2012. - [61] Meng Yang and Guang Gong. Lempel-ziv compression with randomized input-output for anti-compression side-channel attacks under https/tls. In International Symposium on Foundations and Practice of Security . Springer, 2019. - [62] Yuval Yarom and Katrina Falkner. Flush+Reload: a High Resolution, Low Noise, L3 Cache Side-Channel Attack. In USENIX Security Symposium , 2014. - [63] Pavel Yosifovich, Alex Ionescu, Mark E. Russinovich, and David A. Solomon. Windows Internals Part 1 . Microsoft Press, 7 edition, 2017. - [64] Michał Zalewski. American Fuzzy Lop, 2021. URL: https://github.com /Google/AFL. - [65] Xin-jie Zhao, Tao Wang, and Yuanyuan Zheng. Cache Timing Attacks on Camellia Block Cipher. Cryptology ePrint Archive, Report 2009/354 , 2009. - [66] Michał Zielinski. Safedeflate: compression without leaking secrets. Technical report, Cryptology ePrint Archive, Report 2016/958. 2016, 2016. ## APPENDIX A TIMING DIFFERENCE FOR COMPRESSION TABLE IV: Different compression algorithms yield distinguishable timing differences when compressing content with a different entropy. ( n 100000 | Algorithm | Fully Compressible (ns) | Partially Compressible (ns) | Incompressible (ns) | |-----------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | FastLZ LZ4 LZ4 LZO PGLZ zlib zstd | 38619 . 07 ( 0 . 74% ) 606 . 82 ( 0 . 93% ) 44748 . 02 ( 0 . 15% ) 5645 . 86 ( 2 . 18% ) 44275 . 84 ( 0 . 13% ) 38479 . 53 ( 0 . 22% ) 3596 . 41 ( 0 . 42% ) | 58887 . 40 ( 0 . 50% ) 220 . 63 ( 1 . 00% ) 47731 . 08 ( 0 . 16% ) 5915 . 28 ( 2 . 78% ) 65752 . 55 ( 0 . 12% ) 80284 . 72 ( 0 . 23% ) 22288 . 14 ( 0 . 52% ) | 79384 . 89 ( 0 . 40% ) 231 . 76 ( 1 . 97% ) 47316 . 56 ( 0 . 16% ) 7928 . 21 ( 3 . 91% ) - 76973 . 82 ( 0 . 20% ) 29284 . 77 ( 0 . 34% ) | ## APPENDIX B KERNEL TRACE FOR ZRAM DECOMPRESSION To highlight the root-cause of the timing differences in ZRAM accesses, we trace the kernel functions called on accesses to ZRAM pages using the ftrace [50] utility. Listing 1 and Listing 2 show the trace of kernel functions called on an access to an incompressible and compressible page respectively in ZRAM (using deflate algorithm). The incompressible page contains random bytes, while the compressible page contains 2048 bytes of the same value and 2048 random bytes similar to the partially-compressible setting in Section VI-E1. We only show the functions which are called when the ZRAM page is swapped in to regular memory. The main difference between the two listings (colored in red) is that the functions performing decompression of the ZRAM page are only called when a compressible page is swapped-in, while these functions are skipped for the page stored uncompressed. Of these additional function calls, the main driver of the timing difference is the \_\_deflate\_decompress function in Listing 2 which consumes 12555 ns. This ties in with the characterization study in Section VI-E1 which showed the average timing difference between accesses to compressible and incompressible pages to be close to 10000 ns for ZRAM with deflate algorithm. These timings are for the deflate implementation in the Linux kernel, which is a modified version of zlib v1.1.3 [19]; hence these timings differ from the zlib timings measured in Table I for the more recent zlib v1.2.11 . Listing 1: Kernel function trace for ZRAM access to incompressible page <details> <summary>Image 14 Details</summary> ![c724d0cb](/v1/image/c724d0cbf218c90bd5d4984ce2b7421105f8580807d6ddf7f24157595fa18df5) ### Visual Description ## Call Trace: Function Execution Timeline ### Overview The image presents a call trace, displaying a sequence of function calls along with their execution times in nanoseconds. It provides a timeline of function execution, likely within a software system. ### Components/Axes * **Column 1:** Line Number (1 to 23) * **Column 2:** Time (ns) - Execution time in nanoseconds. Ranges from 0 to 1904 ns. * **Column 3:** Function - Name of the function being called. ### Detailed Analysis The table shows the execution timeline of various functions. The first column is the line number, the second column is the time in nanoseconds, and the third column is the function name. Here's a breakdown of the data: | Line | Time (ns) | Function | |------|-----------|--------------------------------| | 1 | | | | 2 | 0 | swap\_readpage | | 3 | 62 | page\_swap\_info | | 4 | 126 | \_frontswap\_load | | 5 | 195 | \_page\_file\_index | | 6 | 254 | bdev\_read\_page | | 7 | 326 | blk\_queue\_enter | | 8 | 395 | zram\_rw\_page | | 9 | 460 | zram\_bvec\_rw.isra.0 | | 10 | 527 | generic\_start\_io\_acct | | 11 | 590 | update\_io\_ticks | | 12 | 661 | part\_inc\_in\_flight | | 13 | 729 | \_zram\_bvec\_read.constprop.0 | | 14 | 813 | zs\_map\_object | | 15 | 967 | \_raw\_read\_lock | | 16 | 1407 | zs\_unmap\_object | | 17 | 1487 | generic\_end\_io\_acct | | 18 | 1550 | update\_io\_ticks | | 19 | 1617 | jiffies\_to\_usecs | | 20 | 1677 | part\_dec\_in\_flight | | 21 | 1753 | ktime\_get\_with\_offset | | 22 | 1835 | page\_endio | | 23 | 1904 | unlock\_page | ### Key Observations * The functions are executed sequentially, with increasing time values. * The time differences between function calls vary, indicating different execution durations. * Several functions relate to memory management (swap, zram, page). * I/O accounting functions (generic\_start\_io\_acct, generic\_end\_io\_acct) are present. ### Interpretation The call trace provides insights into the sequence and timing of function calls within a system. It can be used for performance analysis, debugging, and understanding the flow of execution. The presence of memory management and I/O related functions suggests that the trace might be related to a process involving disk or memory operations. The specific functions called and their timing can help identify bottlenecks or areas for optimization. </details> Listing 2: Kernel function trace for ZRAM access to compressible page <details> <summary>Image 15 Details</summary> ![853fe327](/v1/image/853fe32718055f8b909809e184ad2326f05d1b7c1c81f18c20aa703f9d3feb74) ### Visual Description ## Call Trace: Function Execution Times ### Overview The image presents a call trace, displaying the execution time (in nanoseconds) for a sequence of function calls. The data is organized in two columns: "Time (ns)" and "Function". Some of the time values and function names are highlighted in red, possibly indicating functions related to compression or decompression. ### Components/Axes * **Column 1 Header:** Time (ns) - Represents the time elapsed in nanoseconds. * **Column 2 Header:** Function - Represents the name of the function being called. * **Data:** The table contains a list of function calls and their corresponding execution times. ### Detailed Analysis or ### Content Details Here's a breakdown of the data, noting the red-highlighted entries: | Row | Time (ns) | Function | Color | |-----|-----------|--------------------------------|---------| | 1 | 0 | swap\_readpage | Black | | 2 | 61 | page\_swap\_info | Black | | 3 | 123 | \_frontswap\_load | Black | | 4 | 188 | \_page\_file\_index | Black | | 5 | 248 | bdev\_read\_page | Black | | 6 | 310 | blk\_queue\_enter | Black | | 7 | 379 | zram\_rw\_page | Black | | 8 | 442 | zram\_bvec\_rw.isra.0 | Black | | 9 | 505 | generic\_start\_io\_acct | Black | | 10 | 575 | update\_io\_ticks | Black | | 11 | 634 | part\_inc\_in\_flight | Black | | 12 | 755 | \_zram\_bvec\_read.constprop.0 | Black | | 13 | 838 | zs\_map\_object | Black | | 14 | 1040 | \_raw\_read\_lock | Black | | 15 | 1229 | zcomp\_stream\_get | Red | | 16 | 1306 | zcomp\_decompress | Red | | 17 | 1373 | crypto\_decompress | Red | | 18 | 1433 | deflate\_decompress | Red | | 19 | 1499 | \_deflate\_decompress | Red | | 20 | 14053 | zcomp\_stream\_put | Red | | 21 | 14117 | zs\_unmap\_object | Black | | 22 | 14195 | generic\_end\_io\_acct | Black | | 23 | 14255 | update\_io\_ticks | Black | | 24 | 14317 | jiffies\_to\_usecs | Black | | 25 | 14374 | part\_dec\_in\_flight | Black | | 26 | 14442 | ktime\_get\_with\_offset | Black | | 27 | 14518 | page\_endio | Black | | 28 | 14589 | unlock\_page | Black | ### Key Observations * The initial function calls (rows 1-14) have relatively small execution times, ranging from 0 to 1040 ns. * The functions highlighted in red (rows 15-20) are related to compression and decompression. * The `zcomp_stream_put` function (row 20) has a significantly larger execution time (14053 ns) compared to the other compression/decompression functions. * The remaining function calls (rows 21-28) have execution times in the range of 14117 to 14589 ns. ### Interpretation The call trace provides insights into the performance of different functions within the system. The red highlighting suggests a focus on compression/decompression operations. The significantly longer execution time of `zcomp_stream_put` compared to other compression-related functions indicates that this function might be a performance bottleneck or involve more complex operations. The data suggests that the initial setup and read operations are relatively fast, while the compression and decompression stages, particularly putting the stream, take considerably longer. The final set of functions appears to be related to cleanup and finalization, occurring after a significant delay. </details> ## APPENDIX C ## LAYOUT DISCOVERED BY COMPREZZOR Figure 14 shows the layout discovered by the Comprezzor that amplifies the timing difference for decompression of the correct and incorrect guesses in Zlib dictionary attack. Listing 3 and Listing 4 show the debug trace from the Zlib code for decompression with the correct and incorrect guesses to illustrate the root cause of the timing differences. Fig. 14: Incorrect guesses with the corner-case discovered by Comprezzor lead to a dynamic Huffman block creation for the partially compressible data, that is slow to decompress. <details> <summary>Image 16 Details</summary> ![8612f872](/v1/image/8612f872b7fe064264d82385ff0089a70aa0c8a26af3829b724314f4140fd1f1) ### Visual Description ## Diagram: Cookie Guessing Scenarios ### Overview The image illustrates two scenarios: a "Correct guess" and an "Incorrect guess" related to a cookie value. Each scenario depicts a stack of data blocks, with different types of blocks indicated by color and label. The diagram aims to show how different cookie values affect the structure of the data. ### Components/Axes * **Legend (Top-Right)**: * Tan: "New Dynamic Huffman Block" * Light Green: "Stored Block" * **Scenario Labels (Left)**: * "Correct guess" * "Incorrect guess" * **Data Blocks (Central)**: Each scenario consists of a stack of blocks, each representing a different type of data. ### Detailed Analysis **Correct Guess Scenario:** 1. **Top Block:** A tan block labeled "cookie=SECRET". This block represents the cookie value being correctly guessed. 2. **Second Block:** A large block with diagonal lines, representing a "New Dynamic Huffman Block". 3. **Third Block:** A light blue block labeled "ptr to cookie=SECRET x n". This block contains a pointer to the cookie value, multiplied by 'n'. 4. **Bottom Block:** A light green block labeled "Incompressible data". This block represents data that cannot be compressed. **Incorrect Guess Scenario:** 1. **Top Block:** A tan block labeled "cookie=SECRET". This block represents the cookie value being correctly guessed. 2. **Second Block:** A large block with diagonal lines, representing a "New Dynamic Huffman Block". 3. **Third Block:** Two tan blocks labeled "ptr to cookie=FOOBAR" and "ptr to cookie=FOOBAR x n". This block contains a pointer to the cookie value, which is incorrect ("FOOBAR"), multiplied by 'n'. 4. **Bottom Block:** A light green block labeled "Incompressible data". This block represents data that cannot be compressed. ### Key Observations * The "Correct guess" scenario has a "ptr to cookie=SECRET x n" block, while the "Incorrect guess" scenario has "ptr to cookie=FOOBAR" and "ptr to cookie=FOOBAR x n" blocks. * Both scenarios share the "cookie=SECRET" block at the top and the "Incompressible data" block at the bottom. * The "New Dynamic Huffman Block" is present in both scenarios. ### Interpretation The diagram illustrates how a correct or incorrect guess of a cookie value affects the structure of the data blocks. In the "Correct guess" scenario, the pointer correctly references the "SECRET" cookie value. In the "Incorrect guess" scenario, the pointer references an incorrect value ("FOOBAR"). This difference in pointer values likely has implications for how the data is processed or accessed. The presence of the "New Dynamic Huffman Block" suggests that compression is being used, and the "Incompressible data" block indicates a portion of the data that cannot be compressed. The diagram highlights the importance of correctly guessing the cookie value for proper data handling. </details> Listing 3: Trace for correct guess in zlib. Here the entire guess string is compressed and the remainder is incompressible (decompressed fast as a stored block). <details> <summary>Image 17 Details</summary> ![576d0243](/v1/image/576d024351c5eca99f7ce32dd4109414a6d522a7602631d3cc86042b371f4d3f) ### Visual Description ## Code Snippet: Inflate Process Log ### Overview The image shows a log of an inflate process, likely related to data compression or decompression. The log entries indicate the type of operation (length, distance, literal, stored block) and associated values. ### Components/Axes The log entries are structured as follows: - Line Number: Integer from 30 to 55. - Operation: "inflate:" followed by the type of operation (length, distance, literal, stored block, check matches trailer, end). - Value: A numerical value, a hexadecimal representation, or a character literal associated with the operation. ### Detailed Analysis Here's a breakdown of the log entries: - **Line 30:** `inflate: length 12` - **Line 31:** `inflate: distance 16484` - **Line 32:** `inflate: literal 0x17` - **Line 33:** `inflate: length 13` - **Line 34:** `inflate: distance 14` - **Line 35:** `inflate: literal 0xb3` - **Line 36:** `inflate: length 13` - **Line 37:** `inflate: distance 14` - **Line 38:** `inflate: literal 'x'` - **Line 39:** `inflate: length 13` - **Line 40:** `inflate: distance 14` - **Line 41:** `inflate: literal 0x05` - **Line 42:** `inflate: length 13` - **Line 43:** `inflate: distance 14` - **Line 44:** `inflate: literal 0xa9` - **Line 45:** `inflate: length 13` - **Line 46:** `inflate: distance 14` - **Line 47:** `inflate: literal 0x81` - **Line 48:** `inflate: length 13` - **Line 49:** `inflate: distance 14` - **Line 50:** `inflate: literal ' ['` - **Line 51:** `inflate: stored block (last)` - **Line 52:** `inflate: stored length 16186` - **Line 53:** `inflate: stored end` - **Line 54:** `inflate: check matches trailer` - **Line 55:** `inflate: end` ### Key Observations - The log shows a sequence of length, distance, and literal operations, which are typical in DEFLATE compression algorithms. - The process ends with a stored block, a trailer check, and an "end" marker. - There is a repeating pattern of "length 13" followed by "distance 14" from line 33 to line 49. ### Interpretation The log provides a trace of the inflate process, revealing the steps taken to decompress the data. The repeating pattern of length and distance suggests the use of Huffman coding or a similar technique where lengths and distances are used to reconstruct the original data. The "stored block" indicates a section of the data that was stored without compression. The "check matches trailer" suggests a checksum verification step to ensure data integrity. The log provides insight into the inner workings of a DEFLATE decompression algorithm. </details> Listing 4: Trace for incorrect guess in zlib. Here only part of the guess string ( cookie= ) is compressed and the remainder cookie=FOOBAR is separately compressed (decompression requires a slower code block for Huffman tree decoding). <details> <summary>Image 18 Details</summary> ![a7285029](/v1/image/a728502972758358aaca4535931449f7f92f63a02015d66dce4a1f39f140a858) ### Visual Description ## Log Data: Inflate Process ### Overview The image shows a log excerpt from an inflate process, detailing the sequence of operations performed during data decompression. The log entries indicate the type of operation (length, distance, or literal) and the corresponding value or character. ### Components/Axes The log entries are structured as follows: - **Line Number**: Sequential number of the log entry (e.g., 56, 57, ..., 94). - **Process**: "inflate:" indicates the process generating the log. - **Operation Type**: Specifies the type of operation performed (length, distance, or literal). - **Value**: The value associated with the operation (e.g., length value, distance value, or literal character). ### Detailed Analysis Here's a breakdown of the log entries: | Line | Process | Operation Type | Value | |------|-----------|----------------|----------------| | 56 | inflate: | length | 6 | | 57 | inflate: | distance | 16484 | | 58 | inflate: | literal | 'F' | | 59 | inflate: | literal | 'O' | | 60 | inflate: | literal | 'O' | | 61 | inflate: | literal | 'B' | | 62 | inflate: | literal | 'A' | | 63 | inflate: | literal | 'R' | | 64 | inflate: | literal | 0x17 | | 65 | inflate: | length | 13 | | 66 | inflate: | distance | 14 | | 67 | inflate: | literal | 0xb3 | | 68 | inflate: | length | 13 | | 69 | inflate: | distance | 14 | | 70 | inflate: | literal | 'x' | | 71 | inflate: | length | 13 | | 72 | inflate: | distance | 14 | | 73 | inflate: | literal | 0x05 | | 74 | inflate: | | dynamic codes block (last) | | 75 | inflate: | | table sizes ok | | 76 | inflate: | | code lengths ok | | 77 | inflate: | | codes ok | | 78 | inflate: | length | 13 | | 79 | inflate: | distance | 14 | | 80 | inflate: | literal | 0xa9 | | 81 | inflate: | length | 13 | | 82 | inflate: | distance | 14 | | 83 | inflate: | literal | 0x81 | | 84 | inflate: | length | 13 | | 85 | inflate: | distance | 14 | | 86 | inflate: | literal | '[' | | 87 | inflate: | length | 13 | | 88 | inflate: | distance | 14 | | 89 | inflate: | literal | 0xff | | 90 | inflate: | length | 13 | | 91 | inflate: | distance | 14 | | 92 | inflate: | literal | 0xa5 | | 93 | inflate: | length | 13 | | 94 | inflate: | distance | 14 | ### Key Observations - The log shows a mix of length, distance, and literal operations, which are typical in deflate algorithms. - There's a sequence of literal characters "FOOBAR" early in the log. - The log includes messages indicating the completion of dynamic codes block processing and verification of table sizes and code lengths. - There is a repeating pattern of "length 13" followed by "distance 14" from line 78 to 94. ### Interpretation The log excerpt provides a snapshot of the inflate process, showing how the compressed data is being decompressed. The sequence of length, distance, and literal operations indicates the structure of the compressed data and the steps taken to reconstruct the original data. The "dynamic codes block (last)" message suggests that the process is using dynamic Huffman coding, where the Huffman codes are generated based on the data being compressed. The "table sizes ok" and "code lengths ok" messages indicate that the Huffman tables have been successfully processed and verified. The repeating pattern of length 13 and distance 14 suggests a repetitive structure in the compressed data. </details> ## APPENDIX D BYTEWISE LEAKAGE Figure 15 illustrates the bytewise leakage of the secret (SECRET) for a PHP application using PHP-Memcached. Figure 16 shows the bytewise of the secret string for a Flask application that stores secret data together with attackercontrolled data into PostgreSQL. The prefix value can be shifted bytewise, which allows reusing the same memory layouts found by Comprezzor. Figure 17 shows the last two bytes leaked from a secret (SECRET) in a ZRAM page. This is a continuation of Figure 12 which showed the leakage of the first four bytes. Fig. 15: Bytewise leakage of the secret (S,E,C,R,E,T) from PHP-Memcached. In each plot, the lowest timing (shown in red) indicates the correct guess. <details> <summary>Image 19 Details</summary> ![0feab892](/v1/image/0feab892aef789d878c88a152e2088ddc46dbdcdf5b1f7446e9aee166d343281) ### Visual Description ## Scatter Plot Series: Timing Analysis of Cookie Bytes ### Overview The image presents a series of six scatter plots, each analyzing the timing behavior (in nanoseconds) for different byte positions (Byte 0 through Byte 5) within a cookie. The x-axis represents different "guesses" for the cookie value at that byte position, and the y-axis represents the timing observed for each guess. Each plot highlights a specific cookie value guess with a red dot, while all other guesses are marked with blue dots. ### Components/Axes * **Y-Axis (Timing [ns])**: The y-axis represents the timing in nanoseconds. The scale varies slightly between plots but generally ranges from approximately 1.0 x 10^6 ns to 1.1 x 10^6 ns. * **X-Axis (Guess)**: The x-axis represents the "guess" for the cookie value at the specific byte position. Each guess is labeled in the format "cookie=[value]". * **Plot Titles**: Each plot is titled "Byte [n]", where n is the byte position (0 to 5). * **Data Points**: Blue dots represent the timing for incorrect guesses. Red dots represent the timing for a specific guess, presumably the correct one. ### Detailed Analysis **Byte 0:** * X-Axis Labels: cookie=0, cookie=5, cookie=9, cookie=A, cookie=H, cookie=M, cookie=S, cookie=Z * Y-Axis Range: 1.0 x 10^6 to 1.06 x 10^6 ns * Trend: The timing values are relatively consistent across different guesses, with some minor variations. * Highlighted Point: cookie=S (red dot). Timing is approximately 1.03 x 10^6 ns. **Byte 1:** * X-Axis Labels: cookie=S0, cookie=S5, cookie=S9, cookie=SA, cookie=SE, cookie=SH, cookie=SM, cookie=SS, cookie=SZ * Y-Axis Range: 0.98 x 10^6 to 1.06 x 10^6 ns * Trend: Similar to Byte 0, the timing values are relatively consistent. * Highlighted Point: cookie=SE (red dot). Timing is approximately 0.98 x 10^6 ns. **Byte 2:** * X-Axis Labels: cookie=SE0, cookie=SE5, cookie=SE9, cookie=SEC, cookie=SEH, cookie=SEM, cookie=SES, cookie=SEZ * Y-Axis Range: 1.0 x 10^6 to 1.1 x 10^6 ns * Trend: The timing values are relatively consistent. * Highlighted Point: cookie=SEC (red dot). Timing is approximately 1.02 x 10^6 ns. **Byte 3:** * X-Axis Labels: cookie=SEC0, cookie=SEC5, cookie=SEC9, cookie=SECA, cookie=SECH, cookie=SECM, cookie=SECR, cookie=SECRZ * Y-Axis Range: 1.0 x 10^6 to 1.06 x 10^6 ns * Trend: The timing values are relatively consistent. * Highlighted Point: cookie=SECR (red dot). Timing is approximately 1.02 x 10^6 ns. **Byte 4:** * X-Axis Labels: cookie=SECR0, cookie=SECR5, cookie=SECR9, cookie=SECRA, cookie=SECRE, cookie=SECRM, cookie=SECRS, cookie=SECRZ * Y-Axis Range: 0.95 x 10^6 to 1.06 x 10^6 ns * Trend: The timing values are relatively consistent. * Highlighted Point: cookie=SECRE (red dot). Timing is approximately 0.98 x 10^6 ns. **Byte 5:** * X-Axis Labels: cookie=SECRE0, cookie=SECRE5, cookie=SECRE9, cookie=SECREA, cookie=SECREH, cookie=SECREM, cookie=SECRET, cookie=SECREZ * Y-Axis Range: 1.0 x 10^6 to 1.06 x 10^6 ns * Trend: The timing values are relatively consistent. * Highlighted Point: cookie=SECRET (red dot). Timing is approximately 1.02 x 10^6 ns. ### Key Observations * The timing values for the highlighted points (red dots) are consistently lower than the average timing for other guesses in Bytes 1, 4. * The timing values for the highlighted points (red dots) are consistently similar to the average timing for other guesses in Bytes 0, 2, 3, 5. * The x-axis labels increment sequentially, building upon the previous byte's value. ### Interpretation The plots likely represent a side-channel attack analysis, specifically a timing attack, on a system that processes cookies. The goal is to determine the correct cookie value by observing the time it takes for the system to process different cookie guesses. The red dots indicate the correct cookie value for each byte. The fact that some of these correct guesses (Bytes 1 and 4) have significantly lower timing values suggests that the system might be performing some kind of early exit or optimization when the correct byte is encountered. This timing difference can be exploited to infer the correct cookie value. The full cookie value, as determined by the red dots, is "SECRET". </details> Fig. 16: Bytewise leakage of the secret (S,E,C,R,E,T) from PostgreSQL. The known prefix (cookie=) is shifted left by 1 character in each step, allowing the same memory layout to be reused. In each plot, the highest timing (shown in red) indicates the correct guess. <details> <summary>Image 20 Details</summary> ![4a835b5f](/v1/image/4a835b5f788c971b8bd05c540e7196ff2be4ebe02ef378fb12d0d0f93441c150) ### Visual Description ## Timing Analysis of Cookie Guessing ### Overview The image presents six scatter plots, each representing the timing (in nanoseconds) for different guesses of a byte in a cookie. Each plot corresponds to a specific byte position (Byte 0 to Byte 5). The x-axis represents the different guesses made for that byte, and the y-axis represents the timing taken for each guess. A red dot highlights a specific guess in each byte plot, presumably indicating the correct guess. ### Components/Axes * **Y-axis:** "Timing [ns]" with a scale from approximately 1.18 x 10^6 to 1.3 x 10^6 ns. The scale varies slightly between plots. * **X-axis:** "Guess" - represents the different guesses made for the cookie byte. The labels on the x-axis indicate the specific guess being made (e.g., "cookie=0", "ookie=S0", "kie=SEC0", "ie=SECR0", "e=SECRE0"). * **Titles:** Each plot has a title indicating the byte number being analyzed (e.g., "Byte 0", "Byte 1", ..., "Byte 5"). * **Data Points:** Blue dots represent the timing for each guess. * **Highlighted Guess:** A red dot highlights a specific guess in each plot. ### Detailed Analysis **Byte 0:** * X-axis labels: `cookie=0`, `cookie=5`, `cookie=9`, `cookie=A`, `cookie=H`, `cookie=M`, `cookie=S`, `cookie=Z` * Y-axis range: Approximately 1.2 x 10^6 to 1.26 x 10^6 ns. * Trend: The timing values are relatively stable, with a slight increase for `cookie=9` and `cookie=S`. * Highlighted Guess: `cookie=S` is highlighted with a red dot. * Values: Most points are around 1.22 x 10^6 ns, with `cookie=9` and `cookie=S` around 1.24 x 10^6 ns. **Byte 1:** * X-axis labels: `ookie=S0`, `ookie=S5`, `ookie=S9`, `ookie=SA`, `ookie=SE`, `ookie=SH`, `ookie=SM`, `ookie=SS`, `ookie=SZ` * Y-axis range: Approximately 1.18 x 10^6 to 1.26 x 10^6 ns. * Trend: The timing values are relatively stable, with a noticeable increase for `ookie=SE`. * Highlighted Guess: `ookie=SE` is highlighted with a red dot. * Values: Most points are around 1.22 x 10^6 ns, with `ookie=SE` around 1.26 x 10^6 ns. **Byte 2:** * X-axis labels: `okie=SE0`, `okie=SE5`, `okie=SE9`, `okie=SEC`, `okie=SEH`, `okie=SEM`, `okie=SES`, `okie=SEZ` * Y-axis range: Approximately 1.19 x 10^6 to 1.27 x 10^6 ns. * Trend: The timing values are relatively stable, with a slight increase for `okie=SEC`. * Highlighted Guess: `okie=SEC` is highlighted with a red dot. * Values: Most points are around 1.22 x 10^6 ns, with `okie=SEC` around 1.26 x 10^6 ns. **Byte 3:** * X-axis labels: `kie=SEC0`, `kie=SEC5`, `kie=SEC9`, `kie=SECA`, `kie=SECH`, `kie=SECM`, `kie=SECR`, `kie=SECRZ` * Y-axis range: Approximately 1.2 x 10^6 to 1.3 x 10^6 ns. * Trend: The timing values are relatively stable, with a slight increase for `kie=SECR`. * Highlighted Guess: `kie=SECR` is highlighted with a red dot. * Values: Points for `kie=SEC0`, `kie=SEC5`, `kie=SEC9` are around 1.22 x 10^6 ns. `kie=SECR` is around 1.28 x 10^6 ns. **Byte 4:** * X-axis labels: `ie=SECR0`, `ie=SECR5`, `ie=SECR9`, `ie=SECRA`, `ie=SECRE`, `ie=SECRM`, `ie=SECRS`, `ie=SECRZ` * Y-axis range: Approximately 1.19 x 10^6 to 1.27 x 10^6 ns. * Trend: The timing values are relatively stable, with a noticeable increase for `ie=SECRE`. * Highlighted Guess: `ie=SECRE` is highlighted with a red dot. * Values: Most points are around 1.22 x 10^6 ns, with `ie=SECRE` around 1.26 x 10^6 ns. **Byte 5:** * X-axis labels: `e=SECRE0`, `e=SECRE5`, `e=SECRE9`, `e=SECREA`, `e=SECREH`, `e=SECREM`, `e=SECRET`, `e=SECREZ` * Y-axis range: Approximately 1.19 x 10^6 to 1.27 x 10^6 ns. * Trend: The timing values are relatively stable, with a slight increase for `e=SECRET`. * Highlighted Guess: `e=SECRET` is highlighted with a red dot. * Values: Most points are around 1.22 x 10^6 ns, with `e=SECRET` around 1.26 x 10^6 ns. ### Key Observations * For each byte, the highlighted guess (red dot) corresponds to a noticeable increase in timing compared to other guesses. * The timing differences between correct and incorrect guesses are relatively small, on the order of 0.04 x 10^6 ns (40,000 ps). * The x-axis labels show a progression in the guessed string as the byte position increases (e.g., `cookie`, `ookie`, `okie`, `kie`, `ie`, `e`). ### Interpretation The data suggests a timing attack vulnerability. The correct guess for each byte of the cookie results in a slightly longer execution time. This timing difference, although small, can be exploited to infer the correct cookie value. The red dots indicate the correct cookie bytes, which, when combined, form the string "SECRET". The timing variations likely arise from conditional branches or other operations that take slightly longer when the correct byte is guessed. This type of analysis is crucial for identifying and mitigating security vulnerabilities in software systems. </details> Timing [ns] . cookie=SECR0 cookie=SECR5 Timing [ns] , , cookie=SECRE9 cookie=SECRE0 cookie=SECRE5 cookie=SECREA cookie=SECR9 cookie=SECRA Byte 4 cookie=SECRS cookie=SECRE cookie=SECRM Byte 5 cookie=SECREH cookie=SECREM Guess Fig. 17: Bytewise leakage of the last two bytes of the secret (E,T) from ZRAM. Times for guesses (0-9, A-Z) for each of the bytes are shown. The highest value in each plot (shown in red) indicates the correct secret value for the byte. . cookie=SECRZ cookie=SECREZ cookie=SECRET

Rendering Paper...