Cashe Missing for Fun and Profit
By: Fatih • Research Paper • 4,221 Words • March 3, 2010 • 1,128 Views
Join now to read essay Cashe Missing for Fun and Profit
Abstract. Simultaneous multithreading ЎЄ put simply, the shar-ing of the execution resources of a superscalar processor betweenmultiple execution threads ЎЄ has recently become widespread viaits introduction (under the name Ў°Hyper-ThreadingЎ±) into IntelPentium 4 processors. In this implementation, for reasons of ef-ficiency and economy of processor area, the sharing of processorresources between threads extends beyond the execution units; ofparticular concern is that the threads share access to the memorycaches.We demonstrate that this shared access to memory caches pro-vides not only an easily used high bandwidth covert channel be-tween threads, but also permits a malicious thread (operating, intheory, with limited privileges) to monitor the execution of anotherthread, allowing in many cases for theft of cryptographic keys.Finally, we provide some suggestions to processor designers, op-erating system vendors, and the authors of cryptographic software,of how this attack could be mitigated or eliminated entirely.1. IntroductionAs integrated circuit fabrication technologies have improved, provid-ing not only faster transistors but smaller transistors, processor design-ers have been met with two critical challenges. First, memory latencieshave increased dramatically in relative terms; and second, while it iseasy to spend extra transistors on building additional execution units,many programs have fairly limited instruction-level parallelism, whichlimits the extent to which additional execution resources can be uti-lized. Caches provide a partial solution to the first problem, whileout-of-order execution provides a partial solution to the second.In 1995, simultaneous multithreading was revived1in order to com-bat these two difficulties [12]. Where out-of-order execution allowsinstructions to be reordered (subject to maintaining architectural se-mantics) within a narrow window of perhaps a hundred instructions,Key words and phrases. Side channels, simultaneous multithreading, caching.1Simultaneous multithreading had existed since at least 1974 in theory [10], evenif it had not yet been shown to be practically feasible.
--------------------------------------------------------------------------------
Page 2
simultaneous multithreading allows instructions to be reordered acrossthreads; that is, rather than having the operating system perform con-text switches between two threads, it can schedule both threads simul-taneously on the same processor, and instructions will be interleaved,dramatically increasing the utilization of existing execution resources.On the 2.8 GHz Intel Pentium 4 with Hyper-Threading processor,with which the remainder of this paper is concerned2, the two threadsbeing executed on each processor share more than merely the execu-tion units; of particular concern to us, they share access to the memorycaches [8]. Caches have already been demonstrated to be cryptograph-ically dangerous: Many implementations of AES [9] are subject to tim-ing attacks arising from the non-constancy of S-box lookup timings [1].However, having caches shared between threads provides a vastly moredangerous avenue of attack.2. Covert communication via pagingTo see how shared caches can create a cryptographic side-channel, wefirst step back for a moment to a simpler problem ЎЄ covert channels [7]ЎЄ and one of the classic examples of such a channel: virtual memorypaging.Consider two processes, known as the Trojan process and the Spyprocess, operating at different privilege levels on a multilevel securesystem, but both with access to some large reference file (naturally, ona multilevel secure system this access would necessarily be read-only).The Trojan process now reads a subset of pages in this reference file,resulting in page faults which load the selected pages from disk intomemory. Once this is complete (or even in the middle of this operation)the Spy process reads every page of the reference file and measures thetime taken for each memory access. Attempts to read pages whichhave been previously read by the Trojan process will complete veryquickly, while those pages which have not already been read will incurthe (easily measurable) cost of a disk access. In this manner, the Trojanprocess can repeatedly communicate one bit of information to the Spyprocess in the time it takes for a page to be loaded from disk intomemory, up to a total number of bits equal to the size (in pages) of theshared reference file.2We examine the 2.8 GHz Intel Pentium 4 with Hyper-Threading processor forreasons of availability, but expect that the results in this paper will apply equallyto all processors with the same simultaneous multithreading and memory cachedesign.
--------------------------------------------------------------------------------
Page 3
CACHE MISSING FOR