I understand the part of the paper where they trick the CPU to speculatively load the part of the victim memory into the CPU cache. Part I do not understand is how they retrieve
I would like to contribute one piece of information to the already existing answers, namely how the attacker can actually probe an array from the victim process in the probing phase. This is a problem, because Spectre (unlike Meltdown) runs in the victim's process and even through the cache the attacker cannot just query arrays from other processes.
In short: With Spectre the FLUSH+RELOAD attack needs KSM or another method for shared memory. That way the attacker (to my understanding) can replicate the relevant parts of the victim's memory in his own address space and thus will be able to query the cache for the access times on the probe array.
Long Explanation:
One big difference between Meltdown and Spectre is that in Meltdown the whole attack is running in the address space of the attacker. Thus, it's quite clear how the attacker can both cause changes to the cache and read the cache at the same time. With Spectre however, the attack itself runs in the process of the victim. By using so called gadgets the victim will execute code that writes the secret data into the index of a probe array, e.g. with a = array2[array1[x] * 4096]
.
The proof-of-concepts that have been linked in other answers implement the basic branching/speculation concept of Spectre, but all code seems to run in the same process. Thus, of course it is no problem to have gadget code write to array2
and then read array2
for probing. In a real-world scenario, however, the victim process would write to array2
which is also located in the victim process.
Now, the problem - which the paper in my opinion does not explain well - is that the attacker has to be able to probe the cache for the victim's address space array (array2
). Theoretically, this could be done either from within the victim again or from the attackers address space.
The original paper only describes it vaguely, probably because it was clear to the authors:
For the final phase, the sensitive data is recovered. For Spectre attacks using Flush+Reload or Evict+Reload, the recovery process consists of timing the access to memory addresses in the cache lines being monitored.
To complete the attack, the adversary measures which location in array2 was brought into the cache, e.g., via Flush+Reload or Prime+Probe.
Accessing the cache for array2
from within the victim's address space would be possible, but it would require another gadget and the attacker would have to be able to trigger execution of this gadget. This seemed quite unrealistic to me, especially in Spectre-PHT.
In the paper Detecting Spectre Attacks by identifying Cache Side-Channel Attacks using Machine Learning I found my missing explanation:
In order for the FLUSH+RELOAD attack to work in this case, three preconditions have to be met. [...] But most importantly the CPU must have a mechanism like Kernel Same-page Merging (KSM) [4] or Transparent Page Sharing (TPS) [54] enabled [10].
KSM allows processes to share pages by merging different virtual addresses into the same page, if they reference the same physical address. It thereby increases the memory density, allowing for a more efficient memory usage. KSM was first implemented in Linux 2.6.32 and is enabled by default [33].
KSM explains how the attacker can access array2
that normally would only be available within the victim's process.