Software-based Fault Isolation

Lecturer: Professor Fred B. Schneider

Lecture notes by Lynette I. Millett


We have been discussing protection measures that a single operating system can provide. One way to think of this is to view the operating system as a padded cell in which programs operate; we have been discussing the nature of the padding. Another way to get programs to behave in a manner consistent with a given security policy is by "brainwashing." That is, modify the programs so that they behave only in safe ways. This is embodied by a recent approach to security known as software-based fault isolation (SFI).

So far, the environment has been responsible for policy enforcement, where the environment is either the OS/kernel or the hardware. Hardware methods include addressing mechanisms (e.g. virtual memory); OS methods include having two modes (where the supervisor mode has access to everything). The new approach we discuss today is to construct a piece of software that transforms a given program p into a program p', where p' is guaranteed to satisfy a security policy of interest.

This SFI SW transformation could be any number of things. It could be a piece of the compiler or of the loader. It could also involve a separate pass over machine language code before execution commences. The point is that we are modifying the program before it is executed. (One easy realization of SFI SW is to always output a program that does nothing. However, there are likely to be properties of the original program that we are interested in preserving, and these properties might not be satisfied by a program that does nothing.)

When would the SFI approach be useful (as opposed to the other approaches we have been discussing this semester)?

SFI for Logical Fault Domains

We take a single address space and partition it into regions called logical fault domains (LFDs). We would like to enforce (without resorting to help from the operating system) programs in an LFD being isolated from reading or writing to memory outside their LFD. We also need a cheap way to communicate between LFDs. The policy we want is as follows:

To start, we will address the read/write constraints. Suppose that the LFD is memory address '0300'h to '03FF'h. Given a program's code, the software to modify it should look for instructions equivalent to 'write x' and ensure that x is located in the LFD's region. An instruction 'write x' can be transformed to

The same can be done for a read. One instruction is transformed to five, which is a five-fold code blow-up. We might be better off just accepting the cost of a context switch (IPC) rather than inflating the code (writes occur much more frequently than context switches).

In order to cut down on the number of additional instructions, we can assume that LFDs all start on integral boundaries (e.g. 0000-00FF, 0100-01FF, 0200-02FF, etc.) Also, note that if we control the writes, then we don't really need to worry about the reads: even if a program could read outside of its LFD, if it can't also write, secrecy and integrity will be maintained. Suppose we are modifying a program whose LFD falls between 0300 and 03FF. Then, if we see an address whose high-order bits are 04 we know the program is attempting to access something outside of its LFD. We will use a temporary register and masks to transform an instruction 'write x' as follows:

We have now decreased the code blow-up to 4x. However, we can still do better. We make a further assumption that if a program attempts to write outside of its LFD, the program is malfunctioning anyway so, instead of trying to detect bad addresses, we can just alter them so they are within the LFD. This allows for a cheaper implementation. How a bad program dies is immaterial, and we have effectively only altered the behavior of malfunctioning programs. We again use a separate temporary register (tmp) to transform an instruction 'write x' to: The cost for a write is now just 3 times what it was originally. Note that we could rewrite this without using the separate temporary register. However, this would allow a malicious programmer to exploit the fact that SFI was being done. For instance, if we transformed 'write x' to the following SFI sequence: then a programmer could take advantage of the fact that an SFI write is 3 instructions long and branch to the third (the new 'write x') instruction, thereby circumventing the SFI transformation. This is avoided by using the temporary register which the original program does not read or write and maintaining the invariant: the register 'tmp' always contains a valid address unless the program is in the midst of an SFI sequence. This technique is referred to as sandboxing.

We now turn to the question of how to transform jumps. We need to prevent jumps into regions that are outside of the given program's LFD. We also need to ensure that there are no jumps into the middle of an SFI sequence. There are two types of jump instructions: direct and indirect. Direct jumps are not a problem, since we can easily inspect the address to ensure that it is in the correct LFD.

There are three kinds of indirect jumps: jmp, call and return. The notation is: jmp (r) which means: take r and jump to what it points to. Handling indirect jumps is not all that different from the way we handle write instructions. We could add code that compares the branch/target address under consideration, or we could sandbox it as we did with writes. To pursue the sandboxing approach, we postulate a new register 'ctmp' and transform an instruction 'jmp (r)' as follows:

where SuffixMask is, for example, 00FF, and PrefixMask might be 0500 (if we are interested in staying within the LFD 0500-05FF). If 'ctmp' is manipulated only in SFI regions, then, as long as the program is executing outside an SFI sequence, ctmp has a value within the the appropriate LFD. Thus, even if a malicious program tries to jump to the middle of an SFI region, it will still not be able to jump outside of the current LFD.

The transformation for 'call (r)' is similar.

For a return (which pops the stack and branches) we need to make sure that the address at the top of the stack is reasonable. To do that, an instruction is prefixed to the above jump transformation: ctmp = PopStack.

Even with these transformations, there remain problems due to jumps. Complex opcodes may contain unsafe opcodes. For example, the instruction mov edx, 'C23BCB50'h assembles to BA 50 CB 3B C2. However, 50 CB 3B C2 can be disassembled to

In this instance, jumping into the middle of the original instruction means something very different, but is a legitimate instruction (due to compact instruction representations). SFI won't catch or correct this sort of problem. For example, it could possible to branch into the middle of an instruction, which would then be interpreted as an instruction to jump outside the LFD. A solution is to enforce our assumptions about where the control points are in a program. We design predicates that are true for instruction boundaries: ValidIndirectCallPoint, ValidIndirectRetPoint, and ValidIndirectJumpPoint. We augment the SFI sequence for a jump with a check to make sure that the target address is to a control point we know about. For instance, 'jmp (r)' now becomes:

There are two systems-level issues that should be mentioned. Multiple LFDs in a single address space might correspond to multiple programs. Then:

There are some problems with using SFI. Clearly, we would like the SFI transformation process to preserve some properties of the original program. The problem is in defining precisely what those properties are, since the transformed program obviously doesn't preserve all properties. For instance, the transformed program will not have the same number of instructions as the original. Library routines can also be used to circumvent SFI. In particular, we must prevent programs from invoking library routines that have not been SFI'd. Finally, denial of service attacks remain possible. For example, one LFD could acquire a shared lock and then never relinquish it.