Next: 5 Application to a Up: A Model for Decentralized Previous: 3 Decentralized Information Flow

4 Checking Labels

Labels can be used to annotate code, and the annotated code can be checked statically to verify that it contains no information leaks. In this section, we discuss some issues related to static analysis of annotated code, though we defer issues of how to extend a programming language till the next section. In Section 4.1, we explain the importance of static checking to information flow control, and the problem of implicit flows [DD77]. In Section 4.2, we describe a simple way to prevent implicit flows from leaking information by using static analysis.

4.1 Static vs. Dynamic Checking

Information flow checks can be viewed as an extension to type checking. For both kinds of static analysis, the compiler determines that certain operations are not permitted to be performed on certain data values. Type checks may be performed at compile time or at run time, though compile-time checks are obviously preferred when applicable because they impose no run-time overhead. Access control checks are usually performed at run time, although some access control checks may be performed at compile time [JL78, RSC92]. In general, it seems that some access control checks must be performed dynamically in order to give the system sufficient flexibility.

By contrast, fine-grained information flow control is practical only with some static analysis, which may seem odd; after all, any check that can be performed by the compiler can be performed at run time as well. The difficulty with run-time checks is exactly the fact that they can fail. In failing, they may communicate information about the data that the program is running on. Unless the information flow model is properly constructed, the fact of failure (or its absence) can serve as a covert channel. By contrast, the failure of a compile-time check reveals no information about the actual data passing through a program. A compile-time check only provides information about the program that is being compiled. Similarly, link-time and load-time checks provide information only about the program, and may be considered to be static checks for the purposes of this work.

            x := 0
            if b then
                x := 1
Figure 3: Implicit information flow

Implicit information flows [DD77] are difficult to prevent without static analysis. For example, consider the segment of code shown in Figure 3 and assume that the storage locations b and x belong to different security classes b and x, respectively. (We will follow the literature in using the notation e to refer to the label of the expression e.) In particular, assume b is more sensitive than x (more generally, b x), so data should not flow from b to x. However, the code segment stores 1 into x if b is true, and 0 into x if b is false; x effectively contains the value of b. A run-time check can easily detect that the assignment x := 1 communicates information improperly, and abort the program at this point. Consider, however, the case where b is false: no assignment to x occurs within the context in which b affects the flow of control. The fact of the program's aborting or continuing implicitly communicates information about the value of b, which can be used in at least the case where b is false.

We could imagine inspecting the body of the if statement at run time to see whether it contains disallowed operations, but in general this requires evaluating all possible execution paths of the program, which is clearly infeasible. Another possibility is to restrict all writes that follow the if statement on the grounds that once the process has observed b, it is irrevocably tainted. However, this approach seems too restrictive to be practical. The advantage of compile-time checking is that in effect, static analysis efficiently constructs proofs that no possible execution path contains disallowed operations.

To provide flexibility, some information flow control checks are desirable at run time; such checks are allowed so long as their success or failure does not communicate information improperly--which must itself be checked statically! We examine run-time information flow checks later, in Section 5.10.

4.2 Basic Block Labels

As described in Section 3.1, when a data value is extracted from a slot, it acquires the slot label. Furthermore, to ensure that writing to a slot does not leak information, the label on the slot must be more restrictive than the label on the data value being written: x := v is legal only if vx. However, while this restriction condition is necessary, it is not sufficient for avoiding information leaks, because of the possibility of implicit information flows. The code in Figure 3 provides an example. The variables x and b are both slots. The expressions 0 and 1 do not give any information about other data, so they are labeled by . Therefore the assignment x := 1 appears to be legal. However, earlier we showed that this assignment may be an information leak. Therefore, our simple rule is not sufficient. The problem becomes clearer if we rewrite the if statement:

x := (if b then 1 else x)

Clearly, the value of x after the statement completes is dependent on the value of b, and the assignment is legal only if bx.

In general, the flow of control within a program depends on the values of certain expressions. At any given point during execution, various values vi have been observed in order to decide to arrive at the current program location. Any mutation that occurs can potentially leak information about the observed values vi, so the slot that is being mutated must be at least as restricted as the labels on all these variables:

i vi = v1 v2 ... vn

The label i vi can be determined through straightforward static analysis of the program's basic block diagram, and will be called the basic block label, B. The basic block label indicates information that might be inferred by knowing that the program is executing this basic block. Using the basic block label, we can write the correct rule for checking assignment: assignment to a variable x with a value v is permitted only if vBx.

Figure 4: Basic blocks for an if statement

Intuitively, a basic block label must include the labels of all values that were observed to reach that point in the execution. For example, consider the basic block diagram shown in Figure 4, which corresponds to the code of Figure 3; each basic block, represented as a box in the diagram, is characterized by a single basic block label, and has one or two exit points. Here the basic block for x := 1 has label b because the value of b had to be observed to reach that point in the program. However, the label of the ``final'' block is because at that point the program has no knowledge of any values. It is true that the program could discover information by performing tests that read from slots (e.g., x); however, the basic block label captures what the program knows without any further reading.

Labels of basic blocks are derived as follows. The decision about which exit point to follow from a block Bi is made based on the observation of some value vi. The label B for a particular basic block B is the join of some of the labels vi. A label vi is included in the join if it is possible to reach B from Bi, and it is also possible to reach the final node from Bi without passing through B (if all paths from Bi to the final node pass through B, then arriving at B conveys no information about vi.) The set of vi that must be included in each B can be efficiently computed using standard compiler techniques. This rule for basic block label propagation is equivalent to the rule of Denning and Denning [DD77].

Figure 5: Basic blocks for a while statement

Now, consider the execution of a ``while'' statement, which creates a loop in the basic block diagram. This situation is illustrated in Figure 5. Note that for the final basic block, we obtain B =  by reasoning in the same way as we did for the ``if'' statement. This labeling might seem strange, since when we arrive at the final block, we know the value of b. However, arriving at the final block gives no information about the value of b before the code started, and there is no way to use code of this sort to improperly transmit information.

This labeling rule holds as long as all programs terminate, or at least as long as there is no way to derive information from the non-termination of a program [DD77, AR80]. The way one decides that a program has not terminated is to time its execution, either explicitly or through asynchronous communication with another thread. We do not address timing channels in this paper.

If the language allows the raising of exceptions and of return statements, the returned value must also be labeled by the label of the basic block that contains the return or raise. This fact can be seen clearly by converting a procedure that uses a return statement into one that uses boolean variables to keep track of control flow.

Next: 5 Application to a Language Up: A Model for Decentralized Previous: 3 Decentralized Information Flow

Andrew C. Myers, Barbara Liskov

Copyright ©1997 by the Association for Computing Machinery