Next: 6 Verification and Label Up: A Model for Decentralized Previous: 4 Checking Labels

5 Application to a Language

In this section we define a simple programming language that incorporates our model. The goal of this exposition is not to seriously propose a programming language, but to demonstrate that the information flow model can be applied practically to a rich computational model, providing sound and acceptably precise constraints on information flow. These annotations could be applied to other programming models, such as compiled code or JVM code [LY96].

The language supports some simple types: integers, strings, records, and arrays of any legal type including other array types. Procedures may contain variable declarations, assignments, if statements, and while statements; they return results by assigning to special return variables, as in Pascal. Variables of record or array types are references to storage on the heap, as in Java [GJS96] and CLU [LAB+84], so that assignment of a record or array (e.g., r1 := r2 or a1 := a2) makes the variables aliases for each other.

For simplicity, the language has no exceptions. Exceptions complicate the propagation of basic block labels, but can be handled using the basic-block propagation described in Section 4.2, assuming that procedures declare the exceptions they might raise. Not having exceptions makes some programs clumsy, but programs written with exceptions can be straightforwardly translated into programs without exceptions, so there is no loss of generality.

For simplicity of presentation, the language also lacks global variables. In our simple language, the first basic block in a procedure has label . Global variables could be supported by allowing procedures to accept a special parameter that defines the label of their first basic block.

The language is extended with a few unusual features to support information flow control:

We begin with an example of a program to illustrate some of the features in the language. Then we define the language in more detail, including how to check the constructs using label-checking rules.

5.1 An Example

Figure 6 shows the "check_password" procedure, which accepts a database of passwords, a password, and a user name, and returns a boolean indicating whether the string is the right password for that user. This example is simple, yet it uses declassification to control information flow in a fine-grained way that is not possible under any previous model of which we are aware.

pinfo = record [ names, passwords: string{chkr: chkr} ]

check_password (db: array[pinfo{}]{},
		user: string {},
		password: string{client: chkr})
    returns (ret: bool{client: chkr})
    % Return whether password is the password of user

    i: int {chkr: chkr} := 0            % 
    match: bool {client: chkr;          %
                 chkr: chkr} := false   % 
    while i < db.length() do            % 
      if db[i].names = user &           % 
        db[i].passwords = password then %
           match := true                % {client: chkr;
      end                               %   chkr: chkr}
      i := i + 1                        % 
    ret := false                               % 
    if_acts_for(check_password, chkr) then     % 
      ret := declassify(match, {client: chkr}) % 
end check_password
Figure 6: Annotated password checker

Two principals are mentioned in this code: chkr represents the password checker role, and client represents the principal of the calling client. The password database is an array of records giving the passwords for users; it should be protected by encapsulation (e.g., check_password should be a method), but this complexity is avoided here. In a real program, client would be a parameter to the routine rather than a single fixed principal; a more general (and more concise) version of check_password is shown later, in Figure 8.

In the figure, all labels are declared explicitly by annotating types with label expressions in braces. The type T{L} describes a value of type T that is restricted by the label L. For example, the password argument is readable by the checker but owned by the client, which ensures that the checker can examine the password but cannot distribute it further. The annotations are onerous in this case partly because variable labels are not being inferred; Section 6 shows how to infer labels for this procedure.

The comments (beginning with a ``%'') on the right-hand side of the example indicate the static value of the basic block label, B, and are not part of the code.

To perform its work, the procedure uses the db database to find the password of the user. As it looks at the data, its basic block label picks up any dependencies. For example, the if predicate examines password and the fields of db[i]; therefore, its body has the basic block label {client: chkr; chkr: chkr}. This means that the label of match must be at least this restrictive, or the assignment to match in the body of the if statement would leak information (i.e., would result in a compile-time error).

However, the client requires a result with label {client: chkr}. To provide such a result, the checker must explicitly declassify match, removing chkr from the owner set. The declassification can be carried out only if the procedure has the proper authority (i.e., can act for chkr). Therefore, the procedure checks whether it runs with this authority (in the if_acts_for statement); in the then clause, the code runs with the authority of the password checker; otherwise, it does not.

Code receives authority by being granted it in the principal hierarchy; each procedure has its own principal that can participate in acts-for relations. Of course, granting code the right to act for a particular principal can only be done by a process that acts for that principal, so chkr must have explicitly granted check_password the right to act for it. The expectation of the author of the procedure check_password is that it has been given this right. However, the procedure is constructed so that it does not leak information if the right to act for chkr is later revoked.

The need for explicit declassification is not just an artifact of our model. The return value of the procedure is owned by the client, which means that the client has complete control over the value. The procedure's result could conceivably be used to determine the contents of the password database through exhaustive search. In this case, the implementor of the procedure has made a conscious decision that the amount of information leaked by the boolean return value is small enough that declassification is acceptable. Note that the decision about declassification is made locally, by code that acts for the owner of the data; no appeal to an external trusted agent is required.

It is not necessary to run the entire procedure under the chkr authority. Instead this authority is used just where it is needed, for the declassification. Also, note that the procedure only needs to declassify its result. It is not required to declassify everything extracted from the password database, and is therefore protected against accidentally leaking the database contents (e.g., if it called a helping procedure that acted on the database).

5.2 Label-Checking Rules

Now, we explain the constructs of the language and the corresponding label-checking rules. The process of verifying a program according to these rules involves two major steps: first, basic block labels are propagated; then, each individual statement is verified in the context of the basic block that contains it. Verifying a statement requires that we check for the satisfaction of corresponding label constraints, which is discussed in more detail in Section 6.

If the language contained exceptions or gotos, B would need to be computed through the basic-block label propagation rule of Section 4.2. For our simple constructs, the rules for propagation of B can be stated directly; since control flow is determined only by if and while statements, the basic-block propagation rule takes particularly simple forms. Note that in each statement rule, represents the label for the basic block containing the statement. Many of these rules are similar in intent to those found in Denning and Denning [DD77], though these rules are different in that they are expressed without using the meet operator ().

5.3 Labeled Types

As described, values and slots have labels that restrict information flow. In statically-typed languages, values and slots also have static types that can be said to restrict flow. These two restrictions can be combined, so that labels are considered to be part of the type. This interpretation allows us to use standard type-checking and notions of subtyping to describe information flows.

Every variable, argument, and return type in the language has a labeled type, which consists of a base type such as int, plus a static label. For example, the labeled type int{L} represents an integer restricted by L. In general, label expressions, consisting of the join of several labels, may appear within the braces. For example, int{chkr: chkr L} is restricted by both {chkr: chkr} and L.

The type and label parts of a labeled type act independently. For any two types S and T where S is a subtype of T (ST), and for any two labels L1 and L2 where L1L2, S{L1}T{L2} [VSI96]. This formula also implies that S{L1}T{L1 L3} for any other label L3.

Parametric types such as arrays and records explicitly mention labels on their type parameters. For example, we can form the type array[int{L}]{L2}, which is an array of labeled integers, where the integers have label L, and the array reference has label L2. In record types, similarly, the individual fields have labeled types.

5.4 Assignment

Given an assignment of the form v := e, where v is a variable with type T{v} and e is an expression with type S{e}, the assignment is legal if e B v, where B is the label for the basic block containing the statement. This condition guarantees both that the information in e is not leaked by placing it in the variable v, and that performing the store operation does not leak information because it happens in this particular basic block.

Record and array assignments are similar to variable assignment, except that they may convey information by virtue of the particular record or array being assigned to. Consider record assignment of the form r.f := e. In this statement, r is a record expression with label r, f is a field of the record type with declared type T{f}, and e is an expression with type S{e}, where ST. The assignment is legal if e r B f. This rule is equivalent to the one in Denning and Denning [DD77]. The rule for assignment to an array element is similar, except that the label on the array index is included on the left-hand side. Because r appears on the left-hand side of the rule, fields and array elements become immutable if the variable referring to the record or array becomes more protected than the field or element. For example, a record field with r f is immutable, since otherwise information could be leaked by assigning to it.

5.5 if and while

The rules for if and while are similar to each other. Assume that e is a legal boolean expression, and that S is an arbitrary statement. The statement ``if e then S end'' is legal if S is legal given the basic block label Be. The same condition guarantees the legality of ``while e then S end''. The label e does not need to be part of the basic block label after the execution of the if or while statement, because we are not considering covert timing channels or covert channels arising from non-termination, as discussed in Section 4.2.

5.6 Authority

A procedure executes with some authority that has been granted to it. Authority may be granted through the principal hierarchy or because the procedure's caller grants the procedure the right to act for other principals.

At any given point within a program, the compiler understands the code to be running with the ability to act for some set of principals, which we call the effective authority of the code at that point. The effective authority can never exceed the true authority at any point during execution.

When a procedure starts running, it has no effective authority. It may increase its effective authority by testing whether it has the authority to act for a principal. If the test succeeds, the effective authority is increased to include that principal. This test is accomplished by using the if_acts_for statement:

if_acts_for(P1, P2) then S1 else S2

(The brackets around the else clause indicate that it is optional.)

In this statement, P2 names a principal in the principal hierarchy; P1 names the current procedure or the special keyword caller. If it names the current procedure, it means the procedure's principal, as discussed in Section 5.1. If it names caller, it denotes the principal(s) that the procedure's caller has granted it the right to act for, as discussed later in Section 5.11.

The effect of if_acts_for is to execute the then block if the specified acts-for relationship exists. If the if_acts_for test fails, the else block, if any, is executed with no additional authority. If the test succeeds, the effective authority of the then block is increased to include P2

5.6.1 Revocation

It is possible that while a procedure is executing the then part of an if_acts_for statement, the principal hierarchy may change in a way that would cause the test in the statement to fail. In this case, it may be desirable to revoke the code's permission to run with that authority, and we assume the underlying system can do this, by halting the code's process, at some point after the hierarchy changes.

If a running program is killed by a revocation, information may be leaked about what part of the program was being executed. This constitutes a timing channel, and one that can be made slow enough that it is impractical to use.

5.7 Declassification

A program can explicitly declassify a value. The operation

declassify(e, L)
relabels the result of an expression e with the label L, using relabeling Rules 1 and 2 as needed.

Declassification is checked statically, using the effective authority at the point of declassification; the authorization for declassification must derive from a containing if_acts_for control structure. Declassification is legal as long as e permits declassification to L, which implies the following rule. Let LA be a label in which every principal in the effective authority is an owner, but with an empty reader set. The most restrictive label that e could have and still be declassifiable to L is L LA, so the declassify() expression is legal if e L LA. For example, if the principal A is part of the effective authority, the label {A: B,C; D: E} can be declassified to {A: C; D: E}, since {A: C; D:E} {A: Ø} = {A: Ø; D: E}, which is more restrictive than {A: B,C; D: E}.

5.8 Label Polymorphism

Consider a library routine such as the cosine function (cos). It would be infeasible to declare separate versions of cos for every label in the system. Therefore, we allow procedures to be generic with respect to the labels on their arguments, which means that only one cos function need exist.

If a label is omitted on the type of a procedure argument a, the argument label becomes an implicit parameter to the procedure, and may be referred to as a elsewhere in the procedure signature and body. For example, the cosine function is declared as follows:

cos(x: float) returns (y: float{x})

cos is generic with respect to the label on the argument x, and x is an implicit argument to the routine.

This signature allows cos to be used on any argument, and the label on the return value is always the same as the label on the argument. Since the code of cos does not depend on what x really is, its code need not access the label, so there is no need either to recompile the code for each distinct label or to pass the label at runtime. Therefore, implicit label polymorphism has no run-time overhead.

5.9 Run-time Labels

Implicit labels allow code to be written that is generic with respect to labels on arguments. However, sometimes more power is needed: for example, to model the accounts database of the bank example, where every customer account has a distinct label. To allow such code to be written, we support run-time labels.

A variable of type label may be used both as a first-class value and as a label for other values. For example, procedures can accept arguments with unknown labels, as in the following procedure declaration:

compute(x: int, lb: label) returns (float{xlb})

To simplify static analysis, first-class label variables are immutable after initialization. When a label variable is used as a label, it represents an unknown but fixed label. Because labels form a lattice and obey the simple rules of a lattice, static reasoning about this label is straightforward. For example, if the procedure compute contains the assignment z := x + y, where y has type int{lb}, the assignment is valid as long as it can be statically determined that zlbx. This condition can be checked even when z and y do not declare their labels explicitly, as discussed in Section 6.

Since label is a type, it too must be labeled wherever it is used. Constant labels and implicit label parameters have type label{}. Declarations of run-time labels can indicate the label's label explicitly; if the label's label is omitted, it treated like any other type: it is inferred in the case of a local variable, and it is implicit in the case of an argument (this is the situation for the lb argument of compute).

In principle, code that is written in terms of implicit labels can be expressed in terms of run-time labels. We provide implicit labels because when they provide adequate power, they are easier and cheaper to use than run-time labels. For example, without implicit labels the signature of the cos function would be the following:

  cos(x: float{lx}, lx: label{})
    returns (y: float{lx})

5.10 Labelcase

The labelcase statement is used to determine the run-time label on a value. It effectively allows a program to examine variables of type label. For example, compute might use labelcase to match the label lx against other labels it has available. A labelcase statement has the following form:

labelcase e as v
  when L1 do S1
  when L2 do S2
  [ else S3 ]

The effect of this statement is to execute the first statement Si such that e Li, introducing a new variable v containing the same value as e, but with the label Li.

The block label in the arm of the labelcase does not reflect the label of e, but it does reflect the label of e's label. Suppose the type of e is T{Le} where Le has the type label{Le}. Similarly, suppose the labels Li have respective types Li. The labelcase statement is valid only if each of the statements Si is valid given a basic block label Bi:

This formula says that selecting which statement to execute reveals information about the labels, but does not reveal anything about information protected by the labels. Therefore, the basic block labels Bi depend on Lj but not on Lj. The reason for joining all the Lj up to i is that the arms of the labelcase are checked sequentially, so each arm that fails conveys some information to its successors.

In a labelcase, neither the tested expression e nor the labels Li on any of the arms may mention the implicit labels on procedure arguments; rather, the labelcase is limited to constant labels and run-time labels. Implicit procedure argument labels are only intended to help write code that doesn't care what the labels are on the arguments. This restriction on the labelcase avoids the need to pass implicit labels to the generic code that uses them, as discussed in Section 5.8. Since labels may be passed as explicit arguments, and values of type label may be used to label types, no power is lost through this restriction.

5.11 Procedures

A procedure definition has the following syntax:

The optional authority clause indicates whether the procedure may be granted some authority by its caller. A procedure with an authority clause may try to claim this authority by using caller in an if_acts_for statement. The ai are the names of the arguments to the procedure. The arguments have types Ti and optional labels Li. As in variable declarations, labels in the arguments and results of a procedure signature may be simple label expressions, including joins of other labels.

A call to a procedure is legal only if each actual argument to the procedure is assignable to the corresponding formal argument. This means that the formals must have labels that are more restrictive than the block label of the caller at the time of the call, i.e., the normal rule for assignment given in Section 5.4 applies here. Additionally, bindings must be found for all the implicit label parameters such that for each explicit or implicit formal argument label Li and actual argument label Lai, the constraint LaiB Li holds, where B denotes the basic block label of the call site. Determining the bindings for the implicit labels that will satisfy all these constraints is not trivial, since the formal argument labels may be join expressions that mention other implicit argument labels, as in the signature f(a: int, b:int {aX}), where X is some other label. The efficient solution of such constraint systems is considered in Section 6.

Furthermore, if the authority clause is present, the caller may provide one or more principals that it acts for. Such a call can occur only within the scope of an if_acts_for statement, since otherwise the effective authority of the caller is nil. For example, the following call from the procedure p grants the authority of the principal my_principal to the procedure doit:

if_acts_for(p, my_principal) then

This model for granting authority protects the caller of a procedure because it can select what part of its authority to grant to the procedure. The implementor of the called procedure is also protected, because the procedure uses only the authority its implementor wishes to claim, and only where needed. (This is similar to the CACL model [RSC92], but provides more protection for the implementor.)

5.12 protected[T]

Run-time label checking is conveniently accomplished by using the special type protected[T]. A protected[T] is an immutable object that contains two things: a value of type T, and a label that protects the value.

The type protected[T] is particularly useful for implementing structures like file systems, where the information flow policies of individual objects cannot be deduced statically. For example, a file system directory is essentially an array of protected file objects. Navigating the file system requires a series of run-time label checks, but this work is unavoidable.

protected[T] has two methods: get, which extracts the contained value, and get_label, which extracts the contained label. It also has a constructor, protect, which creates a new protected value. The get method requires an expected label as an argument. The value is returned only if the expected label is at least as restrictive as the contained label, which is determined at run time. The expected label must be either a constant label or a variable label; implicit labels are not allowed. The procedural signatures of these operations are as follows:

protect(T: type, lb: label, v: T{lb})
  returns (p: protected[T]{lb})
    % Create a new "protected[T]" containing
    % the value "v" and label "lb".

get_label(p: protected[T]) returns (lb: label{p})
    % Return the label that is contained in "p"

get (p: protected[T], expect: label)
	returns (success: bool{p  expect},
	         v: T{expect  expect  p})
    % Return the value that is contained in "p" if
    % the label "expect" matches the contained label.
    % Set "success" accordingly.

If the language is extended to support some form of data abstraction with encapsulation, the type protected can be implemented in a natural manner by using the labelcase statement. Without these extensions, the closest approximation to protected[T]{lb} is the following type:

record[lb: label, x: T{lb}]{lb}

Like this type, the type protected[T] has the special property that the label on the reference to a protected[T] (the label lb) is assumed to be the label on the contained label, lb. This constraint can be maintained because protected[T] is immutable, and because L1L2 implies protected[T]{L1} protected[T]{L2}.

A protected[T] allows information flow via its contained label: one could pass information by storing different labels in protected[T] objects, and then seeing whether get operations succeed. However, the signature of get prevents an information leak because the success result is labeled both by the label on the label being passed in (lb), and by p's label (p), which is the same as the label on the contained label. Therefore, this information flow does not create a leak.

5.13 The Bank Example

The bank example from Section 2 is a good example of the need for run-time labels. Each customer account is owned by the individual customer, rather than by the bank, which gives the customer more confidence in the privacy offered by the bank. This example can be conveniently written using protected[T] to protect each customer's account with a label specific to the customer, as shown in Figure 7.

account = record [
    name: string{bank_label},
    balance: protected[float]{bank_label} ]

get_balance (name: string, customer: label,
	     accts: array[account{}]{})
        returns (success: bool {name  customer 
				    customer },
                 balance: float {name  customer 
    ... % find element "i" containing the right customer
    s: bool{name  customer  customer  bank_label}
    b: bool{name  customer  customer  bank_label}
    s, b := get(accts[i].balance, customer)
    if_acts_for (get_balance, Bank) then
       success := declassify(s, {name  customer 
       balance := declassify(b, {name  customer 
end get_balance

Figure 7: Bank Example

For example, the customer's account can be a simple record type, where the customer's name is protected by (and accessible to) the bank, but balance is protected by both the bank label and a customer label that is stored inside the protected[float]. This design gives the customers protection against dissemination of their balance information.

In this code, Bank represents a principal with the ability to declassify the bank_label label. The current balance of an account is obtained by the procedure get_balance, which accepts a first-class label as an argument. If the label customer that is passed to get_balance is at least as restrictive as the label in the account, the balance is returned as a float labeled by customer. The procedure fails either if no customer exists by that name, or if the label passed in is insufficiently protected.

5.14 Output Channels

Output channels show up as the special opaque type ``channel'' in the language. The type channel denotes an output channel with a hidden reader set, whose members denote the principals who are reading from the channel. Information can be written to a channel using the ``write'' procedure:

write(c: channel{lb}, s: string{lb}, lb: label{lb})

When a write is attempted, the label lb is compared against the hidden reader set within the channel. If it passes the tests of the effective reader set that are described in Section 3.5, the write is successful. Otherwise, it silently fails.

It is important that lb capture all the possible information flows that the write will cause, since otherwise write would not perform a sufficiently stringent test against the channel's reader set. Because lb is used to label s, the channel cannot leak information through the contents of the data that is sent out. Because lb is used to label the argument c as well, the channel cannot be used to leak information by choosing among a set of channels to write to. Finally, because lb labels itself, the channel cannot be used to leak information by changing the label that is passed in as the lb argument, and transmitting information by the fact of the write's success or failure.

Next: 6 Verification and Label Up: A Model for Decentralized Previous: 4 Checking Labels

Andrew C. Myers, Barbara Liskov

Copyright ©1997 by the Association for Computing Machinery