Information Flow for Web Applications

Lecturer: K. Vikram

Lecture notes by K. Vikram


This lecture is based in part on the following paper: Stephen Chong, Jed Liu, Andrew C. Myers, Xin Qi, K. Vikram, Lantian Zheng, Xin Zheng. Secure Web Applications via Automatic Partitioning. In Proceedings of the 21st ACM Symposium on Operating Systems Principles (SOSP'07), pages 31-44, October 2007

More details and the download link for the Swift compiler can be found on the Swift website. Those interested in developing web applications using Swift can check out the Swift tutorial.

In previous lectures, we have discussed the concept and techniques of information flow control. One class of applications that can greatly benefit from information flow techniques is web applications. A web application is a program that runs on a server and is accessed by a client running a program called a browser. Typically the client-server communication happens using a standard protocol such as HTTP. Security is a very important concern for web applications, since such programs are visible to and accessible by a larger and potentially malicious audience, compared to centralized programs running on individual machines. Vulnerabilities such as cross-site scripting and SQL injection are common, but they are just instances of a larger class of vulnerabilities involving inappropriate use of information.

The early days of the web saw only static web pages, where the browser did nothing more than render page content in an easily consumed format. Any modification of the content required a round trip to the server. In recent years, a trend dubbed Web 2.0/AJAX has seen more active content move to the client in the form of JavaScript, significantly enhancing the interactivity and responsiveness properties of these applications. However, this design further complicates the problem of security.

We are increasingly relying on web applications for important activities such as banking, financial transactions, email and social networking, online shopping and auctions, etc. In addition, web applications are being used for applications that were traditionally viewed as desktop programs, such as word processing, spreadsheets and gaming. With obvious advantages such as portability, ease of sharing and collaboration, lower backup and hardware maintenance costs, etc. the trend is only likely to continue. However, without adequate security controls, their full scale adoption is unlikely to happen. In this lecture we shall see how information flow techniques can be applied to ease the development of robust, secure and performant web applications. We also discuss how the program security analysis techniques we saw in previous lectures can be extended for a real-world setting such as web applications and a general purpose language like Java.

Programming Model

This work observes that current development practices for web applications are at a nascent stage. They are inspired by models and languages developed for environments where programs ran on individual machines. As a result, any web application today is a mishmash of HTML, CSS and DOM code describing the UI, JavaScript/Flash code or Java Applets managing active content in the browser, PHP/Perl/Ruby/JSP code programming the server and SQL code managing fetching and storing of persistent data. This state of affairs not only makes development difficult, since expertise on a variety of technologies is needed to build an entire application, it also makes the code harder to read, understand and maintain. Even worse, it makes such code prone to security vulnerabilities. Despite their use in critical environments such as finance and e-commerce, current development techniques offer no understanding of the security properties of web applications.

Swift is built on the philosophy of a unified programming and data model. All of the source code is written in one standalone sequential program in a Java-like programming language. Using an unified model not only makes it easier to develop and maintain web applications - more importantly, it makes its security reasoning a lot easier. For example, we can apply the typing rules we discussed in an earlier lecture, to achieve an end-to-end enforcement of desired security policies. So, our source code would also contain the security policies associated with types of program variables. We will discuss this in detail in the next section.

Swift thus preserves useful features of erstwhile languages such as object-oriented programming, static typing/subtyping and exceptions. In addition, it provides features useful for developing web applications, such as a unified programming/data model and expression of declarative security policies in the source program. The heavy-lifting is done by the compiler, which performs the information flow type analysis and then translates the program into a combination of HTML, JavaScript and Java Servlet code. The compiler decides for each individual statement, whether it is placed on the client or the server or replicated on both. As a result, Swift applications are secure by construction; Swift makes interactive web applications secure and easier to write.

As a quick introduction, let us discuss the simple Hello World Swift application shown below. On executing it, it prints a string "Hello World" in the browser.

package hello;
 
import webil.ui.*;
 
public class Hello {
 
    final label{*<-*} cl = new label{Client->; Client<-};

    Text[cl, cl]{Client->; Client<-} message;
    final Panel[cl, cl]{Client->; Client<-} mainPanel;
 
    public Hello{Client->; Client<-}()
    {
        message = new Text("Hello World");
        mainPanel = RootPanel.getRootPanel(Client);

        if(mainPanel != null) {
            mainPanel.addChild(cl, cl, message);
        }
    }
 
    public static void main{Client->; Client<-}()
    {
        final Hello h = new Hello();
    }
}

As you can see, the entire application is written as if it is executing on a single machine, in a language that looks like Java, ignoring labels (anything within single line braces: {...}). Package declarations, import statements, class, field and method declarations have the same meaning as in Java. One of the classes needs to contain the main method which is the entry point into the application. This denotes the client browser's first request to the server triggering the start of the application. The UI model is similar to that of the AWT or Swing packages. An instance of RootPanel represents the top level UI widget, similar to window.document in the DOM.

Note that the programmer does not need to specify which code runs on the client and which code on the server. In general, placing code on the client is good because it reduces server load and reduces the number of high-latency round trips to the server. However, security sensitive code needs to be placed on the server. This tension between security and performance is resolved by the compiler automatically based on its analysis of the source code and its security policies. In the above example, since there is no security sensitive code or data, the compiler automatically places all code on the client.

The Swift Language

Now that we have a feel for the language, let us dive into a few more details so that we can then build a bigger and more interesting application.

Security Model

Information security concerns are expressed as labels. A label is a set of one or more confidentiality and one or more integrity policies. These policies are specified by principals, which are entities that have a stake in the security of the system. For example a social network application might have one principal corresponding to each user, and one principal corresponding to each group of users. The deployer of the application can also be represented by a principal. The Swift language itself has two built-in principals: * and Client, representing the server and the client browser of the current session. More principals can be created, if required, as instances of a class that implements the Principal interface. In the Hello example above, we used only the built-in principals.

A confidentiality policy is expressed using a right arrow ->. A policy in which a principal o allows principals r1, r2, ..., rn to read is written as o->r1,r2,...,rn. By default the principal always allows itself to read, and is therefore omitted from the right side of the arrow. An integrity policy, on the other hand, is written using a left arrow <-. A policy in which a principal o allows principals w1,w2,...,wn to write is written as o<-w1,w2,...,wn. In the integrity policy too, o appears on the right side of the arrow implicitly. A label is composed of multiple confidentiality policies c1,c2,...,cn and integrity policies i1,i2,...,in, enclosed in braces and separated by semicolons as in {c1;c2;...;cn;i1;i2;...;in}. For example, the label {Client->; Client<-} means that the Client principal dictates that reads can be performed only by the Client principal and that writes can be performed only by the Client principal. This model is popularly known as the decentralized label model. The notion of security is decentralized and relative to particular principals.

Swift also supports expression of trust relationships between principals in the system. There is a single relation between principals called actsfor. If a principal P actsfor principal Q, it means that P is always allowed to do anything (e.g. read or write data) that Q is allowed to do. In the social network example, a user principal actsfor the group principal it is a member of. This allows it to do anything that the group principal is allowed to do. Also, an implicit relationship exists between the built-in principals, in that * actsfor Client. This mirrors reality, since any client that uses the server, implicitly trusts it to perform any computation on any of its data, to service its request. The threat model in this system is that the client is buggy or malicious, but the server is trusted.

The set of labels forms a lattice, where the partial order (⊑) is defined on two labels as follows: l1 ⊑ l2 if l2 allows fewer principals to read than l1 and/or l2 allows more principals to influence than l1. For example {alice->bob,chuck} ⊑ {alice->bob} and {alice<-bob} ⊑ {alice<-bob,chuck}. The security policy we saw in the previous lecture allowed only two security levels: public (P) and secret (S). The model here adds integrity levels and generalizes them to a potentially infinite lattice. This allows expression of rich security policies for a highly dynamic setting such as web applications. As before, we enforce termination-insensitive noninterference. Whereas earlier it meant that information cannot flow from secret variables to public variables, in this setting it means that information cannot flow from a variable labeled l1 to a variable labeled l2 if l2 ⊑ l1.

Language Constructs

Swift is a security-typed language, meaning that security labels are attached to types. For example, the following declaration uses a label containing two policies separated by a semicolon:

int{alice->bob,alice; bob<-alice} y;

It means that the information in y is considered sensitive by alice, who considers that it can be released securely only to bob and alice; and further that it is considered trustworthy by bob, who believes that only alice should be allowed to affect it. In general, y could be a local variable or a field. These policies might look like access control policies, but in fact these information flow policies generalize access control policies. Where access control only enforces security at the point where information from y is being released, information flow enforces security throughout the system.

As before, both explicit and implicit flows are controlled by the type system. Consider the following code snippet, continuing from the one above:

int {bob->bob} x;
int {alice->bob; bob->alice} z;
if (x == 0)
  z = y;

This code causes an explicit information flow from y to z. This is a secure flow w.r.t. confidentiality since {alice->bob,alice} ⊑ {alice->bob} i.e. conf-policy(y) ⊑ conf-policy(z). It is also secure w.r.t. integrity because z has no integrity policy; the integrity of y (bob<-alice) is extra. More subtly, the code also causes an implicit information flow from x to z, because inspecting z after the code runs may impart information about x, even if the assignment from y never happens. The implicit flow from x to z is secure if alice->bob is at least as restrictive as bob->bob, meaning {alice->bob} ⊑ {bob->bob}. In general, this condition does not hold, because the second policy is owned by bob, who would not trust any enforcement of the second policy on behalf of its owner (alice). However, the implicit flow would be secure if alice acts for bob, meaning that bob trusts alice completely, and as a result, alice->bob is at least as restrictive as bob->bob.

Labels and principals can be first class objects in the program; they are called dynamic labels/principals. This enhances the expressiveness of the language, since it can express security policies that vary according to runtime conditions. In the Hello example, the field cl is a dynamic label. For them to be meaningful, they have to be declared final, so that the associated policies cannot change after creation. Since dynamic labels are themselves values of type label, they have labels associated with them. In the Hello example, cl has a label indicating that it is a public and trusted value.

The label in main is called a begin label. It is similar to the τ cmd type we saw in the previous lecture. A method with a begin label l assigns only to variables of security level l or higher.

Swift allows classes to be parameterized with labels or principals, similar to Java generics. In the Hello example, the Text widget is parameterized with two labels.

You must have observed that the variable h in the main method does not have a declared label. This can only be done for local variables. The compiler automatically infers a suitable label for them, based on the context.

Example Swift Application

Let us now consider a slightly more sophisticated example. Suppose we want to build a web application where a random number between 1 and 10 is picked by the system. The user has three chances to guess the number and wins if a guess is correct. For each guess, the application performs a bounds check to ensure that the number is between 1 and 10. Then it compares it with the stored random number and notifies the user if the guess was correct. Using current technologies such as PHP, a simple-minded programmer might place all code and data on the server. Each guess from the user would require a round trip to the server.

In an attempt to improve the performance of the application, the programmer tries to avoid these network messages to get a responsive implementation. He does that by moving most of the code and data to the client, as JavaScript.

Placing the entire code on the client would result in a very responsive application but would not be secure. A malicious user can peek into his browser and look at the true number. This violates the confidentiality requirement: the user should not learn the true number until after the game is over. A malicious user can also influence the execution to take extra guesses or just lie about whether a guess was correct. This violates the integrity requirements: the match between the guess and the true number should be computed in a trustworthy manner and the number of attempts should not be more than three.

However, suppose guesses that are not valid numbers between 1 and 10 do not count against the user. Then it is secure and indeed preferable to perform the bounds check on the client side. This saves the roundtrip to the server in case the guess is out of bounds. Let us see what the source code for this application looks like and how the compiler ends up placing the code.

public class GuessNumber authority (*) {
    final label{*<-*} cl = new label {*->Client};
    private int{*->*; *<-*} secret;
    int{*->Client; *<-*} tries;
    ...
    private void setupUI{*->Client}() {
      guessbox = new NumberTextBox[cl, cl]("");
      message = new Text[cl, cl]("");
      button = new Button[cl, cl]("Guess");
      GuessListener{*<-Client} guessLi = new GuessListener(this);
      button.addListener(guessLi);
      ...
      rootpanel.addChild(cl, cl, guessbox);
      rootpanel.addChild(cl, cl, button);
      rootpanel.addChild(cl, cl, message);
    }

    void makeGuess{*->Client;*<-Client}(Integer{*->Client;*<-Client} num)
        where authority (*), endorse ({*->Client; *<-*})
        throws NullPointerException
    {
        int i = 0;
        if (num != null) i = ts.intValue();
        endorse(i, {*->Client;*<-Client} to {*->Client; *<-*}) 
            if (i >= 1 && i <= 10) {
                if (this.tries > 0 && i == secret) {
                    declassify ({*->*; *<-*} to {*->Client; *<-*}) {
                        this.tries = 0;
                        finishApp("Bingo. You Win!");
                    }
                }
                else {
                    declassify ({*->*; *<-*} to {*->Client; *<-*}) {
                        this.tries = this.tries - 1;
                        if (this.tries > 0) {
                            if (message != null && guesses != null) {
                                message.setText("Try again");
                                guesses.addChild(lb, lb, new Text(Integer.toString(i)));
                            }
                        } else {
                            finishApp("Sorry! Tries Exhausted. Game Over");
                        }
                    }
                }
            }
            else {
                if (message != null) {
                    message.setText("Number out of Range");
                }
            }
    }
}
...
class GuessListener 
    implements ClickListener[{*->Client;*<-Client}, {*->Client;*<-Client}] {
    ...
    public void onClick{*->Client;*<-Client} (
        Widget[{*->Client;*<-Client}, {*->Client;*<-Client}]{*->Client;*<-Client} w)
    {
        if (this.guessApp != null) {
           NumberTextBox tb = guessApp.guessbox;
           if (tb != null) {
               guessApp.makeGuess(tb.getInteger());
               tb.setText("");
               tb.setFocus(true);
           }
        }
    }
}

More Language Constructs

The Guess example shows a more sophisticated use of the UI library. A listener is registered on the "guess" button. On clicking it, the onClick method is invoked by the system. This ends up invoking the makeGuess method, which contains the core application logic.

Note how easy it is to write this code. Implementing the same functionality using current technologies would require hundreds of lines of code in various languages, with explicit control transfers between the client and server. The application security policy is expressed as label annotations on the fields secret and tries. The act of writing these two labels constrains many of the other labels in the program, so that the information flow analysis by the compiler can succeed.

Labels are sometimes not expressive enough to specify application information flow policies. For example, a simple secret label on the password in a login program will never pass the information flow analysis. This is because each login attempt reveals some information about the secret password, thus violating the policy on the password. A declassify statement lets the programmer specify such information flow requirements. Correspondingly, an endorse statement allows special integrity information flows.

In the Guess example, if the input from the client satisfies the bounds check, it is endorsed to make it trusted. This allows modification of the tries field. This is legal as per application semantics. The result of the match between the guess and the secret is declassified so that the user can learn about it. Again, this is allowed by the application even though some information about the secret is leaked to the user. The declassify and endorse statements are called downgrade statements and they form part of the security policies associated with the application.

Since these downgrade statements are dangerous and can be abused, their use is controlled using the notion of authority. A downgrade statement requires the authority of the principal whose policy is being downgraded. In the Guess example, this is the server principal. Thus the authority (*) clause for the makeGuess method.

The Swift Programming System Internals

The Swift Compiler

Although understanding the compiler is not necessary to develop Swift programs, it can help appreciate what is going on and can help deepen understanding of debug messages. Let us focus on how the compiler processes the code for the makeGuess method.

Label Projection. If the information flow analysis succeeds, the compiler generates placement constraints on fields and statements based on their labels. For instance, a secret field has to be placed on the server and not on the client. A trusted field has to be placed on the server and can optionally be placed on the client. Downgrade statements are removed since they do not correspond to any runtime computation. The labels themselves are also removed after having been projected to either the client or the server. The projection is done according to the following map:

client can write (low integrity) client cannot write (high integrity)
client can read (low confidentiality) S?C? (client or server) ShC? (server and maybe client)
client cannot read (high confidentiality) S (server only) Sh (server only)

The entire space of labels is mapped to one of the four quadrants. At the end of the label projection phase, the code looks something like this:

1  public class GuessNumber authority (*) {
2  S:      private int secret;
3  SC?:    int tries;
4          ...
5          auto void makeGuess(Integer num)
6          {
7  C?S?:     int i = 0;
8  C?S?:     if (num != null) i = ts.intValue();
9  C?Sh:     if (i >= 1 && i <= 10) {
10 Sh:         if (this.tries > 0 && i == secret) {
11 C?Sh:         this.tries = 0;
12 C?S?:         finishApp("Bingo. You Win!");
13             } else {
14 C?Sh:         this.tries = this.tries - 1;
15 C?S?:         if (this.tries > 0) {
16 C:              if (message != null && guesses != null) {
17 C:                message.setText("Try again");
18 C:                guesses.addChild(lb, lb, new Text(Integer.toString(i)));
19                 } else {
20 C?S?:             finishApp("Sorry! Tries Exhausted. Game Over");
21                 }
22               }
23             }
24           } else {
25 C:          if (message != null) {
26 C:            message.setText("Number out of Range");
27             }
28           }
29         }
30 }
...

The secret and tries fields have placement constraints as discussed. Lines 7 and 8 compute using the untrusted input from the client and thus can happen on either the client or the server. By the time we reach Line 9, the client input has been endorsed and is now trusted. Thus the bounds check on Line 9 has to happen on the server. It can optionally also happen on the client. Line 10 computes using a secret value and also checks whether attempts have been exhausted. Both of these checks need to happen only on the server, denoted by Sh. Lines 11 and 14 update the trusted tries field, and thus have to happen on the server and optionally on the client. Line 15 involves only the read of a trusted, public field and so has no constraints. Lines 16-18 and 25-26 update the UI and have to happen on the client.

Program Partitioning. In the next phase, the compiler splits the program between the client and the server to optimize for performance while preserving the above constrains. In optimizing for performance, we seek to minimize the number of network messages. The compiler constructs a control flow graph to model program execution. It then runs a min-cut algorithm on this graph to generate the client and server partitions of the original program. If the weights on the control flow graph are accurately related to the probability of that edge being traversed, the min-cut algorithm minimizes the number of client-server roundtrip messages. In reality, the weights are approximate and the min-cut algorithm runs on a somewhat different graph to allow for the possibility of replicating the code on both the client and the server. This works well for typical web applications. At the end of this phase, the two partitions would be:
public class GuessNumber authority (*) {
  
  int tries;
  ...
  void makeGuess(Integer num)
  {
    int i = 0;
    if (num != null) i = ts.intValue();
    if (i >= 1 && i <= 10) {
 
        this.tries = 0;
        finishApp("Bingo. You Win!");

        this.tries = this.tries - 1;
        if (this.tries > 0) {
          if (message != null && guesses != null) {
            message.setText("Try again");
            guesses.addChild(lb, lb, new Text(Integer.toString(i)));
          } else {
            finishApp("Sorry! Tries Exhausted. Game Over");
          }
        }

    } else {
      if (message != null) {
        message.setText("Number out of Range");
      }
    }
  }
}
...
public class GuessNumber authority (*) {
  private int secret;
  int tries;
  ...
  void makeGuess(Integer num)
  {


    if (i >= 1 && i <= 10) {
      if (this.tries > 0 && i == secret) {
        this.tries = 0;

      } else {
        this.tries = this.tries - 1;
        if (this.tries > 0) {






        }
      }
    } else {



    }
  }
}
...
Client Server

Note that the input validation code is replicated on the client and the server. This is an improvement over current development techniques which require code duplication to implement this. Swift places it on the client for responsiveness and on the server for integrity reasons.

Code Generation. In the next phase the server generates the client and the server classes and the supporting code to invoke statements on each other. Each block of contiguous statements on a host is encapsulated into its own method. The client side classes are then translated into JavaScript using the GWT compiler. The server side classes stay as Java code and are executed atop a servlet engine on the server.

The Swift Runtime

The Swift Runtime is a layer of code on which the client and server subprograms execute concurrently, simulating the execution of the original Swift program while enforcing its security requirements. The runtime manages client-server communication and state synchronization.

Let us consider a run of the program and observe how the client and server communicate. Let us assume the true number is 7 and the user guesses 6. The client executes the code up until the bounds check. At this point, the client sends a message to the server asking it to execute this statement. The check succeeds on the server and execution proceed to compare the guess with the true number. This check fails and a message is sent back to the client to execute the code in the else branch. Any local variables updates since control arrived on the server are piggybacked on this control transfer message. The decrement of the tries field is replicated on the client and server.

Control Flow Integrity. The above run suggests that the client can always send a message to the server with values of local variables, asking the server to execute any code. This can be abused by a malicious user. To protect against this, the server maintains enough state about expected control flow and enforces those expectations. In addition, the server does not accept updates to local variables with high integrity.