Changes:

11/21: The desired interval in 1(d) was corrected and the problem part was clarified.

CS 4120: Introduction to Compilers
Fall 2011

Problem Set 5: Program Analysis and Advanced Language Features

due: Monday, November 21, 11:59PM

  1. Bounding intervals

    We want to design a dataflow analysis that computes conservative intervals bounding the values of all integer variables. An analysis along these lines could be used for eliminating bounds checks, for example. We extend the set Z of integers with plus and minus infinity: Z = Z ∪ {+∞, −∞}, such that −∞ < n, and n < +∞ for any integer n. We then define a lattice over the set L = {[a, b] | a, bZab} ∪ {⊤}.

    1. Explain what the element ⊤ represents and why we need it for this analysis. Define the partial order and the meet operator for elements in this lattice (including ⊤).
    2. Using this lattice to compute ranges of variables will fail in general. Explain why.
    3. To solve the problems from part (b), we define a lattice L′ = {[a, b] | a, b ∈ {−∞, -k, -2, −1, 0, 1, 2,...,k, +∞} ∧ ab } ∪ {⊤} (with the same partial order as before, and for some constant k, and build a dataflow analysis that computes ranges in L′ . Show the transfer functions for the following nodes:
      • assignment to a constant x = n
      • addition x = y + z
      • multiplication x = y * z
    4. This is an example of an analysis where it makes sense to propagate different information along different out edges from a node. Give a suitable transfer function for each of the out edges of a comparison x < n, and show that with this analysis we can derive a bound [10, 10] for variable i at the exit from the following example:
      i = 0
      while (i < 10) {
          i = i + 1
      }
      
  2. Defending against zombies with dataflow analysis

    Let us define "undead" code as code which depends on a variable that is always uninitialized. When such undead code is removed, additional program regions may become undead due to the disappearance of variable declarations. The goal of this exercise is to remove all undead code from a function using only a single analysis pass.

    1. Design a dataflow analysis that can be used for cascading undead code removal. Describe its domain, the meet operator, the meet of the empty set, as well as the transfer function. Where necessary, be conservative. You only need to specify the transfer function for the following cases:
      • start
      • x=c
      • if x
      • x=y+z
      • x=f(y)
      • x=[y]
      • [x]=y
      Here, c denotes a constant and x,y,z are variables.
    2. Show that the transfer functions you defined are monotonic, and either show that they are distributive or construct a counterexample.
    3. Show that one run of your analysis leads to the removal of the following grayed-out undead code:
      1. a = 1
      2. if (f(a) > 0) {
      3.   c = c+1
      4.   d = f(c)
      5. }
      6. [d] = a+c
      7. g = a+d
  3. Object layout

    Consider the following class declarations in a Java-like language with subtyping, inheritance, and method overloading, but with a simple type hierarchy in which no class has two parents:

    class A {
        int a;
        int f() { code 1 }
    }
    
    class B extends A {
        int b;
        int f() { code 2 }
        int g(int x) { code 3 }
        int h() { code 4 }
    }
    
    class C extends A {
        int c;
        void f(float x) { code 5 }
        int j() { code 6 }
        int g() { code 7 }
    }
    

    Assuming the simple dispatch table scheme with sequentially assigned method indices, draw the memory layout of three objects of classes A, B, and C respectively, including the objects themselves, their dispatch tables and the code for all their methods.

    Now, suppose instead that the language supported multiple inheritance and there was a class D defined as follows.

    class D extends B, C {
        int d; 
        int f() { code 8 }
    }
    

    Assuming the language implementation is based on sparse dispatch tables, draw what objects of all four classes might look like, and give a table showing the indices of each method.

  4. Suppose Xi were extended with higher-order lexically-scoped functions. Then we could write the following code:
    f(n:int, g1:()→int, g2:()→int) = {
        x: int = n + 12;
        g(): int = { return x; }
        if (n == 1)
    	return g() + g1() + g2(); // X
        else
    	return f(n-1, g, g1);
    }
    

    Note that the function call f(3, g0, g0) returns 42 regardless of what function g0 is passed in. Assuming heap-allocated activation records, draw what the stack and heap would look like at the point where the statement marked X is executed. Identify any assumptions you are making about the implementation.