Brief comments re: Prelim #2.

 

Today:

preserving the semantics of the language

 

 

Compilers versus Evaluators (or: all about PS#6!)

The evaluator takes a program as input and runs it, returning its value

Evaluator contract is

Program ---> [ Evaluator ] ---> Value

Compiler contract is

Program ---> [ Compiler ] ---> Program' ---> [ Evaluator ] ---> Value

where we preserve the semantics of the language:

(eval P env) = (eval (compile P) env)

Typically the output of the compiler is a different language (such as

PPC assembler, which is interpreted by the PPC chip). In CS212, the

output of the compiler will be a subset of Scheme.

Our compiler thus performs source-to-source transformation. Sometimes called "translators".

The input is a Scheme program (represented as a list); the output is a Scheme

program. In many respects the compiler and evaluator are similar --

they are programs that "walk over" source code. An evaluator computes

a value on each recursive call, while a compiler computes code which

will eventually compute a value.

Why bother? Program' is just like Program, only faster. To do

this, the compiler reasons about Scheme programs (although the

reasoning is quite simple).

Combining 2 themes behind 212: reasoning about programs, and efficiency. Note that the efficiency gained is never something that shows up in asymptotic analysis, but it’s still important

To see why this might be useful, consider defining

(define useless (lambda (x) (+ (* 3 5) x)))

(map useless '(1 2 3 ... 100))

How does this work in the evaluator?

... extend global env by [x: 1] ... evaluate (+ (* 3 5) x) ... evaluate (* 3 5)...

... extend global env by [x: 2] ... evaluate (+ (* 3 5) x) ... evaluate (* 3 5)...

.

.

.

... extend global env by [x: 100] ... evaluate (+ (* 3 5) x) ... evaluate (* 3 5)...

That's a lot of evaluations of (* 3 5)

Note: you might not write code like this, but a macro could (see lecture in 1 week). Or in-line functions could (suppose you call someone else's code).

Usually there is a Program' that is a *lot* faster (typically 100-1000 times). Compilers involve getting from Program to Program'

In this example, we want to get from

(lambda (x) (+ (* 3 5) x)) to (lambda (x) (+ 15 x))

This is (a simple) part of PS#6.

 

To make life easier, we will consider only compiling a subset of Scheme programs.

Note: we’d really need LETREC too (why?)

Even this language subset includes very complicated expressions. Our strategy is to produce an intermediate form from an expression and then optimize that intermediate form. [Note: this is how all compilers work.]

To see why this is necessary, consider the expression

(f (g x) (g x))

We want to turn this into something like (let ((temp (g x))) (f temp temp))

To evaluate this expression, we evaluate (g x), then we evaluate (g

x), then we invoke f on the first result and the second result. But

in Scheme, these intermediate results are implicit. We need to make

them explicit, through a process we call LINEARIZATION.

It's a little hair raising in places (we'll provide the code for those

who want to look at it). You should know what a linearized expression

is, but not necessarily how to write code to linearize one.

 

Linearization will produce an intermediate form like:

(let ((val1 (g x)))

(let ((val2 (g x)))

(let ((val3 (f val1 val2)))

t3)))

We will then optimize this intermediate form to produce

(let ((val1 (g x)))

(let ((val2 val1))

(let ((val3 (f val1 val2)))

t3)))

which optimizes out one call to g [NOTE: side effects are harmful!]

You will write this optimization in PS#6

 

 

In the linearized form two things are made explicit:

There are thus 2 parts to the compilation process:

Program ---> [Linearizer] ---> Linearized Program ---> [Optimizer] ---> Program'

As before, everything will be a Scheme subset [label the languages above]

 

 

 

The output of the linearizer, which will also be the output of the

optimizer (and hence of the compiler) will be a very restricted

subset of Scheme, called Linear Scheme (Linear-S for short).

Scheme subset ---> [Linearizer] ---> Linear-S ---> [Optimizer] ---> Linear-S

The key property of Linear-S is that all combinations are SIMPLE.

A combination is SIMPLE if the operator and the operands are all

atomic (i.e., symbols or numbers). For example, (f a 23) is simple,

while ((f) (g)) is not.

In addition, in conditionals the test is required to be atomic. So

(if x 1 2) is simple, while (if (not x) 1 2) is not. In fact, the

latter expression would be linearized to

(let ((val1 (not x)))

(let ((val2 (if val1 1 2)))

val2))

[Note: we can not turn (if e1 e2 e3) into

(let ((v1 e1))

(let ((v2 e2))

(let ((v3 e3))

(let ((v4 (if v1 v2 v3)))

v4))))

Why?]

A Linear-S expression is essentially a giant series of LETs which

eventually returns a value in the body. Every let involves a single

simple computation (no nesting).

An important part of linearization is called ALPHA-renaming.

Basically, whenever we see a LAMBDA we need to give its parameters

unique names, or we will get confused. [Note that this would have simplified the change! prelim question considerably…]

 

For example, consider

((lambda (f) (f x)) (lambda (f) f))

(which applies the identity function to x)

will be alpha-renamed to

((lambda (f1) (f1 x)) (lambda (f2) f2))

After alpha-renaming, we can be sure that any two variables with the same name are the same variable.

 

 

OK, we now have linearized code. How do we optimize it? The

optimizations we will consider are all fairly simple, although they

can improve your code a lot. Remember: compile Program to Program’ (once), then run Program’ (possibly, many times).

To understand optimization, we need to go back to our first example

and think about the relationship between compilation and evaluation

(define useless (lambda (x) (+ (* 3 5) x)))

We can try and turn this into a better piece of code, but we have to

bear in mind that we have no idea what the value of x is. In fact,

we won't know until we actually apply this procedure to something

(i.e., at run time).

On the other hand, we know what the value of (* 3 5) is, irrespective

of the value of x (i.e., at compile time).

Important lesson:

Obvious consequence: if you want to optimize a program that doesn't

contain any procedures, you can simply compute the value. If the

compiler is given some complex arithmetic expression, it should simply

return the value.

[Non-obvious consequence: compiler writers can (and do!) "cheat" on various benchmarks, by emitting the answer or special purpose code.]

 

Four optimization rules to live by:

 

Here is a short description of some optimizations we will look at (and implement!) Note that we are always doing substitutions of some kind.

 

OPTIMIZATION

DESCRIPTION

Constant folding

Replace VARIABLES with VALUES

Common Subexpression Elimination

Replace EXPRESSIONS with VARIABLES

Inlining

Replace PROCEDURE CALLS with EXPRESSIONS

 

Part of what a compiler does can be described as PARTIAL EVALUATION.

We take code like:

(lambda (x) (+ (* 3 5) x))

and return code somewhat like

(lambda (x) (+ 15 x))

In essence, anything can be computed at compile time should be

computed.

The simplest such optimization is called CONSTANT FOLDING, which

replaces operations by constants where possible.

Given an Linear-S expression like

(let ((val1 (* 3 5)))

(let ((val2 (+ val1 x)))

val2))

constant folding will produce a Linear-S expression like

(let ((val1 15))

(let ((val2 (+ val1 x)))

val2))

 

 

 

Basic idea:

(let ((val1 (* 3 5)))

BODY)

---> replace with a new BODY with val1 replaced by 15 wherever it occurs (and no let)

Note: what if F is known to be a lambda at compile time and G, H are

constants?? We need something like an evaluator in our compiler! A

real partial evaluator requires a lot of work...

Sample:

(let ((val1 14))

(let ((val2 (lambda (val3) (* val1 3))))

(let ((val3 (val2 run-time-variable)))

val3)))

==> 42

We’ll come back to this kind of constant folding later – it’s called inlining, and is more or less what macros do. [One way to think of a macro is as code run at compile time, which produces code run at run time…]

 

 

There are other related optimizations which aren't quite partial

evaluation, but which are similar in flavor.

Example: algebraic simplification. The simplest examples can be

handled by pattern matching -- look for (let ((x (* y 0))) ...) and

the like. We've done something like this already, just not as part of

a compiler.

More complex examples involve quite non-trivial computation; do two

arbitrary expressions compute the same value? For arithmetic

expressions there is actually a pretty simple algorithm, based on the fact that zeros of polynomials are sparse.

 

 

 

An interesting example is COMMON SUBEXPRESSION ELIMINATION. Compute

something once - why compute it again?

Consider an expression like (f (+ a b) (+ a b))

Linearizer produces

(let ((val1 (+ a b)))

(let ((val2 (+ a b)))

(let ((val3 (f val1 val2)))

val3)))

We'd like to avoid computing (+ a b) twice.

This is actually pretty similar to constant folding --

if we know something at compile time we don't need to

recompute it. However, while in constant folding we replace

variables with values, in common subexpression elimination

we replace expressions with variables.

This needs to be converted to

(let ((val1 (+ a b)))

(let ((val2 (f val1 val1)))

val2))

The key is to see that val1 and val2 are the same expression, and

replace them by one computation.

 

Another example: dead code elimination

When an IF's test value is known at compile time, we can eliminate the

consequent or alternate.

A similar procedure can get rid of useless lets like (let ((var1

var2)) ...)

 

 

Procedure inlining

There is overhead involved in a procedure call. If we have

(define foo (lambda () (* a c)))

then

(+ (foo) (foo))

is slower than

(+ (* a c) (* a c))

[not much, but it can matter inside a loop!]

One solution is to write your code as macros. Disadvantages?

* Kind of painful (macros are hard to debug)

* Space versus time

Alternative: inlining

Note that in Linear-S we leave calls to lambdas alone. Thus

(+ (foo) (foo))

is linearized into

(let ((val1 (foo)))

(let ((val2 (foo)))

(let ((val3 (+ val1 val2)))

val3)))

which after eliminating common subexpressions becomes:

(let ((val1 (foo)))

(let ((val2 val1))

(let ((val3 (+ val1 val2)))

val3)))

but we might be better off inlining the call to foo...

This depends, of course, on what foo does. Can you tell what a

function does without running it? NO! See last lecture of CS212.

Summary: reasoning about programs has some inherent limits.

For some simple functions, we can "inline" (or "open-code") them.

When is this a fatal error? Recursion!

Still, this can be useful. ANSI C supports an "inline" declaration.

Use this at your own risk (the time-space tradeoffs are not always

obvious!)

Limited inlining of recursive code can be very useful (it's called loop

unrolling).

 

 

Final note: many of these optimizations enable each other. Doing

inlining can enable common subexpression, for example. This

combination can be pretty similar to simply memoizing (which we

talked about in streams).

(+ (foo) (bar)) ==>

(+ (* a c) (* c a)) ==>

(let ((val1 (* a c)))

(let ((val2 (+ val1 val1)))

val2))

Or eliminating dead code can enable partial evaluation:

(lambda (x) (if (foo) (y) (f x))) è [inlining] (lambda (x) (if #t (y) (f x))) è [dead code elimination] (y) etc…

A typical compiler makes several passes over the code, doing a bunch

of different optimizations. Some passes need to be done more than

once. How this is done is beyond the scope of this course. Some of it is beyond the scope of the instructor! Typically, they don’t do all possible optimizations – this is one of the things that compiler flags control!

 

 

 

Just in time compilation (a.k.a. dynamic compilation). Examples: Java, Apple PowerPC emulator for 68K.

Compiler:

advantage: runs "offline".

disadvantage: don’t know values of run-time parameters

Interpreter tradeoffs are the opposite.

Suppose someone runs a big piece of code in the interpreter (why would anyone do this?) At run time, the parameters are known [lookup rule failure in compiler à "oh well", interpreter à debugger]. It may well be worthwhile to compile the users code, especially if there are loops. But we can now do even better than the regular compiler, because we know the values of all parameters!

This can be done even if the original code is compiled.

Big lessons: