Control Flow in Assembly

So far, all the assembly programs we’ve written have been straight-line code, in the sense that they always run one instruction after the other. That’s like writing C without any control flow: no if, for, while, etc. The remainder of this lecture is about the instructions that exist in RISC-V to implement control-flow constructs.

Jump

For most instructions, when the processor is done running that instruction, it proceeds onto the next instruction (incrementing the program counter by 4 on RISC-V, because every instruction is 4 bytes). A jump instruction lets you set the program counter to a value you choose. A jump is like a goto in C: it instantly teleports execution to a different point in the program.

In assembly, you can write a jump instruction as j <PC>. The operand is an immediate that should be the address of the next instruction to execute. (j label is a pseudo-instruction and a shorthand for jal, which we’ll cover in the next lecture.) Try this assembly to see it in action:

li   x1, 0
addi x1, x1, 1
j    16
addi x1, x1, 1
addi x1, x1, 1

To understand what this program will do, it can be helpful to label each instruction with its address. Remember that every instruction in RISC-V is 4 bytes. So here are the same instructions annotated with their address (in decimal):

 0   li   x1, 0
 4   addi x1, x1, 1
 8   j    16
12   addi x1, x1, 1
16   addi x1, x1, 1

Executing the j instruction sets the program counter to 16, so the next instruction to execute after that is the one at address 16 (the last line in this program). The result is that we skip over the instruction at address 12.

Note

In RISC-V assembly, when you use a constant integer as the operand to j and other control-flow instructions, that integer is an absolute instruction address. However, in machine code, the instruction has an immediate field that is a relative offset: the hardware adds this offset to the current value of the PC to determine the new PC value. The assembler takes care of translating the absolute address to a relative offset for you.

Labels

A good way to understand any assembly program is to try translating it backward into C. Here’s a possible translation of the code above:

  x1 = 0;
  x1 += 1;
goto somewhere;
  x1 += 1;
somewhere:
  x1 += 1;

C’s goto and labels let us accomplish something like j.

To make your assembly code more readable, the assembler also supports labels. So we can also write this:

  li   x1, 0
  addi x1, x1, 1
  j    somewhere
  addi x1, x1, 1
somewhere:
  addi x1, x1, 1

Labels only appear in assembly code. The assembler doesn’t generate any instructions for them. Instead, whenever the assembler sees you use a label in an instruction like j, it computes the signed offset to the labeled instruction. Those offsets go into the machine code for control-flow instructions.

To emphasize this point about labels “disappearing” when assembled, let’s again label each instruction with its address:

 0  li   x1, 0
 4  addi x1, x1, 1
 8  j    somewhere
12  addi x1, x1, 1
somewhere:
16  addi x1, x1, 1

Notice that the somewhere: label doesn’t get its own address, and it doesn’t affect the address of the following instruction. The assembler fills in the address for each label for us, producing code that looks more like the original code above that contains the constant 16.

Tip

Using labels instead of numerical offsets can make your assembly code much easier to read. Use them! Replacing labels with offsets is a job better left to the assembler.

Branch If Equal

goto-like functionality is nice, but it is not enough to implement C’s if. For that, we need something like a jump that can decide which of two different instructions to execute next.

A branch instruction is like a jump, but it checks a condition first to decide whether or not to jump to a given location. One example is the beq instruction, which means branch if equal:

beq rs1, rs2, label

The first two operands are registers, and beq checks whether the values are equal. The third operand is a label: where to jump if they’re equal. If they’re not equal, then we just proceed on to the next instruction as usual.

Here’s an example:

  beq  x1, x2, later
  addi x3, x3, 42
later:
  addi x3, x3, 27

This program checks whether x1 == x2. If so, then it immediately executes the last instruction, skipping the second instruction. Otherwise, it runs all 3 instructions in this listing in order (it adds 42 and then adds 27 to x3).

In other words, you can imagine this assembly code implementing an if statement in C:

if (x1 != x2) {
  x3 += 42;
}
x3 += 27;

Other Branches

You should read the RISC-V spec to see an exhaustive list of branch instructions it supports. Here are a few, beyond beq:

  • bne rs1, rs2, label: Branch if the registers are not equal.
  • blt rs1, rs2, label: Branch if rs1 is less than rs2, treated as signed (two’s complement) integers.
  • bge rs1, rs2, label: Like that, but with “greater than or equal to.”
  • bltu and bgtu are similar but do unsigned integer comparisons.

Implementing Loops

We have already seen how branches in assembly can implement the if control-flow construct. There are also all you need to implement loops, like the for and while constructs in C. Let’s start with a little do/while loop that computes the factorial:

int fact(int n) {
    int res = 1;
    do {
        res *= n;
        n -= 1;
    } while (n != 0);
    return res;
}

To translate this into assembly, it can help to simplify it first while keeping it as valid C code. For example, we can break down the while loop and replace it with an if and a goto:

int fact(int n) {
    int res;

    res = 1;
loop:
    res *= n;
    n += -1;
    if (n != 0) goto loop;

    return res;
}

Take a moment to convince yourself that this program does the same thing as the do/while loop above. The point of simplifying the code like this is that, now, we can translate the code line-by-line into assembly instructions. Here’s the result:

  li x2, 1
loop:
  mul x2, x1, x2
  addi x1, x1, -1
  bne x1, zero, loop

(We had to pick registers during the translation: n goes in x1 and res goes in x2. To try this out in the online RISC-V simulator, consider putting an initial value in x1 with something like li x1, 4.)

A More Interesting Loop

Let’s try that again with a for loop that sums up the values in an array:

int accum(int A[20]) {
    int sum = 0;
    for (int i = 0; i < 20; i++) {
      sum += A[i];
    }
}

As always, it is helpful to try simplifying this C program step by step until it is almost an assembly program. As a first step, let’s convert the for to a while:

int accum(int A[20]) {
    int sum = 0;
    int i = 0;
    while (i < 20) {
      sum += A[i];
      i += 1;
    }
}

Next, let’s replace that while with a conditional goto:

int accum(int A[20]) {
    int sum = 0;
    int i = 0;
loop:
    if (i >= 20) goto end;
    sum += A[i];
    i += 1;
    goto loop;
end:
}

Next, we can make our assembly-programming life a little easier by recognizing that the array indexing expression A[i] can also be implemented by incrementing a pointer variable. Here’s the idea:

int accum(int A[20]) {
    int sum = 0;
    int i = 0;
    int A_i = A;
loop:
    if (i >= 20) goto end;
    sum += *A_i;
    A_i += 1;
    i += 1;
    goto loop;
end:
}

The line A_i += 1 performs pointer arithmetic to move the pointer variable A_i by 4 bytes (because, on our RISC-V platform, int’s size is 4 bytes). Finally, to anticipate that we will need to use the bge instruction which takes two register, we’ll want to store the value 20 in a variable:

int accum(int A[20]) {
    int sum = 0;
    int i = 0;
    int A_i = A;
    int twenty = 20;
loop:
    if (i >= twenty) goto end;
    sum += *A_i;
    A_i += 1;
    i += 1;
    goto loop;
end:
}

At last, let’s translate this code line-by-line into assembly. Let’s use this register map:

  • Let’s assume A is in x8.
  • We’ll put sum in x1.
  • i in x2.
  • A_i in x3.
  • twenty in x4.
  li x1, 0;
  li x2, 0;
  mv x3, x8;
  li x4, 20;
loop:
  bge x2, x4, end;
  lw x5, 0(x3);
  add x1, x1, x5;
  addi x3, x3, 4;
  addi x2, x2, 1;
  j loop;
end:

Another Example

Now it’s your turn to try. Translate this loop into RISC-V assembly.

#define LENGTH 8

// Count the number of entries in the A and B arrays that are equal.
int count_eq(int* A, int* B) {
    int count = 0;
    for (int i = 0; i < LENGTH; ++i) {
        if (A[i] == B[i]) {
            count += 1;
        }
    }
    return count;
}