Assembly Programming

Instructions: Remember, all assignments in CS 3410 are individual. You must submit work that is 100% your own. Remember to ask for help from the CS 3410 staff in office hours or on Ed! If you discuss the assignment with anyone else, be careful not to share your actual work, and include an acknowledgment of the discussion in a collaboration.txt file along with your submission.

The assignment is due via Gradescope at 11:59pm on the due date indicated on the schedule.

Submission Requirements

You will submit the following files to Gradescope. From the lab:

lab.txt: Contains all work from the lab exercises.

From Part I;

arrays.s: implementation of the array manipulation program
mult.s: implementation of multiplication
prime.s: implementation of primality checker

From Part II:

mystery1.c: C translation of mysterious function 1
mystery2.c: C translation of mysterious function 2

Overview

This assignment will level-up your skills as an assembly language programmer: both reading and writing RISC-V assembly.

Part 0: Lab

During lab section, we will start with some warm-up exercises to get you familiar with writing RISC-V assembly and to help you start with the assignment. To familiarize yourself with the available instructions, see the RISC-V instruction set manual. As you write assembly, you will also likely find it helpful to use 3410 RISC-V interpreter to execute and validate your code.

Submit all your answers to this part as a text file: lab.txt. This does not need to be formatted in any specific way; just make it readable to a human. We are just looking for complete answers in this part.

Writing Assembly Programs

Your task in lab is to write RISC-V assembly programs to implement several functions.

1. Arithmetic

We begin with implementing arithmetic functions. The binomial theorem lets you expand the powers of a binomial as the following sum of terms:

\[ (x + y)^n = \sum_{k=0}^{n}{n\choose k}x^{k}y^{n-k} \]

We’ll implement both the right- and the left-hand side of this equation for \(n = 4\).

Let’s consider what these programs might look like in C. The LHS would look like:

z = pow(x + y, 4)

And you could write the RHS as:

z = 1 * 1 * pow(y, 4) + 4 * x * pow(y, 3) + 6 * pow(x, 2) * pow(y, 2) + 4 * pow(x, 3) * y + 1 * pow(x, 4) * 1

Write two RISC-V assembly programs: one that computes the value of the LHS of the equation and another that computes the RHS. Then, check that the values given by both are the same for x = 5 and y = 7.

For each program, assume that:

register x1 holds the value of x
x2 holds y
x3 holds z, the final value of the expression

Hint: You can use the mul instruction to implement the calls to pow in the code above. As an even better alternative, you can use shift instructions to multiply by a number that is a power of two. So when you need multiply by a constant, see if you can instead write a sum of shifts.

2. Load and Store

Consider this function in C, which swaps the values at indices 1 and 3 in an array of ints:

void swap(int* arr) {
    int temp = arr[1];
    arr[1] = arr[3];
    arr[3] = temp; 
}

Assume that the arr pointer is in register x1. (Also, don’t worry about out-of-bounds accesses: assume that we allocated enough space for the arr array.) Write the RISC-V assembly code to implement this swap.

3. Conditional Control Flow

Consider this code with a simple if statement:

if (x < y)
    y = (x - y) * 2;
else
    y--;

Assume that:

register x16 holds x
x17 holds y

You may use all other registers to store temporary values if you like. Write a RISC-V assembly program to implement this code.

4. Loops

Consider this for loop in C:

for (int i = 0; i < y; i++) {
  x = x + 2;
}
return x;

Assume that x and i start at 0, and that we use these register mappings:

y is in register a0
x is in register a1
i is in register t0

Which of these RISC-V assembly translations are correct? For the incorrect translations, write a brief explanation of why they are incorrect.

Option 1:

for:
blt t0, a0, end
body:
addi a1, a1, 2
addi t0, t0, 1
beq x0, x0, for
end:

Option 2:

for:
beq t0, a0, end
addi a1, a1, 2
addi t0, t0, 1
beq x0, x0, for
end:

Option 3:

bge x0, a0, end
for:
bge t0, a0, end
addi a1, a1, 2
addi t0, t0, 1
beq x0, x0, for
end:

Option 4:

bge x0, a0, end
for:
bge t0, a0, end
body:
addi a1, a1, 2
addi t0, t0, 1
end:

Option 5:

ble a0, x0, end
for:
addi a1, a1, 2
addi t0, t0, 1
blt t0, a0, for
end:

5. Putting Everything Together

Finally, let’s translate the following C program that calculates the product of an array:

void product(int* arr, int size) {
  int product = 1;
  // --- START HERE ---
  for (int i = 0; i < size; i++) {
    product *= arr[i];
  }
  // --- END HERE ---
  printf("The product is %d\n", product);
}

Translate the indicated section of code—just the loop—to RISC-V assembly. Assume that:

x1 holds arr pointer
x2 holds size
x3 holds product, and it is already initialized to 1 (outside of your code)
x4 is uninitialized, but will hold i

Feel free to use any other registers as you see fit.

Reading Assembly

Next, we’ll try understanding assembly code. A good strategy for understanding assembly code is to try reverse translation: write out a C program (or a “pseudo-C program”) that corresponds to the assembly code and then try to understand that code.

6. Branches

Consider the following RISC-V assembly:

addi t0, x0, 0
addi t1, x0, 5
blt t1, x0, label
addi t0, t0, 5
label:
addi t0, t0, 6

What is the value of register t0 after running this code? To answer this question, you can try writing out the corresponding C program.

If blt were replaced by bge, what would the value of register t0 be?

7. Accessing Memory

Consider the following assembly:

addi t1, x0, 4
addi s2, x0, 7
sw s2, 8(t1)
lw s3, 12(x0)

What is the value of s3 after this code runs?

Again, it can be very helpful to first write the corresponding pseudo-C code. Here’s one way to do that:

int* t1 = 4;
int s2 = 7;
*(t1 + 8) = s2;
int s3 = *(12 + 0);

Why are the constants in those last two lines 2 and 3? (You may want to refresh your memory about the rules of pointer arithmetic in C.)

8. Loop to C

Let’s translate this assembly code back to C:

addi t0, x0, 7
addi t1, x0, 0
loop: 
bge x0, t0, end
addi t0, t0, -1
add t1, t1, t0
beq x0, x0, loop
end:

Assume that the value of variable x is held in register t0 and y is held in register t1. Here’s a partial translation:

int x = 7;
int y = 0;
while (A) {
  x = B;
  y = C;
}

The placeholders A, B, and C mark expressions that are up to you. All of these should be C expressions.

Part I: From C to RISC-V

In this first part, you’ll translate three C programs written to RISC-V assembly. Consider trying out your implementations using the online RISC-V simulator to check that it behaves like the original C.

Array Accesses

Imagine we have variables of these types:

int x;   // x10
int y;   // x11
int* A;  // x12
int* B;  // x13

Assume that the two pointer variables, A and B, point to large arrays of ints. The code you need to translate is:

x += (x + y) * 2 - A[4];
B[3] = x;

Assume:

x is stored in register x10
y is in x11
the base address of array A is in register x12
B is in x13

Use x5 and x6 (and no more) as the temporary registers. Write your assembly code in a file named arrays.s.

Multiplication

Let’s implement the integer multiplication instruction in RISC-V using other instructions! The instruction mul rd, rs1, rs2 multiplies rs1 and rs2 and stores the result in rd. Here is an implementation in C for 64-bit integers:

unsigned long intmul(unsigned long rs1, unsigned long rs2) {
  unsigned long rd = 0;
  for (int i = 0; i < 64; i++) {
    if (rs2 & 0x1) {
      rd += rs1;
    }
    rs1 <<= 1;
    rs2 >>= 1;
  }
  return rd;
}

Translate the above code to assembly. Do not use the mul instruction. Assume:

the variable rs1 is stored in register a0
rs2 is in register a1
the return value rd goes in t0

Use t0, t1, and t2 for any temporary values. Please name your submission file mult.s.

Primality Test

The following function prime gives a rudimentary algorithm for checking whether a number (p) is prime:

bool prime(int p) {
  if (p < 2) {
    return false;
  }

  for (int i = 2; i < p; i++) {
    int rem = p % i;
    if (rem == 0) {
      return false;
    }
  }
  return true;
}

Translate this function to RISC-V. Submit your file as prime.s.

Please label the entry block to your assembly with .prime.

Imagine that there are two labels .ret_tru and .ret_fls that already exist; translate the return true and return false lines into jumps to these labels.

Assume p is stored in a2 (a.k.a. x12).

To implement the % operation, you will need to use mul and div instructions. Please use t3–t6 (a.k.a. x28–x31) for temporary values, and try to minimize how many of these you use.

Part II: Mysterious RISC-V

Your friend, Sia, is a great C programmer. But she doesn’t understand RISC-V assembly, unfortunately. She is trying to understand some mysterious RISC-V programs so she comes to find you, a RISC-V assembly programmer, to help her translate those RISC-V programs to C so that she can understand what they do.

Mysterious Function 1

Here’s one assembly program Sia is trying to understand:

loop:
  lw   x5, 0(x12)
  mul  x5, x5, x15
  lw   x6, 0(x13)
  add  x6, x6, x5
  sw   x6, 0(x11)
  addi x11, x11, 4
  addi x12, x12, 4
  addi x13, x13, 4
  addi x14, x14, -1
  bne  x14, x0, loop
  ret

Sia has already written a function signature:

void mystery1(int *arr1, int *arr2, int *arr3, int size, int num) {
  // ???
}

Assume that the function arguments are in registers x11 through x15, a.k.a. a1 through a5. Also assume that any array length given as an input is greater than zero. Complete this C function so it behaves the same way as the above assembly.

Follow these guidelines in your translation:

Prioritize readability. Comments are optional, but use them if you think it makes the code easier to understand.
Do not use goto. Use C’s if, for, while, etc. instead.
Prefer for loops over while loops. It is always possible to use while to implement any loop, but we want you to use for if the control flow fits the typical for (i = 0; i < max; i++) pattern.

It is possible to implement this function in only 2 lines of straightforward, readable C. Your solution does not need to be that short, but try to make it reasonably compact and understandable. (Sia will be grateful!)

Submit your completed implementation of the mystery1 function in mystery1.c.

Hint: Once you have a working C program, consider writing some tests for it. You can write a main function that calls the mystery1 function a few times on different inputs, for example, so you can compare the results to running the original RISC-V code. But please only submit the mystery1 function alone.

Mysterious Function 2

Sia asks you about a second mysterious assembly program:

addi x10, x0, 0

loop:
  lw x6, 0(x12)
  bne x6, x0, foo
  j bar

foo:
  sw x6, 0(x11)
  addi x11, x11, 4
  addi x10, x10, 1

bar:
  addi x12, x12, 4
  addi x13, x13, -1
  bne x13, x0, loop

ret

She already has this function signature:

int mystery2(int* arr1, int* arr2, int size) {
  // ...
}

The function arguments are again in registers a1 through a3 (a.k.a. x11 through x13). Register x10 is used to store the result of mystery2. Complete this function body. Use the same guidelines as in the previous part. You can also assume that any array length given as an input is greater than zero. It is possible to implement this code in about 6 lines of readable C but, again, your solution does not need to be that short.

Submit your solution in a file named mystery2.c.

Rubric

We will test all submitted code by running it on several test cases to check that it behaves correctly, i.e., equivalent to the original code. We will also manually read to the assembly code to check that the required registers are used, and we’ll read the C to see that it obeys the guidelines.

Lab
- lab.txt: 10
Part I
- arrays.s: 10
- mult.s: 10
- prime.s: 10
Part II
- mystery1.c: 10
- mystery2.c: 10