The RISC-V ISA

Via our “bottom-up” trajectory, we built up from logic gates to the two-instruction FemtoProc. We will now take a leap to a full-featured processor and a standard, popular ISA: RISC-V.

Like FemtoProc’s ISA (and any ISA), RISC-V is an extremely primitive programming language made of bits, and it has a textual assembly format that makes it easier to read and write than entering binary values manually. Each instruction is like an extremely simple statement in a different programming language, and it describes a single small action that the processor can take.

Unlike FemtoProc, it has many more than two instructions—it has enough so that arbitrary C programs be translated to RISC-V code. In fact, that’s what happened every time you have typed gcc during this whole semester.

Why Learn Assembly Programming?

Understanding assembly is important because it is the language that the computer actually speaks. So while it would be infeasible in the modern age to write entire large software projects entirely in assembly, it remains relevant for the small handful of exceptional cases where higher levels of abstraction obscure important information. Here are some examples:

People hand-write assembly for extremely performance-sensitive loops. A classic example is audio/video encoding/decoding: the popular FFmpeg library, for example, is mostly written in C but contains hand-written RISC-V assembly for performance-critical functions. While modern compiler optimizations are amazing, humans can still sometimes beat them.
Operating system internals typically need some platform-specific assembly to deal with the edge cases that arise with controlling user processes.
Code that must be secure, such as encryption and decryption routines, are often written directly in assembly to avoid timing channels. If an encryption routine takes different amounts of time depending on the key, an attacker can learn the key by repeatedly measuring the time taken to encrypt or decrypt. By taking direct control over which instructions get executed, humans can sometimes ensure that the code takes a constant amount of time, so that the attacker can’t learn anything by timing it. This is hard to do by writing C because the compiler tries to be clever: by optimizing your code, it can “accidentally” make its timing input-dependent.
Even more commonly: reading assembly is an important diagnostic skill. When something goes wrong, sometimes reading the assembly is the only way to track down the root cause. If it’s a performance problem, for example, understanding the source code only gets you so far. If it’s a compiler bug (and compilers do have bugs!), then debugging is hopeless unless you can read assembly.

For these reasons and others, it is important to know how to read and write assembly code. We will program in RISC-V during this semester, but the skills you learn as a RISC-V programmer will translate to other ISAs such as ARM and x86.

Recap: Assembly Principles

First, recall some programs we wrote in our two-instruction FemtoProc ISA, like this:


wrimm A, 8
wrimm B, 3
add A
add B

The important thing to keep in mind is that there are two ways of thinking about each assembly program: it is either a shorthand for machine code, or it is a programming language. The first interpretation says that each line is a mnemonic for some bits that control how the processor works. The second says you can forget about the processor and pay attention only to the ISA manual.

Under the first interpretation, remember that each FemtoProc instruction was 8 bits, so we can also represent this program as a string of binary numbers. There is a 1-1 correspondence between lines in our assembly program and 1-byte instructions in the machine code. In hex: c8 33 80 20. Sometimes, it is helpful to see both the text form and the hex representation of the bytes side by side, like this:


c8  wrimm A, 8
33  wrimm B, 3
80  add A
20  add B

Under the second interpretation, we just want to understand what this program computes. My #1 recommendation for understanding assembly code is to translate it into pseudocode first and then read it. For example, we can translate our example line by line into pseudo-Python:


a = 8
b = 3
a = a + b
b = a + b

where the a and b variables correspond to the two FemtoProc registers. To understand what this program actually does, we can try simplifying the Python code:


a = 8
b = 3
a = 8 + 3
b = 11 + b

So we know that this program puts $8+3=11$ into register A and $11+3=14$ into register B.

Let’s See Some RISC-V Assembly

To get started, let’s look at some RISC-V assembly code. I mentioned already that, every time you have typed gcc so far this semester, you have been invoking a compiler whose job it is to translate your C into machine code. We can ask it to instead stop at the assembly and print that out using the -S command-line flag.

Let’s start with an extremely simple C program:


unsigned long mean(unsigned long x, unsigned long y) {
    return (x + y) / 2;
}

To see the assembly code, try a command like this:


$ rv gcc -O1 -S mean.c -o mean.s

The -S tells GCC to emit assembly, and -o mean.s determines the output file. I’m also using some optimizations, with -O1, that clean up the code somewhat (in addition to making the code faster, it also makes the assembly more readable). This is just a text file, so you can open it in the same editor you use to write C code. Try opening it up.

There’s a lot going on in this output, but let’s zoom in on these 3 lines:


add     a0,a0,a1
srli    a0,a0,1
ret

This is a sequence of 3 assembly instructions. Just like in our FemtoProc ISA, each one works like a statement in a “real” programming language, and it describes a single, small action for the program to take. Even though we don’t know what these instructions do, we can puzzle through what this code does:

add probably adds two numbers together. Which is good, because that’s what our original C program does first.
srli is a little more mysterious. It turns out that this mnemonic stands for shift right logical immediate. The important part is that this is a bitwise right shift. So the compiler has cleverly decided to use something like >> 1 instead of / 2.
ret returns from the function.

The takeaway here is that our “second interpretation” of assembly code works for RISC-V too. We can think of it as an extremely primitive programming language and understand the code that way, forgetting about the fact that each instruction corresponds to some control bits that orchestrate the circuitry in a processor.

A Look at the Bits

Now let’s return to the first interpretation of assembly code: it’s a roughly 1-1 reflection of the (binary) machine code for a program that actually executes. Let’s look at those bits.

Object Files and Disassembly

We can translate our .s assembly code into machine by assembling it. Try this command:


$ rv gcc -c mean.s -o mean.o

The -c flag instructs GCC to just compile the code to an object file (with the .o extension), and not to link the result into an executable. (You can also ask GCC to go all the way from C to a .o in one step if you want; just provide the .c file as the input and remember to use -c.)

You could look directly at this object file with xxd mean.o if you want, but that’s not very informative. It’s more useful to disassemble the code in this file so you can see the text form of the instructions. (Disassembling is the opposite of assembling: it’s a translation from machine code back to assembly code.) Our container comes with a tool called objdump that can do this:


$ rv objdump -d mean.o

The important part of the output is:


0000000000000000 <mean>:
   0:   00b50533                add     a0,a0,a1
   4:   00155513                srli    a0,a0,0x1
   8:   00008067                ret

Here’s how to read this output:


function address <function name>:
 addr:  machine code           assembly instruction

On the right, we see the same three instructions in the textual assembly format. On the left the tool is also printing out the hex form of the machine code (and the corresponding address). For example, the first instruction consists of the bytes 00b50533, starting at address 0. In RISC-V, every instruction is exactly 4 bytes long, so the next instruction starts at address 4.

Raw Machine Code

The .o object files that our compiler produces don’t just contain machine code; they also contain other metadata to make linking possible. Sometimes (like on this week’s assignment), it is useful to have a “raw” binary file just containing the instructions. In the CS 3410 container, we have provided a convenient command that makes it easy to produce these raw files, called asbin.

Let’s put just the instructions we want into a new file:


add a0, a0, a1
srli a0, a0, 1
ret

Try this command:


$ rv asbin mean.s

Then take a look at the bytes:


$ xxd mean.bin
00000000: 3305 b500 1355 1500 6780 0000            3....U..g...

You can see the bits for same 4-byte instructions here, with a twist. The bytes are backward, for a reason we’ll explain next (named endianness).

For the curious only: our little asbin script just runs a couple of commands. You can run them yourself too:


$ as something.s -o something.o
$ objcopy something.o -O binary something.bin

The objcopy command is a powerful tool for converting between binary file formats, but we just need it to do this one thing. We just thought this was common enough in CS 3410 that it would be handy to have a single command to do it all.

Endianness

The reason the instruction bytes appear backward in the file is because of a concept called endianness or byte order. Different computers have different conventions for how to order the bytes within a multi-byte value. For example, in RISC-V, both int and instructions are 4 bytes—which order should we put those bytes into memory?

The options are:

Big endian: The “obvious” order. The most-significant byte goes at the lowest address.
Little endian: The other order. The least-significant byte goes at the lowest address.

Fortunately or unfortunately, most modern computers use little endian. That includes all of x86, ARM, and RISC-V (in their most common modes). That’s why the lowest byte in our instructions appears first when we look at the binary file with xxd. File I/O routines will hide this different from you, so if you read an int from a file, it will put the bytes in the right order by the time your program sees the bytes.

Why are these called big and little “endian”? It’s one of the all-time great examples of computer scientists being terrible at naming things: these names come from the 1726 novel Gulliver’s Travels by Jonathan Swift, from a part about a war between people who believe you should crack an egg on the big end or the little end.

RISC-V Assembly Basics

Let’s cover a few fundamental concepts that RISC-V will use for every instruction. We will break down this instruction from our example:


add a0, a0, a1

Registers

There are 32 registers. RISC-V names them x0 through x31. We’re using the 64-bit version of the RISC-V ISA, so each register holds a 64-bit value.

Alternative Names for Registers

While all the registers just hold bits, there are conventions about how each one is usually used. To help remind you of these purposes, RISC-V also gives the instructions alternative symbolic names. Wikipedia has a detailed table with all of these names that I won’t reproduce here. Here are some register names that will be relevant immediately:

x0 is also known as zero. It is unique among all RISC-V registers because it cannot be written: it always holds the all-0s value. If you try to update this register, the write is ignored. Having quick access to “64 zeroes” turns out to be useful for many programs.
x10 through x17 are also known as a0 through a7.
x5, x6, x7, and x28 through x31 are also known as t0 through t6.
x8, x9, and x18 through x27 are also known as s0 through s11.

The latter 3 sets of registers (aN, tN, and sN) have subtly different conventions that have to do with function calls, which we’ll cover later. For now, however, you can think of them as interchangeable places to put values when we’re operating on them. You absolutely do not need to memorize the alternative names for every register—you just need to know that there are multiple names. This way, you know that our instruction above is exactly equivalent to:


add x10, x10, x11

…because it just uses different names for the same registers. These alternate names are just an assembly language phenomenon (i.e., for human readability), and the machine code for these two versions looks exactly the same.

Three-Operand Form

Most RISC-V instructions take three operands, so they look like this:


<name> <operand>, <operand>, <operand>

The name tells us what operation the instruction should do, and the three operands tell us what values it will operate on. So our example is an add instruction, with three register operands: a0, a0, and a1.

In these three-operand instructions, the first one is the destination register and the second two are the source registers. You’ll sometimes see the format off the add instruction written like this:


add rd, rs1, rs2

The mnemonic is that r* are register operands, d means destination, and s means source. So our instruction add a0, a0, a1 adds the values in a0 and a1 and puts the result in a0. It is allowed, and extremely common, for the same register to be used both as a source and a destination.

Using the Manual

Working with assembly code entails reading the manual. A lot. In other languages, you can quickly build up an intuition for what all the basic components mean. In assembly languages, there are usually so many instructions that you need to look them up continuously. Expect to work with assembly with your code in one hand and the ISA manual in the other.

Navigate to this site’s RISC-V Assembly resource page. I recommend using the RISC-V reference card linked there all the time. In rare circumstances where you need more details, you can use the (very long) specification document. I’ll refer to the reference card here.

The first page of the reference card tells us what each instruction means. To understand our add instruction, we can find it on the list to see the format, a short English description, and a somewhat cryptic pseudocode description of the semantics.

The second page tells us how to encode the instruction as actual machine-code bits. We’ll cover the encoding strategy next.

Instruction Encodings

Every assembly instruction corresponds to a 32-bit value. This correspondence is called the instruction encoding.

For example, we know that the add instruction we’re working with, when assembled, encodes to the value 0x00b50533. Why those particular bits?

In RISC-V, instruction encodings use one of a few different formats, which it calls “types.” You can see a list of all the formats on the second page of the reference card: R-, I-, S-, B-, U-, and J-type (another list that you should not attempt to memorize). Each format comes with a little diagram mapping out the purpose of each bit in the 32-bit range.

Add Instruction

add is an R-type instruction (so named because all the operands are registers). Reading from the least-significant to most-significant bits, the map of the bits in an R-type instruction consists of:

7 bits for the opcode. The opcode determines which instruction this is. The reference card tells us that the opcode for add is 0110011, in binary.
5 bits for rd, the destination register. It makes sense that the register is 5 bits because there are a total of $2^5=32$ possible registers. So to use destination register x10, we’d put the binary value 01010 into this field.
3 function bits. (We’ll come back to this in a moment.)
The first source register operand, rs1. Also 5 bits.
The second source register, rs2. 5 bits again.
7 more function bits.

In RISC-V, the function bit fields—labeled funct3 and funct7—specify more about how the instruction should work. They’re kind of a supplement to the opcode. For example, the table tells us that add and sub (and many others) actually share an opcode, and the bits in funct3 and funct7 tell us which operation to perform. To encode an add, set all the bits are zero.

So now we can describe exactly how to encode our example instruction, add x10, x10, x11. Again starting with the least-significant bits:

The opcode (7 bits): 0110011.
rd (5 bits): decimal 10, binary 01010.
funct3 (3 bits): 000.
rs1 (5 bits): decimal 10, binary 01010 (again).
rs2 (5 bits): decimal 11, binary 01011.
funct7 (7 bits): 0000000.

Try stringing these bits together and converting to hex. You should get the hex value the assembler produced for us, 0x00b50533. Some handy tools for doing these conversions include:

Bitwise, an interactive tool that runs in your terminal for experimenting with data encodings.
The macOS Calculator app. Press ⌘3 to switch to “programmer mode.”

Add-Immediate Instruction

To try another format, consider this instruction:


addi a0, a1, 42

This add-immediate instruction is different from add because one of the operands isn’t a register, it’s an immediate integer. The reference card tells us that this instruction uses a different format: I-type (the I is for immediate). The distinguishing feature in this format is that the most-significant 11 bits are used for this immediate value. (This field replaces the funct7 and rs2 fields from the R-type format.)

If we assemble this instruction, we get the 32-bit value 0x02a58513. The interesting part is the top 12 bits, which are 00000010 1010 or, in decimal, 42.

Let’s Write an Assembly Program

Let’s try out our new reading-the-manual skills to write an assembly program from scratch. Our program will compute $(34-13) \times 2$ . We’ll implement the multiplication with a left shift, so our program will work like the C expression (34 - 13) << 1.

When writing assembly, it can help to start by writing out some pseudocode where each statement is roughly the complexity of an instruction and all the variables are named like registers. Here’s a Python-like reformatting of that expression:


a0 = 34
a1 = a0 - 13
a2 = a1 << 1

I’ve used three different registers just for illustrative purposes; we could definitely have just reused a0.

Let’s translate this program to assembly one line at a time:

We need to put the constant value 34 into register a0. Remember the add-immediate instruction? And remember the special x0 register that is always zero? We can combine these to do something like a0 = 0 + 34, which works just as well. The instruction is addi a0, x0, 34.
Now we need to subtract 13. Let’s look at the reference card. There is no subtract-immediate instruction… but we can add a negative number. Let’s try the instruction addi a1, a0, -13.
Finally, let’s look for a left-shift instruction in the reference card. We can find slli, for shift left logical immediate. The final instruction we need is slli a2, a1, 1.

Here’s our complete program:


addi a0, x0, 34
addi a1, a0, -13
slli a2, a1, 1

To try this out, we could compile it to machine code, but this would be a little hard to work with because we’d need to craft the assembly code to print stuff out. (We’ll cover more about how to do this over the coming weeks.) Instead, a handy resource that you can find linked from our RISC-V assembly resources page is this online RISC-V simulator. Try pasting this program into the web interface and clicking the “Run” or “Step” buttons to see if we got it right: i.e., that the program puts the result $(34-13) \times 2$ into register a2.