ECE/CS 314: Computer Organization Fall 2004

Project 4: MIPS Processor (Part 3: The Completed Processor)
Due: December 14, 2004


  Path
 

You will be using mipsgen and irsim.314 for this part, so make sure that /usr/class/ececs314/bin is in your PATH.

  Basics
 

Please read this document completely before starting.

  • The bootcode test for this final part was included with the rest of the files you should already have.
  • You will turn in this project electronically. See the end of this document for specific details.
  • The consultants can help you when you encounter problems. The consulting schedule is posted on the course web page. If you still have problems after talking to the consultants, you can try the course newsgroup, e-mail your questions to one of the TAs, or schedule an appointment.
  Part 3: The Complete Processor
  In this final part of the project you will to complete the implementation of your pipelined processor. Implement the control logic and the remainder of your datapath. The processor must be able to execute the boot code, jump to the assembly code you've written, and execute instructions without any intervention (other than stepping through cycles).

Template file. The template file for this project is cpu.cast, which was one of the files provided as part of Project 4, part 2. Read this file carefully. It contains posflops for the program counter and instruction register IR. It also instantiates a register file, and contains flops for the output of the register file. The definition of CPU contains the nodes that are used by the instruction and data memory interface (see below).

You may modify cpu.cast so that creating a CPU creates the complete datapath by completing the missing parts.

You may not alter the prototype of cpu.cast or remove or rename any lines or nodes that were included in the file.

Note that the code that instantiates the CPU is at the end of cpu.cast. Do not modify this part because it contains the memory interface logic. Your register file must be referencable as c.r.  The PC must be referencable as c.PC, and the pipeline registers must be referencable as c.PC1, c.PC2, c.IR1, c.IR2 , etc.

We have supplied some additional definitions and tools that you can use in your implementation of this part of the project. We have also made available a correct (but slow) alu in

/usr/class/ececs314/alu.cast

and

/usr/class/ececs314/alu_include.cast

If your alu has problems, feel free to use ours instead (it's slow though).

Processor-Memory Interface: The instruction and data memory are supplied by external modules. In order to interface to them in your CAST files, you will need to follow the steps below:

  • Interface to external memory: The following nodes should be defined in your top level CAST file (cpu.cast):
    • Inputs:
      node[32] Iin; // Bus for transferring instruction word from memory
      node[32] Din; // Bus for transferring data word from memory
    • Outputs:
      node[32] Iaddr; // Instruction address
      node[32] Daddr; // Data address
      node[32] Dout; // Bus containing data to write to memory
      node DMC0, DMC1; // Data memory control
      node IMC0, IMC1; // Instruction memory control
      Drive DMC0 and DMC1 with the following values to access memory:
       
      DMC0 DMC1 Action
      0 0 read word
      0 1 write byte
      1 0 write half-word
      1 1 write word

      The instruction control bits behave in an identical fashion, but you can leave them set to 0 since all you have to do is read instruction words.
    • Write-pulse Circuit: We have provided 314/memIF.cast to help you with the data memory control. The definition takes memory control signals as inputs, and produces clean memory signals (no glitches) as outputs that can be connected to the data memory. Make sure the inputs to memIF are generated from the pipeline registers (i.e. positive-edge triggered flops). The write signals will go high for only part of the cycle (after the negative edge) as discussed in class. Look at the comments in $CAST_HOME/cast/314/memIF.cast to see how the block should be used.
    • Creating a test file: Use the supplied Makefile to compile your own assembly file (test.s) into a memory image. Name the image file it generates mem.image and place it in the directory where you have your CAST files. The image file is a plain text file consisting of pairs of address and data. You can write your own image files to test the memory operations in your datapath.
    • Compiling and Execution. To compile the cast file use the command:
      mipsgen
      (this is similar to running prs2sim on cpu.cast, but it also adds the memory interface) This will produce cpu.sim and cpu.al. To run the simulation use:
      irsim.314 cpu.al cpu.sim
      The instructions and data will now be taken from your mem.image file.

      Note: Both mipsgen and irsim.314 are in /usr/class/ececs314/bin. You will need to add this directory to your PATH, or use full pathnames for these commands.

HINT: Instruction memory always reads words.  IMC0 and IMC1 should therefore always be set to ???.

  Testing and Debugging
 

The basic testing strategy is as follows:

  1. Generate .sim and .al files by running 'mipsgen'. 
  2. Write a .s file that will test some set of instructions.  Look at basic_test.s for an example.
  3. Copy this file to test.s and run 'gmake' to generate test.image.
  4. Copy test.image to mem.image.
  5. Create a .cmd file that will display all the important signals you wish to watch (like pc, inst, registers, etc.).  Use basic_test.cmd for an example.  Asserts are very useful for debugging.
  6. Run 'irsim.314 cpu.sim cpu.al'.  This runs the latest cpu with the contents of mem.image in memory.  There will be about 40 instructions worth of boot code included in the beginning of the image.
  7. Run 'gmipc', along with irsim as in the last step, and open test.image.  Compare the results in irsim to the correct functionality shown in gmipc. 
  8. Make the appropriate corrections to your cpu.
  9. Repeat steps 1,6,7,8 until this test passes (and no corrections need to be made for this test).
  10. Repeat starting from step 2.

In general, you need to run step 1 anytime you make corrections to your cast files.  However, you should be able to run different image files without running mipsgen each time, assuming that no changes have been made to the .cast files.  You can copy any image file to mem.image to run it with your processor.  For instance, you may want to run msort from proj 1.  Feel free to edit the Makefile so that it makes more than just test.s.  Repeat the above steps until your processor works.

You should first try to get your processor working without bypassing.  Once the basic implementation works, make a backup copy, and then try adding bypassing.  You're better off submitting a cpu without bypassing that works, rather than a broken one with bypassing.  Make a backup of your processor before adding any extra credit components as well.

  Grading Criteria
 

Most of your Project 4c grade will be computed automatically, based on correctness. We released a basic test script that tests the startup and a few basic instructions on your chip.  This guarantees that we are able to test your chip automatically.

If your chip does not pass the basic test you could get no (0) correctness points!
The files basic_test.s and basic_test.cmd are provided to you.  Please check for irsim errors indicating that basic_test.cmd cannot attach to the buses or nodes.  These files are also a good starting point when writing your own test cases.

(Hint: We do not thoroughly test the instructions contained in the boot test (but we will in our real tests)!  We are just testing that your boot code works and that we will be able to grade your project. )

Extra Credit

Here are a couple of extra-credit extensions you can make to your processor.  You need to provide the appropriate documentation (make sure the file name is exactly as asked) if it is requested below.

Note: If you want to do one of the extra credit options, it might be wise to get the basic processor working first and submit it to CMS. If you then implement an extra credit option, you can always submit again. In the absence of desperate emails, we use your latest submission for grading purposes.

LWL and LWR Instructions (4 points)

These two instructions are used together to load an unaligned word from memory. The description in the text (p. A-66) is altogether inadequate, but the online MIPS documentation is fine.

Note that LWL and LWR use register rt as both a source and destination. Think about your forwarding/bypass paths!

If you implement these instructions, you don't need to do anything special to let us know. Our auto testing script assumes that if the LWL and LWR instructions work you must have implemented them for extra credit.

Carry-Lookahead Adder (5 points)

We cannot tell by testing whether your ALU implements carry-lookahead or not. This is the one case in which we grade manually. You must include a documentation file named CLA.pdf describing your carry-lookahead implementation (including which files the grader needs to look in) if you want credit for the carry-lookahead option. See the submission instructions below.

Performance (5 points)

In order to get the full 5 points, your processor should run at 5ns.  Anything better than 12ns will get some extra credit.  Your bypassing logic must work in order to qualify for this extra credit (if it doesn't then your cpu is artificially faster since the bypassing is part of the longest delay).  No documentation needed.

Exceptions (5 points)

Here is a simple design for precise exceptions, in which there is no distinction between Kernel mode and User mode, and no TLB. You need to detect only two kinds of exceptions: arithmetic overflow for add and sub instructions, and reserved instruction , which should be raised for any unimplemented instruction (e.g. multiply or divide or FPU operations).

You need to implement only two registers from CP0:

  • EPC, the Exception PC register, which is CP0 register number 14.
  • Cause, which is CP0 register number 13. You need to implement the following bits:
    • BD, the Branch Delay bit (bit 31, the high-order bit).
    • ExcCode, the Exception Code field (bits 6-2). The only values that will occur here are 10 (RI, Reserved Instruction) and 12 (Ov, arithmetic Overflow). Both these values are decimal.
    • All other bits should be 0 when read with the mcf0 instruction described below.
You can just implement these registers “in place”; there is no need to create a separate CP0 module.

When one of the exception conditions is detected, your processor should cancel instructions as necessary to give precise interrupt behavior. It should store the appropriate values in the EPC and Cause registers, and then continue fetching instructions from the exception handler location 0x80000180. The “appropriate values” for EPC and Cause are:

  • Arithmetic overflow not in a delay slot:
    EPC = instruction address, Cause = (BD=0, ExcCode=12).
  • Reserved instruction not in a delay slot:
    EPC = instruction address, Cause = (BD=0, ExcCode=10).
  • Arithmetic overflow in a delay slot:
    EPC = address of branch instruction (i.e., instruction address - 4), Cause = (BD=1, ExcCode=12).
  • Reserved instruction in a delay slot:
    EPC = address of branch instruction (i.e., instruction address - 4), Cause = (BD=1, ExcCode=10).

You will receive partial extra credit even if you do not get around to handling exceptions in branch delay slots correctly.

You need to implement (a subset of the functionality of) the mfc0 instruction

      mfc0  rt,rd
which reads CP0 register rd into general register rt. You only need to handle EPC and Cause, so for example it is fine to read EPC if rd is even and to read Cause if rd is odd.

Because there is no Kernel/User Mode distinction, and no way to mask or disable interrupts, you do not need to implement the rfe instruction. Your processor can return from an exception using a jr instruction, jumping to the address that was placed in the ECP register when the exception was raised.

If you implement exceptions (either with or without branch-delay-slot detection) please include a file EXC.pdf describing your implementation, (in particular, tell us whether you handle branch delay slots correctly) and describing your testing strategy.

Note: Because you have not implemented the Interrupt Enable bit(s) in the CP0 Status Register, this design is not quite able to do interrupt-driven I/O. But it's very close!

Note 2: This is a nontrivial extension, and requires substantial additions to the datapath. We suggest you get the rest of your project working (and submitted) before tackling this.

  Submitting Your Project
 

Submission will be done using CMS .

In your ececs314/proj4 directory create a file named README. As usual, the first few lines of this file must contain

NAME: username1 username2
Name of person1
Name of person2
PROJECT 4.3
where username1 and username2 are the netids of you and your partner. The remainder of your README file should contain documentation explaining any special features of your design.

If your documentation is not suitable for a plain text file, create a separate file DOCUMENTATION.pdf and put your documentation there. Make a note in your README file that you have done this.

You will submit a single file: a tar archive containing all your files. Construct this as follows:

proj4ctar.314
This will create your solution file proj4c_valid_sub.tar.gz , which you should submit to CMS.  This time the script takes all .cast .cmd .s and .pdf files.
  List of Instructions
 

The complete list of required MIPS instructions is: j, jal, beq, bne, blez, bgtz, addi, addiu, slti, sltiu, andi, ori, xori, lui, lb, lh, lw, lbu, lhu, sb, sh, sw, sll, srl, sra, sllv, srlv, srav, add, addu, sub, subu, and, or, xor, nor, slt, sltu, jr, jalr, bltz, bgez, bltzal, bgezal.