PortOS Project 6

File System


Overview

For the final project, you will implement a virtual file system to work with your minithreads package. We have provided a block-based disk interface in disk.h, which simulates a disk by translating block reads and writes to accesses to a single Windows NT file.

You should implement a hierarchical, Unix-like file system on top of the disk emulator that supports the following operations:

minifile_t minifile_creat(char *filename)
minifile_t minifile_open(char *filename, char *mode)
int minifile_read(minifile_t file, char *data, int maxlen)
int minifile_write(minifile_t file, char *data, int len)
int minifile_close(minifile_t)
int minifile_unlink(char *filename) (delete file)
int minifile_mkdir(char *dirname)
int minifile_rmdir(char *dirname)
int minifile_stat(char *path) 
int minifile_cd(char *path) 
char **minifile_ls(char *path) 
char *minifile_pwd()

To get an idea of what these functions should do, look them up (omitting the "minifile_" prefix) via man on a Unix system, or in the Visual Studio help. The only exception is that our minifile_open takes arguments like the fopen call, instead of open. Don't worry about reporting detailed error codes when something goes wrong, returning -1 from the function (or some other appropriate error value) is good enough. Your file system should support variable-sized files via a Unix-like inode mechanism, and reuse of blocks from unlinked (deleted) files. It is vital that your file system has concurrency control, so that it can cope with simultaneous accesses by multiple threads.

If you stick to the interface above, you should be able to compile and run the shell program included with this version of the code. You should then be able to create files and directories from the command line.

The Details

The disk simulator is relatively straightforward. To create a disk, use the disk_create() function, provide a name for your disk. You can also specify some disk flags to control disk behavior, and give a maximum size for the disk. Use disk_startup() to spin up the disk you have created.

To begin using the disk, just issue disk requests through the disk_send_request() function. The format of the requests is shown in disk.h. When (and if) requests complete, the disk controller signals the completion by raising an interrupt. As with previous assignments, you need to write an interrupt handler that will handle these interrupts appropriately.

Recall from the course discussion that the disk controller may reorder your requests in any arbitrary order. In fact, an efficient controller will reorder requests quite aggressively. Consequently, if you have a series of blocks that need to be written with a well-defined order (e.g. block A before B before C), then you must, in your file-system code, make sure that you do not issue request B before the request for A has been completed.

Make sure to test your code extensively. Simple sequential tasks, such as creating files, creating directories, removing directories, etc. should be easy. But you should also test your code with concurrent accesses as well as failure cases, e.g. five threads are concurrently writing to the disk when a system crash occurs (someone presses control-c). The file-system should not be left in an inconsistent state. You can set the failure rates to non-zero values to have the disk controller experience such errors occasionally, just like a real disk.

Note that not all of the functions are file operations. For instance, the concept of the "current working directory" is not a global abstraction that applies to the file-system, but a piece of local state kept with each process (minithread). Similarly, minifile_pwd() returns the path to the current working directory associated with the calling thread. Your file-system should NOT have any such state as global variables shared across independent processes.

Since you are not asked to support mount points, there should be only one file-system in your implementation. This unique file-system should reside on a virtual disk that uses the file MINIFILESYSTEM in the current directory. Make sure you write a C program, called mkfs.exe, that creates an empty file-system in this virtual disk. This initial file-system created by mkfs.exe should contain only one directory (the root directory) with no entries.

Applications may access the filesystem concurrently. There are a few reasonable approaches to how your FS can deal with open/read/write/close concurrency:

Similarly, your FS must define a reasonable set of semantics when multiple applications concurrently access the file and delete or rename it at the same time. For example, suppose a thread is writing to a file when another thread unlinks that file. Once again, you have a few implementation options, with different degrees of desirability:

We will favor Unix semantics in our grading.

It goes without saying that a system crash in response to concurrent operations must be avoided at all costs.

General guidelines for file system design:

How to Get Started

Make a backup copy of your code from the previous project, download the code from here portos6.zip, and merge.

We've included a simple command shell in shell.c, which you can link against your minifile implementation. It should enable you to test your code from the command line.

Submissions

Consult the submission guidelines to find out how to submit your work.

For the Adventurous

Note: These suggestions for an extra challenge will be examined but not graded. They will have no impact on the class grades. They are here to provide some direction to those who finish their assignments early and are looking for a way to impress friends and family.

Implement hard links to files. The reason why the Unix delete operation is called "unlink" is because the use of inodes for storing file information, separate from the directory hierarchy, allows a file to have multiple names, even within the same directory. Every "name" simply points to the one inode for the file. To add an additional directory entry for a file (i.e. give it another name), the "link" system call is used. Unlink is the opposite of link: it removes a name. The implementation is complicated by the fact that you need to keep track of how many links exist to a file, so that you know when to remove it completely.

Final Word

If you need help with any part of the assignment, we are here to help. You may also find the FAQ useful.
Emin Gün Sirer, May 2002