CS4411

Practicum in Operating Systems

Project 1: Non-preemptive Multitasking (FAQ)

FAQ Categories

Indirection in the queue functions

Why is it that the queue_prepend() and queue_append() functions accept an any_t parameter, while queue_dequeue() accepts an any_t * item? Since any_t is already a void *, are we supposed to return a pointer to a pointer?

This is C's method of returning multiple results from a function. Remember that C only uses call-by-value, and the return keyword only returns a single type. queue_dequeue() is supposed to return true or false based on its success. It is also supposed to return a pointer to an element. It does the former with a statement like return 0;. It does the latter by taking a pointer to a location, and writing the return value to that location. As we wish to return an any_t, we must pass in an any_t *.

For example, let's say you have a function that is supposed to return height (which in our example is 10) and width (which in our example is 20). If you do the obvious:
int getheightwidth(int width) {
        width = 20;
        return 10;
}

int main() {
        int h = 0,w = 0;

        h = getheightwidth(w);
}
The change you made to width will not be visible to the calling routine - w will still be 0 after the call, because C uses call-by-value parameters.

You can get around this problem by passing in a pointer:
int getheightwidth(int *width) {
        *width = 20;
        return 10;
}

int main() {
        int h = 0,w = 0;

        h = getheightwidth(w);
}
Working through an example and convincing yourself that this approach works is the best thing to do. If you have further questions, please see the course staff during one of the TA hours.

Thread setup and initialization

What to do with the initial context

We were told that we could represent the thread that NT gives us as a minithread. But what I don't see is where we get the NT stack's base and top to initialize our minithread structure. How do we get such information and more importantly, what is the exact purpose of representing the kernel thread as a minithread?

Bootstrapping is a problem for any OS. In minithreads, just like in a real system, you have the problem of going from the initial bootloader context to switching between threads that you have created. In our case, the initial stack assigned to us by NT is the bootloader stack, and you would like to then start context switching between minithreads. There are a couple of things one could do here. For instance, you could take a context_switch out of the initial stack, and never return there. This is OK, but you would in effect be throwing away the entire initial stack, which is wasteful.

A better approach is to use the initial stack provided to you by the bootloader (NT in our case) as if it were a minithread stack. The problem is that you don't know the base or the top of the stack assigned to you by NT. But why do you need to know the base or the top ? You need the base if you want to ever free the stack, and you need the top when you need to initialize the stack. But the boot stack is already initialized, and if you turn the initial boot context into the idle thread or the stack cleaner thread, then it will never terminate. Hence, you'll never need to free its stack.

So it's perfectly OK to create a special TCB for the initial context. In effect, you have the context already, and you are legitimizing it in your threads package by wrapping it in a TCB. Unlike every other TCB, this one may have NULLs for stack base and top, but that's ok. You get the nice property that you do not lose or discard any of the memory available to you, including the initial stack.

Queue functions and behavior

`queue_delete` signature

Also, with the new definition of queue_delete(queue_t, any_t* item), what is the purpose of the any_t * item? Is it so that we can return a ptr to the data of what we are deleting? If this is the case, then how do we identify what it is we are to delete?

In queue_delete, *item is a pointer to the queue element the calling application wants you to delete. queue_delete does not return a pointer, do the (any_t *) is not strictly necessary. So why is there a level of indirection ? For no fundamental reason, only to keep the queue_dequeue interface identical to queue_delete, so people will not make mistakes when they use either of the routines that take items off of a queue.

`queue_iterate` failure cases

Is it a failure case to call queue_iterate on an empty queue?

No. General convention with C is to only consider something a failure case if it results in unexpected or surprising behavior. For example, iterating over a set naturally involves examining each element. Iterating over an empty set would intuitively involve doing nothing.

queue_iterate should return -1 if there is an error with the application of the PFany object.

Should I continue to iterate if a call to the provided function returned -1?

This is a question to which there is no definite answer (however, some are certainly better). We recommend to continue iteration over the elements and return -1. It is up to the caller of queue_iterate to decide whether the result is valid.

queue_iterate does not need to necessarily check for errors in queue structure (for these directly translate to bugs in your code). It is up to the caller to conform to the interface and provide a valid queue.

`queue_free` recursive free

When freeing my queue, should I also free the any_t values stored in the queue?

No. A call to queue_free should only free the datastructures that the queue interface has created. If you wish to free the objects in the queue, you should explicitly dequeue them and then free them.

Error Reporting

What should we do about errors (in non-C jargon, "exceptions") which occur in the functions we write?

It depends on what causes the error. If it's because of bad parameters passed by the caller, or because a data structure is empty when the caller thought it was full, or some similar condition, then you should return an error code indicating what the error is, if necessary. If the error is one which the function or the caller can't deal with gracefully (memory corruption, not enough memory for a malloc, etc), then it may be better to print out as descriptive an error message as possible, and terminate the program, since the error is likely to be unrecoverable.

Assertions

What is the assert() function for? When should I use it?

assert() is for detecting mistakes and inconsistencies when you're writing and debugging code. If the condition in the assert fails, the program prints an error and dumps core (writes the contents of its memory to a file, for non-Unix people). For instance, if at at the start of function A, the value of X should be 2, and the value of Y should be 8, you could put an assert(X == 2 && Y == 8) at the top.

Asserts are useful for detecting when you made an incorrect assumption about a program invariant, or you violated the invariant elsewhere in your code. You should not really use them for catching any caller-caused errors.

Memory Management Errors in C

I seem to be getting some kind of corruption of memory pointed to by pointers. What could be the cause?

There are many causes for memory errors. We list a few common errors here, and a way to find and locate any memory error quite efficiently.

There are several common causes of memory errors:

Using a pointer to an object on a stack frame, which is then deallocated.

Writing past the end of an allocated structure, often caused by allocating too little memory to fit an object.

Freeing the same object more than once.

Here's how the first case can arise in practice: you use a pointer to a local variable on the stack, which you returned from a function. For instance:
char* return_a_string(...) {
    char string[32];
    ...
    return string;
}

int main() {
    char* x = return_a_string(...);
    ...
}
This is a serious, but unfortunately common C mistake, which is often hard to spot. The problem is that a function allocates its local variables on the stack, which is shared by subsequent function invocations. The pointer x ends up pointing to some location on the stack, which may get used for some other data and overwritten by a later function invocation. Returning this kind of a pointer is even more dangerous because you might not notice the bug immediately. If in place of return_a_string we had:

then we'd have a different version of the same problem. Be aware that the problem still occurs if you store one of these pointers-to-the-stack in a data structure which gets used after the function finishes, as well. You just have to be careful.

The second case of memory errors is often caused by code like this:
struct foo { int x; int y};
typedef struct foo *foo_t;

void function(...) {
    ...
    /* this code allocates too small a space for a struct foo */
   foo_t myptr = (foo_t) malloc(sizeof(foo_t));
   myptr->y = 13; // writing past the end of allocated memory
   ...
}
Here, foo_t is a pointer to an object. The amount of memory that has to be allocated to hold a struct foo object is sizeof(struct foo). sizeof(foo_t) will always return the size of a single pointer, (4 bytes on 32-bit architectures, no matter how big the actual object might be).

Finally, if you free the same object several times, the memory allocator's internal data structures may become inconsistent, and may lead to a crash during a subsequent malloc or free. Some memory allocators will catch this kind of memory errors and alert you, especially if you build within a debug as opposed to a release environment; others might silently fail later.

Luckily, debugging memory errors is fairly straightforward most of the time. When the crash occurs, you need to identify which virtual address caused the crash. Then you need to examine the instruction at the crashed program counter to see what type of an access was performed. Then work backwards from that point to determine what kind of an inconsistency in memory is the root cause of the problem (e.g. location 0x4003fdc has an erroneous value of 0xdeadbeef stored in it, which causes a subsequent store to location 0xdeadbeef). Finally, set up a watchpoint such that you can see the value stored at that address, and perform a binary search through the program execution to find the instruction that is creating the inconsistency in the first place. For instance, stop the program halfway through to the crash to see the value stored at location 0x4003fdc. If it's not 0xdeadbeef, the memory access is occuring in the second half of the execution. Using binary search, you should be able to pinpoint the root cause of the problem in log N steps.

Some memory errors require a bit more detective work, especially if the crash is not easily reproducible or "non-deterministic" (by that, most people actually mean that the crash is triggered by external events over which they have little control, such as the timing of interrupts. But keep in mind that most crashes can be made "deterministic" with some effort. For example, recording the PC values at which interrupts occurred, and disabling interrupts at all PC's except those recorded in the crash, will often enable the deterministic reproduction of timing-related errors). Finally, it helps to do the debugging as a team.

Guard Pages

I am getting a "guard page exception" when running my system, what is the reason?

It could be because you are declaring an array as a local variable in a function, which is too big to fit on the stack. Threads have quite small stacks (about 64k). Either increase the stack size (defined by the STACKSIZE constant in minithread_md.h), or make the array smaller.

The Build

Why `nmake`?

Why is it necessary to use "make" to build the PortOS system? Why can't I just add all the sources to a Visual Studio Project?

The emulation of interrupts requires that we specify the order object files are linked in order to allow us to prevent preemptions in NT library code. Visual Studio doesn't allow specifying the order of linking without a makefile. If you do not use the Makefile, you may get random crashes when preemption catches your code in an NT library.

Linux

Can I build with Linux/GCC?

Our emulation of various facilities of the OS is deeply tied to Windows' architecture/API because CSUGLAB machines run Windows. As a result, PortOS must be built on Windows.

Cannot open include file: 'windows.h'

I get a linker error like:

fatal error C1083: Cannot open include file: 'windows.h': No such file or directory

How do I fix this?

Are you compiling from the command prompt? Several of the CSUGLAB machines will not build within Visual Studio, and so we recommend building from the command prompt.

syntax error : identifier 'semaphore_create'

I see the following error:

synch.c(19) : error C2016: C requires that a struct or union has at least one member
synch.c(26) : error C2061: syntax error : identifier 'semaphore_create'
synch.c(26) : error C2059: syntax error : ';'
synch.c(26) : error C2059: syntax error : ')'

What's going on?

You need to populate struct semaphore (even if you do so with a temp variable).
struct semaphore {
    int temp;
};

ml64.exe

I see the following error:

"ml64.exe /c /Zi /Zf machineprimitives_x86_64_asm.S
'ml64.exe' is not recognized as an internal or external command, operable program or batch file.
NMAKE : fatal error U1077: 'ml64.exe' : return code '0x1'"

Is my Visual Studio broken?

Project 1 description has a section on how to get started. Follow these instructions, paying particular attention to configuring the Makefile.

Managing Stacks

Stack Allocation

How does stack allocation work?

Each new thread you create needs its own stack to store function arguments and return addresses on. You can obtain a stack by caling minithread_allocate_stack(), and initialise it by calling minithread_initialize_stack(). Initialising the stack requires two functions: the function the new thread should run, and a "final" function, which runs after the main function has returned. The final function need not be unique for each thread. Both functions have arguments, so that the order of invocation looks like: main_proc(main_arg); finally_proc(finally_arg). The exception is the first thread you create, which gets the original stack for the process, and runs inside minithread_system_initialize(). It still needs a thread control block, however, so that it can participate in context switches correctly. This first thread cannot terminate in the same way as the others: since it's running on the NT stack, when it returns from main(), it will destroy the whole process.

Freeing Stacks

Why can't a thread free its own stack?

When a thread announces it wants to terminate, by calling the thread destroy routine, it is still running on its own stack. In order to transfer control to another thread in a context switch, it pushes some data onto its stack. So if it freed its stack and then context-switched, it would end up writing to an invalid stack. The solution is to have the terminating thread "notify the system" (set a flag, put itself on a queue, etc.) so that a thread which runs after it will know it has to free the stack.

Managing Threads

`final_proc` return

Why can't final_proc return? Don't all functions return?

final_proc cannot return because otherwise you will pop the last stack frame from the stack. By definition of "last stack frame", there is no other frame to return to; instead, we would leave the process in an undefined state, possibly executing undefined code (it's more likely that we'd just crash).

final_proc should do all necessary bookkeeping to let your idle thread cleanup the thread. Then it should perform a context switch out of the now-terminated context.

Thread stop and start

What are minithread_start() and minithread_stop() for?

They're really only useful for implementing semaphores, it's unlikely that your "user programs" will need to use these low-level calls.

Thread identifiers

What are minithread_self() and minithread_id() for?

A thread needs to be able to get a pointer to its TCB in case it wants to add it to some data structure, for instance: then another thread could call minithread_start() on it. Of course, this is only useful outside minithread.c. Thread IDs are really only helpful for debugging.

System termination

What do I do if all the threads terminate? What if there's no thread avaible to run when a thread does a stop or yield?

Some students have the tendency to exit from the system when there are no threads to schedule. This is a mistake. Keep in mind that you are writing an operating system, a long-lived process that is never supposed to terminate. If, for some reason, there are no more threads to schedule, your system should just loop in the idle loop. A subsequent interrupt might kick off a previously blocked thread, and cause a thread to start up at an arbitrary time in the future.

Semaphores

Reading Semaphore Internals

Is it ok to read (not write) the value of a semaphore without executing a P on it?

No, you should stick to the interface in synch.h. Semaphores do not provide an interface by which the internal semaphore count can be read. There are two good reasons for this:

Reading internals of a semaphore breaks the semaphore abstraction and encapsulation

That value will be out of date the moment it is read, and therefore code that depends on reading the internal count of a semphore cannot be correct.

Tracking When To Block

Is it OK to use the queue to know when to block/unblock threads?

No. You must maintain an internal count variable. Failure to do so will lead to not properly tracking the number of calls to V above and beyond the calls to P.

Semaphore Behavior

How does semaphore_P work?

semaphore_P does the following:

Spin on the semaphore's TAS lock. Proceed when it is obtained.

Decrement the internal count.

If the resulting value is negative, then:

Add the current thread to the semaphore's wait queue.

Atomically unlock the TAS lock and stop the thread.

else:

Unlock the TAS lock.

How does semaphore_V work?

semaphore_V does the following:

Spin on the semaphore's TAS lock. Proceed when it is obtained.

Increment the internal count.

If the resulting value is non-positive, then

Remove one thread from the wait queue, and start it.

Unlock the TAS lock.

Do we need to use TAS locks for project 1?

We haven't given you the minithread_unlock_and_stop function declaration (this comes in project two). Therefore any TAS locking in semaphores for project one would not be a correct implementation.

For Project 1 we expect you to have a correct implementation (without TAS locks). For Project 2 we expect that you will use TAS locks to ensure correctness of the semaphores.