Lab 10: The Arraylist

CS 3410 Spring 2017


Due: Sunday, April 16, 2017 at 11:59 PM. Submit all required files on CMS.


Overview

In this lab we will implement 3 functions in the file lab10.c for an arraylist of ints: arraylist_add, arraylist_insert, and arraylist_free. You will also gain experience with structs, dynamic memory and debugging code using GDB.

References. Searching the Internet will generally find an answer to nearly any conceivable question about the C programming language. However, information on the Web is not always accurate or complete. We recommend reference books over web sites. In particular, C: A Reference Manual (5th Edition) by Samuel P. Harbison and Guy L. Steele Jr. is a great textbook for this class and should be the source for reliable information on C.

Structs

Just like in your previous languages, C allows you define your own types. Somewhat like classes (but without methods!), structs are a way to define a group of fields to be stored together. They're often the basis of data structures.
typedef struct {
  void** buffer;
  unsigned int buffer_size;
  unsigned int length;
} arraylist;
You can use a . to access the fields of a struct (again, like classes). If you have a pointer to a struct, you can use -> to both dereference and access.
arraylist a;
a.length = 5;
arraylist *a_ptr = &a;
a_ptr->length = 10;
(*a_ptr).length = 15;
That is, a->b is the same as (*a).b, but much easier to read and write.

Dynamic Memory

In C, primitive values and even arrays are often declared on the stack. This is known as static memory or static allocation. This is fast and efficient, but a major drawback is that the size of objects on the stack must be known at compile-time. Therefore, such common tasks as dealing with variable-length arrays cannot be done using only static memory.

C also allows the programmer to use dynamic memory, which is asking the operating system for more memory at runtime, and then using that newly allocated memory. While allocating memory does take time, dynamic memory is incredibly flexible and powerful, since any amount of memory can be requested and used at runtime (only limited by the amount of memory on the system).

The two functions in C to deal with dynamic memory (defined in stdlib.h) are malloc and free. malloc allocates new dynamic memory. It takes a single argument, the amount of bytes to allocate, and returns a void* pointer to the newly allocated memory. A void* pointer can point to objects of any type (e.g. int, float, or even a struct potato_salad) and can be cast to the appropriate type the programmer wants to use. Commonly used with malloc is the sizeof operator, which returns the size in bytes of any type defined in the program.

int *int_ptr = (int*)malloc(sizeof(int)); //allocates room for one int
int *int_ptr = (int*)malloc(5 * sizeof(int)); //allocates room for five ints

Any memory allocated using malloc must eventually be freed by a corresponding call to free. The free function takes a single argument: a pointer that was allocated using malloc. It releases the associated memory back to the operating system. After a pointer has been freed, accessing or using that pointer again leads to undefined and unstable behavior. Don't do this!

The code below is a simple example of dynamic memory in C, including the use of pointers as arrays.

int *int_ptr;
int_ptr = (int*)malloc(sizeof(int));
*int_ptr = 5;
printf("I stored the int %d at address %p\n", *int_ptr, int_ptr);
free(int_ptr); // don't forget to free memory when you're done with it!

/* Since arrays in C are just contiguous regions of bytes, pointers can also
 * be used to point to arrays. */
int *int_arr;
int i;

// here, we malloc enough space for 5 ints - creating an int array of length 5!
int_arr = (int*)malloc(5 * sizeof(int));
for (i = 0; i < 5; i++) {
  int_arr[i] = i;
}

// now, int_arr points to the int array [0;1;2;3;4]
free(int_arr);

The Arraylist

You can find the lab10.c file in your github repository.

Step 1: Implement the arraylist_add(arraylist* a, void* x) function

The elements of an arraylist are stored consecutively in a dynamically allocated array. This function takes a pointer to an arraylist and a value, and appends that value to the end of the arraylist. It should also increment the length of the arraylist to include the new element.

If this is a conceptual drawing of what the arguments to the function look like at the beginning of a call to arraylist_add:

Then this is a conceptual drawing of what the arguments should look like at the end of the function arraylist_add:

If the arraylist is currently at capacity when arraylist_add is called, you should create a new dynamic array that is twice the size of the old one and copy over all the elements currently in the arraylist. You may find that the standard library function realloc is useful for doing this.

Step 2: Implement the arraylist_insert(arraylist* a, unsigned int index, void* x) function

This function should take advantage of your arraylist_add to insert an element at an arbitrary location in your arraylist. You do not have to worry about attempting to insert to locations beyond the end of the arraylist (doing so it undefined behavior). Hint: Use memmove()

Step 3: Implement the arraylist_free(arraylist* a) function

The arraylist_free simply has to free all the resources used by an arraylist. Note that the arraylist struct contains a pointer!

Step 4: Test

When you are done implementing the functions, compile your program using gcc and run it.

$ gcc -ggdb -Wall -Werror -std=c99 -o lab10 lab10.c
$ ./lab10

The program is meant to output:

[0, 1, 2, 3, 4, 5]
Insert position 0: [100, 0, 1, 2, 3, 4, 5]
Insert position 1: [0, 100, 1, 2, 3, 4, 5]
Insert position 2: [0, 1, 100, 2, 3, 4, 5]
Insert position 3: [0, 1, 2, 100, 3, 4, 5]
Insert position 4: [0, 1, 2, 3, 100, 4, 5]
Insert position 5: [0, 1, 2, 3, 4, 100, 5]
Clean: [0, 1, 2, 3, 4, 5]

Unfortunately, you will discover that the program contains a bug that causes it to print slightly different output. Advance to the next step to discover how to use GDB to find it.

Step 5: Debugging

Start by running GDB with

$ gdb ./lab10
GDB will start by printing some license information and loading your program. You will then be presented by a prompt. You can find detailed documentation and tutorials online. Some of the most useful commands are: After starting GDB, the first thing you'll want to do is place a breakpoint. Type break arraylist_print to place a breakpoint at the beginning of the arraylist_print() function. Then type run to start your program. Soon you will find that execution comes grinding to a halt. Type backtrace to see the call stack. You can now use print to view the values of variables. (Go ahead and type print a, which prints a pointer, print a->length, which prints a particular field in the struct that a points to, and print *a, which dereferences the pointer and prints the entire structure it points to.) Pay special attention to the contents of the arraylist. You can continue execution by typing continue. Once you have a theory of what might be going wrong, you'll want to run your program again. This time before doing so, set a different breakpoint in the code where you think the issue might be. Again inspect the program state to see if you've tracked down the problem. If not, try somewhere else in the program and repeat.

Step 6: Valgrind

You should use valgrind to check your code for memory leaks.

$ valgrind -q --leak-check=full ./lab10

When validating memory leaks, make sure you're using valgrind to verify that there are no leaks. However, if you're struggling to figure out the output from valgrind or other errors in your code, we offer another tool on the VM called scan-build (Clang static analyzer). To run this, call (scan-build -k -V ./lab10). It will generate an HTML page that will show existing issues in the code if it can detect some. This tool will not detect all the same issues though that valgrind will, so you cannot rely on it as proof of no leaks in your code.

Step 7: Submit your final program on CMS

Congratulations, you've completed Lab 10 and now have a even deeper understanding of arrays, pointers, and memory management in C!