CS410, Summer 1998 Lecture 1 Outline Dan Grossman Goals: * Introduction of course flavor, material, structure, and instructor * Deal with administrative details * Review concept and usefulness of Abstract Data Types (ADTs) * Review Stacks in the ADT framework * Learn asymptotic notation: usefulness, practicality, formal definition Reading: * CLR, Chapter 1 & 2: Just skim 2.2 (it's more for reference) * Standish Appendix has a gentler presentation of asymptotic notation (what is in CLR 2.1) * Start learning Java yesterday!!! Announcements: * Course homepage http://www.cs.cornell.edu/cs410-su98 is the official source for everything. Check it at least once a day, every day (remember to hit reload). * Turn in "Who I am" form at the end of class. If you missed the first day, contact me immediately so you get a CSUGLAB computer account. * Homework 1 handed out, due Thursday. * Homework 2 available on Web (actually, I forgot to announce this) Course organization: * This is a summer course -- it's gonna fly! * Textbook -- CLR, not used previous two semesters * Lectures -- ask questions because I already know this stuff! * Quizzes (lots of easy points) * Homeworks (8 total, 3-4 programming) * Programming language: Java * Exams * Academic Integrity -- do not cheat in my class; you will fail the course. Course Introduction: * What is a data structure, what do we do with it, and why is it a useful concept. * Standard techniques are decades old, apply in wide-range of computing environments, and are the workhorses of computation. * Claim: You cannot be a competent programmer without knowing the material of this course in your sleep. * Approximate syllabus available: basically foundations, trees, hashtables, sorting, graphs, and some other "one-day" topics Abstract Data Types: * Separate specification from implementation. We will use the former as the requirement, and study how to do the latter efficiently. * The Stack example: define operations push, pop, size without appealing to implementation. Now consider multiple implementations: linked list, array of fixed size, array that resizes by adding 10, array that resizes by doubling. * Incomplete list of advantages of ADTs: documentation, code re-use, code evolution, code readability. For us, it defines our problems clearly. Asymptotic Notation: * We judge data structures on the resources they use to perform the desired tasks. The main resources we consider are time and space (and often just time). * We want to be precise (so our hand-waving about Stacks is unacceptable), but generally applicable (so using a stop-watch is insufficient). * So let's express resource usage in terms of the problem. * Concept: Ignoring the constant overhead (becomes insignificant for the large cases, and the large cases are the interesting ones) * Concept: Ignoring the lower order terms (becomes insignificant again) * Concept: Ignoring the multiplicative constants (depend on low-level details, but can be an important _second_ concern). * Capture all this with big-O as an upper bound and big-Omega as a lower bound * Appeal to graphical verison of functions for intuition * Formal definition of O, Omega, and Theta: f(n) is O(g(n)) if there exist constants c and n_o such that f(n) <= cg(n) for all n > n_o f(n) is Omega(g(n)) if there exist constants c and n_o such that f(n) >= cg(n) for all n > n_o f(n) is Theta(g(n)) if f(n) is O(g(n)) and f(n) is Omega(g(n)) (Notice that we can use different c's in showing each -- we can use different n_o too, but we can also just take the maximum of the two) * Common examples: 1, log n, n, nlogn, n^2, n^3, 2^n, nm, n+m * Example algorithms that take time Theta(each thing listed above) 1: add/multiply small numbers, index into an array, write variables log n: add big numbers, cut problem in half at each iteration n: Searching through unsorted records nlogn : some sorting algorithms n^2: some sorting algorithms, anything that takes time proportional to the sum of 1,2,...,n n^3: matrix multiplication 2^n: nothing practical, but lots of things we'd like to do. Next time: We will express the running time of algorithms recursively and then bound them with asymptotic notation. Here are the code fragments used in the lecture (by fragments I mean I just tried to get the idea across -- these won't run as is): ///////////// Linked list stack implementation class List { Object val; List next; // null for empty list } class Stack { private List theStack; private int stackSize; public Stack() { theStack = null; } public void push(Object obj) { stackSize++; // a constructor in the List class could do this "all at once" // but this is what is really going on List l = new List(); l.val = obj; l.next = theStack; theStack = l; } public Object pop() { if (stackSize == 0) { // error -- see Thursday's lecture } stackSize--; Object ans = theStack.val; theStack = theStack.next; return ans; } public int size() { return stackSize; } } ////////////// Array stack implementation class Stack { private Object[] theStack; private int stackSize; public Stack() { theStack = new Object[START_SIZE]; stackSize = 0; } public void push(Object obj) { if (stackSize = theStack.length) { // in different versions we did the new size // differently -- either added ADD_SIZE, or multiplied // by 2. Object[] newStack = new Object[theStack.length...] for (int i = 0; i < theStack.length; i++) newStack[i] = theStack[i]; } theStack[stackSize] = obj; stackSize++; } public Object pop() { if (stackSize == 0) { // error -- see Thursday's lecture } stackSize--; return theStack[stackSize]; } public int size() { return stackSize; } }