3/06: 1. introduction to the maximum subsequence sum problem _______________________________________________________________________________ 1.0 problem statement -- english specification: find the maximum subsequence sum of a 0-terminated input sequence _______________________________________________________________________________ 1.1 example: for the input sequence: +---+----+----+----+---+ | 6 | -2 | -1 | -2 | 7 | 0 +---+----+----+----+---+ the subsequence sums are: 0 [] empty subsequence [ 6 ] subsequences ending on 6 [ 6 + -2 ] subsequences ending on (first) -2 [ -2 ] [ 6 + -2 + -1 ] subsequences ending on -1 [ -2 + -1 ] [ -1 ] [ 6 + -2 + -1 + -2 ] subsequences ending on (second) -2 [ -2 + -1 + -2 ] [ -1 + -2 ] [ -2 ] [ 6 + -2 + -1 + -2 + 7 ] subsequences ending on 7 [ -2 + -1 + -2 + 7 ] [ -1 + -2 + 7 ] [ -2 + 7 ] [ 7 ] stopping value 0 is *not* part of the sequence we want the maximum of the sums above _______________________________________________________________________________ 1.2 elaboration of english specification (terminology review & introduction): 0-terminated input sequence: + an *input sequence* is a sequence of numbers typed in the user (usually in response to input prompts) + a *0-terminated* input sequence is an input sequence containing no 0s, where the user indicates the sequence is over by entering in the *stopping value* 0, which is thus *not* considered to be part of the input sequence subsequence: + a *subsequence* ("subsequence" = "subset" + "sequence") of a vector $s$ is a sequence $s(i:j)$ of consecutive elements of $s$ in the same order. + above, we did not put any conditions on $i$ and $j$: when conditions (e.g. bounds) on positions are (partly) omitted, the omission is shorthand for "can be any ``legal'' value", i.e. "allowed to vary over the largest ``reasonable'' set of values". note: if you think we accidentally omitted conditions, then ask! it is possible we intentionally omitted them, but it is also possible we accidentally omitted them. + in this case, the largest set of legal values that would not produce an error is: $i>j$ or $1<=i,j<=length(s)$ where $1<=i,j<=length(s)$ is shorthand for $1<=i$ and $1<=j$ and $i<=length(s)$ and $j<=length(s)$ + when $i>j$, $i:j = []$, so $s(i:j) = s([]) = []$ is the empty sequence. + thus, the empty sequence $[]$ is a subsequence of every sequence, including itself, just as the empty set {} is a subset of every set, including itself. subsequence sum + by itself, the phrase "subsequence sum" means "the sum $sum(s(i:j))$ of a subsequence $s(i:j)$ of some sequence $s$" + since $s,i,j$ are omitted, again this is shorthand for "allowed to vary as widely as possible (without error)" + "subsequence sum of a( 0-terminated) input sequence" does specify the sequence $s$, but since $i,j$ are omitted, that means $i,j$ are allowed to vary as much as possible "find the maximum " + this means "find the maximum value that $blah$ can attain for all legal values of the associated variables (explicit and implicit)". _______________________________________________________________________________ 1.3 problem statement -- technical specification: compute $max sum(s(i:j))$ for a given sequence $s$ of input values i,j + the notation max f(i,j) = "max of $f(i,j)$ over all legal values of $i$ and $j$" i,j is just like ---- \ ) f(i,j) = "sum of $f(i,j)$ over all legal values of $i$ and $j$" / ---- i,j __ \ i.e. an operation (e.g. $max$ or $ ) $) applied to values $f(i,j)$ /__ as $i,j$ range over all allowed values. + again, the lack of conditions on $i$ and $j$ means they are allowed to vary as much as possible (so long as there is no error) + since the empty sequence $[]$ is always a subsequence and has sum $sum([]) = 0$, this tells us the maximum subsequence sum: + is 0 for the empty sequence $s = []$ + is greater than or equal to zero (i.e. is always non-negative) (since the empty sequence is a subsequence, the maximum subsequence sum is greater than or equal to the subsequence sum for the empty subsequence) + is 0 if every element of $s$ is negative + note: $s = []$ is a special case of "every element of $s$ is negative" note: + mathematically, $max([]) = -inf$ + but matlab returns $[]$ for $max([])$ -- grr! + in lecture, someone asked "how can $sum(T)$ be bigger than $max(T)$?" ("how can the sum of a sequence or set be bigger than the max of the same sequence or set?"), this is actually not so surprising: + consider $T = [1 1 1 1]$: then $sum(T) = 4 > 1 = max(T)$. + consider $T = []$: then $sum(T) = 0 > -inf = max(T)$. + of course, it is not true that $sum(T)$ is always bigger than $max(T)$ -- consider $T = [-1 -1]$: then $sum(T) = -2 < -1 = max(T)$. + also, observe that the sums and max are being computed over different sets: + the sums are of subsequences of the input sequence + conceptually, the $max$ operation in the definition: + IS the max of a set of the *sums* (of subsequences) + IS NOT the max of a subsequence or set of subsequences + e.g. if $s = [3 7]$ + conceptually, the max is computed on the set { sum(s([]), sum(s(1:1)), sum(s(2:2)), sum(s(1:2)) } = { sum([]), sum([3]), sum([7]), sum([3 7]) } = { 0, 3, 7, 10 } + conceptually, the max is not computed on any of these sequences: + the sequence s([]) = [] + the sequence s(1:1) = [3] + the sequence s(2:2) = [7] + the sequence s(1:2) = [3 7] + e.g. if $s = []$ + conceptually, the max is computed on the set { sum(s([]) } = { sum([]) } = { 0 }, yielding the answer 0 + conceptually, the max is not computed on the sequence $s = s([]) = []$, which would yield $-inf$ _______________________________________________________________________________ 1.4. goal: a short and efficient solution to max subsequence sum + i.e. code that is *concise, *small*, and *fast* + *concise* means the code we write/type is small + *small* means the amount of computer memory required --both to store the program and to store data-- is low, e.g. 3 numbers versus $n$ numbers. + *fast* means the amount time to run the program is short, e.g. $2n$ steps versus $n^3/6$ or $n^2$ steps.