Lecture 21: Division and number bases

Notation
- ℤ, quot(a, b), rem(a, b)
- bad (but common) notation: a/b, a, amodb
Euclidean division
- statement, existence, uniqueness
Base b representation
- interpreting a string of digits in base b
- converting to base b

Notes on Notation

ℤ denotes the set of all integers: ℤ = {…, − 2, −1, 0, 1, 2, …}.
For this section of the course, unless stated otherwise, variables will be elements of ℤ. I will typically use a, b, c and n, m for integers, and will reserve variables like x and y for other types of things.
Don't use division when working with the integers, even if you believe that things divide evenly. Find a way to write what you are saying using only multiplication and addition. It will make your life easier.
quot(a, b) and rem(a, b) denote the quotient and remainder of a by b, as defined by the Euclidean Division Algorithm below.
Many programming languages use a/b to denote quot(a, b). This is a bad idea and will lead to confusion. Use the same notation as in the division algorithm below.
Many languages use a to refer to rem(a, b). This is reasonable notation, except that every programming language interprets this symbol differently when given negative numbers.
Many books use write a mod b to refer to rem(a, b). This notation leads to massive confusion when we discuss modular arithmetic. Don't use it.

Euclidean division

Claim (the Euclidean Division Algorithm): For any a, and any b ≠ 0, there exists integers q and r with 0 ≤ r < b and a = qb + r. Moreover, q and r are unique.

Notation: q is called the quotient of a by b, and r is called the remainder. They are written quot(a, b) and rem(a, b) respectively.

Proof (existence): We prove this only for a ≥ 0 and b > 0. You can prove the negative cases by using the existence in the positive cases.

Proof is by induction on a. Let P(a) be the statement that for all b > 0, there exists q, r as in the statement above.

To prove P(0), choose q = r = 0.

To prove P(a + 1), assume P(a). Then we know a = q′b + r′ for some q′ and r′. Then a + 1 = q′b + (r′+1). If r′<b − 1 then r′+1 < b, so we can choose q = q′ and r = r′+1, which gives a = qb + r.

However, if r′=b − 1, then this choice of r doesn't satisfy the requirements of the theorem. In this case, though, we have a + 1 = q′b + b = (q′+1)b + 0. Thus choosing q = q′+1 and r = 0 yields a = qb + r as required.

Proof (uniqueness): To show uniqueness, we must show that if a = qb + r and a = q′b + r′ (with both r and r′ in the range [0, b)), then q = q′ and r = r′.

Suppose that qb + r = q′b + r′. Then we have r − r′=(q′−q)b (). Now, we know 0 ≤ r < b and −b < −r′≤0. Adding these equations yields −b < r − r′<b. But we also know that r − r′ is a multiple of b by (), so r − r′ could be (for example) −2b, −b, 0, b, 2b, etc. But the only multiple of b between −b and b is 0, so r − r′ must be 0. Thus r = r′.

Plugging this in to equation (*), we see that (q′−q)b = 0. Since b > 0, we have that q′−q = 0, so q = q′, as required.

Working in base b

Base b representation is a way to write numbers using the digits {0, 1, …, (b − 1)}.

Common bases:

you use base 10 (decimal) every day (digits are {0, 1, ..., 9})
base 2 (binary) uses digits {0, 1}. It is convenient for digital logic, a digit (called a bit) can be represented using a single wire: the wire has high voltage for 1, low for 0. Binary numbers are often designated by a trailing b: for example 1101b.
base 16 (hexadecimal) uses the digits {0, 1, 2, ..., 9, A, B, C, D, E, F}. it is useful becausee a single digit can be represented using 4 bits. Hex numbers are often written with a prefix of "0x": for example 0xFC39.
base 8 (octal) uses the digits {0, 1, 2, ..., 7}, and is occasionally used when 3-bit numbers are useful.

A string of digits in base b, written (a_na_n − 1...a₃a₂a₁a₀)_b, represents the number a₀b⁰ + a₁b¹ + a₂b² + ⋯ + a_nbⁿ.

Arithmetic in base b

There are two ways to work with numbers in base b. The hard way: convert to base 10, apply the algorithms you already know for addition and multiplication, convert back.

The easy way: use the algorithms you already know for addition, multiplication, division, but remember that (10)_b stands for b and not 10.

We did examples with long addition. Long multiplication and division work the same way.

Writing a number in base b

Theorem: for any a and b ≥ 2, you can write a in base b. That is, there are digits d₀, d₁, ..., d_n such that (d_nd_n − 1...d₂d₁d₀)_b = a.

Note: as with the division algorithm, the proof of this theorem contains the algorithm used to construct the base-b representation.

Proof: by strong induction on a. In the base case, we can choose d_i = 0. Then since b > 1, b > d₀, and a = d₀ = (d₀)_b.

For the inductive step, assume that any number k < a can be written in base b. We wish to write a in base b. To do so, use Euclidean division to write a = qb + r. Let d₀ = r. By the inductive hypothesis, we can write q in base b: q = (d_l′d_l − 1′...d₂′d₁′d₀′) (we have to check that q < a, but this is true).

It turns out that these digits of q are also the higher digits of a. That is, d₁ = d′₀ and d₂ = d′₁ and so on. To check this, we know:

$$q = \sum_{i=0}^l d_i'b^i$$

We also know that a = qb + r, so

$$a = qb + r = b(\sum_{i=0}^l d_i'b^i) + r = d_0 + \sum_{i=0}^l d_i'b^{i+1} = d_0 + \sum_{i=0}^l d_{i+1}b^{i+1}$$

By changing i to j − 1 in the summation this becomes

$$a = d_0 + \sum_{j=1}^{l+1} d_jb^j = \sum_{j=0}^{l+1} d_jb^j$$

which shows that (d_i) is the base b representation of n.

Note: we stated (but did not prove) that the base b representation is unique (just as quotient and remainder are), this is a good exercise. It follows directly from the uniqueness of the quotient and remainder.