# Tokens We've talked about authentication of humans as involving something you know, have, or are. The prototypical example of "have" is *authentication tokens*: the human is issued a machine called a *token*, that machine becomes an attribute of the human's identity, and authentication of the human reduces to authentication of the machine. So you can think of authentication tokens as being an intermediate step between authenticating humans and authenticating machines. User convenience is an important issue. The machines being used as tokens should be small, light, require no maintenance (especially of a battery), and should cost very little. So issuing laptops isn't realistic. Rather, real-world examples include ATM cards, prox cards (like your Cornell id card), RSA SecurID tokens, and IronKey tokens. ## Local authentication **Goal:** Authenticate a human Hu to a local system L using a token T. **Threat model:** The adversary might read and replay messages but can't change them in the middle of a protocol execution. So we're considering only a limited form of man-in-the-middle (MitM) attacks. This matches well with real-world scenarios, in which wired keyboards or short-range radios (e.g., RFID) are used. **Enrollment:** At enrollment, associate identifier id_T with identifier id_Hu. The protocol we give next is an example of a *challenge&ndash;response* protocol. These are well-known from old spy movies: two agents who haven't met before authenticate each other by speaking an appropriate challenge and response, for example, > - Bond: Can I borrow a match? > - Driver: I use a lighter. > - Bond: That's better still. > - Driver: Until they go wrong. > &mdash;*From Russia with Love* Here's our first attempt at a protocol. We assume that T is preprogrammed with a fixed response to a fixed challenge: ``` 1. Hu->T: I want to authenticate to L 2. T->L: id_T 3. L: look up challenge question for id_T 4. L->T: challenge 5. T->L: response 6. L: look up id_Hu and response associated with id_T; id_Hu is authenticated if response is correct ``` Note that the human never declares her identity to either T or L. Instead, the token sends its identifier to L, and L is responsible for looking up the association between id_Hu and id_T. Also note that, although the protocol is written such that L and T send messages directly to one another over a channel, Hu might in fact be involved in implementing that channel. For example, T might display a message on a screen, and Hu might type that message into L's keyboard. But that protocol is vulnerable to replay attacks; an attacker need only eavesdrop and learn the response. Let's fix that by using unpredictable challenges and digital signatures. **Authentication with Token and Asymmetric-key Cryptography** We assume that T stores a signing key k_T for which L knows the verification key K_T: ``` 1. Hu->L: I want to authenticate with T 2. L: invent unique nonce N_L 3. L->T: N_L 4. T: compute s=Sign(N_L; k_T) 5. T->L: id_T, s 6. L: lookup id_Hu and K_T associated with id_T; id_Hu is authenticated if Ver(N_L; s; K_T) ``` This protocol fixes the replay vulnerability, because nonce N_L is different every execution. But historically, tokens don't use this protocol, in part because asymmetric-key cryptography hasn't always been available on mobile devices. There's no security reason you couldn't use it now with modern devices&mdash;for example, Android or USB flash drive. But for performance (time and power) reasons, less expensive symmetric-key cryptography seems to be preferred. **Authentication with Token and Symmetric-key Cryptography** Assume that T and L share a secret MAC key k_T: ``` 1. Hu->L: I want to authenticate with T 2. L: invent unique nonce N_L 3. L->T: N_L 4. T: compute t=MAC(N_L; kT) 5. T->L: id_T, t 6. L: lookup id_Hu and kT associated with id_T; consider id_Hu authenticated if t=MAC(N_L; kT) ``` We just introduced a key distribution problem, in that we now have to arrange for T and L to share a key. But we already have to solve the problem of physically issuing tokens; piggy-backing installation of a shared key isn't hard. A disadvantage of symmetric cryptography is that L must now store the shared key k_T. Hashing and salting that key wouldn't work, because L actually needs to know the plaintext key. Alternative methods of storage can include using secure co-processors (e.g., trusted platform modules, hardware security modules) to protect the secrecy of those keys on disk. ## Two-factor authentication There's a risk that T might be stolen. With the protocols above, a stolen T could be used by anyone to authenticate as id_Hu. We can use a second factor, a PIN (something you know), as a countermeasure: ``` 1. Hu->L: I want to authenticate with T 2. L: invent unpredictable nonce N_L 3. L->T: N_L 4. T->Hu: Enter PIN on my keyboard 5. Hu->T: pin 6. T: compute t=MAC(N_L, pin; kT) 7. T->L: id_T, t 8. L: lookup id_Hu, pin, and kT associated with id_T; id_Hu is authenticated if t=MAC(N_L, pin; kT) ``` The PINs stored by L could be salted and hashed, as long as T can also be configured to store the salt. Then the tag would be computed as `MAC(N_L, H(pin,salt); k_T)`, and L would store `H(pin, salt)` for each user. Alternatively, the PIN could be stored locally on the token, and the token would authenticate the human, followed by the local system authenticating the token. In this design (as well as the original), the token must be careful not to enable online guessing attacks that reveal the PIN. ## Remote Authentication **Goal:** Authenticate a human Hu to a remote system S using a token T and local system L. **Threat model:** On the channel between T and L, the adversary might read and replay messages but can't change them in the middle of a protocol execution. But on the channel between L and S, the adversary is Dolev-Yao. In this new threat model, we have to worry about the channel between L and S. So, first, we can establish a secure channel between L and S. Then we can run an adaptation of the two-factor authentication protocol. Any messages sent from T to S in the original protocol now are sent from T to L to S. **Remote Two-factor Authentication with Token and PIN** ``` 1. Hu->L: I want to authenticate with T to S 2. L and S: establish secure channel 3. S: invent unpredictable nonce N_S 4. S->L->T: N_S 5. T->Hu: Enter PIN on my keyboard 6. Hu->T: pin 7. T: compute t=MAC(N_S, pin; kT) 8. T->L->S: id_T, t 9. S: lookup id_Hu, pin, and kT associated with id_T; consider id_Hu authenticated if t=MAC(N_S, pin; kT) ``` ## Case Study: RSA SecurID Tokens <img width="25%" src="rsa_token.gif"/> The RSA SecurID token is a commercial product with an LCD display and internal clock, but with no means for input. It can compute hashes, has a secret stored on it, and is *tamper resistant*, such that it's hard to physically extract the secret. The LCD displays a *code* that changes every 60 seconds. Consider what follows to be a hypothetical protocol that could work with hardware similar to the SecurID token. The main ideas behind this protocol are (i) to replace the nonce used in remote two-factor authentication by the time at the clocks of T and S, and (ii) to use L rather than T to input the PIN. ``` 1. Hu->L: I want to authenticate as id_Hu to S 2. L and S: establish secure channel 3. L->Hu: Enter PIN and code on my keyboard 4. T->Hu: code = MAC(current time at T, id_T; kT) 5. Hu->L: pin, code 6. L: compute h = H(pin, code) 7. L->S: id_Hu, h 8. S: lookup pin, id_T, and kT associated with id_Hu; id_Hu is authenticated if h=H(pin, MAC(current time at S, id_T; kT)) ``` The clocks on T and S might not be perfectly synchronized (and messages might get delayed, etc.). So S might need to check a few different values of t in the final step, as well as store additional information about the last time T was successfully used to authenticate, and about the observed drift of T's clock; Schneider [section 5.2] provides details. ## Case Study: S/KEY S/KEY is an authentication protocol based on *one-time passwords* (OTPs): passwords that are valid only once for authentication. It's useful in situations where a user might not trust a local machine sufficiently to reveal their main password to it, or for backup passwords in case a main password is lost, or (less commonly now) when passwords need to be sent in cleartext across an untrusted network. The passwords used in S/KEY are generated algorithmically using *hash chains*, which are sequences constructed by iterating a hash function. Let `H^i(s)` denote the iteration of function `H` on input `s`, for a total of `i` applications. That is, `H^0(s) = s`, and `H^(i+1)(s) = H(H^i(s))`. Hence `H^1(s) = H(s)`. One possibility would be to establish some shared secret seed `s` between the system and the user, then let the user's first OTP be `H^1(s)`, second OTP be `H^2(s)`, and so on. But an untrusted local machine that learns any one of the OTPs in the chain would be able to compute all the future passwords in the chain. So that doesn't work. A clever idea, due to Leslie Lamport (1981), is to construct the hash chain backwards. For sake of example, let's suppose the user wants 10 OTPs. Then the first OTP is `H^10(s)`, the second OTP is `H^9(s)`, ..., and the last OTP is `H^1(s)`. This limits the number of OTPs (since the user must decide in advance where they want the chain to begin), but it defeats the local machine from computing future passwords: because cryptographic hash functions are one-way, it's computationally infeasible to compute `H^(i-1)(s)` from `H^i(s)`, which is `H(H^(i-1))(s)`. To make this idea usable by humans, we need to offer the human assistance in computing the hash chain. One possibility is to issue the human an electronic token that can compute hashes and stores the secret `s`. But another possibility is to instead issue the human the list of OTPs on paper; the human stores that paper securely somehow (perhaps in their wallet, perhaps in another trusted physical location), and accesses it when necessary. The paper is essentially a token in this scenario. It looks like this: ``` 50: H^50(s) 49: H^49(s) 48: H^48(s) ... ``` Of course, the paper doesn't literally have `H^50(s)`, but rather the bit string corresponding to that value. Humans aren't good at typing bit strings without introducing errors, though. So for usability, S/KEY chunks the bit string into several blocks, treats each block as an integer, and uses that integer as an index into a dictionary of short words. Such words are easier to type without errors. With that modification, we get what the S/KEY paper actually looks like: ``` 50: MEND VOTE MALE HIRE BEAU LAY 49: PUG LYRA CANT JUDY BOAR AVON 48: LOAM OILY FISH CHAD BRIG NOV 47: RUE CLOG LEAK FRAU CURD SAM 46: COY LUG DORA NECK OILY HEAL ... ``` When the user authenticates, the system prompts them to enter the next unused OTP; the user does so, then crosses it off their list. Neither the user nor the system should ever be willing to "go back in time" and provide or accept a password that one or both of them believes has been used before. Furthermore, network messages sometimes get dropped, or users might accidentally cross off extra passwords, etc. So the index of the next password that each principal expects could become desynchronized. For that reason, principals must be willing to move forward to "future" passwords and skip over passwords that might have been lost in transit. Eventually the user will run out of passwords. (In fact, DoS could even become an issue, with an adversary causing the principals to believe that all passwords have been used up.) When the user runs out of passwords, one possibility is that they can go back to the system administrator and ask for a new printout of passwords, which will be based on a different `s`. In this case, the user never really needs to learn what that seed is; it's implicitly present on the paper, but the user doesn't have to know it, to type it, etc. But having to involve the sysadmins is not ideal. So another possibility is to make the `s` be a passphrase chosen by the user and never changed, as well as to incorporate what S/KEY calls a "salt" but functions more like a unique nonce or sequence number. The OTPs are constructed as `H^i(pass, salt)`. Whenever the user wants a new sheet of passwords, they contact the server, request a new salt, and use a trusted local machine to generate and print the new passwords by themself. The authentication protocol is thus as follows. Assume that S stores `(id_Hu, n_S, salt, last)` for each user, and that the user stores `(pass, n_U)`. ``` 1. Hu->L->S: id_Hu 2. S: lookup (n_S,salt,last) for id_Hu 3. S->L->Hu: n_S 4. Hu: n = min(n_Hu, n_S) – 1; if n<=0 then abort else let p = H^n(pass, salt); // lookup on paper n_Hu := n // cross off on paper 5. Hu->L->S: n, p 6. S: if n<n_S and H^(n_S-n)(p)=last then n_S := n; last := p; id_Hu is authenticated ``` Back in the early 2000s, the Cornell CS department used S/KEY for logging in to our network over dial-up telephone lines. You went to an office on the third floor of Gates to ask for a sheet of OTPs. They instructed you to protect it like a $20 bill. ## Case Study: Remote Keyless Entry Authentication tokens are often used for entry into cars, garages, and buildings. In these situations, usability and deployability often trump security. So the protocols and tokens themselves tend to be quite simple. One common design for remote keyless entry (RKE) is based on a counter that (contrary to our discussion about building secure channels) is permitted to overflow. It works as follows. There is a *barrier* B to which the token T authenticates. Think of the barrier as being a door or a gate. The token usually has a physical button that is pressed, or is physically held near a wireless reciever. That physical action is what triggers the authentication protocol. The barrier stores a master key `mk`. Every token has an identifier `id_T` and a symmetric key `k_T` that it stores. That key is not a truly random number, but instead is `H(mk,id_T)`. Further, each token stores the current value `n_T` of its counter. Every time authentication is attempted, the value of that counter is incremented. The barrier stores, for each token, the last successfully authenticated counter value `last_T` for that token. The authentication protocol is as follows: ``` 1. Hu -> T: I want to authenticate to B // physical action 2. T: t = MAC(id_T, n_T; k_T); n_T := n_T + 1 3. T -> B: id_T, t 4. B: lookup id_T to find last_T; compute t' = MAC(id_T, last_T + 1; H(mk,id_T)); if t=t' then Hu is authenticated; last_T := last_T + 1 ``` A practical problem is that `last_T` and `n_T` might become desynchronized; for example, the user accidentally pushes the button on T when they are not in proximity for B to receive the message in step 3. One solution is for B to accept not just `last_T + 1` in step 4 but `last_T + k` for some integer `k`. That provides a *window* of nonces that can be accepted at any given time. There is now an engineering tradeoff between usability and security: make the window large, and it's easier to use but less secure. Another solution is to force the user to manually re-synch T and B after gaining access past B by some other means (e.g. a physical key). The increment operations on `n_T` and `last_T` above are permitted to overflow; they "wrap around" back to zero. Hence this kind of protocol is known as a *rolling window*. ## Exercises 1. KeeLoq is a technology used in some RKE systems. Based on [this technical note][keeloq] from the manufacturer, reverse engineer the authentication protocol as best you can. How does it handle the issues of overflow, desynchronization, and manual programming? 2. Suppose that the password at index i in S/KEY were defined to be `H(pass, salt, i)` instead of `H^i(pass, salt)`. What tradeoffs would this create? 3. Theft of the sheet of S/KEY passwords could enable an attacker to impersonate a user. Design a countermeasure. Write down the new protocol and password construction mechanism. *Hint: be inspired by the discussion of theft of tokens.* 4. Google Authenticator is a software implementation of an authentication token. The *soft token* needs to know its secret, which on a hardware token would presumably be stored in tamper-resistant hardware. How does a user install the secret into a Google Authenticator soft token? How might the soft token protect the secret? How does this [public implementation][ga] attempt to protect the secret? [keeloq]: http://ww1.microchip.com/downloads/cn/AppNotes/cn010992.pdf [ga]: https://github.com/google/google-authenticator