Authentication of Humans

People aren't computers. They don't have the computational or storage capacity. So the mechanisms to authenticate humans are considerably different from the mechanisms to authenticate machines. (Though they both have in common the notion of secrets.)

These aren't always clear-cut categories. A sheet of passwords, each valid only once, could be "know" or "have." A finger could be "have" or "are."

Frequently these are combined. Use independent methods from each of two categories, and you have two-factor authentication, e.g., using an ATM card requires "have" (card) and "know" (PIN). The general case is called multi-factor authentication.

Identity

What is an identity? A name? A netid? An email? A URL? An IP address? An X.500 distinguished name (/C=US/O=CORNELL/OU=CS/CN=Michael Clarkson)? Other attributes, like your citizenship, your credit score, your political party?

In this course, we'll say that an identity is a set of attributes; each attribute is a statement about or property of a principal. You have many identities that you present to those around you. Some of them might uniquely identify you, others might not. An identifier is an attribute that is associated with exactly one principal, perhaps within a given population.

Enrollment is the process of establishing an identity. We go through enrollment protocols all the time, e.g.,

The amount of work that the principal enrolling us does varies widely. Websites rarely verify many of our attributes, but governments issuing travel documents usually do. And we can pay to get various levels of verification from companies like Verisign.

Enrollment is tricky to design. It's where the digital world interfaces with the real world, so there's no fully technical solution.

Privacy

When authenticating humans, privacy is an important concern:

So authentication of humans must be handled carefully.

Here are some guidelines for privacy in human authentication:

Something You Are

"Something you are" is authentication based on biometrics. Biometrics are a measurement of your physical or behavioral traits, e.g., your fingerprint, face, iris, retina, hands, or DNA. To be usable for authentication, a biometric must be (i) an identifier within the population; (ii) invariant over time (N.B., kids' fingerprints change); difficult to spoof (proof of life?); and easy to measure.

Biometric measurement suffers from the problems of false positives and false negatives, so biometric authentication mechanisms can incorrectly accept or incorrectly reject an authentication request. Which is better depends on context. And both are bad: on commercial flights, a false negative or false positive rate of just 1% could have serious consequences.

Another problem with biometrics is updating of identities. If a fingerprint is disclosed, how do you issue the human a new finger? What about a new retina??

But despite these problems, biometrics are attractive. You can't lose them, forget them, or share them.

Something You Know

"Something you know" is authentication based on a human's knowledge of a secret. The secret is usually a PIN (short numeric code), password (short string), or passphrase (longer string). I'll write "password" from now on, but everything we talk about is relevant to all three.

Passwords have a life cycle:

Passwords unfortunately do get disclosed to attackers sometimes. How might that happen? They could be found on post-it notes, revealed by the user themself (social engineering), guessed by the attacker (online guessing), or cracked by the attacker who obtains the password database (offline guessing).

Prompting for passwords. Implicitly, there's always some kind of mutual authentication going on. Before the human enters a password, she has to decide whether to enter it. (Many people don't think about this very carefully, contributing to the success of phishing attacks.) Some systems go to greater lengths to achieve mutual authentication:

Do visual secrets really work? The basic assumption underlying their use is that users can't discern whether they're really interacting with their bank or not (e.g., by looking at the browser's title bar.) Consider a man-in-the-middle (MitM) attack, in which the attacker interposes between the human and bank. The attacker learns the human's uid, forwards it on the bank, receives back the user's visual secret, and displays it to the user. The user then enters her password for the attacker. GAME OVER.

What would prevent this attack? Only the human noticing that website isn't really the bank's. But that contradicts our initial assumption. So visual secrets don't actually, fully prevent phishing attacks. QED.

Visual secrets do raise the bar, though, by making it (a little) harder for attackers to mount phishing attacks. Maybe every little bit helps.

Finally, note that visual secrets are trying to solve a UI problem in the browser—enabling the user to easily identify the remote server. Perhaps that's the problem more worth solving, but it's a really hard one. Users are quite bad at detecting counterfeits!

What should happen when password-based authentication fails? I.e., how should an application react in response to successful or failed authentication? It could be useful to tell successfully authenticated users when they last logged in, the number of failed attempts since then, and maybe even when those attempts were. When authentication fails, the identity might be under online attack. Possible responses include rate limiting future attempts to authenticate under that identity, and eventually disabling the identity (though that creates an availability attack). Informative error messages tend to hurt security here.

A related question is: should an application prompt for a username first, or at same time as the password? Prompting first creates a new vulnerability, in that it enables attackers to guess valid usernames. So it's better to prompt at the same time, and not give any indication of which was wrong, the username or the password, if authentication fails. (Note how visual secrets might disobey this rule.)

Recovering or changing passwords. Users forget passwords. Systems cause passwords to expire. And users want to change passwords. So recovery and change are really important parts of the password lifecycle.

Unfortunately, these parts tend to receive less attention than the rest of the system and consequently might be poorly designed and tested. That makes them attractive targets for attacks, rather than going after the (presumably well-engineered) primary authentication mechanism.

Recognize that recovery is the authentication problem all over again: the system must authenticate user by some other means than a password. Standard solutions on the web today exemplify that: