Privacy Primer

Lecturer: Professor Fred B. Schneider

Lecture notes by Fred B. Schneider

For further reading, see:

Authentication is useful in connection with authorization, because permission to perform an operation so often depends on which principal is requesting that operation (or, more generally, on whose behalf that operation is being requested). However, the knowledge that an operation has been requested on behalf of some person could reveal things about that person---possibly even things that person would prefer to have kept secret. We defined privacy as the "right of an individual to decide for himself/herself on what terms his/her attributes should be revealed" and, therefore, we conclude that authentication is not necessarily privacy preserving.

In fact any program or service that processes data about a person could reveal information that person would wish to have kept secret. An authentication service is but one example, and privacy potentially should be a concern to the developers of many systems. This lecture---a primer on privacy---is intended for these developers. Specifically, we discuss

Thus we cover not only a context for implementing privacy but also a set of concrete steps that most would accept as constituting a "good faith" effort at supporting privacy in a system.

The Right to Privacy

An obligation to protect individual privacy predates the existence and challenges of cyberspace. Over 2000 years ago, Hebrew Law imposed restrictions on erecting a structure opposite the windows of your neighbor's house; and the Talmud stated that a person should not look into his neighbors house. English common law (in 1603) restricted the crown from invading the privacy of subjects ("The house of every one is to him as his castle and fortress.") and also provided for the punishment of eavesdroppers on conversations.

The secrecy of Postal Mail in the U.S. has roots in the postal system the British government created in 1710 for the colonies [sic]; in 1825, the U.S. Congress asserted that prying into another person's (postal) mail is illegal, followed by an 1878 U.S. Supreme Court ruling that (even) the U.S. Government, which operates the postal service, requires a search warrant in order to open first class mail.

The U.S. Constitution nowhere explicitly mentions a "right to privacy". Scholars and the courts though do find elements of a "right to privacy" scattered throughout various of the amendments, as follows.

There is, for sure, a tension between a "right to privacy" and society's desire to discourage illegal activity by successfully prosecuting crime, since success in prosecution usually requires thorough investigation (which by its nature must supersede the privacy rights of the individual to decide on whether information about him/her is revealed). This tension is typically resolved differently in the U.S. from other countries, reflecting different views of privacy. Europeans, for example, tend to believe your data remains yours, even after you have released it to somebody else; Americans tend to believe that whoever owns data has a right to disseminate it.

Moreover, at any given time, different countries and even different communities within a country may resolve this tension between the individual and society differently, selecting different points in what amounts to spectrum of possibilities. Life changed rather significantly, for example, in the U.S. after the 9/11 terrorist attacks: mandatory authentication became required for passengers on airplanes, and various individual rights about controlling information were relinquished (e.g. a new legal process to allow the government to learn what books you check out of a library) in the name of facilitating investigations to anticipate future terrorist activities.

Finally, note that the definition of privacy used in this lecture, while intended and well suited for cyberspace, is decidedly information-centric, hence narrow. Privacy is seen by most as being far broader. Some argue that privacy encompasses a right to autonomy, including rights to solitude, intimacy, and autonomy. The United Nations 1948 Universal Declaration of Human Rights stipulates a right of privacy in article 12:

"No one shall be subjected to arbitrary interference with his privacy, family, home or correspondence, nor to attacks upon his honour and reputation. Everyone has the right to the protection of the law against such interference or attacks."

Guidelines for Privacy in Cyberspace

Various sets of guidelines have been proposed and adopted by national and international bodies. These guidelines outline obligations that ensure a software system does not compromise an individual's right of privacy. They are rarely codified as laws (though they certainly do inform the content of legislation and the interpretation of existing law), so a developer's obligation to satisfy the guidelines is ("only") a moral one.

One of the better known sets of guidelines is the one adopted by the Organization for Economic Cooperation and Development (OECD), with its instantiation as the U.S. Fair Information Principles and Practices (FIPP). These principles refine into a set of obligations what it means for an individual to control what and how information is revealed; each obligation concerns some dimension of a system, its relationship with the individual, or its relationship with other systems.

Some Developer Guidelines

Besides the legal, moral, or ethical reasons for enforcing privacy rights, there are also good business reasons for doing so. Ignoring privacy erodes customer trust and exposes the developer to negative press. A good underlying strategy, then, is:
Collect personal information only if there is a compelling business and customer value proposition.
Here, we consider personal information to be

The different kinds of personal information warrant different treatments. It is worthwhile to distinguish between:

The Principle of Collection Limitation says that information should be collected only with the consent of users. Not collecting information is obviously best, since no consent would then be needed. So give very careful thought about whether that piece of information is really needed before you build software to collect it. And instead of viewing the question from your perspective as a software developer, put yourself in the shoes of a paranoid privacy advocate who isn't paid by your employer.

When information must be collected, three general modes of consent exist:

Opt-in consent should be preferred, because it increases the chances that consent is being given in thoughtful and deliberate way (even knowing that many users click "yes" to anything, which is neither thoughtful nor deliberate).

The Principle of Collection Limitation also says that users should be given notice about what information is being collected. Here, we can distinguish between, on the one hand, prominent notice, which is designed to catch the users attention, and, on the other hand, discoverable notice, where the user may have to take actions in order to find the notice. In theory, prominent notice should be preferred but it is not difficult to imagine settings where (in practice) prominent notice would become a nuisance.

There is also the matter of timing about when a notice is to be displayed. With just in time display, notice is given when the data is about to be collected. This is useful if the collection act itself is not predictable (e.g., collecting data in response to a system crash) and it gives the user an opportunity to examine exactly which data is being collected. Under first-run disclosure, the notice is displayed either the first time a user runs the program or when the program is first started up. An opportunity is afforded here to paint a broad picture of data collection choices in the context of what the program will do. However, a user who has not much experience with a program might not understand the ramifications of choices that are made at this point. Finally, in some settings it is sensible to provide installation-time disclosure. If the system installer is different from the system user, then this means the user does not see the disclosure---a problem, unless the system administrator is authorized to make privacy choices on behalf of all users (as is frequently the case in a corporate setting).

Authentication and Privacy

If we lived in a world with exactly one form of identification, then all actions by an individual could be linked and privacy would be quite limited. One way to prevent the correlation of actions is by associating multiple identities with each person. We each might (and most of us do) carry multiple different credit cards, membership cards, etc. Each card gives a different number for our identity, and that number is what is employed to track our actions. Use of just one of these forms of identification prevents the agency with which we are interacting from associating an action with those actions we might have taken using a different form of identification. The member of the "NRA" is never seen by that organization to also be a member of the "Pacifists Club", for example.

Keeping the various separate identities uncorrelated is important for supporting privacy but, unfortunately, it is becoming more and more difficult to do. For one thing, it is not all that convenient to have to carry multiple cards or to remember all those different identifiers. And many people find the idea of carrying fewer identity cards an appealing proposition, despite the reductions in privacy it implies.

In addition, the current drive for fully-mediated access brings an increase in the need for and prevalence of authentication systems. These authentication systems erode privacy by compelling individuals to reveal ever more about what they do and when. The desire by business to collect PII cheaply and unobtrusively (for marketing and advertising), leveraging the prevalence of authentication systems, coupled with the resilience of digital information fosters the increased collection and retention of PII. And governments too are under presure to streamline their interactions and reduce costs; linked PII collection provides a means to respond.

As a system developer, you may well be caught in these currents. Be mindful that adding an authentication system is not without risks to privacy. Ponder the possibilities of: