Chapter 18

Common Knowledge of Rationality

In this chapter, we turn to using our model of knowledge to reason about games.

18.1  Knowledge and Games

We must first define what it means for a knowledge structure to be “appropriate” for a game 𝒢:

Definition 18.1. We say that a knowledge structure tuple M = (n,Ω,X,s,P) is appropriate for a finite game 𝒢 = (n,A,u) if:

  • n = n (the players are the same)
  • X = A (the outcome in each world is an action profile in the game)
  • For all ω Ω,i [n], and all ω Pi(ω), si(ω) = si(ω), where si denotes the ith component of s. (That is, players always know their own strategy.)

In other words, each world models what action each of the players take; players have beliefs about what others are doing and believing, and we assume players always know their own action. See Figure 18.1 for a simple knowledge structure appropriate for the Prisoner’s Dilemma game.

Figure 18.1: A knowledge network for the Prisoner’s Dilemma. This graph looks similar to the one we considered for the muddy children puzzle, but notice that there each player knows the other player’s state but not their own, whereas here each player knows their own strategy but not the other’s. Also observe that, while in this network, there is exactly one state per action profile, this does not necessarily need to hold; other networks for the same game may have several nodes with the same action profile but different beliefs.

We can now define an event RATi, which denotes the event that player i is acting “rationally.” We take a conservative view and provide a minimal (and weak) definition of what it means to be rational. Basically, we define what it means to not be “crazy” or “totally irrational”: following the discussion in chapter 1, we say that player i is (minimally) rational at a world ω if there does not exists some strategy ai that would give i strictly better utility in every world i considers possible. (Clearly, if such a strategy were to exist, then i would be stupid not to use it.) More precisely, let

RATi = {ω Ωaiω P i(ω) : ui(ai,s i(ω)) u i(s(ω))},

and let RAT denote the event that everyone is rational:

RAT = i[n]RATi.

Note that if a player i is rational at some state ω, this effectively means that there is action ai that strictly dominates si(ω) in every world i considers possible. Based on this observation, we can get the following simple characterizations of states where RAT holds:

Theorem 18.1. Let 𝒢 = (n,A,u) be a finite normal-form game, and let a A be an action profile. Then the following statements are equivalent:

  1. a is not strictly dominated (i.e., a survives one round of iterative strict dominance).
  2. There exists a knowledge structure M = (n,Ω,X,s,P) that is appropriate for 𝒢 and a state ω Ω such that s(ω) = a and ω RAT.

Proof. We first prove that (1) implies (2) and then that (2) implies (1).

(1) (2). Consider an action profile b that survives one step of iterative strict dominance. Construct a knowledge structure M where we have a state ωa for every possible action profile a such that:

  • Let s(ωa) = a;
  • Let Pi(ωa) = {ω(ai,ai)ai Ai} (that is, i has no idea what any other player is doing, but knows their own action).

It is easily verified that in every world, players know their beliefs and their strategy; they know their strategy by definition, and the fact that they know their beliefs follows from the fact that the beliefs of a player i are determined solely by their strategy. We can now claim that ωb is the world we are looking for. By our definition of M, s(ωb) = b. Furthermore, since b is not strictly dominated, we have by the definition of RAT that ωb RAT.

(2) (1). Assume there exists some knowledge structure M appropriate for 𝒢 and some state ω Ω such that s(ω) = a and ω RAT. Assume for contradiction that there exists some player i and some strategy ai that strictly dominates ai. Since ω RATi, there must exist some state ω Pi(ω) such that:

ui(ai,s i(ω)) u i(s(ω)).

But, because players “know their own strategy” (since M is appropriate for 𝒢), we have that si(ω) = si(ω) = ai. It follows that

ui(ai,s i(ω)) u i(ai,si(ω)),

which contradicts the fact that ai strictly dominates ai.

18.2  An Epistemic Characterization of ISD

In this section, we instead show how to get an “epistemic” (i.e., knowledge-based) characterization of ISD and PNE. As we saw in chapter 16, it is often natural to not only assume that everyone is rational, but also that it is commonly known that everyone is rational (i.e., everyone is rational, everyone knows that everyone is rational, everyone knows that everyone knows that everyone is rational, and so on)—this is referred to as common knowledge of rationality (CKR). The following theorem shows that CKR characterizes exactly the set of strategies surviving ISD.

Theorem 18.2. Let 𝒢 = (n,A,u) be a finite normal-form game, and let a A be an action profile. Then the following statements are equivalent:

  1. a survives iterative strict dominance.
  2. There exists a knowledge structure M = (n,Ω,X,s,P) that is appropriate for 𝒢 and a state ω Ω such that s(ω) = a and ωC RAT (i.e., common knowledge of rationality holds at ω).

Proof. [Advanced] Again, we separately show each direction.

(1) (2). Let ISDi be the set of strategies for player i surviving ISD, and let ISD = ISD1 × ISD2 × × ISDn be the set of strategy profiles surviving ISD. We construct a knowledge structure M = (n,Ω,X,s,P) appropriate for 𝒢 where we have a state ωa Ω for every possible action profile a such that:

  • s(ωa) = a;
  • Pi(ωa) = {ω(ai,ai)ai ISDi} (that is, i knows their own action, but otherwise considers all strategy profiles for i that survive ISD possible).

Just as in the proof of Theorem 18.1, we can easily verify that in every world, players know their strategy and beliefs. We additionally have the following claim:

Claim 18.1. Consider the knowledge structure M defined above. For every ω Ω, we have that ωRAT.

Given this claim, it directly follows that for every player i, and state ω Ω, ω KiRAT; thus, ω ERAT. Consequently, we thus have that for every ω Ω, ω EERAT, and so on. By induction it follows that for every ω Ω, and every k, ω EkRAT, and thus we also have that ω CRAT.

Thus for every strategy profile a that survives ISD, we have that M and the state ωa satisfy the required conditions.

(2) (1). Consider some knowledge structure M appropriate for 𝒢. Let ISDik denote the set of strategies for player i surviving k rounds of ISD (and define ISDk as the set of strategy profiles surviving ISD). We shall prove by induction that for any state ω EkRAT (i.e., where everybody knows that everybody knows …(k times) …that everyone is rational), we have s(ω) ISDk. The base case (k = 0) is proven by Theorem 18.1 above.

For the inductive step, assume the statement is true for k, and let us prove it for k + 1. Consider some ω Ek+1RAT and some player i. Note that if ω Ek+1RAT, then ω EkRAT (since, by the definition of beliefs, ω Pi(ω)); consequently, ω RAT as well. So, by the induction hypothesis, si(ω) ISDik; furthermore, for every ω Pi(ω), si(ω) ISDik. Since ω RAT, it follows by definition that si(ω) is an action that is inside ISDik, but not strictly dominated by any action ai with respect to ISDik; hence, si(ω) ISDik+1 (i.e., it will survive one more round). And since the above holds for all players, we have s(ω) ISDk+1. So, for any ω CRAT, s(ω) ISDk for any k, implying that s(ω) survives ISD.

18.3  An Epistemic Characterization of PNE

Can we hope to also get an epistemic characterization of PNE? To do this, we need to add additional assumptions about the knowledge of players. In particular, we will need to assume that everyone knows not only their own strategy, but also the strategies of others! Specifically, let KS be the event that all players know all players’ strategies; formally,

KS = {ω Ωi [n],ω P i(ω) : s(ω) = s(ω)}.

Theorem 18.3. Let 𝒢 = (n,A,u) be a finite normal-form game, and let a A be an action profile. Then the following statements are equivalent:

  1. a is a PNE.
  2. There exists a knowledge structure M = (n,Ω,X,s,P) that is appropriate for 𝒢 and a state ω Ω such that s(ω) = a and ω KS CRAT.
  3. There exists a knowledge structure M = (n,Ω,X,s,P) that is appropriate for 𝒢 and a state ω Ω such that s(ω) = a and ω KS RAT.

Proof. [Advanced] We prove equivalence by showing that (1) implies (2) implies (3) implies (1).

(1) (2). Consider a structure M with just one state ω such that s(ω) = a and Pi(ω) = ω. Clearly, at ω, KS and RAT hold, and consequently, by induction, also CRAT.

(2) (3). Trivial (because CRAT implies RAT).

(3) (1).KS and RAT taken together imply that for every player i, player i’s strategy at ω is a best response with respect to si(ω); thus, s(ω) = a is a PNE.

18.4  Knowledge vs. Belief: Explaining Bubbles in the Market

So far in our discussion we have only considered a model of knowledge. We may also use a similar model to reason about belief; we can define a belief structure in exactly the same way as a knowledge structure, except that we remove the second-to-last condition in Definition 17.1—that is, we no longer require that in every state ω, players always need to consider ω possible. We say that a player i believes ϕ if ϕ holds in every world i considers possible, and we define a belief event Bi in exactly the same ways as Ki (except that it is defined on a “belief structure” rather than a “knowledge structure”).

Note that once we drop the second-to-last condition in Definition 17.1, the two graph properties of knowledge networks discussed in Section 17.2 (i.e., self-loops and bidirectional edges) no longer need to hold for “belief networks”: self-loops are no longer required and as a consequence, neither are bidirectional edges. To see how this makes things different, let us once more consider an auction scenario with two players and a single item and the belief structure depicted in Figure 18.2. Assume that the true state of the world is ω1. Player 1 here knows the true state of the world. However, player 2 has an incorrect belief: he believes that ω2 is the true state of the world. (Notice that there is no self-loop for player 2 at ω1, so he does not even consider the true state possible; this could not have happened in a knowledge structure.) Thus, player 2 believes that player 1 believes that player 2 values the item at 100, when player 1 (correctly) believes that player 2 values it at zero!

Figure 18.2: A belief network for a second-price auction. Some states in a belief network (here, ω1 and ω2) lead to players having incorrect beliefs.

In other words, player 2 (incorrectly) believes that he is smarter than player 1, who (according to player 2) incorrectly believes that player 2 is foolishly valuing the good at 100. How much do you think an auctioneer could sell this item for in such a situation? Remember that not only everyone thinks that the item is worthless, but we also have that everyone believes that everyone thinks the item is worthless! But, as is turns out, just the fact that somebody believes that somebody else believes that the item is worth 100 means that a sufficiently clever auctioneer can sell it for close to 100, assuming that common belief of rationality holds (i.e., it is common belief that everyone is rational, where common belief is defined exactly as common knowledge but in belief structures). In essence, we get a “bubble” in the market where the item gets sold at a high price despite nobody actually having a high valuation for it.

We will briefly and informally explain how this can happen. Consider a standard second-price auction, but change it slightly so that the seller will also decide to forgo some small amount of money R < 1 and distribute back to the players as a “participation reward”; the money R gets split among the players based on how much the players bid (so that player who bid a lot gets a larger fraction of R). Intuitively, adding such a participation reward should make it worse for the seller, as he now loses R out of the profit he makes from selling the item. But, as we shall see (and as is common in practice), adding small “welcome gifts” can be a way to extort more money from the buyers!

More precisely, in an auction with n players, player i gets R bi n(1+bi) as a participation reward where bi is the player’s bid; note that each such participation reward is smaller than R n and thus in total, the auctioneer never spends more than R. Intuitively, however, this incentivizes players to bid high, especially if they are sure to never win in the auction (as in this case, they increase their participation reward). As we shall argue, in any outcome that is consistent with CKR, the object is sold for close to 100. The reasoning works as follows:

We conclude that in the true state of the world, ω1, player 1 will bid at least 97 and player 2 will bid at least 98; thus, the item will be sold for at least 97, and the auctioneer is guaranteed to receive a utility of at least 97 R 96 for a worthless item!

Notes

Our treatment of the knowledge-based characterizations of ISD and PNE follows ideas from [TW88, AB95, HP09], but we explore them here in a somewhat different setting that enables a significantly simpler analysis. As far as we know, these characterizations are new (and our formalism is closest in spirit to [HP09, CMP16]).

Closely related characterization results was first proven by [TW88] and [AB95] relying on a different (stronger) notion of rationality. More precisely, in our treatment we consider a weak “possibilistic” notion of rationality. A stronger definition of rationality can be obtained by considering knowledge structures with probabilities: we no longer only have just a set of worlds players consider possible at each world, but we also assign probabilities to the worlds a player considers possible (e.g., a player may consider some worlds more likely than others). In such a setting, the most popular notion of rationality requires a player i’s strategy at ω to maximize i expected utility given the distribution of strategies for players i induced by their beliefs. CKR is characterized by ISD in this setting as long as we change the definition of ISD to also allow for domination by “mixed strategies” (that is, we remove actions that are strictly dominated by a probability distribution over actions)—this was the result first proved by [TW88]. An analogous characterizations of (mixed-strategy) Nash equilibria were obtained in [AB95].

The bubbles in the market example originates from the treatment in [CMP16] where a very similar auction mechanism is analyzed (but the actual example is new).