Chapter 18 Common Knowledge of Rationality

In this chapter, we turn to using our model of knowledge to reason about games.

18.1 Knowledge and Games

We must first define what it means for a knowledge structure to be “appropriate” for a game $𝒢$ :

Definition 18.1. We say that a knowledge structure tuple $M = (n, Ω, X, s, \vec{P})$ is appropriate for a finite game $𝒢 = (n^{'}, A, u)$ if:

$n^{'} = n$ (the players are the same)
$X = A$ (the outcome in each world is an action profile in the game)
For all $ω \in Ω, i \in [n]$ , and all $ω^{'} \in P_{i} (ω)$ , $s_{i} (ω^{'}) = s_{i} (ω)$ , where $s_{i}$ denotes the $i^{th}$ component of $s$ . (That is, players always know their own strategy.)

In other words, each world models what action each of the players take; players have beliefs about what others are doing and believing, and we assume players always know their own action. See Figure 18.1 for a simple knowledge structure appropriate for the Prisoner’s Dilemma game.

**Figure 18.1:** A knowledge network for the Prisoner’s Dilemma. This graph looks similar to the one we considered for the muddy children puzzle, but notice that there each player knows the other player’s state but not their own, whereas here each player knows their own strategy but not the other’s. Also observe that, while in this network, there is exactly one state per action profile, this does not necessarily need to hold; other networks for the same game may have several nodes with the same action profile but different beliefs.

We can now define an event ${RAT}_{i}$ , which denotes the event that player $i$ is acting “rationally.” We take a conservative view and provide a minimal (and weak) definition of what it means to be rational. Basically, we define what it means to not be “crazy” or “totally irrational”: following the discussion in chapter 1, we say that player $i$ is (minimally) rational at a world $ω$ if there does not exists some strategy $a_{i}^{'}$ that would give $i$ strictly better utility in every world $i$ considers possible. (Clearly, if such a strategy were to exist, then $i$ would be stupid not to use it.) More precisely, let

{RAT}_{i} = {ω \in Ω ∣ ∄ a_{i}^{'} \forall ω^{'} \in P_{i} (ω) : u_{i} (a_{i}^{'}, s_{- i} (ω^{'})) \leq u_{i} (s (ω^{'}))},

and let $RAT$ denote the event that everyone is rational:

RAT = ⋂_{i \in [n]} {RAT}_{i} .

Note that if a player $i$ is rational at some state $ω$ , this effectively means that there is action $a_{i}^{'}$ that strictly dominates $s_{i} (ω)$ in every world $i$ considers possible. Based on this observation, we can get the following simple characterizations of states where $RAT$ holds:

Theorem 18.1. Let $𝒢 = (n, A, u)$ be a finite normal-form game, and let $\vec{a} \in A$ be an action profile. Then the following statements are equivalent:

$\vec{a}$ is not strictly dominated (i.e., $\vec{a}$ survives one round of iterative strict dominance).
There exists a knowledge structure $M = (n, Ω, X, s, \vec{P})$ that is appropriate for $𝒢$ and a state $ω \in Ω$ such that $s (ω) = \vec{a}$ and $ω \in RAT$ .

Proof. We first prove that (1) implies (2) and then that (2) implies (1).

(1) $\to$ (2). Consider an action profile $\vec{b}$ that survives one step of iterative strict dominance. Construct a knowledge structure $M$ where we have a state $ω_{\vec{a}}$ for every possible action profile $\vec{a}$ such that:

Let $s (ω_{\vec{a}}) = \vec{a}$ ;
Let $P_{i} (ω_{\vec{a}}) = {ω_{(a_{i}, a_{- i})} ∣ a_{- i} \in A_{- i}}$ (that is, $i$ has no idea what any other player is doing, but knows their own action).

It is easily verified that in every world, players know their beliefs and their strategy; they know their strategy by definition, and the fact that they know their beliefs follows from the fact that the beliefs of a player $i$ are determined solely by their strategy. We can now claim that $ω_{\vec{b}}$ is the world we are looking for. By our definition of $M$ , $s (ω_{\vec{b}}) = \vec{b}$ . Furthermore, since $\vec{b}$ is not strictly dominated, we have by the definition of $RAT$ that $ω_{\vec{b}} \in RAT$ .

(2) $\to$ (1). Assume there exists some knowledge structure $M$ appropriate for $𝒢$ and some state $ω \in Ω$ such that $s (ω) = \vec{a}$ and $ω \in RAT$ . Assume for contradiction that there exists some player $i$ and some strategy $a_{i}^{'}$ that strictly dominates $a_{i}$ . Since $ω \in {RAT}_{i}$ , there must exist some state $ω^{'} \in P_{i} (ω)$ such that:

u_{i} (a_{i}^{'}, s_{- i} (ω^{'})) \leq u_{i} (s (ω^{'})) .

But, because players “know their own strategy” (since $M$ is appropriate for $𝒢$ ), we have that $s_{i} (ω^{'}) = s_{i} (ω) = a_{i}$ . It follows that

u_{i} (a_{i}^{'}, s_{- i} (ω^{'})) \leq u_{i} (a_{i}, s_{- i} (ω^{'})),

which contradicts the fact that $a_{i}^{'}$ strictly dominates $a_{i}$ .

■

18.2 An Epistemic Characterization of ISD

In this section, we instead show how to get an “epistemic” (i.e., knowledge-based) characterization of ISD and PNE. As we saw in chapter 16, it is often natural to not only assume that everyone is rational, but also that it is commonly known that everyone is rational (i.e., everyone is rational, everyone knows that everyone is rational, everyone knows that everyone knows that everyone is rational, and so on)—this is referred to as common knowledge of rationality (CKR). The following theorem shows that CKR characterizes exactly the set of strategies surviving ISD.

Theorem 18.2. Let $𝒢 = (n, A, u)$ be a finite normal-form game, and let $\vec{a} \in A$ be an action profile. Then the following statements are equivalent:

$\vec{a}$ survives iterative strict dominance.
There exists a knowledge structure $M = (n, Ω, X, s, \vec{P})$ that is appropriate for $𝒢$ and a state $ω \in Ω$ such that $s (ω) = \vec{a}$ and ω ∈ C RAT (i.e., common knowledge of rationality holds at $ω$ ).

Proof. [Advanced] Again, we separately show each direction.

(1) $\to$ (2). Let $I S D_{i}$ be the set of strategies for player $i$ surviving ISD, and let $I S D = I S D_{1} \times I S D_{2} \times \dots \times I S D_{n}$ be the set of strategy profiles surviving ISD. We construct a knowledge structure $M = (n, Ω, X, s, \vec{P})$ appropriate for $𝒢$ where we have a state $ω_{\vec{a}} \in Ω$ for every possible action profile $\vec{a}$ such that:

$s (ω_{\vec{a}}) = \vec{a}$ ;
$P_{i} (ω_{\vec{a}}) = {ω_{(a_{i}, a_{- i})} ∣ a_{- i} \in I S D_{- i}}$ (that is, $i$ knows their own action, but otherwise considers all strategy profiles for $- i$ that survive ISD possible).

Just as in the proof of Theorem 18.1, we can easily verify that in every world, players know their strategy and beliefs. We additionally have the following claim:

Claim 18.1. Consider the knowledge structure $M$ defined above. For every $ω \in Ω$ , we have that ω ∈ RAT.

Proof. Assume for contradiction that there exists some state $ω \in Ω$ and some player $i$ such that $ω \notin {RAT}_{i}$ . That is, there exists some $a_{i}^{'}$ that strictly dominates $a_{i} = s_{i} (ω^{'})$ with respect to $s_{- i} (ω^{'})$ for all $ω^{'} \in P_{i} (ω)$ . By our construction of $M$ (specifically, $P$ ), $a_{i}^{'}$ thus strictly dominates $a_{i}$ with respect to $I S D_{- i}$ . Thus, intuitively, $a_{i}$ should have been deleted by ISD (since $a_{i}^{'}$ dominates it with respect to the remaining set of strategies), which would be a contradiction. In fact, the only reason why $a_{i}$ may not be removed in the ISD process at this stage is if $a_{i}^{'}$ were not inside $I S D_{i}$ , since ISD only considers strict dominance by strategies still surviving. But in that case, $a_{i}^{'}$ was deleted due to being dominated by some strategy $a_{i}^{1}$ ; in turn, this strategy is either in $I S D_{i}$ or was previously removed due to being dominated by some strategy $a_{i}^{2}$ . Since the strategy space is finite, we must eventually will reach some $a_{i}^{m} \in I S D_{i}$ , which strictly dominates $a_{i}$ itself with respect to $I S D_{- i}$ by transitivity of strict dominance (and the fact that the strategy space shrinks, preventing a strategy strictly dominated earlier in the process from not being strictly dominated later). This, then, contradicts the assumption that $a_{i} \in I S D_{i}$ .

■

Given this claim, it directly follows that for every player $i$ , and state $ω \in Ω$ , $ω \in K_{i} RAT$ ; thus, $ω \in E RAT$ . Consequently, we thus have that for every $ω \in Ω$ , $ω \in E E RAT$ , and so on. By induction it follows that for every $ω \in Ω$ , and every $k$ , $ω \in E^{k} RAT$ , and thus we also have that $ω \in C RAT$ .

Thus for every strategy profile $\vec{a}$ that survives ISD, we have that $M$ and the state $ω_{\vec{a}}$ satisfy the required conditions.

(2) $\to$ (1). Consider some knowledge structure $M$ appropriate for $𝒢$ . Let $I S D_{i}^{k}$ denote the set of strategies for player $i$ surviving $k$ rounds of ISD (and define $I S D^{k}$ as the set of strategy profiles surviving ISD). We shall prove by induction that for any state $ω \in E^{k} RAT$ (i.e., where everybody knows that everybody knows …( $k$ times) …that everyone is rational), we have $s (ω) \in I S D^{k}$ . The base case ( $k = 0$ ) is proven by Theorem 18.1 above.

For the inductive step, assume the statement is true for $k$ , and let us prove it for $k + 1$ . Consider some $ω \in E^{k + 1} RAT$ and some player $i$ . Note that if $ω \in E^{k + 1} RAT$ , then $ω \in E^{k} RAT$ (since, by the definition of beliefs, $ω \in P_{i} (ω)$ ); consequently, $ω \in RAT$ as well. So, by the induction hypothesis, $s_{i} (ω) \in I S D_{i}^{k}$ ; furthermore, for every $ω^{'} \in P_{i} (ω)$ , $s_{- i} (ω^{'}) \in I S D_{- i}^{k}$ . Since $ω \in RAT$ , it follows by definition that $s_{i} (ω)$ is an action that is inside $I S D_{i}^{k}$ , but not strictly dominated by any action $a_{i}^{'}$ with respect to $I S D_{- i}^{k}$ ; hence, $s_{i} (ω) \in I S D_{i}^{k + 1}$ (i.e., it will survive one more round). And since the above holds for all players, we have $s (ω) \in I S D^{k + 1}$ . So, for any $ω \in C RAT$ , $s (ω) \in I S D^{k}$ for any $k$ , implying that $s (ω)$ survives ISD.

■

18.3 An Epistemic Characterization of PNE

Can we hope to also get an epistemic characterization of PNE? To do this, we need to add additional assumptions about the knowledge of players. In particular, we will need to assume that everyone knows not only their own strategy, but also the strategies of others! Specifically, let $KS$ be the event that all players know all players’ strategies; formally,

KS = {ω \in Ω ∣ \forall i \in [n], ω^{'} \in P_{i} (ω) : s (ω^{'}) = s (ω)} .

Theorem 18.3. Let $𝒢 = (n, A, u)$ be a finite normal-form game, and let $\vec{a} \in A$ be an action profile. Then the following statements are equivalent:

$\vec{a}$ is a PNE.

There exists a knowledge structure

M = (n, Ω, X, s, \vec{P})

that is appropriate for

𝒢

and a state

ω \in Ω

such that

s (ω) = \vec{a}

and

ω \in KS \cap C RAT

There exists a knowledge structure

M = (n, Ω, X, s, \vec{P})

that is appropriate for

𝒢

and a state

ω \in Ω

such that

s (ω) = \vec{a}

and

ω \in KS \cap RAT

Proof. [Advanced] We prove equivalence by showing that (1) implies (2) implies (3) implies (1).

(1) $\to$ (2). Consider a structure $M$ with just one state $ω$ such that $s (ω) = \vec{a}$ and $P_{i} (ω) = ω$ . Clearly, at $ω$ , $KS$ and $RAT$ hold, and consequently, by induction, also $C RAT$ .

(2) $\to$ (3). Trivial (because $C RAT$ implies $RAT$ ).

(3) $\to$ (1). $KS$ and $RAT$ taken together imply that for every player $i$ , player $i$ ’s strategy at $ω$ is a best response with respect to $s_{i} (ω)$ ; thus, $s (ω) = \vec{a}$ is a PNE.

■

18.4 Knowledge vs. Belief: Explaining Bubbles in the Market

So far in our discussion we have only considered a model of knowledge. We may also use a similar model to reason about belief; we can define a belief structure in exactly the same way as a knowledge structure, except that we remove the second-to-last condition in Definition 17.1—that is, we no longer require that in every state $ω$ , players always need to consider $ω$ possible. We say that a player $i$ believes $ϕ$ if $ϕ$ holds in every world $i$ considers possible, and we define a belief event $B_{i}$ in exactly the same ways as $K_{i}$ (except that it is defined on a “belief structure” rather than a “knowledge structure”).

Note that once we drop the second-to-last condition in Definition 17.1, the two graph properties of knowledge networks discussed in Section 17.2 (i.e., self-loops and bidirectional edges) no longer need to hold for “belief networks”: self-loops are no longer required and as a consequence, neither are bidirectional edges. To see how this makes things different, let us once more consider an auction scenario with two players and a single item and the belief structure depicted in Figure 18.2. Assume that the true state of the world is $ω_{1}$ . Player 1 here knows the true state of the world. However, player 2 has an incorrect belief: he believes that $ω_{2}$ is the true state of the world. (Notice that there is no self-loop for player 2 at $ω_{1}$ , so he does not even consider the true state possible; this could not have happened in a knowledge structure.) Thus, player 2 believes that player 1 believes that player 2 values the item at 100, when player 1 (correctly) believes that player 2 values it at zero!

In other words, player 2 (incorrectly) believes that he is smarter than player 1, who (according to player 2) incorrectly believes that player 2 is foolishly valuing the good at 100. How much do you think an auctioneer could sell this item for in such a situation? Remember that not only everyone thinks that the item is worthless, but we also have that everyone believes that everyone thinks the item is worthless! But, as is turns out, just the fact that somebody believes that somebody else believes that the item is worth 100 means that a sufficiently clever auctioneer can sell it for close to 100, assuming that common belief of rationality holds (i.e., it is common belief that everyone is rational, where common belief is defined exactly as common knowledge but in belief structures). In essence, we get a “bubble” in the market where the item gets sold at a high price despite nobody actually having a high valuation for it.

We will briefly and informally explain how this can happen. Consider a standard second-price auction, but change it slightly so that the seller will also decide to forgo some small amount of money $R < 1$ and distribute back to the players as a “participation reward”; the money $R$ gets split among the players based on how much the players bid (so that player who bid a lot gets a larger fraction of $R$ ). Intuitively, adding such a participation reward should make it worse for the seller, as he now loses $R$ out of the profit he makes from selling the item. But, as we shall see (and as is common in practice), adding small “welcome gifts” can be a way to extort more money from the buyers!

More precisely, in an auction with $n$ players, player $i$ gets $R \frac{b_{i}}{n (1 + b_{i})}$ as a participation reward where $b_{i}$ is the player’s bid; note that each such participation reward is smaller than $\frac{R}{n}$ and thus in total, the auctioneer never spends more than $R$ . Intuitively, however, this incentivizes players to bid high, especially if they are sure to never win in the auction (as in this case, they increase their participation reward). As we shall argue, in any outcome that is consistent with CKR, the object is sold for close to 100. The reasoning works as follows:

We first claim that at $ω_{3}$ , a rational player 2 must bid at least 100. To show this, note that if we had been running a standard second-price auction (without a participation reward), then player 2 clearly can never improve their utility by bidding less than 100 (since he will anyway end up paying the same price if he wins the auctions, and in case he loses, he gets a utility of $0$ ). Now, since the actual auction we are running has been modified to include the participation reward, note that by bidding less than 100, player 2 gets a strictly smaller part of the reward $R$ . Thus, we conclude that by bidding less than 100, player 2 gets strictly less in utility (since player 2’s utility is the utility he would have gotten in a standard second-price auction plus the participation reward).
Now, in $ω_{2}$ , since player 1 believes (incorrectly) that the true state is $ω_{3}$ , a rational player 1 must bid at least 99: he knows that player 2 (being rational) bids at least 100, thus by bidding 99 he gets as much as possible of the participation reward.
Thus, in the true state of the world, $ω_{1}$ , player 2, who incorrectly believes that the true state is $ω_{2}$ , must bid at least 98 (since he is sure that player 1 will bid at least 99).
Finally, at $ω_{1}$ , player 1, who correctly believes that the true state is $ω_{1}$ must bid at least 97 (since he is sure that player 2 will bid at least 98).

We conclude that in the true state of the world, $ω_{1}$ , player 1 will bid at least 97 and player 2 will bid at least 98; thus, the item will be sold for at least 97, and the auctioneer is guaranteed to receive a utility of at least $97 - R \geq 96$ for a worthless item!

Notes

Our treatment of the knowledge-based characterizations of ISD and PNE follows ideas from [TW88, AB95, HP09], but we explore them here in a somewhat different setting that enables a significantly simpler analysis. As far as we know, these characterizations are new (and our formalism is closest in spirit to [HP09, CMP16]).

Closely related characterization results was first proven by [TW88] and [AB95] relying on a different (stronger) notion of rationality. More precisely, in our treatment we consider a weak “possibilistic” notion of rationality. A stronger definition of rationality can be obtained by considering knowledge structures with probabilities: we no longer only have just a set of worlds players consider possible at each world, but we also assign probabilities to the worlds a player considers possible (e.g., a player may consider some worlds more likely than others). In such a setting, the most popular notion of rationality requires a player $i$ ’s strategy at $ω$ to maximize $i$ expected utility given the distribution of strategies for players $- i$ induced by their beliefs. CKR is characterized by ISD in this setting as long as we change the definition of ISD to also allow for domination by “mixed strategies” (that is, we remove actions that are strictly dominated by a probability distribution over actions)—this was the result first proved by [TW88]. An analogous characterizations of (mixed-strategy) Nash equilibria were obtained in [AB95].

The bubbles in the market example originates from the treatment in [CMP16] where a very similar auction mechanism is analyzed (but the actual example is new).