Playing For Real Binmore Pdf Writer

Description of the book 'Playing for Real: A Text on Game Theory / Edition 1': Ken Binmore's previous game theory textbook, Fun and Games (D.C. Heath, 1991), carved out a significant niche in the advanced undergraduate market- it was intellectually serious and more up-to-date than its competitors, but also accessibly written.
Playing For Real Binmore Pdf Free. 5/31/2017 0 Comments. Playing for Real A Text on Game Theory Ken G Binmore.pdf. Playing for Real A Text on Game. Playing for Real: A Text on Game. Enter your mobile number or email address below and we'll send you a link to download the free Kindle App.
Kenneth Binmore. Jump to navigation Jump to search. Playing for Real – A Text on Game Theory. New York: Oxford University Press. One-armed economists show big business how to play the game. Article by Binmore.

Playing For Real Binmore Pdf Writers
Playing For Real Binmore Pdf Writer Download
Playing For Real Binmore Pdf Writer Free

First published Sat Jan 25, 1997; substantive revision Fri Mar 8, 2019

Game theory is the study of the ways in which interactingchoices of economic agents produce outcomeswith respect to the preferences (or utilities) ofthose agents, where the outcomes in question might have been intendedby none of the agents. The meaning of this statement will not be clearto the non-expert until each of the italicized words and phrases hasbeen explained and featured in some examples. Doing this will be themain business of this article. First, however, we provide somehistorical and philosophical context in order to motivate the readerfor the technical work ahead.

Game Theory Theodore L. Binmore and Vulkan, “Applying game theory to auto-mated negotiation,” Netnomics Vol. 1, 1999, pages 1–9) but is still in a nascent stage. This article focuses principally on noncooperative game theory with rational play-ers. In addition to providing an important baseline case in economic. Chapter 1 Introduction This tutorial is designed to give the reader an understanding of Principal Components Analysis (PCA). PCA is a useful statistical technique.

2. Basic Elements and Assumptions of Game Theory
3. Uncertainty, Risk and Sequential Equilibria
8. Game Theory and Behavioral Evidence

1. Philosophical and Historical Motivation

The mathematical theory of games was invented by John von Neumann andOskar Morgenstern (1944). For reasons to be discussed later, limitations in their mathematicalframework initially made the theory applicable only under special andlimited conditions. This situation has dramatically changed, in wayswe will examine as we go along, over the past seven decades, as theframework has been deepened and generalized. Refinements are stillbeing made, and we will review a few outstanding problems that liealong the advancing front edge of these developments towards the endof the article. However, since at least the late 1970s it has beenpossible to say with confidence that game theory is the most importantand useful tool in the analyst’s kit whenever she confrontssituations in which what counts as one agent’s best action (forher) depends on expectations about what one or more other agents willdo, and what counts as their best actions (for them) similarly dependon expectations about her.

Despite the fact that game theory has been rendered mathematically andlogically systematic only since 1944, game-theoretic insights can befound among commentators going back to ancient times. For example, intwo of Plato’s texts, the Laches and theSymposium, Socrates recalls an episode from the Battle ofDelium that some commentators have interpreted (probablyanachronistically) as involving the following situation. Consider asoldier at the front, waiting with his comrades to repulse an enemyattack. It may occur to him that if the defense is likely to besuccessful, then it isn’t very probable that his own personalcontribution will be essential. But if he stays, he runs the risk ofbeing killed or wounded—apparently for no point. On the otherhand, if the enemy is going to win the battle, then his chances ofdeath or injury are higher still, and now quite clearly to no point,since the line will be overwhelmed anyway. Based on this reasoning, itwould appear that the soldier is better off running away regardless ofwho is going to win the battle. Of course, if all of the soldiersreason this way—as they all apparently should, sincethey’re all in identical situations—then this willcertainly bring about the outcome in which the battle islost. Of course, this point, since it has occurred to us as analysts,can occur to the soldiers too. Does this give them a reason forstaying at their posts? Just the contrary: the greater thesoldiers’ fear that the battle will be lost, the greater theirincentive to get themselves out of harm’s way. And the greaterthe soldiers’ belief that the battle will be won, without theneed of any particular individual’s contributions, the lessreason they have to stay and fight. If each soldieranticipates this sort of reasoning on the part of the others,all will quickly reason themselves into a panic, and their horrifiedcommander will have a rout on his hands before the enemy has evenengaged.

Long before game theory had come along to show analysts how to thinkabout this sort of problem systematically, it had occurred to someactual military leaders and influenced their strategies. Thus theSpanish conqueror Cortez, when landing in Mexico with a small forcewho had good reason to fear their capacity to repel attack from thefar more numerous Aztecs, removed the risk that his troops might thinktheir way into a retreat by burning the ships on which they hadlanded. With retreat having thus been rendered physically impossible,the Spanish soldiers had no better course of action than to stand andfight—and, furthermore, to fight with as much determination asthey could muster. Better still, from Cortez’s point of view,his action had a discouraging effect on the motivation of the Aztecs.He took care to burn his ships very visibly, so that the Aztecs wouldbe sure to see what he had done. They then reasoned as follows: Anycommander who could be so confident as to willfully destroy his ownoption to be prudent if the battle went badly for him must have goodreasons for such extreme optimism. It cannot be wise to attack anopponent who has a good reason (whatever, exactly, it might be) forbeing sure that he can’t lose. The Aztecs therefore retreatedinto the surrounding hills, and Cortez had the easiest possiblevictory.

These two situations, at Delium and as manipulated by Cortez, have acommon and interesting underlying logic. Notice that the soldiers arenot motivated to retreat just, or even mainly, by theirrational assessment of the dangers of battle and by theirself-interest. Rather, they discover a sound reason to run away byrealizing that what it makes sense for them to do depends on what itwill make sense for others to do, and that all of the others cannotice this too. Even a quite brave soldier may prefer to run ratherthan heroically, but pointlessly, die trying to stem the oncoming tideall by himself. Thus we could imagine, without contradiction, acircumstance in which an army, all of whose members are brave, fleesat top speed before the enemy makes a move. If the soldiers reallyare brave, then this surely isn’t the outcome any ofthem wanted; each would have preferred that all stand and fight. Whatwe have here, then, is a case in which the interaction ofmany individually rational decision-making processes—one processper soldier—produces an outcome intended by no one. (Most armiestry to avoid this problem just as Cortez did. Since they can’tusually make retreat physically impossible, they make iteconomically impossible: they shoot deserters. Then standingand fighting is each soldier’s individually rational course ofaction after all, because the cost of running is sure to be at leastas high as the cost of staying.)

Another classic source that invites this sequence of reasoning isfound in Shakespeare’s Henry V. During the Battle ofAgincourt Henry decided to slaughter his French prisoners, in fullview of the enemy and to the surprise of his subordinates, whodescribe the action as being out of moral character. The reasons Henrygives allude to non-strategic considerations: he is afraid that theprisoners may free themselves and threaten his position. However, agame theorist might have furnished him with supplementary strategic(and similarly prudential, though perhaps not moral) justification.His own troops observe that the prisoners have been killed, andobserve that the enemy has observed this. Therefore, they know whatfate will await them at the enemy’s hand if they don’twin. Metaphorically, but very effectively, their boats have beenburnt. The slaughter of the prisoners plausibly sent a signal to thesoldiers of both sides, thereby changing their incentives in ways thatfavoured English prospects for victory.

These examples might seem to be relevant only for those who findthemselves in sordid situations of cut-throat competition. Perhaps,one might think, it is important for generals, politicians, mafiosi,sports coaches and others whose jobs involve strategic manipulation ofothers, but the philosopher should only deplore its amorality. Such aconclusion would be highly premature, however. The study of thelogic that governs the interrelationships amongst incentives,strategic interactions and outcomes has been fundamental in modernpolitical philosophy, since centuries before anyone had an explicitname for this sort of logic. Philosophers share with social scientiststhe need to be able to represent and systematically model not onlywhat they think people normatively ought to do, but what theyoften actually do in interactive situations.

Hobbes’s Leviathan is often regarded as the foundingwork in modern political philosophy, the text that began thecontinuing round of analyses of the function and justification of thestate and its restrictions on individual liberties. The core ofHobbes’s reasoning can be given straightforwardly as follows.The best situation for all people is one in which each is free to doas she pleases. (One may or may not agree with this as a matter ofpsychology or ideology, but it is Hobbes’s assumption.) Often,such free people will wish to cooperate with one another in order tocarry out projects that would be impossible for an individual actingalone. But if there are any immoral or amoral agents around, they willnotice that their interests might at least sometimes be best served bygetting the benefits from cooperation and not returning them. Suppose,for example, that you agree to help me build my house in return for mypromise to help you build yours. After my house is finished, I canmake your labour free to me simply by reneging on my promise. I thenrealize, however, that if this leaves you with no house, you will havean incentive to take mine. This will put me in constant fear of you,and force me to spend valuable time and resources guarding myselfagainst you. I can best minimize these costs by striking first andkilling you at the first opportunity. Of course, you can anticipateall of this reasoning by me, and so have good reason to try to beat meto the punch. Since I can anticipate this reasoning byyou, my original fear of you was not paranoid; nor was yoursof me. In fact, neither of us actually needs to be immoral to get thischain of mutual reasoning going; we need only think that there is somepossibility that the other might try to cheat on bargains.Once a small wedge of doubt enters any one mind, the incentive inducedby fear of the consequences of being preempted—hitbefore hitting first—quickly becomes overwhelming on both sides.If either of us has any resources of our own that the other mightwant, this murderous logic can take hold long before we are so sillyas to imagine that we could ever actually get as far as making dealsto help one another build houses in the first place. Left to their owndevices, agents who are at least sometimes narrowly self-interestedcan repeatedly fail to derive the benefits of cooperation, and insteadbe trapped in a state of ‘war of all against all’, inHobbes’s words. In these circumstances, human life, as hevividly and famously put it, will be “solitary, poor, nasty,brutish and short.”

Hobbes’s proposed solution to this problem was tyranny. Thepeople can hire an agent—a government—whose job is topunish anyone who breaks any promise. So long as the threatenedpunishment is sufficiently dire then the cost of reneging on promiseswill exceed the cost of keeping them. The logic here is identical tothat used by an army when it threatens to shoot deserters. If allpeople know that these incentives hold for most others, thencooperation will not only be possible, but can be the expected norm,so that the war of all against all becomes a general peace.

Hobbes pushes the logic of this argument to a very strong conclusion,arguing that it implies not only a government with the right and thepower to enforce cooperation, but an ‘undivided’government in which the arbitrary will of a single ruler must imposeabsolute obligation on all. Few contemporary political theorists thinkthat the particular steps by which Hobbes reasons his way to thisconclusion are both sound and valid. Working through these issueshere, however, would carry us away from our topic into details ofcontractarian political philosophy. What is important in the presentcontext is that these details, as they are in fact pursued incontemporary debates, involve sophisticated interpretation of theissues using the resources of modern game theory. Furthermore,Hobbes’s most basic point, that the fundamental justificationfor the coercive authority and practices of governments ispeoples’ own need to protect themselves from what game theoristscall ‘social dilemmas’, is accepted by many, if not most,political theorists. Notice that Hobbes has not argued thattyranny is a desirable thing in itself. The structure of his argumentis that the logic of strategic interaction leaves only two generalpolitical outcomes possible: tyranny and anarchy. Sensible agents thenchoose tyranny as the lesser of two evils.

The reasoning of the Athenian soldiers, of Cortez, and ofHobbes’s political agents has a common logic, one derived fromtheir situations. In each case, the aspect of the environment that ismost important to the agents’ achievement of their preferredoutcomes is the set of expectations and possible reactions to theirstrategies by other agents. The distinction between actingparametrically on a passive world and actingnon-parametrically on a world that tries to act inanticipation of these actions is fundamental. If you wish to kick arock down a hill, you need only concern yourself with the rock’smass relative to the force of your blow, the extent to which it isbonded with its supporting surface, the slope of the ground on theother side of the rock, and the expected impact of the collision onyour foot. The values of all of these variables are independent ofyour plans and intentions, since the rock has no interests of its ownand takes no actions to attempt to assist or thwart you. By contrast,if you wish to kick a person down the hill, then unless that person isunconscious, bound or otherwise incapacitated, you will likely notsucceed unless you can disguise your plans until it’s too latefor him to take either evasive or forestalling action. Furthermore,his probable responses should be expected to visit costs upon you,which you would be wise to consider. Finally, the relativeprobabilities of his responses will depend on his expectations aboutyour probable responses to his responses. (Consider the difference itwill make to both of your reasoning if one or both of you are armed,or one of you is bigger than the other, or one of you is theother’s boss.) The logical issues associated with the secondsort of situation (kicking the person as opposed to the rock) aretypically much more complicated, as a simple hypothetical example willillustrate.

Suppose first that you wish to cross a river that is spanned by threebridges. (Assume that swimming, wading or boating across areimpossible.) The first bridge is known to be safe and free ofobstacles; if you try to cross there, you will succeed. The secondbridge lies beneath a cliff from which large rocks sometimes fall. Thethird is inhabited by deadly cobras. Now suppose you wish torank-order the three bridges with respect to their preferability ascrossing-points. Unless you get positive enjoyment from risking yourlife—which, as a human being, you might, a complicationwe’ll take up later in this article—then your decisionproblem here is straightforward. The first bridge is obviously best,since it is safest. To rank-order the other two bridges, you requireinformation about their relative levels of danger. If you can studythe frequency of rock-falls and the movements of the cobras forawhile, you might be able to calculate that the probability of yourbeing crushed by a rock at the second bridge is 10% and of beingstruck by a cobra at the third bridge is 20%. Your reasoning here isstrictly parametric because neither the rocks nor the cobras aretrying to influence your actions, by, for example, concealing theirtypical patterns of behaviour because they know you are studying them.It is obvious what you should do here: cross at the safe bridge. Nowlet us complicate the situation a bit. Suppose that the bridge withthe rocks was immediately before you, while the safe bridge was aday’s difficult hike upstream. Your decision-making situationhere is slightly more complicated, but it is still strictlyparametric. You would have to decide whether the cost of the long hikewas worth exchanging for the penalty of a 10% chance of being hit by arock. However, this is all you must decide, and your probability of asuccessful crossing is entirely up to you; the environment is notinterested in your plans.

However, if we now complicate the situation by adding a non-parametricelement, it becomes more challenging. Suppose that you are a fugitiveof some sort, and waiting on the other side of the river with a gun isyour pursuer. She will catch and shoot you, let us suppose, only ifshe waits at the bridge you try to cross; otherwise, you will escape.As you reason through your choice of bridge, it occurs to you that sheis over there trying to anticipate your reasoning. It will seem that,surely, choosing the safe bridge straight away would be a mistake,since that is just where she will expect you, and your chances ofdeath rise to certainty. So perhaps you should risk the rocks, sincethese odds are much better. But wait … if you can reach thisconclusion, your pursuer, who is just as rational and well-informed asyou are, can anticipate that you will reach it, and will be waitingfor you if you evade the rocks. So perhaps you must take your chanceswith the cobras; that is what she must least expect. But, then, no… if she expects that you will expect that she will leastexpect this, then she will most expect it. This dilemma, you realizewith dread, is general: you must do what your pursuer least expects;but whatever you most expect her to least expect is automatically whatshe will most expect. You appear to be trapped in indecision. All thatmight console you a bit here is that, on the other side of the river,your pursuer is trapped in exactly the same quandary, unable to decidewhich bridge to wait at because as soon as she imagines committing toone, she will notice that if she can find a best reason to pick abridge, you can anticipate that same reason and then avoid her.

We know from experience that, in situations such as this, people donot usually stand and dither in circles forever. As we’ll seelater, there is a unique best solution available to eachplayer. However, until the 1940s neither philosophers nor economistsknew how to find it mathematically. As a result, economists wereforced to treat non-parametric influences as if they werecomplications on parametric ones. This is likely to strike the readeras odd, since, as our example of the bridge-crossing problem was meantto show, non-parametric features are often fundamental features ofdecision-making problems. Part of the explanation for gametheory’s relatively late entry into the field lies in theproblems with which economists had historically been concerned.Classical economists, such as Adam Smith and David Ricardo, weremainly interested in the question of how agents in very largemarkets—whole nations—could interact so as to bring aboutmaximum monetary wealth for themselves. Smith’s basic insight,that efficiency is best maximized by agents first differentiatingtheir potential contributions and then freely seeking mutuallyadvantageous bargains, was mathematically verified in the twentiethcentury. However, the demonstration of this fact applies only inconditions of ‘perfect competition,’ that is, whenindividuals or firms face no costs of entry or exit into markets, whenthere are no economies of scale, and when no agents’ actionshave unintended side-effects on other agents’ well-being.Economists always recognized that this set of assumptions is purely anidealization for purposes of analysis, not a possible state of affairsanyone could try (or should want to try) to institutionally establish.But until the mathematics of game theory matured near the end of the1970s, economists had to hope that the more closely a marketapproximates perfect competition, the more efficient it willbe. No such hope, however, can be mathematically or logicallyjustified in general; indeed, as a strict generalization theassumption was shown to be false as far back as the 1950s.

Playing For Real Binmore Pdf Writers

This article is not about the foundations of economics, but it isimportant for understanding the origins and scope of game theory toknow that perfectly competitive markets have built into them a featurethat renders them susceptible to parametric analysis. Because agentsface no entry costs to markets, they will open shop in any givenmarket until competition drives all profits to zero. This implies thatif production costs are fixed and demand is exogenous, then agentshave no options about how much to produce if they are trying tomaximize the differences between their costs and their revenues. Theseproduction levels can be determined separately for each agent, so noneneed pay attention to what the others are doing; each agent treats hercounterparts as passive features of the environment. The other kind ofsituation to which classical economic analysis can be applied withoutrecourse to game theory is that of a monopoly facing many customers.Here, as long as no customer has a share of demand large enough toexert strategic leverage, non-parametric considerations drop out andthe firm’s task is only to identify the combination of price andproduction quantity at which it maximizes profit. However, bothperfect and monopolistic competition are very special and unusualmarket arrangements. Prior to the advent of game theory, therefore,economists were severely limited in the class of circumstances towhich they could straightforwardly apply their models.

Philosophers share with economists a professional interest in theconditions and techniques for the maximization of welfare. Inaddition, philosophers have a special concern with the logicaljustification of actions, and often actions must be justified byreference to their expected outcomes. (One tradition in moralphilosophy, utilitarianism, is based on the idea that all justifiableactions must be justified in this way.) Without game theory, both ofthese problems resist analysis wherever non-parametric aspects arerelevant. We will demonstrate this shortly by reference to the mostfamous (though not the most typical) game, the so-calledPrisoner’s Dilemma, and to other, more typical, games.In doing this, we will need to introduce, define and illustrate thebasic elements and techniques of game theory.

2. Basic Elements and Assumptions of Game Theory

2.1 Utility

An economic agent is, by definition, an entity withpreferences. Game theorists, like economists and philosophersstudying rational decision-making, describe these by means of anabstract concept called utility. This refers to some ranking,on some specified scale, of the subjective welfare or change insubjective welfare that an agent derives from an object or an event.By ‘welfare’ we refer to some normative index of relativealignment between states of the world and agents’ valuations ofthe states in question, justified by reference to some backgroundframework. For example, we might evaluate the relative welfare ofcountries (which we might model as agents for some purposes) byreference to their per capita incomes, and we might evaluate therelative welfare of an animal, in the context of predicting andexplaining its behavioral dispositions, by reference to its expectedevolutionary fitness. In the case of people, it is most typical ineconomics and applications of game theory to evaluate their relativewelfare by reference to their own implicit or explicit judgments ofit. This is why we referred above to subjective welfare.Consider a person who adores the taste of pickles but dislikes onions.She might be said to associate higher utility with states of the worldin which, all else being equal, she consumes more pickles and feweronions than with states in which she consumes more onions and fewerpickles. Examples of this kind suggest that ‘utility’denotes a measure of subjective psychological fulfillment,and this is indeed how the concept was originally interpreted byeconomists and philosophers influenced by the utilitarianism of JeremyBentham. However, economists in the early 20th century recognizedincreasingly clearly that their main interest was in the marketproperty of decreasing marginal demand, regardless of whether that wasproduced by satiated individual consumers or by some other factors. Inthe 1930s this motivation of economists fit comfortably with thedominance of behaviourism and radical empiricism in psychology and inthe philosophy of science respectively. Behaviourists and radicalempiricists objected to the theoretical use of such unobservableentities as ‘psychological fulfillment quotients.’ Theintellectual climate was thus receptive to the efforts of theeconomist Paul Samuelson (1938) to redefine utility in such a way that it becomes a purely technicalconcept rather than one rooted in speculative psychology. SinceSamuelson’s redefinition became standard in the 1950s, when wesay that an agent acts so as to maximize her utility, we mean by‘utility’ simply whatever it is that the agent’sbehavior suggests her to consistently act so as to make more probable.If this looks circular to you, it should: theorists who followSamuelson intend the statement ‘agents act so as tomaximize their utility’ as a tautology, where an‘(economic) agent’ is any entity that can be accuratelydescribed as acting to maximize a utility function, an‘action’ is any utility-maximizing selection from a set ofpossible alternatives, and a‘utility function’ is what aneconomic agent maximizes. Like other tautologies occurring in thefoundations of scientific theories, this interlocking (recursive)system of definitions is useful not in itself, but because it helps tofix our contexts of inquiry.

Though the behaviourism of the 1930s has since been displaced bywidespread interest in cognitive processes, many theorists continue tofollow Samuelson’s way of understanding utility because theythink it important that game theory apply to any kind ofagent—a person, a bear, a bee, a firm or a country—and notjust to agents with human minds. When such theorists say that agentsact so as to maximize their utility, they want this to be part of thedefinition of what it is to be an agent, not an empiricalclaim about possible inner states and motivations. Samuelson’sconception of utility, defined by way of Revealed PreferenceTheory (RPT) introduced in his classic paper (Samuelson (1938)) satisfies this demand.

Economists and others who interpret game theory in terms of RPT shouldnot think of game theory as in any way an empirical account of themotivations of some flesh-and-blood actors (such as actual people).Rather, they should regard game theory as part of the body ofmathematics that is used to model those entities (which might or mightnot literally exist) who consistently select elements from mutuallyexclusive action sets, resulting in patterns of choices, which,allowing for some stochasticity and noise, can be statisticallymodeled as maximization of utility functions. On this interpretation,game theory could not be refuted by any empirical observations, sinceit is not an empirical theory in the first place. Of course,observation and experience could lead someone favoring thisinterpretation to conclude that game theory is of little helpin describing actual human behavior.

Some other theorists understand the point of game theory differently.They view game theory as providing an explanatory account of actualhuman strategic reasoning processes. For this idea to be applicable,we must suppose that agents at least sometimes do what they do innon-parametric settings because game-theoretic logicrecommends certain actions as the ‘rational’ ones. Such anunderstanding of game theory incorporates a normative aspect,since ‘rationality’ is taken to denote a property that anagent should at least generally want to have. These two very generalways of thinking about the possible uses of game theory are compatiblewith the tautological interpretation of utility maximization. Thephilosophical difference is not idle from the perspective of theworking game theorist, however. As we will see in a later section,those who hope to use game theory to explain strategicreasoning, as opposed to merely strategic behavior,face some special philosophical and practical problems.

Since game theory is a technology for formal modeling, we must have adevice for thinking of utility maximization in mathematical terms.Such a device is called a utility function. We will introducethe general idea of a utility function through the special case of anordinal utility function. (Later, we will encounter utilityfunctions that incorporate more information.) The utility-map for anagent is called a ‘function’ because it maps orderedpreferences onto the real numbers. Suppose that agent xprefers bundle a to bundle b and bundle bto bundle c. We then map these onto a list of numbers, wherethe function maps the highest-ranked bundle onto the largest number inthe list, the second-highest-ranked bundle onto the next-largestnumber in the list, and so on, thus:

bundle a ≫ 3

bundle b ≫ 2

bundle c ≫ 1

The only property mapped by this function is order. Themagnitudes of the numbers are irrelevant; that is, it must not beinferred that x gets 3 times as much utility from bundlea as she gets from bundle c. Thus we could representexactly the same utility function as that above by

bundle a ≫ 7,326

bundle b ≫ 12.6

bundle c ≫ −1,000,000

The numbers featuring in an ordinal utility function are thus notmeasuring any quantity of anything. A utility-function inwhich magnitudes do matter is called ‘cardinal’.Whenever someone refers to a utility function without specifying whichkind is meant, you should assume that it’s ordinal. These arethe sorts we’ll need for the first set of games we’llexamine. Later, when we come to seeing how to solve games that involve(ex ante) uncertainty—our river-crossing game from Part1 above, for example—we’ll need to build cardinal utilityfunctions. The technique for doing this was given by von Neumann & Morgenstern (1944), and was an essential aspect of their invention of game theory. Forthe moment, however, we will need only ordinal functions.

2.2 Games and Rationality

All situations in which at least one agent can only act to maximizehis utility through anticipating (either consciously, or justimplicitly in his behavior) the responses to his actions by one ormore other agents is called a game. Agents involved in gamesare referred to as players. If all agents have optimalactions regardless of what the others do, as in purely parametricsituations or conditions of monopoly or perfect competition (see Section 1 above) we can model this without appeal to game theory; otherwise, weneed it.

Game theorists assume that players have sets of capacities that aretypically referred to in the literature of economics as comprising‘rationality’. Usually this is formulated by simplestatements such as ‘it is assumed that players arerational’. In literature critical of economics in general, or ofthe importation of game theory into humanistic disciplines, this kindof rhetoric has increasingly become a magnet for attack. There is adense and intricate web of connections associated with‘rationality’ in the Western cultural tradition, and theword has often been used to normatively marginalize characteristics asnormal and important as emotion, femininity and empathy. Gametheorists’ use of the concept need not, and generally does not,implicate such ideology. For present purposes we will use‘economic rationality’ as a strictly technical, notnormative, term to refer to a narrow and specific set of restrictionson preferences that are shared by von Neumann and Morgenstern’soriginal version of game theory, and RPT. Economists use a second,equally important (to them) concept of rationality when they aremodeling markets, which they call ‘rational expectations’.In this phrase, ‘rationality’ refers not to restrictionson preferences but to non-restrictions on informationprocessing: rational expectations are idealized beliefs that reflectstatistically accurately weighted use of all information available toan agent. The reader should note that these two uses of one wordwithin the same discipline are technically unconnected. Furthermore,original RPT has been specified over the years by several differentsets of axioms for different modeling purposes. Once we decide totreat rationality as a technical concept, each time we adjust theaxioms we effectively modify the concept. Consequently, in anydiscussion involving economists and philosophers together, we can findourselves in a situation where different participants use the sameword to refer to something different. For readers new to economics,game theory, decision theory and the philosophy of action, thissituation naturally presents a challenge.

In this article, ‘economic rationality’ will be used inthe technical sense shared within game theory, microeconomics andformal decision theory, as follows. An economically rational player isone who can (i) assess outcomes, in the sense of rank-ordering themwith respect to their contributions to her welfare; (ii) calculatepaths to outcomes, in the sense of recognizing which sequences ofactions are probabilistically associated with which outcomes; and(iii) select actions from sets of alternatives (which we’lldescribe as ‘choosing’ actions) that yield hermost-preferred outcomes, given the actions of the other players. Wemight summarize the intuition behind all this as follows: an entity isusefully modeled as an economically rational agent to the extent thatit has alternatives, and chooses from amongst these in a way that ismotivated, at least more often than not, by what seems best for itspurposes. (For readers who are antecedently familiar with the work ofthe philosopher Daniel Dennett, we could equate the idea of aneconomically rational agent with the kind of entity Dennettcharacterizes as intentional, and then say that we canusefully predict an economically rational agent’s behavior from‘the intentional stance’.)

Economic rationality might in some cases be satisfied by internalcomputations performed by an agent, and she might or might not beaware of computing or having computed its conditions and implications.In other cases, economic rationality might simply be embodied inbehavioral dispositions built by natural, cultural or marketselection. In particular, in calling an action ‘chosen’ weimply no necessary deliberation, conscious or otherwise. We meanmerely that the action was taken when an alternative action wasavailable, in some sense of ‘available’ normallyestablished by the context of the particular analysis.(‘Available’, as used by game theorists and economists,should never be read as if it meant merely‘metaphysically’ or ‘logically’ available; itis almost always pragmatic, contextual and endlessly revisable by morerefined modeling.)

Each player in a game faces a choice among two or more possiblestrategies. A strategy is a predetermined ‘programme ofplay’ that tells her what actions to take in response toevery possible strategy other players might use. Thesignificance of the italicized phrase here will become clear when wetake up some sample games below.

A crucial aspect of the specification of a game involves theinformation that players have when they choose strategies. Thesimplest games (from the perspective of logical structure) are thosein which agents have perfect information, meaning that atevery point where each agent’s strategy tells her to take anaction, she knows everything that has happened in the game up to thatpoint. A board-game of sequential moves in which both players watchall the action (and know the rules in common), such as chess, is aninstance of such a game. By contrast, the example of thebridge-crossing game from Section 1 above illustrates a game ofimperfect information, since the fugitive must choose abridge to cross without knowing the bridge at which the pursuer haschosen to wait, and the pursuer similarly makes her decision inignorance of the choices of her quarry. Since game theory is abouteconomically rational action given the strategically significantactions of others, it should not surprise you to be told that whatagents in games believe, or fail to believe, about each others’actions makes a considerable difference to the logic of our analyses,as we will see.

2.3 Trees and Matrices

The difference between games of perfect and of imperfect informationis related to (though certainly not identical with!) a distinctionbetween ways of representing games that is based on orderof play. Let us begin by distinguishing between sequential-moveand simultaneous-move games in terms of information. It is natural, asa first approximation, to think of sequential-move games as being onesin which players choose their strategies one after the other, and ofsimultaneous-move games as ones in which players choose theirstrategies at the same time. This isn’t quite right, however,because what is of strategic importance is not the temporalorder of events per se, but whether and when players knowabout other players’ actions relative to having to choosetheir own. For example, if two competing businesses are both planningmarketing campaigns, one might commit to its strategy months beforethe other does; but if neither knows what the other has committed toor will commit to when they make their decisions, this is asimultaneous-move game. Chess, by contrast, is normally played as asequential-move game: you see what your opponent has done beforechoosing your own next action. (Chess can be turned into asimultaneous-move game if the players each call moves on a commonboard while isolated from one another; but this is a very differentgame from conventional chess.)

It was said above that the distinction between sequential-move andsimultaneous-move games is not identical to the distinction betweenperfect-information and imperfect-information games. Explaining whythis is so is a good way of establishing full understanding of bothsets of concepts. As simultaneous-move games were characterized in theprevious paragraph, it must be true that all simultaneous-move gamesare games of imperfect information. However, some games may containmixes of sequential and simultaneous moves. For example, two firmsmight commit to their marketing strategies independently and insecrecy from one another, but thereafter engage in pricing competitionin full view of one another. If the optimal marketing strategies werepartially or wholly dependent on what was expected to happen in thesubsequent pricing game, then the two stages would need to be analyzedas a single game, in which a stage of sequential play followed a stageof simultaneous play. Whole games that involve mixed stages of thissort are games of imperfect information, however temporally stagedthey might be. Games of perfect information (as the name implies)denote cases where no moves are simultaneous (and where noplayer ever forgets what has gone before).

As previously noted, games of perfect information are the (logically)simplest sorts of games. This is so because in such games (as long asthe games are finite, that is, terminate after a known number ofactions) players and analysts can use a straightforward procedure forpredicting outcomes. A player in such a game chooses her first actionby considering each series of responses and counter-responses thatwill result from each action open to her. She then asks herself whichof the available final outcomes brings her the highest utility, andchooses the action that starts the chain leading to this outcome. Thisprocess is called backward induction (because the reasoningworks backwards from eventual outcomes to present choiceproblems).

There will be much more to be said about backward induction and itsproperties in a later section (when we come to discuss equilibrium andequilibrium selection). For now, it has been described just so we canuse it to introduce one of the two types of mathematical objects usedto represent games: game trees. A game tree is an example ofwhat mathematicians call a directed graph. That is, it is aset of connected nodes in which the overall graph has a direction. Wecan draw trees from the top of the page to the bottom, or from left toright. In the first case, nodes at the top of the page are interpretedas coming earlier in the sequence of actions. In the case of a treedrawn from left to right, leftward nodes are prior in the sequence torightward ones. An unlabelled tree has a structure of the followingsort:

The point of representing games using trees can best be grasped byvisualizing the use of them in supporting backward-inductionreasoning. Just imagine the player (or analyst) beginning at the endof the tree, where outcomes are displayed, and then working backwardsfrom these, looking for sets of strategies that describe paths leadingto them. Since a player’s utility function indicates whichoutcomes she prefers to which, we also know which paths she willprefer. Of course, not all paths will be possible because the otherplayer has a role in selecting paths too, and won’t take actionsthat lead to less preferred outcomes for him. We will present someexamples of this interactive path selection, and detailed techniquesfor reasoning through these examples, after we have described asituation we can use a tree to model.

Trees are used to represent sequential games, because theyshow the order in which actions are taken by the players. However,games are sometimes represented on matrices rather thantrees. This is the second type of mathematical object used torepresent games. Matrices, unlike trees, simply show the outcomes,represented in terms of the players’ utility functions, forevery possible combination of strategies the players might use. Forexample, it makes sense to display the river-crossing game from Section 1 on a matrix, since in that game both the fugitive and the hunter havejust one move each, and each chooses their move in ignorance of whatthe other has decided to do. Here, then, is part of thematrix:

Figure 2

The fugitive’s three possible strategies—cross at the safebridge, risk the rocks, or risk the cobras—form the rows of thematrix. Similarly, the hunter’s three possiblestrategies—waiting at the safe bridge, waiting at the rockybridge and waiting at the cobra bridge—form the columns of thematrix. Each cell of the matrix shows—or, rather wouldshow if our matrix was complete—an outcome defined interms of the players’ payoffs. A player’s payoffis simply the number assigned by her ordinal utility function to thestate of affairs corresponding to the outcome in question. For eachoutcome, Row’s payoff is always listed first, followed byColumn’s. Thus, for example, the upper left-hand corner aboveshows that when the fugitive crosses at the safe bridge and the hunteris waiting there, the fugitive gets a payoff of 0 and the hunter getsa payoff of 1. We interpret these by reference to the twoplayers’ utility functions, which in this game are very simple.If the fugitive gets safely across the river he receives a payoff of1; if he doesn’t he gets 0. If the fugitive doesn’t makeit, either because he’s shot by the hunter or hit by a rock orbitten by a cobra, then the hunter gets a payoff of 1 and the fugitivegets a payoff of 0.

We’ll briefly explain the parts of the matrix that have beenfilled in, and then say why we can’t yet complete the rest.Whenever the hunter waits at the bridge chosen by the fugitive, thefugitive is shot. These outcomes all deliver the payoff vector (0, 1).You can find them descending diagonally across the matrix above fromthe upper left-hand corner. Whenever the fugitive chooses the safebridge but the hunter waits at another, the fugitive gets safelyacross, yielding the payoff vector (1, 0). These two outcomes areshown in the second two cells of the top row. All of the other cellsare marked, for now, with question marks. Why? The problemhere is that if the fugitive crosses at either the rocky bridge or thecobra bridge, he introduces parametric factors into the game. In thesecases, he takes on some risk of getting killed, and so producing thepayoff vector (0, 1), that is independent of anything the hunter does.We don’t yet have enough concepts introduced to be able to showhow to represent these outcomes in terms of utilityfunctions—but by the time we’re finished we will, and thiswill provide the key to solving our puzzle from Section 1.

Matrix games are referred to as ‘normal-form’ or‘strategic-form’ games, and games as trees are referred toas ‘extensive-form’ games. The two sorts of games are notequivalent, because extensive-form games containinformation—about sequences of play and players’ levels ofinformation about the game structure—that strategic-form gamesdo not. In general, a strategic-form game could represent any one ofseveral extensive-form games, so a strategic-form game is best thoughtof as being a set of extensive-form games. When order of playis irrelevant to a game’s outcome, then you should study itsstrategic form, since it’s the whole set you want to know about.Where order of play is relevant, the extensive formmust be specified or your conclusions will be unreliable.

2.4 The Prisoner’s Dilemma as an Example of Strategic-Form vs. Extensive-Form Representation

The distinctions described above are difficult to fully grasp if allone has to go on are abstract descriptions. They’re bestillustrated by means of an example. For this purpose, we’ll usethe most famous of all games: the Prisoner’s Dilemma. It in factgives the logic of the problem faced by Cortez’s and HenryV’s soldiers (see Section 1 above), and by Hobbes’s agents before they empower the tyrant. However,for reasons which will become clear a bit later, you should not takethe PD as a typical game; it isn’t. We use it as anextended example here only because it’s particularly helpful forillustrating the relationship between strategic-form andextensive-form games (and later, for illustrating the relationshipsbetween one-shot and repeated games; see Section 4 below).

The name of the Prisoner’s Dilemma game is derived from thefollowing situation typically used to exemplify it. Suppose that thepolice have arrested two people whom they know have committed an armedrobbery together. Unfortunately, they lack enough admissible evidenceto get a jury to convict. They do, however, have enoughevidence to send each prisoner away for two years for theft of thegetaway car. The chief inspector now makes the following offer to eachprisoner: If you will confess to the robbery, implicating yourpartner, and she does not also confess, then you’ll go free andshe’ll get ten years. If you both confess, you’ll each get5 years. If neither of you confess, then you’ll each get twoyears for the auto theft.

Our first step in modeling the two prisoners’ situation as agame is to represent it in terms of utility functions. Following theusual convention, let us name the prisoners ‘Player I’ and‘Player II’. Both Player I’s and Player II’sordinal utility functions are identical:

Go free ≫ 4

2 years ≫ 3

5 years ≫ 2

10 years ≫ 0

The numbers in the function above are now used to express eachplayer’s payoffs in the various outcomes possible inthe situation. We can represent the problem faced by both of them on asingle matrix that captures the way in which their separate choicesinteract; this is the strategic form of their game:

Each cell of the matrix gives the payoffs to both players for eachcombination of actions. Player I’s payoff appears as the firstnumber of each pair, Player II’s as the second. So, if bothplayers confess then they each get a payoff of 2 (5 years in prisoneach). This appears in the upper-left cell. If neither of themconfess, they each get a payoff of 3 (2 years in prison each). Thisappears as the lower-right cell. If Player I confesses and Player IIdoesn’t then Player I gets a payoff of 4 (going free) and PlayerII gets a payoff of 0 (ten years in prison). This appears in theupper-right cell. The reverse situation, in which Player II confessesand Player I refuses, appears in the lower-left cell.

Each player evaluates his or her two possible actions here bycomparing their personal payoffs in each column, since this shows youwhich of their actions is preferable, just to themselves, for eachpossible action by their partner. So, observe: If Player II confessesthen Player I gets a payoff of 2 by confessing and a payoff of 0 byrefusing. If Player II refuses, then Player I gets a payoff of 4 byconfessing and a payoff of 3 by refusing. Therefore, Player I isbetter off confessing regardless of what Player II does. Player II,meanwhile, evaluates her actions by comparing her payoffs down eachrow, and she comes to exactly the same conclusion that Player I does.Wherever one action for a player is superior to her other actions foreach possible action by the opponent, we say that the first actionstrictly dominates the second one. In the PD, then,confessing strictly dominates refusing for both players. Both playersknow this about each other, thus entirely eliminating any temptationto depart from the strictly dominated path. Thus both players willconfess, and both will go to prison for 5 years.

The players, and analysts, can predict this outcome using a mechanicalprocedure, known as iterated elimination of strictly dominatedstrategies. Player 1 can see by examining the matrix that his payoffsin each cell of the top row are higher than his payoffs in eachcorresponding cell of the bottom row. Therefore, it can never beutility-maximizing for him to play his bottom-row strategy, viz.,refusing to confess, regardless of what Player II does. SincePlayer I’s bottom-row strategy will never be played, we cansimply delete the bottom row from the matrix. Now it isobvious that Player II will not refuse to confess, since her payofffrom confessing in the two cells that remain is higher than her payofffrom refusing. So, once again, we can delete the one-cell column onthe right from the game. We now have only one cell remaining, thatcorresponding to the outcome brought about by mutual confession. Sincethe reasoning that led us to delete all other possible outcomesdepended at each step only on the premise that both players areeconomically rational — that is, will choose strategies thatlead to higher payoffs over strategies that lead to lowerones—there are strong grounds for viewing joint confession asthe solution to the game, the outcome on which its playmust converge to the extent that economic rationalitycorrectly models the behavior of the players. You should note that theorder in which strictly dominated rows and columns are deleteddoesn’t matter. Had we begun by deleting the right-hand columnand then deleted the bottom row, we would have arrived at the samesolution.

It’s been said a couple of times that the PD is not a typicalgame in many respects. One of these respects is that all its rows andcolumns are either strictly dominated or strictly dominant. In anystrategic-form game where this is true, iterated elimination ofstrictly dominated strategies is guaranteed to yield a uniquesolution. Later, however, we will see that for many games thiscondition does not apply, and then our analytic task is lessstraightforward.

The reader will probably have noticed something disturbing about theoutcome of the PD. Had both players refused to confess, they’dhave arrived at the lower-right outcome in which they each go toprison for only 2 years, thereby both earning higher utilitythan either receives when both confess. This is the most importantfact about the PD, and its significance for game theory is quitegeneral. We’ll therefore return to it below when we discussequilibrium concepts in game theory. For now, however, let us staywith our use of this particular game to illustrate the differencebetween strategic and extensive forms.

When people introduce the PD into popular discussions, one will oftenhear them say that the police inspector must lock his prisoners intoseparate rooms so that they can’t communicate with one another.The reasoning behind this idea seems obvious: if the players couldcommunicate, they’d surely see that they’re each betteroff if both refuse, and could make an agreement to do so, no? This,one presumes, would remove each player’s conviction that he orshe must confess because they’ll otherwise be sold up the riverby their partner. In fact, however, this intuition is misleading andits conclusion is false.

When we represent the PD as a strategic-form game, we implicitlyassume that the prisoners can’t attempt collusive agreementsince they choose their actions simultaneously. In this case,agreement before the fact can’t help. If Player I is convincedthat his partner will stick to the bargain then he can seize theopportunity to go scot-free by confessing. Of course, he realizes thatthe same temptation will occur to Player II; but in that case he againwants to make sure he confesses, as this is his only means of avoidinghis worst outcome. The prisoners’ agreement comes to naughtbecause they have no way of enforcing it; their promises to each otherconstitute what game theorists call ‘cheap talk’.

But now suppose that the prisoners do not movesimultaneously. That is, suppose that Player II can chooseafter observing Player I’s action. This is the sort ofsituation that people who think non-communication important must havein mind. Now Player II will be able to see that Player I has remainedsteadfast when it comes to her choice, and she need not be concernedabout being suckered. However, this doesn’t change anything, apoint that is best made by re-representing the game in extensive form.This gives us our opportunity to introduce game-trees and the methodof analysis appropriate to them.

First, however, here are definitions of some concepts that will behelpful in analyzing game-trees:

Node: a point at which a player chooses an action.

Initial node: the point at which the first action in the gameoccurs.

Terminal node: any node which, if reached, ends the game.Each terminal node corresponds to an outcome.

Subgame: any connected set of nodes and branches descendinguniquely from one node.

Payoff: an ordinal utility number assigned to a player at anoutcome.

Outcome: an assignment of a set of payoffs, one to eachplayer in the game.

Strategy: a program instructing a player which action to takeat every node in the tree where she could possibly be called on tomake a choice.

These quick definitions may not mean very much to you until you followthem being put to use in our analyses of trees below. It will probablybe best if you scroll back and forth between them and the examples aswe work through them. By the time you understand each example,you’ll find the concepts and their definitions natural andintuitive.

To make this exercise maximally instructive, let’s suppose thatPlayers I and II have studied the matrix above and, seeing thatthey’re both better off in the outcome represented by thelower-right cell, have formed an agreement to cooperate. Player I isto commit to refusal first, after which Player II will reciprocatewhen the police ask for her choice. We will refer to a strategy ofkeeping the agreement as ‘cooperation’, and will denote itin the tree below with ‘C’. We will refer to a strategy ofbreaking the agreement as ‘defection’, and will denote iton the tree below with ‘D’. Each node is numbered 1, 2, 3,… , from top to bottom, for ease of reference in discussion.Here, then, is the tree:

Figure 4

Look first at each of the terminal nodes (those along the bottom).These represent possible outcomes. Each is identified with anassignment of payoffs, just as in the strategic-form game, with PlayerI’s payoff appearing first in each set and Player II’sappearing second. Each of the structures descending from the nodes 1,2 and 3 respectively is a subgame. We begin our backward-inductionanalysis—using a technique called Zermelo’salgorithm—with the sub-games that arise last in thesequence of play. If the subgame descending from node 3 is played,then Player II will face a choice between a payoff of 4 and a payoffof 3. (Consult the second number, representing her payoff, in each setat a terminal node descending from node 3.) II earns her higher payoffby playing D. We may therefore replace the entire subgame with anassignment of the payoff (0,4) directly to node 3, since this is theoutcome that will be realized if the game reaches that node. Nowconsider the subgame descending from node 2. Here, II faces a choicebetween a payoff of 2 and one of 0. She obtains her higher payoff, 2,by playing D. We may therefore assign the payoff (2,2) directly tonode 2. Now we move to the subgame descending from node 1. (Thissubgame is, of course, identical to the whole game; all games aresubgames of themselves.) Player I now faces a choice between outcomes(2,2) and (0,4). Consulting the first numbers in each of these sets,he sees that he gets his higher payoff—2—by playing D. Dis, of course, the option of confessing. So Player I confesses, andthen Player II also confesses, yielding the same outcome as in thestrategic-form representation.

What has happened here intuitively is that Player I realizes that ifhe plays C (refuse to confess) at node 1, then Player II will be ableto maximize her utility by suckering him and playing D. (On the tree,this happens at node 3.) This leaves Player I with a payoff of 0 (tenyears in prison), which he can avoid only by playing D to begin with.He therefore defects from the agreement.

We have thus seen that in the case of the Prisoner’s Dilemma,the simultaneous and sequential versions yield the same outcome. Thiswill often not be true of other games, however. Furthermore, onlyfinite extensive-form (sequential) games of perfect information can besolved using Zermelo’s algorithm.

In this case, you can do so without having to pay for them again. Gps igo 2014 completo.

As noted earlier in this section, sometimes we must representsimultaneous moves within games that are otherwisesequential. (In all such cases the game as a whole will be one ofimperfect information, so we won’t be able to solve it usingZermelo’s algorithm.) We represent such games using the deviceof information sets. Consider the following tree:

The oval drawn around nodes b and c indicates thatthey lie within a common information set. This means that at thesenodes players cannot infer back up the path from whence they came;Player II does not know, in choosing her strategy, whether she is atb or c. (For this reason, what properly bear numbersin extensive-form games are information sets, conceived as‘action points’, rather than nodes themselves; this is whythe nodes inside the oval are labelled with letters rather thannumbers.) Put another way, Player II, when choosing, does not knowwhat Player I has done at node a. But you will recall fromearlier in this section that this is just what defines two moves assimultaneous. We can thus see that the method of representing games astrees is entirely general. If no node after the initial node is alonein an information set on its tree, so that the game has only onesubgame (itself), then the whole game is one of simultaneous play. Ifat least one node shares its information set with another, whileothers are alone, the game involves both simultaneous and sequentialplay, and so is still a game of imperfect information. Only if allinformation sets are inhabited by just one node do we have a game ofperfect information.

2.5 Solution Concepts and Equilibria

In the Prisoner’s Dilemma, the outcome we’ve representedas (2,2), indicating mutual defection, was said to be the‘solution’ to the game. Following the general practice ineconomics, game theorists refer to the solutions of games asequilibria. Philosophically minded readers will want to posea conceptual question right here: What is ‘equilibrated’about some game outcomes such that we are motivated to call them‘solutions’? When we say that a physical system is inequilibrium, we mean that it is in a stable state, one inwhich all the causal forces internal to the system balance each otherout and so leave it ‘at rest’ until and unless it isperturbed by the intervention of some exogenous (that is,‘external’) force. This is what economists havetraditionally meant in talking about ‘equilibria’; theyread economic systems as being networks of mutually constraining(often causal) relations, just like physical systems, and theequilibria of such systems are then their endogenously stable states.(Note that, in both physical and economic systems, endogenously stablestates might never be directly observed because the systems inquestion are never isolated from exogenous influences that move anddestabilize them. In both classical mechanics and in economics,equilibrium concepts are tools for analysis, not predictionsof what we expect to observe.) As we will see in later sections, it ispossible to maintain this understanding of equilibria in the case ofgame theory. However, as we noted in Section 2.1, some peopleinterpret game theory as being an explanatory theory of strategicreasoning. For them, a solution to a game must be an outcome that arational agent would predict using the mechanisms of rationalcomputation alone. Such theorists face some puzzles aboutsolution concepts that are less important to the theorist whoisn’t trying to use game theory to under-write a generalanalysis of rationality. The interest of philosophers in game theoryis more often motivated by this ambition than is that of the economistor other scientist.

It’s useful to start the discussion here from the case of thePrisoner’s Dilemma because it’s unusually simple from theperspective of the puzzles about solution concepts. What we referredto as its ‘solution’ is the unique Nashequilibrium of the game. (The ‘Nash’ here refers toJohn Nash, the Nobel Laureate mathematician who in Nash (1950) did most to extend and generalize von Neumann &Morgenstern’s pioneering work.) Nash equilibrium (henceforth‘NE’) applies (or fails to apply, as the case may be) towhole sets of strategies, one for each player in a game. Aset of strategies is a NE just in case no player could improve herpayoff, given the strategies of all other players in the game, bychanging her strategy. Notice how closely this idea is related to theidea of strict dominance: no strategy could be a NE strategy if it isstrictly dominated. Therefore, if iterative elimination of strictlydominated strategies takes us to a unique outcome, we know that thevector of strategies that leads to it is the game’s unique NE.Now, almost all theorists agree that avoidance of strictly dominatedstrategies is a minimum requirement of economic rationality.A player who knowingly chooses a strictly dominated strategy directlyviolates clause (iii) of the definition of economic agency as given in Section 2.2. This implies that if a game has an outcome that is a uniqueNE, as in the case of joint confession in the PD, that must be itsunique solution. This is one of the most important respects in whichthe PD is an ‘easy’ (and atypical) game.

We can specify one class of games in which NE is always not onlynecessary but sufficient as a solution concept. These arefinite perfect-information games that are also zero-sum. Azero-sum game (in the case of a game involving just two players) isone in which one player can only be made better off by making theother player worse off. (Tic-tac-toe is a simple example of such agame: any move that brings one player closer to winning brings heropponent closer to losing, and vice-versa.) We can determine whether agame is zero-sum by examining players’ utility functions: inzero-sum games these will be mirror-images of each other, with oneplayer’s highly ranked outcomes being low-ranked for the otherand vice-versa. In such a game, if I am playing a strategy such that,given your strategy, I can’t do any better, and if you arealso playing such a strategy, then, since any change ofstrategy by me would have to make you worse off and vice-versa, itfollows that our game can have no solution compatible with our mutualeconomic rationality other than its unique NE. We can put this anotherway: in a zero-sum game, my playing a strategy that maximizes myminimum payoff if you play the best you can, and your simultaneouslydoing the same thing, is just equivalent to our both playingour best strategies, so this pair of so-called ‘maximin’procedures is guaranteed to find the unique solution to the game,which is its unique NE. (In tic-tac-toe, this is a draw. Youcan’t do any better than drawing, and neither can I, if both ofus are trying to win and trying not to lose.)

However, most games do not have this property. It won’t bepossible, in this one article, to enumerate all of the waysin which games can be problematic from the perspective of theirpossible solutions. (For one thing, it is highly unlikely thattheorists have yet discovered all of the possible problems.) However,we can try to generalize the issues a bit.

First, there is the problem that in most non-zero-sum games, there ismore than one NE, but not all NE look equally plausible as thesolutions upon which strategically alert players would hit. Considerthe strategic-form game below (taken from Kreps (1990), p. 403):

Figure 6

This game has two NE: s1-t1 and s2-t2. (Note that no rows or columnsare strictly dominated here. But if Player I is playing s1 then PlayerII can do no better than t1, and vice-versa; and similarly for thes2-t2 pair.) If NE is our only solution concept, then we shall beforced to say that either of these outcomes is equally persuasive as asolution. However, if game theory is regarded as an explanatory and/ornormative theory of strategic reasoning, this seems to be leavingsomething out: surely sensible players with perfect information wouldconverge on s1-t1? (Note that this is not like the situationin the PD, where the socially superior situation is unachievablebecause it is not a NE. In the case of the game above, both playershave every reason to try to converge on the NE in which they arebetter off.)

This illustrates the fact that NE is a relatively (logically)weak solution concept, often failing to predict intuitivelysensible solutions because, if applied alone, it refuses to allowplayers to use principles of equilibrium selection that, if notdemanded by economic rationality—or a more ambitiousphilosopher’s concept of rationality—at least seem bothsensible and computationally accessible. Consider another example from Kreps (1990), p. 397:

Here, no strategy strictly dominates another. However, PlayerI’s top row, s1, weakly dominates s2, since I doesat least as well using s1 as s2 for any reply by Player II,and on one reply by II (t2), I does better. So should not the players(and the analyst) delete the weakly dominated row s2? When they do so,column t1 is then strictly dominated, and the NE s1-t2 is selected asthe unique solution. However, as Kreps goes on to show using thisexample, the idea that weakly dominated strategies should be deletedjust like strict ones has odd consequences. Suppose we change the payoffs of the game just a bit, as follows:

Figure 8

s2 is still weakly dominated as before; but of our two NE, s2-t1 isnow the most attractive for both players; so why should the analysteliminate its possibility? (Note that this game, again, doesnot replicate the logic of the PD. There, it makes sense toeliminate the most attractive outcome, joint refusal to confess,because both players have incentives to unilaterally deviate from it,so it is not an NE. This is not true of s2-t1 in the present game. Youshould be starting to clearly see why we called the PD game‘atypical’.) The argument for eliminating weaklydominated strategies is that Player 1 may be nervous, fearing thatPlayer II is not completely sure to be economically rational(or that Player II fears that Player I isn’t completely reliablyeconomically rational, or that Player II fears that Player I fearsthat Player II isn’t completely reliably economically rational,and so on ad infinitum) and so might play t2 with some positiveprobability. If the possibility of departures from reliable economicrationality is taken seriously, then we have an argument foreliminating weakly dominated strategies: Player I thereby insuresherself against her worst outcome, s2-t2. Of course, she pays a costfor this insurance, reducing her expected payoff from 10 to 5. On theother hand, we might imagine that the players could communicate beforeplaying the game and agree to play correlated strategies soas to coordinate on s2-t1, thereby removing some, most or allof the uncertainty that encourages elimination of the weakly dominatedrow s1, and eliminating s1-t2 as a viable solution instead!

Any proposed principle for solving games that may have the effect ofeliminating one or more NE from consideration as solutions is referredto as a refinement of NE. In the case just discussed,elimination of weakly dominated strategies is one possible refinement,since it refines away the NE s2-t1, and correlation is another, sinceit refines away the other NE, s1-t2, instead. So which refinement ismore appropriate as a solution concept? People who think of gametheory as an explanatory and/or normative theory of strategicrationality have generated a substantial literature in which themerits and drawbacks of a large number of refinements are debated. Inprinciple, there seems to be no limit on the number of refinementsthat could be considered, since there may also be no limits on the setof philosophical intuitions about what principles a rational agentmight or might not see fit to follow or to fear or hope that otherplayers are following.

We now digress briefly to make a point about terminology. Theoristswho adopt the revealed preference interpretation of the utilityfunctions in game theory are sometimes referred to in the philosophyof economics literature as ‘behaviorists’. This reflectsthe fact the revealed preference approaches equate choices witheconomically consistent actions, rather than being intended to referto mental constructs. Historically, there was a relationship ofcomfortable alignment, though not direct theoretical co-construction,between revealed preference in economics and the methodological andontological behaviorism that dominated scientific psychology duringthe middle decades of the twentieth century. However, this usage isincreasingly likely to cause confusion due to the more recent rise ofbehavioral game theory (Camerer 2003). This program of research aims to directly incorporate intogame-theoretic models generalizations, derived mainly from experimentswith people, about ways in which people differ from purer economicagents in the inferences they draw from information(‘framing’). Applications also typically incorporatespecial assumptions about utility functions, also derived fromexperiments. For example, players may be taken to be willing to maketrade-offs between the magnitudes of their own payoffs andinequalities in the distribution of payoffs among the players. We willturn to some discussion of behavioral game theory in Section 8.1, Section 8.2 and Section 8.3. For the moment, note that this use of game theory crucially rests onassumptions about psychological representations of value thought to becommon among people. Thus it would be misleading to refer tobehavioral game theory as ‘behaviorist’. But then it justwould invite confusion to continue referring to conventional economicgame theory that relies on revealed preference as‘behaviorist’ game theory. We will therefore refer to itas ‘non-psychological’ game theory. We mean by this thekind of game theory used by most economists who are notrevisionist behavioral economists. (We use the qualifier‘revisionist’ to reflect the further complication thatincreasingly many economists who apply revealed preference conceptsconduct experiments, and some of them call themselves‘behavioral economists’! For a proposed new set ofconventions to reduce this labeling chaos, see Ross (2014), pp. 200–201.) These ‘establishment’ economiststreat game theory as the abstract mathematics of strategicinteraction, rather than as an attempt to directly characterizespecial psychological dispositions that might be typical inhumans.

Non-psychological game theorists tend to take a dim view of much ofthe refinement program. This is for the obvious reason that it relieson intuitions about which kinds of inferences people shouldfind sensible. Like most scientists, non-psychological game theoristsare suspicious of the force and basis of philosophical assumptions asguides to empirical and mathematical modeling.

Behavioral game theory, by contrast, can be understood as a refinementof game theory, though not necessarily of its solution concepts, in adifferent sense. It restricts the theory’s underlying axioms forapplication to a special class of agents, individual, psychologicallytypical humans. It motivates this restriction by reference toinferences, along with preferences, that people do findnatural, regardless of whether these seem rational,which they frequently do not. Non-psychological and behavioral gametheory have in common that neither is intended to benormative—though both are often used to try to describenorms that prevail in groups of players, as well to explainwhy norms might persist in groups of players even when they appear tobe less than fully rational to philosophical intuitions. Both see thejob of applied game theory as being to predict outcomes ofempirical games given some distribution of strategicdispositions, and some distribution of expectations about thestrategic dispositions of others, that are shaped by dynamics inplayers’ environments, including institutional pressures andstructures and evolutionary selection. Let us therefore groupnon-psychological and behavioral game theorists together, just forpurposes of contrast with normative game theorists, asdescriptive game theorists.

Descriptive game theorists are often inclined to doubt that the goalof seeking a general theory of rationality makes sense as aproject. Institutions and evolutionary processes build manyenvironments, and what counts as rational procedure in one environmentmay not be favoured in another. On the other hand, an entity that doesnot at least stochastically (i.e., perhaps noisily but statisticallymore often than not) satisfy the minimal restrictions of economicrationality cannot, except by accident, be accurately characterized asaiming to maximize a utility function. To such entities game theoryhas no application in the first place.

This does not imply that non-psychological game theorists abjure allprincipled ways of restricting sets of NE to subsets based on theirrelative probabilities of arising. In particular, non-psychologicalgame theorists tend to be sympathetic to approaches that shiftemphasis from rationality onto considerations of the informationaldynamics of games. We should perhaps not be surprised that NE analysisalone often fails to tell us much of applied, empirical interest aboutstrategic-form games (e.g., Figure 6 above), in which informationalstructure is suppressed. Equilibrium selection issues are often morefruitfully addressed in the context of extensive-form games.

2.6 Subgame Perfection

In order to deepen our understanding of extensive-form games, we needan example with more interesting structure than the PD offers.

Consider the game described by this tree:

This game is not intended to fit any preconceived situation; it issimply a mathematical object in search of an application. (L and Rhere just denote ‘left’ and ‘right’respectively.)

Now consider the strategic form of this game:

Figure 10

If you are confused by this, remember that a strategy must tell aplayer what to do at every information set where that playerhas an action. Since each player chooses between two actions at eachof two information sets here, each player has four strategies intotal. The first letter in each strategy designation tells each playerwhat to do if he or she reaches their first information set, thesecond what to do if their second information set is reached. I.e., LRfor Player II tells II to play L if information set 5 is reached and Rif information set 6 is reached.

If you examine the matrix in Figure 10, you will discover that (LL,RL) is among the NE. This is a bit puzzling, since if Player I reachesher second information set (7) in the extensive-form game, she wouldhardly wish to play L there; she earns a higher payoff by playing R atnode 7. Mere NE analysis doesn’t notice this because NE isinsensitive to what happens off the path of play. Player I,in choosing L at node 4, ensures that node 7 will not be reached; thisis what is meant by saying that it is ‘off the path ofplay’. In analyzing extensive-form games, however, weshould care what happens off the path of play, becauseconsideration of this is crucial to what happens on the path.For example, it is the fact that Player I would play R ifnode 7 were reached that would cause Player II to play L ifnode 6 were reached, and this is why Player I won’t choose R atnode 4. We are throwing away information relevant to game solutions ifwe ignore off-path outcomes, as mere NE analysis does. Notice thatthis reason for doubting that NE is a wholly satisfactory equilibriumconcept in itself has nothing to do with intuitions about rationality,as in the case of the refinement concepts discussed in Section2.5.

Now apply Zermelo’s algorithm to the extensive form of ourcurrent example. Begin, again, with the last subgame, that descendingfrom node 7. This is Player I’s move, and she would choose Rbecause she prefers her payoff of 5 to the payoff of 4 she gets byplaying L. Therefore, we assign the payoff (5, −1) to node 7.Thus at node 6 II faces a choice between (−1, 0) and (5,−1). He chooses L. At node 5 II chooses R. At node 4 I is thuschoosing between (0, 5) and (−1, 0), and so plays L. Note that,as in the PD, an outcome appears at a terminal node—(4, 5) fromnode 7—that is Pareto superior to the NE. Again, however, thedynamics of the game prevent it from being reached.

The fact that Zermelo’s algorithm picks out the strategy vector(LR, RL) as the unique solution to the game shows that it’syielding something other than just an NE. In fact, it is generatingthe game’s subgame perfect equilibrium (SPE). It givesan outcome that yields a NE not just in the whole game but inevery subgame as well. This is a persuasive solution concept because,again unlike the refinements of Section 2.5, it does not demand‘extra’ rationality of agents in the sense of expectingthem to have and use philosophical intuitions about ‘what makessense’. It does, however, assume that players not only knoweverything strategically relevant to their situation but alsouse all of that information. In arguments about thefoundations of economics, this is often referred to as an aspect ofrationality, as in the phrase ‘rational expectations’.But, as noted earlier, it is best to be careful not to confuse thegeneral normative idea of rationality with computational power and thepossession of budgets, in time and energy, to make the most of it.

An agent playing a subgame perfect strategy simply chooses, at everynode she reaches, the path that brings her the highest payoff inthe subgame emanating from that node. SPE predicts a game’soutcome just in case, in solving the game, the players foresee thatthey will all do that.

A main value of analyzing extensive-form games for SPE is that thiscan help us to locate structural barriers to social optimization. Inour current example, Player I would be better off, and Player II noworse off, at the left-hand node emanating from node 7 than at the SPEoutcome. But Player I’s economic rationality, and PlayerII’s awareness of this, blocks the socially efficient outcome.If our players wish to bring about the more socially efficient outcome(4,5) here, they must do so by redesigning their institutions so as tochange the structure of the game. The enterprise of changinginstitutional and informational structures so as to make efficientoutcomes more likely in the games that agents (that is, people,corporations, governments, etc.) actually play is known asmechanism design, and is one of the leading areas ofapplication of game theory. The main techniques are reviewed in Hurwicz and Reiter (2006), the first author of which was awarded the Nobel Prize for hispioneering work in the area.

2.7 On Interpreting Payoffs: Morality and Efficiency in Games

Many readers, but especially philosophers, might wonder why, in thecase of the example taken up in the previous section, mechanism designshould be necessary unless players are morbidly selfish sociopaths.Surely, the players might be able to just see that outcome(4,5) is socially and morally superior; and since the whole problemalso takes for granted that they can also see the path of actions thatleads to this efficient outcome, who is the game theorist to announcethat, unless their game is changed, it’s unattainable? Thisobjection, which applies the distinctive idea of rationality urged byImmanuel Kant, indicates the leading way in which many philosophersmean more by ‘rationality’ than descriptive game theoristsdo. This theme is explored with great liveliness and polemical forcein Binmore (1994, 1998).

This weighty philosophical controversy about rationality is sometimesconfused by misinterpretation of the meaning of ‘utility’in non-psychological game theory. To root out this mistake, considerthe Prisoner’s Dilemma again. We have seen that in the unique NEof the PD, both players get less utility than they could have throughmutual cooperation. This may strike you, even if you are not a Kantian(as it has struck many commentators) as perverse. Surely, you maythink, it simply results from a combination of selfishness andparanoia on the part of the players. To begin with they have no regardfor the social good, and then they shoot themselves in the feet bybeing too untrustworthy to respect agreements.

This way of thinking is very common in popular discussions, and badlymixed up. To dispel its influence, let us first introduce someterminology for talking about outcomes. Welfare economists typicallymeasure social good in terms of Pareto efficiency. Adistribution of utility β is said to be Pareto superiorover another distribution δ just in case from state δthere is a possible redistribution of utility to β such that atleast one player is better off in β than in δ and no playeris worse off. Failure to move from a Pareto-inferior to aPareto-superior distribution is inefficient because theexistence of β as a possibility, at least in principle, showsthat in δ some utility is being wasted. Now, the outcome (3,3)that represents mutual cooperation in our model of the PD is clearlyPareto superior to mutual defection; at (3,3) both playersare better off than at (2,2). So it is true that PDs lead toinefficient outcomes. This was true of our example in Section 2.6 aswell.

However, inefficiency should not be associated with immorality. Autility function for a player is supposed to represent everythingthat player cares about, which may be anything at all. As we havedescribed the situation of our prisoners they do indeed care onlyabout their own relative prison sentences, but there is nothingessential in this. What makes a game an instance of the PD is strictlyand only its payoff structure. Thus we could have two Mother Theresatypes here, both of whom care little for themselves and wish only tofeed starving children. But suppose the original Mother Theresa wishesto feed the children of Calcutta while Mother Juanita wishes to feedthe children of Bogota. And suppose that the international aid agencywill maximize its donation if the two saints nominate the same city,will give the second-highest amount if they nominate eachothers’ cities, and the lowest amount if they each nominatetheir own city. Our saints are in a PD here, though hardly selfish orunconcerned with the social good.

To return to our prisoners, suppose that, contrary to our assumptions,they do value each other’s well-being as well as theirown. In that case, this must be reflected in their utility functions,and hence in their payoffs. If their payoff structures are changed sothat, for example, they would feel so badly about contributing toinefficiency that they’d rather spend extra years in prison thanendure the shame, then they will no longer be in a PD. But all thisshows is that not every possible situation is a PD; it doesnot show that selfishness is among the assumptions of gametheory. It is the logic of the prisoners’ situation,not their psychology, that traps them in the inefficient outcome, andif that really is their situation then they are stuck in it(barring further complications to be discussed below). Agents who wishto avoid inefficient outcomes are best advised to prevent certaingames from arising; the defender of the possibility of Kantianrationality is really proposing that they try to dig themselves out ofsuch games by turning themselves into different kinds of agents.

In general, then, a game is partly defined by the payoffsassigned to the players. In any application, such assignments shouldbe based on sound empirical evidence. If a proposed solution involvestacitly changing these payoffs, then this ‘solution’ is infact a disguised way of changing the subject and evading theimplications of best modeling practice.

2.8 Trembling Hands and Quantal Response Equilibria

Our last point above opens the way to a philosophical puzzle, one ofseveral that still preoccupy those concerned with the logicalfoundations of game theory. It can be raised with respect to anynumber of examples, but we will borrow an elegant one from C.Bicchieri (1993). Consider the following game:

The NE outcome here is at the single leftmost node descending fromnode 8. To see this, backward induct again. At node 10, I would play Lfor a payoff of 3, giving II a payoff of 1. II can do better than thisby playing L at node 9, giving I a payoff of 0. I can do better thanthis by playing L at node 8; so that is what I does, and the gameterminates without II getting to move. A puzzle is then raised byBicchieri (along with other authors, including Binmore (1987) and Pettit and Sugden (1989)) by way of the following reasoning. Player I plays L at node 8 becauseshe knows that Player II is economically rational, and so would, atnode 9, play L because Player II knows that Player I is economicallyrational and so would, at node 10, play L. But now we have thefollowing paradox: Player I must suppose that Player II, at node 9,would predict Player I’s economically rational play at node 10despite having arrived at a node (9) that could only be reached ifPlayer I is not economically rational! If Player I is not economicallyrational then Player II is not justified in predicting that Player Iwill not play R at node 10, in which case it is not clear that PlayerII shouldn’t play R at 9; and if Player II plays R at 9, thenPlayer I is guaranteed of a better payoff then she gets if she plays Lat node 8. Both players use backward induction to solve the game;backward induction requires that Player I know that Player II knowsthat Player I is economically rational; but Player II can solve thegame only by using a backward induction argument that takes as apremise the failure of Player I to behave in accordance with economicrationality. This is the paradox of backward induction.

A standard way around this paradox in the literature is to invoke theso-called ‘trembling hand’ due to Selten (1975). The idea here is that a decision and its consequent act may‘come apart’ with some nonzero probability, however small.That is, a player might intend to take an action but then slip up inthe execution and send the game down some other path instead. If thereis even a remote possibility that a player may make amistake—that her ‘hand may tremble’—then nocontradiction is introduced by a player’s using a backwardinduction argument that requires the hypothetical assumption thatanother player has taken a path that an economically rational playercould not choose. In our example, Player II could reason about what todo at node 9 conditional on the assumption that Player I chose L atnode 8 but then slipped.

Gintis (2009a) points out that the apparent paradox does not arise merely from oursupposing that both players are economically rational. It restscrucially on the additional premise that each player must know, andreasons on the basis of knowing, that the other player is economicallyrational. This is the premise with which each player’sconjectures about what would happen off the equilibrium path of playare inconsistent. A player has reason to consider out-of-equilibriumpossibilities if she either believes that her opponent is economicallyrational but his hand may tremble or she attaches somenonzero probability to the possibility that he is not economicallyrational or she attaches some doubt to her conjecture abouthis utility function. As Gintis also stresses, this issue with solvingextensive-form games games for SEP by Zermelo’s algorithmgeneralizes: a player has no reason to play even a Nashequilibrium strategy unless she expects other players to also playNash equilibrium strategies. We will return to this issue in Section 7 below.

The paradox of backward induction, like the puzzles raised byequilibrium refinement, is mainly a problem for those who view gametheory as contributing to a normative theory of rationality(specifically, as contributing to that larger theory the theory ofstrategic rationality). The non-psychological game theoristcan give a different sort of account of apparently“irrational” play and the prudence it encourages. Thisinvolves appeal to the empirical fact that actual agents, includingpeople, must learn the equilibrium strategies of games theyplay, at least whenever the games are at all complicated. Researchshows that even a game as simple as the Prisoner’s Dilemmarequires learning by people (Ledyard 1995, Sally 1995, Camerer 2003, p. 265). What it means to say that people must learn equilibriumstrategies is that we must be a bit more sophisticated than wasindicated earlier in constructing utility functions from behavior inapplication of Revealed Preference Theory. Instead of constructingutility functions on the basis of single episodes, we must do so onthe basis of observed runs of behavior once it hasstabilized, signifying maturity of learning for the subjects inquestion and the game in question. Once again, the Prisoner’sDilemma makes a good example. People encounter few one-shotPrisoner’s Dilemmas in everyday life, but they encounter manyrepeated PD’s with non-strangers. As a result, when setinto what is intended to be a one-shot PD in the experimentallaboratory, people tend to initially play as if the game were a singleround of a repeated PD. The repeated PD has many Nash equilibria thatinvolve cooperation rather than defection. Thus experimental subjectstend to cooperate at first in these circumstances, but learn aftersome number of rounds to defect. The experimenter cannot infer thatshe has successfully induced a one-shot PD with her experimental setupuntil she sees this behavior stabilize.

If players of games realize that other players may need to learn gamestructures and equilibria from experience, this gives them reason totake account of what happens off the equilibrium paths ofextensive-form games. Of course, if a player fears that other playershave not learned equilibrium, this may well remove her incentive toplay an equilibrium strategy herself. This raises a set of deepproblems about social learning (Fudenberg and Levine 1998. How can ignorant players learn to play equilibria if sophisticatedplayers don’t show them, because the sophisticated are notincentivized to play equilibrium strategies until the ignorant havelearned? The crucial answer in the case of applications of game theoryto interactions among people is that young people aresocialized by growing up in networks ofinstitutions, including cultural norms. Most complexgames that people play are already in progress among people who weresocialized before them—that is, have learned game structures andequilibria (Ross 2008a). Novices must then only copy those whose play appears to be expectedand understood by others. Institutions and norms are rich withreminders, including homilies and easily remembered rules of thumb, tohelp people remember what they are doing (Clark 1997).

As noted in Section 2.7 above, when observed behavior does not stabilize aroundequilibria in a game, and there is no evidence that learning is stillin process, the analyst should infer that she has incorrectly modeledthe situation she is studying. Chances are that she has eithermis-specified players’ utility functions, the strategiesavailable to the players, or the information that is available tothem. Given the complexity of many of the situations that socialscientists study, we should not be surprised that mis-specification ofmodels happens frequently. Applied game theorists must do lots oflearning, just like their subjects.

The paradox of backward induction is one of a family of paradoxes thatarise if one builds possession and use of literally completeinformation into a concept of rationality. (Consider, by analogy, thestock market paradox that arises if we suppose that economicallyrational investment incorporates literally rational expectations:assume that no individual investor can beat the market in the long runbecause the market always knows everything the investor knows; then noone has incentive to gather knowledge about asset values; then no onewill ever gather any such information and so from the assumption thatthe market knows everything it follows that the market cannot knowanything!)As we will see in detail in various discussions below, mostapplications of game theory explicitly incorporate uncertainty andprospects for learning by players. The extensive-form games with SPEthat we looked at above are really conceptual tools to help us prepareconcepts for application to situations where complete and perfectinformation is unusual. We cannot avoid the paradox if we think, assome philosophers and normative game theorists do, that one of theconceptual tools we want to use game theory to sharpen is a fullygeneral idea of rationality itself. But this is not a concernentertained by economists and other scientists who put game theory touse in empirical modeling. In real cases, unless players haveexperienced play at equilibrium with one another in the past, even ifthey are all economically rational and all believe this about oneanother, we should predict that they will attach some positiveprobability to the conjecture that understanding of game structuresamong some players is imperfect. This then explains why people, evenif they are economically rational agents, may often, or even usually,play as if they believe in trembling hands.

Learning of equilibria may take various forms for different agents andfor games of differing levels of complexity and risk. Incorporating itinto game-theoretic models of interactions thus introduces anextensive new set of technicalities. For the most fully developedgeneral theory, the reader is referred to Fudenberg and Levine (1998); the same authors provide a non-technical overview of the issues in Fudenberg and Levine (2016)). A first important distinction is between learning specific parametersbetween rounds of a repeated game (see Section 4) with common players, and learning about general strategicexpectations across different games. The latter can include learningabout players if the learner is updating expectations based on hermodels of types of players she recurrently encounters. Thenwe can distinguish between passive learning, in which aplayer merely updates her subjective priors based on herobservation of moves and outcomes, and strategic choices she infersfrom these, and active learning, in which she probes—intechnical language screens—for information about otherplayers’ strategies by choosing strategies that test herconjectures about what will occur off what she believes to be thegame’s equilibrium path. A major difficulty for both players andmodelers is that screening moves might be misinterpreted if playersare also incentivized to make moves to signal information toone another (see Section 4). In other words: trying to learn about strategies can under somecircumstances interfere with players’ abilities to learnequilibria. Finally, the discussion so far has assumed that allpossible learning in a game is about the structure of the game itself. Wilcox (2008) shows that if players are learning new information about causalprocesses occurring outside a game while simultaneously trying toupdate expectations about other players’ strategies, the modelercan find herself reaching beyond the current limits of technicalknowledge.

It was said above that people might usually play as if theybelieve in trembling hands. A very general reason for this is thatwhen people interact, the world does not furnish them with cue-cardsadvising them about the structures of the games they’re playing.They must make and test conjectures about this from their socialcontexts. Sometimes, contexts are fixed by institutional rules. Forexample, when a person walks into a retail shop and sees a price tagon something she’d like to have, she knows without needing toconjecture or learn anything that she’s involved in a simple‘take it or leave it’ game. In other markets, she mightknow she is expect to haggle, and know the rules for that too.

Given the unresolved complex relationship between learning theory andgame theory, the reasoning above might seem to imply that game theorycan never be applied to situations involving human players that arenovel for them. Fortunately, however, we face no such impasse. In apair of influential papers in the mid-to-late 1990s, McKelvey andPalfrey (1995, 1998) developed the solution concept of quantal responseequilibrium (QRE). QRE is not a refinement of NE, in the sense ofbeing a philosophically motivated effort to strengthen NE by referenceto normative standards of rationality. It is, rather, a method forcalculating the equilibrium properties of choices made by playerswhose conjectures about possible errors in the choices of otherplayers are uncertain. QRE is thus standard equipment in the toolkitof experimental economists who seek to estimate the distribution ofutility functions in populations of real people placed in situationsmodeled as games. QRE would not have been practically serviceable inthis way before the development of econometrics packages such as Stata(TM) allowed computation of QRE given adequately powerful observationrecords from interestingly complex games. QRE is rarely utilized bybehavioral economists, and is almost never used by psychologists, inanalyzing laboratory data. In consequence, many studies by researchersof these types make dramatic rhetorical points by‘discovering’ that real people often fail to converge onNE in experimental games. But NE, though it is a minimalist solutionconcept in one sense because it abstracts away from much informationalstructure, is simultaneously a demanding empirical expectation if itis imposed categorically (that is, if players are expected to play asif they are all certain that all others are playing NE strategies).Predicting play consistent with QRE is consistent with—indeed,is motivated by—the view that NE captures the core generalconcept of a strategic equilibrium. One way of framing thephilosophical relationship between NE and QRE is as follows. NEdefines a logical principle that is well adapted fordisciplining thought and for conceiving new strategies for genericmodeling of new classes of social phenomena. For purposes ofestimating real empirical data one needs to be able to defineequilibrium statistically. QRE represents one way of doingthis, consistently with the logic of NE. The idea is sufficiently richthat its depths remain an open domain of investigation by gametheorists. The current state of understanding of QRE iscomprehensively reviewed in (Goeree, Holt and Palfrey (2016).

3. Uncertainty, Risk and Sequential Equilibria

The games we’ve modeled to this point have all involved playerschoosing from amongst pure strategies, in which each seeks asingle optimal course of action at each node that constitutes a bestreply to the actions of others. Often, however, a player’sutility is optimized through use of a mixed strategy, inwhich she flips a weighted coin amongst several possible actions. (Wewill see later that there is an alternative interpretation of mixing,not involving randomization at a particular information set; but wewill start here from the coin-flipping interpretation and then buildon it in Section 3.1.) Mixing is called for whenever no pure strategy maximizes theplayer’s utility against all opponent strategies. Ourriver-crossing game from Section 1 exemplifies this. As we saw, the puzzle in that game consists in thefact that if the fugitive’s reasoning selects a particularbridge as optimal, his pursuer must be assumed to be able to duplicatethat reasoning. The fugitive can escape only if his pursuer cannotreliably predict which bridge he’ll use. Symmetry of logicalreasoning power on the part of the two players ensures that thefugitive can surprise the pursuer only if it is possible for him tosurprise himself.

Suppose that we ignore rocks and cobras for a moment, and imagine thatthe bridges are equally safe. Suppose also that the fugitive has nospecial knowledge about his pursuer that might lead him to venture aspecially conjectured probability distribution over thepursuer’s available strategies. In this case, thefugitive’s best course is to roll a three-sided die, in whicheach side represents a different bridge (or, more conventionally, asix-sided die in which each bridge is represented by two sides). Hemust then pre-commit himself to using whichever bridge is selected bythis randomizing device. This fixes the odds of his survivalregardless of what the pursuer does; but since the pursuer has noreason to prefer any available pure or mixed strategy, and since inany case we are presuming her epistemic situation to be symmetrical tothat of the fugitive, we may suppose that she will roll a three-sideddie of her own. The fugitive now has a 2/3 probability of escaping andthe pursuer a 1/3 probability of catching him. Neither the fugitivenor the pursuer can improve their chances given the other’srandomizing mix, so the two randomizing strategies are in Nashequilibrium. Note that if one player is randomizing then theother does equally well on any mix of probabilities overbridges, so there are infinitely many combinations of best replies.However, each player should worry that anything other than a randomstrategy might be coordinated with some factor the other player candetect and exploit. Since any non-random strategy is exploitable byanother non-random strategy, in a zero-sum game such as our example,only the vector of randomized strategies is a NE.

Now let us re-introduce the parametric factors, that is, the fallingrocks at bridge #2 and the cobras at bridge #3. Again, suppose thatthe fugitive is sure to get safely across bridge #1, has a 90% chanceof crossing bridge #2, and an 80% chance of crossing bridge #3. We cansolve this new game if we make certain assumptions about the twoplayers’ utility functions. Suppose that Player 1, the fugitive,cares only about living or dying (preferring life to death) while thepursuer simply wishes to be able to report that the fugitive is dead,preferring this to having to report that he got away. (In other words,neither player cares about how the fugitive lives or dies.)Suppose also for now that neither player gets any utility ordisutility from taking more or less risk. In this case, the fugitivesimply takes his original randomizing formula and weights it accordingto the different levels of parametric danger at the three bridges.Each bridge should be thought of as a lottery over thefugitive’s possible outcomes, in which each lottery has adifferent expected payoff in terms of the items in hisutility function.

Consider matters from the pursuer’s point of view. She will beusing her NE strategy when she chooses the mix of probabilities overthe three bridges that makes the fugitive indifferent among hispossible pure strategies. The bridge with rocks is 1.1 times moredangerous for him than the safe bridge. Therefore, he will beindifferent between the two when the pursuer is 1.1 times more likelyto be waiting at the safe bridge than the rocky bridge. The cobrabridge is 1.2 times more dangerous for the fugitive than the safebridge. Therefore, he will be indifferent between these two bridgeswhen the pursuer’s probability of waiting at the safe bridge is1.2 times higher than the probability that she is at the cobra bridge.Suppose we use s1, s2 and s3 to represent the fugitive’sparametric survival rates at each bridge. Then the pursuer minimizesthe net survival rate across any pair of bridges by adjusting theprobabilities p1 and p2 that she will wait at them so that

s1 (1 − p1) = s2 (1 − p2)

Since p1 + p2 = 1, we can rewrite this as

s1 × p2 = s2 × p1

p1/s1 = p2/s2.

Thus the pursuer finds her NE strategy by solving the followingsimultaneous equations:

1 (1 − p1)	=	0.9 (1 − p2)
=	0.8 (1 − p3)

p1 + p2 + p3 = 1.

Then

p1	=	49/121
p2	=	41/121
p3	=	31/121

Now let f1, f2, f3 represent the probabilities with which the fugitivechooses each respective bridge. Then the fugitive finds his NEstrategy by solving

s1 × f1	=	s2 × f2
=	s3 × f3

1 × f1	=	0.9 × f2
=	0.8 × f3

simultaneously with

f1 + f2 + f3 = 1.

Then

f1 = 36/121

f2 = 40/121

f3 = 45/121

These two sets of NE probabilities tell each player how to weight hisor her die before throwing it. Note the—perhapssurprising—result that the fugitive, though by hypothesis hegets no enjoyment from gambling, uses riskier bridges with higher probability. This is the only way of making the pursuerindifferent over which bridge she stakes out, which in turn is whatmaximizes the fugitive’s probability of survival.

We were able to solve this game straightforwardly because we set theutility functions in such a way as to make it zero-sum, orstrictly competitive. That is, every gain in expected utilityby one player represents a precisely symmetrical loss by the other.However, this condition may often not hold. Suppose now that theutility functions are more complicated. The pursuer most prefers anoutcome in which she shoots the fugitive and so claims credit for hisapprehension to one in which he dies of rockfall or snakebite; and sheprefers this second outcome to his escape. The fugitive prefers aquick death by gunshot to the pain of being crushed or the terror ofan encounter with a cobra. Most of all, of course, he prefers toescape. Suppose, plausibly, that the fugitive cares morestrongly about surviving than he does about getting killedone way rather than another. We cannot solve this game, as before,simply on the basis of knowing the players’ ordinal utilityfunctions, since the intensities of their respectivepreferences will now be relevant to their strategies.

Prior to the work of von Neumann & Morgenstern (1947), situations of this sort were inherently baffling to analysts. This isbecause utility does not denote a hidden psychological variable suchas pleasure. As we discussed in Section 2.1, utility is merely a measure of relative behavioural dispositionsgiven certain consistency assumptions about relations betweenpreferences and choices. It therefore makes no sense to imaginecomparing our players’ cardinal—that is,intensity-sensitive—preferences with one another’s, sincethere is no independent, interpersonally constant yardstick we coulduse. How, then, can we model games in which cardinal information isrelevant? After all, modeling games requires that all players’utilities be taken simultaneously into account, as we’veseen.

A crucial aspect of von Neumann & Morgenstern’s (1947) work was the solution to this problem. Here, we will provide a briefoutline of their ingenious technique for building cardinal utilityfunctions out of ordinal ones. It is emphasized that what follows ismerely an outline, so as to make cardinal utilitynon-mysterious to you as a student who is interested in knowing aboutthe philosophical foundations of game theory, and about the range ofproblems to which it can be applied. Providing a manual you couldfollow in building your own cardinal utility functions wouldrequire many pages. Such manuals are available in many textbooks.

Suppose that we now assign the following ordinal utility function tothe river-crossing fugitive:

Escape ≫ 4

Death by shooting ≫ 3

Death by rockfall ≫ 2

Death by snakebite ≫ 1

We are supposing that his preference for escape over any formof death is stronger than his preferences between causes of death.This should be reflected in his choice behaviour in the following way.In a situation such as the river-crossing game, he should be willingto run greater risks to increase the relative probability of escapeover shooting than he is to increase the relative probability ofshooting over snakebite. This bit of logic is the crucial insightbehind von Neumann & Morgenstern’s (1947) solution to the cardinalization problem.

Suppose we asked the fugitive to pick, from the available set ofoutcomes, a best one and a worst one.‘Best’ and ‘worst’ are defined in terms ofexpected payoffs as illustrated in our current zero-sum game example:a player maximizes his expected payoff if, when choosing amonglotteries that contain only two possible prizes, he always chooses soas to maximize the probability of the best outcome—call thisW—and to minimize the probability of the worstoutcome—call this L. Now imagine expanding theset of possible prizes so that it includes prizes that the agentvalues as intermediate between W andL. We find, for a set of outcomes containing suchprizes, a lottery over them such that our agent is indifferent betweenthat lottery and a lottery including only W andL. In our example, this is a lottery that includesbeing shot and being crushed by rocks. Call this lotteryT . We define a utility function q =u(T) from outcomes to the real (as opposedto ordinal) number line such that if q is the expected prizein T , the agent is indifferent between winningT and winning a lottery T* in whichW occurs with probabilityu(T) and L occurs withprobability 1 − u(T).Assuming that the agent’s behaviour respects the principle ofreduction of compound lotteries (ROCL)—that is, he doesnot gain or lose utility from considering more complex lotteriesrather than simple ones—the set of mappings of outcomes inT to uT* gives a vonNeumann—Morgenstern utility function (vNMuf) with cardinalstructure over all outcomes in T.

What exactly have we done here? We’ve given our agent choicesover lotteries, instead of directly over resolved outcomes, andobserved how much extra risk of death he’s willing to run tochange the odds of getting one form of death relative to analternative form of death. Note that this cardinalizes theagent’s preference structure only relative to agent-specificreference points W and L; theprocedure reveals nothing about comparative extra-ordinal preferencesbetween agents, which helps to make clear that constructing avNMuf does not introduce a potentially objective psychologicalelement. Furthermore, two agents in one game, or one agent underdifferent sorts of circumstances, may display varying attitudes torisk. Perhaps in the river-crossing game the pursuer, whose life isnot at stake, will enjoy gambling with her glory while our fugitive iscautious. In analyzing the river-crossing game, however, wedon’t have to be able to compare the pursuer’scardinal utilities with the fugitive’s. Both agents, after all,can find their NE strategies if they can estimate the probabilitieseach will assign to the actions of the other. This means that eachmust know both vNMufs; but neither need try to comparatively value theoutcomes over which they’re choosing.

We can now fill in the rest of the matrix for the bridge-crossing gamethat we started to draw in Section 2. If both players are risk-neutraland their revealed preferences respect ROCL, then we have enoughinformation to be able to assign expected utilities, expressed bymultiplying the original payoffs by the relevant probabilities, asoutcomes in the matrix. Suppose that the hunter waits at the cobrabridge with probability x and at the rocky bridge withprobability y. Since her probabilities across the threebridges must sum to 1, this implies that she must wait at the safebridge with probability 1 − (x + y). Then,continuing to assign the fugitive a payoff of 0 if he dies and 1 if heescapes, and the hunter the reverse payoffs, our complete matrix is asfollows:

Figure 12

We can now read the following facts about the game directly from thematrix. No pair of pure strategies is a pair of best replies to theother. Therefore, the game’s only NE require at least one playerto use a mixed strategy.

3.1 Beliefs and Subjective Probabilities

In all of our examples and workings to this point, we have presupposedthat players’ beliefs about probabilities in lotteries matchobjective probabilities. But in real interactive choice situations,agents must often rely on their subjective estimations or perceptionsof probabilities. In one of the greatest contributions totwentieth-century behavioral and social science, Savage (1954) showed how to incorporate subjective probabilities, and theirrelationships to preferences over risk, within the framework of vonNeumann-Morgenstern expected utility theory. Indeed, Savage’sachievement amounts to the formal completion of EUT. Then, just over adecade later, Harsanyi (1967) showed how to solve games involving maximizers of Savage expectedutility. This is often taken to have marked the true maturity of gametheory as a tool for application to behavioral and social science, andwas recognized as such when Harsanyi joined Nash and Selten as arecipient of the first Nobel prize awarded to game theorists in1994.

As we observed in considering the need for people playing games tolearn trembling hand equilibria and QRE, when we model the strategicinteractions of people we must allow for the fact that people aretypically uncertain about their models of one another. Thisuncertainty is reflected in their choices of strategies. Furthermore,some actions might be taken specifically for the sake of learningabout the accuracy of a player’s conjectures about otherplayers. Harsanyi’s extension of game theory incorporates thesecrucial elements.

Consider the three-player imperfect-information game below known as‘Selten’s horse’ (for its inventor, Nobel LaureateReinhard Selten, and because of the shape of its tree; taken from Kreps (1990), p. 426):

This game has four NE: (L, l₂, l₃), (L,r₂, l₃), (R, r₂, l₃) and(R, r₂, r₃). Consider the fourth of these NE. Itarises because when Player I plays R and Player II playsr₂, Player III’s entire information set is off thepath of play, and it doesn’t matter to the outcome what PlayerIII does. But Player I would not play R if Player III could tell thedifference between being at node 13 and being at node 14. Thestructure of the game incentivizes efforts by Player I to supplyPlayer III with information that would open up her closed informationset. Player III should believe this information because the structureof the game shows that Player I has incentive to communicate ittruthfully. The game’s solution would then be the SPE of the(now) perfect information game: (L, r₂, l₃).

Theorists who think of game theory as part of a normative theory ofgeneral rationality, for example most philosophers, and refinementprogram enthusiasts among economists, have pursued a strategy thatwould identify this solution on general principles. Notice what PlayerIII in Selten’s Horse might wonder about as he selects hisstrategy. “Given that I get a move, was my action node reachedfrom node 11 or from node 12?” What, in other words, are theconditional probabilities that Player III is at node 13 or 14given that he has a move? Now, if conditional probabilities are whatPlayer III wonders about, then what Players I and II might makeconjectures about when they select their strategies arePlayer III’s beliefs about these conditionalprobabilities. In that case, Player I must conjecture about PlayerII’s beliefs about Player III’s beliefs, and PlayerIII’s beliefs about Player II’s beliefs and so on. Therelevant beliefs here are not merely strategic, as before, since theyare not just about what players will do given a set ofpayoffs and game structures, but about what understanding ofconditional probability they should expect other players to operatewith.

What beliefs about conditional probability is it reasonable forplayers to expect from each other? If we follow Savage (1954) we would suggest as a normative principle that they should reason andexpect others to reason in accordance with Bayes’srule. This tells them how to compute the probability of an eventF given information E (written ‘pr(F/E)’):

pr(F/E) = [pr(E/F) × pr(F)] / pr(E)

If we assume that players’ beliefs are always consistent withthis equality, then we may define a sequential equilibrium. ASE has two parts: (1) a strategy profile § for each player, asbefore, and (2) a system of beliefs μ for each player.μ assigns to each information set h a probabilitydistribution over the nodes in h, with the interpretationthat these are the beliefs of player i(h) aboutwhere in his information set he is, given that information seth has been reached. Then a sequential equilibrium is aprofile of strategies § and a system of beliefs μ consistentwith Bayes’s rule such that starting from every information seth in the tree player i(h) plays optimallyfrom then on, given that what he believes to have transpiredpreviously is given by μ(h) and what will transpire atsubsequent moves is given by §.

Let us apply this solution concept to Selten’s Horse. Consideragain the NE (R, r₂, r₃). Suppose that PlayerIII assigns pr(1) to her belief that if she gets a move she is at node13. Then Player I, given a consistent μ(I), must believe thatPlayer III will play l₃, in which case her only SE strategyis L. So although (R, r₂, l₃) is a NE, it is nota SE.

The use of the consistency requirement in this example is somewhattrivial, so consider now a second case (also taken from Kreps (1990), p. 429):

Figure 14

Suppose that Player I plays L, Player II plays l₂ andPlayer III plays l₃. Suppose also that μ(II) assignspr(.3) to node 16. In that case, l₂ is not a SE strategyfor Player II, since l₂ returns an expected payoff of .3(4)+ .7(2) = 2.6, while r₂ brings an expected payoff of 3.1.Notice that if we fiddle the strategy profile for player III whileleaving everything else fixed, l₂ could become aSE strategy for Player II. If §(III) yielded a play ofl₃ with pr(.5) and r₃ with pr(.5), then ifPlayer II plays r₂ his expected payoff would now be 2.2, so(Ll₂l₃) would be a SE. Now imagine settingμ(III) back as it was, but change μ(II) so that Player II thinksthe conditional probability of being at node 16 is greater than .5; inthat case, l₂ is again not a SE strategy.

The idea of SE is hopefully now clear. We can apply it to theriver-crossing game in a way that avoids the necessity for the pursuerto flip any coins of we modify the game a bit. Suppose now that thepursuer can change bridges twice during the fugitive’s passage,and will catch him just in case she meets him as he leaves the bridge.Then the pursuer’s SE strategy is to divide her time at thethree bridges in accordance with the proportion given by the equationin the third paragraph of Section 3 above.

It must be noted that since Bayes’s rule cannot be applied toevents with probability 0, its application to SE requires that playersassign non-zero probabilities to all actions available in extensiveform. This requirement is captured by supposing that all strategyprofiles be strictly mixed, that is, that every action atevery information set be taken with positive probability. You will seethat this is just equivalent to supposing that all hands sometimestremble, or alternatively that no expectations are quite certain. A SEis said to be trembling-hand perfect if all strategies playedat equilibrium are best replies to strategies that are strictly mixed.You should also not be surprised to be told that no weakly dominatedstrategy can be trembling-hand perfect, since the possibility oftrembling hands gives players the most persuasive reason for avoidingsuch strategies.

How can the non-psychological game theorist understand the concept ofan NE that is an equilibrium in both actions and beliefs? Decades ofexperimental study have shown that when human subjects play games,especially games that ideally call for use of Bayes’s rule inmaking conjectures about other players’ beliefs, we shouldexpect significant heterogeneity in strategic responses.Multiple kinds of informational channels typically link differentagents with the incentive structures in their environments. Someagents may actually compute equilibria, with more or less error.Others may settle within error ranges that stochastically drift aroundequilibrium values through more or less myopic conditioned learning.Still others may select response patterns by copying the behavior ofother agents, or by following rules of thumb that are embedded incultural and institutional structures and represent historicalcollective learning. Note that the issue here is specific to gametheory, rather than merely being a reiteration of a more generalpoint, which would apply to any behavioral science, that people behavenoisily from the perspective of ideal theory. In a given game, whetherit would be rational for even a trained, self-aware, computationallywell resourced agent to play NE would depend on the frequency withwhich he or she expected others to do likewise. If she expects someother players to stray from NE play, this may give her a reason tostray herself. Instead of predicting that human players will revealstrict NE strategies, the experienced experimenter or modeleranticipates that there will be a relationship between their play andthe expected costs of departures from NE. Consequently, maximumlikelihood estimation of observed actions typically identifies a QREas providing a better fit than any NE.

An analyst handling empirical data in this way should not beinterpreted as ‘testing the hypothesis’ that the agentsunder analysis are ‘rational’. Rather, she conjecturesthat they are agents, that is, that there is a systematic relationshipbetween changes in statistical patterns in their behavior and somerisk-weighted cardinal rankings of possible goal-states. If the agentsare people or institutionally structured groups of people that monitorone another and are incentivized to attempt to act collectively, theseconjectures will often be regarded as reasonable by critics, or evenas pragmatically beyond question, even if always defeasible given thenon-zero possibility of bizarre unknown circumstances of the kindphilosophers sometimes consider (e.g., the apparent people arepre-programmed unintelligent mechanical simulacra that would berevealed as such if only the environment incentivized responses notwritten into their programs). The analyst might assume that all of theagents respond to incentive changes in accordance with Savageexpected-utility theory, particularly if the agents are firms thathave learned response contingencies under normatively demandingconditions of market competition with many players. If theanalyst’s subjects are individual people, and especially if theyare in a non-standard environment relative to their cultural andinstitutional experience, she would more wisely estimate a maximumlikelihood mixture model that allows that a range of different utilitystructures govern different subsets of her choice data. All this is tosay that use of game theory does not force a scientist to empiricallyapply a model that is likely to be too precise and narrow in itsspecifications to plausibly fit the messy complexities of realstrategic interaction. A good applied game theorist should also be awell-schooled econometrician.

4. Repeated Games and Coordination

So far we’ve restricted our attention to one-shotgames, that is, games in which players’ strategic concernsextend no further than the terminal nodes of their single interaction.However, games are often played with future games in mind,and this can significantly alter their outcomes and equilibriumstrategies. Our topic in this section is repeated games, thatis, games in which sets of players expect to face each other insimilar situations on multiple occasions. We approach these firstthrough the limited context of repeated prisoner’s dilemmas.

We’ve seen that in the one-shot PD the only NE is mutualdefection. This may no longer hold, however, if the players expect tomeet each other again in future PDs. Imagine that four firms, allmaking widgets, agree to maintain high prices by jointly restrictingsupply. (That is, they form a cartel.) This will only work if eachfirm maintains its agreed production quota. Typically, each firm canmaximize its profit by departing from its quota while the othersobserve theirs, since it then sells more units at the higher marketprice brought about by the almost-intact cartel. In the one-shot case,all firms would share this incentive to defect and the cartel wouldimmediately collapse. However, the firms expect to face each other incompetition for a long period. In this case, each firm knows that ifit breaks the cartel agreement, the others can punish it byunderpricing it for a period long enough to more than eliminate itsshort-term gain. Of course, the punishing firms will take short-termlosses too during their period of underpricing. But these losses maybe worth taking if they serve to reestablish the cartel and bringabout maximum long-term prices.

One simple, and famous (but not, contrary to widespread myth,necessarily optimal) strategy for preserving cooperation in repeatedPDs is called tit-for-tat. This strategy tells each player tobehave as follows:

Always cooperate in the first round.
Thereafter, take whatever action your opponent took in theprevious round.

A group of players all playing tit-for-tat will never see anydefections. Since, in a population where others play tit-for-tat,tit-for-tat is the rational response for each player, everyone playingtit-for-tat is a NE. You may frequently hear people who know alittle (but not enough) game theory talk as if this is theend of the story. It is not.

There are two complications. First, the players must be uncertain asto when their interaction ends. Suppose the players know when the lastround comes. In that round, it will be utility-maximizing for playersto defect, since no punishment will be possible. Now consider thesecond-last round. In this round, players also face no punishment fordefection, since they expect to defect in the last round anyway. Sothey defect in the second-last round. But this means they face nothreat of punishment in the third-last round, and defect there too. Wecan simply iterate this backwards through the game tree until we reachthe first round. Since cooperation is not a NE strategy in that round,tit-for-tat is no longer a NE strategy in the repeated game, and weget the same outcome—mutual defection—as in the one-shotPD. Therefore, cooperation is only possible in repeated PDs where theexpected number of repetitions is indeterminate. (Of course, this doesapply to many real-life games.) Note that in this context any amountof uncertainty in expectations, or possibility of trembling hands,will be conducive to cooperation, at least for awhile. When people inexperiments play repeated PDs with known end-points, they indeed tendto cooperate for awhile, but learn to defect earlier as they gainexperience.

Now we introduce a second complication. Suppose that players’ability to distinguish defection from cooperation is imperfect.Consider our case of the widget cartel. Suppose the players observe afall in the market price of widgets. Perhaps this is because a cartelmember cheated. Or perhaps it has resulted from an exogenous drop indemand. If tit-for-tat players mistake the second case for the first,they will defect, thereby setting off a chain-reaction of mutualdefections from which they can never recover, since every player willreply to the first encountered defection with defection, therebybegetting further defections, and so on.

If players know that such miscommunication is possible, they haveincentive to resort to more sophisticated strategies. In particular,they may be prepared to sometimes risk following defections withcooperation in order to test their inferences. However, if they aretoo forgiving, then other players can exploit them throughadditional defections. In general, sophisticated strategies have aproblem. Because they are more difficult for other players to infer,their use increases the probability of miscommunication. Butmiscommunication is what causes repeated-game cooperative equilibriato unravel in the first place. The complexities surroundinginformation signaling, screening and inference in repeated PDs help tointuitively explain the folk theorem, so called because noone is sure who first recognized it, that in repeated PDs, forany strategy S there exists a possible distributionof strategies among other players such that the vector of Sand these other strategies is a NE. Thus there is nothing special,after all, about tit-for-tat.

Real, complex, social and political dramas are seldom straightforwardinstantiations of simple games such as PDs. Hardin (1995) offers an analysis of two tragically real political cases, theYugoslavian civil war of 1991–95, and the 1994 Rwandan genocide,as PDs that were nested inside coordination games.

A coordination game occurs whenever the utility of two or more playersis maximized by their doing the same thing as one another, and wheresuch correspondence is more important to them than whatever it is, inparticular, that they both do. A standard example arises with rules ofthe road: ‘All drive on the left’ and ‘All drive onthe right’ are both outcomes that are NEs, and neither is moreefficient than the other. In games of ‘pure’ coordination,it doesn’t even help to use more selective equilibrium criteria.For example, suppose that we require our players to reason inaccordance with Bayes’s rule (see Section 3 above). In thesecircumstances, any strategy that is a best reply to any vector ofmixed strategies available in NE is said to berationalizable. That is, a player can find a set of systemsof beliefs for the other players such that any history of the gamealong an equilibrium path is consistent with that set of systems. Purecoordination games are characterized by non-unique vectors ofrationalizable strategies. The Nobel laureate Thomas Schelling (1978) conjectured, and empirically demonstrated, that in such situations,players may try to predict equilibria by searching for focalpoints, that is, features of some strategies that they believewill be salient to other players, and that they believe other playerswill believe to be salient to them. For example, if two people want tomeet on a given day in a big city but can’t contact each otherto arrange a specific time and place, both might sensibly go to thecity’s most prominent downtown plaza at noon. In general, thebetter players know one another, or the more often they have been ableto observe one another’s strategic behavior, the more likelythey are to succeed in finding focal points on which to coordinate.

Coordination was, indeed, the first topic of game-theoreticapplication that came to the widespread attention of philosophers. In1969, the philosopher David Lewis (1969) published Convention, in which the conceptual framework ofgame-theory was applied to one of the fundamental issues oftwentieth-century epistemology, the nature and extent of conventionsgoverning semantics and their relationship to the justification ofpropositional beliefs. The basic insight can be captured using asimple example. The word ‘chicken’ denotes chickens and‘ostrich’ denotes ostriches. We would not be better orworse off if ‘chicken’ denoted ostriches and‘ostrich’ denoted chickens; however, we would beworse off if half of us used the pair of words the first way and halfthe second, or if all of us randomized between them to refer toflightless birds generally. This insight, of course, well precededLewis; but what he recognized is that this situation has the logicalform of a coordination game. Thus, while particular conventions may bearbitrary, the interactive structures that stabilize and maintain themare not. Furthermore, the equilibria involved in coordinating on nounmeanings appear to have an arbitrary element only because we cannotPareto-rank them; but Millikan (1984) shows implicitly that in this respect they are atypical of linguisticcoordinations. They are certainly atypical of coordinating conventionsin general, a point on which Lewis was misled by over-valuing‘semantic intuitions’ about ‘the meaning’of‘convention’ (Bacharach 2006, Ross 2008a).

Ross & LaCasse (1995) present the following example of a real-life coordination game inwhich the NE are not Pareto-indifferent, but the Pareto-inferior NE ismore frequently observed. In a city, drivers must coordinate on one oftwo NE with respect to their behaviour at traffic lights. Either allmust follow the strategy of rushing to try to race through lights thatturn yellow (or amber) and pausing before proceeding when red lightsshift to green, or all must follow the strategy of slowing down onyellows and jumping immediately off on shifts to green. Both patternsare NE, in that once a community has coordinated on one of them thenno individual has an incentive to deviate: those who slow down onyellows while others are rushing them will get rear-ended, while thosewho rush yellows in the other equilibrium will risk collision withthose who jump off straightaway on greens. Therefore, once acity’s traffic pattern settles on one of these equilibria itwill tend to stay there. And, indeed, these are the two patterns thatare observed in the world’s cities. However, the two equilibriaare not Pareto-indifferent, since the second NE allows more cars toturn left on each cycle in a left-hand-drive jurisdiction, and righton each cycle in a right-hand jurisdiction, which reduces the maincause of bottlenecks in urban road networks and allows all drivers toexpect greater efficiency in getting about. Unfortunately, for reasonsabout which we can only speculate pending further empirical work andanalysis, far more cities are locked onto the Pareto-inferior NE thanon the Pareto-superior one. Conditional game theory (see Section 5 below) provides promising resources for modeling cases such as thisone, in which maintenance of coordination game equilibria likely mustbe supported by stable social norms, because players are anonymous andencounter regular opportunities to gain once-off advantages bydefecting from supporting the prevailing equilibrium. This work iscurrently ongoing.

Conventions on standards of evidence and scientific rationality, thetopics from philosophy of science that set up the context forLewis’s analysis, are likely to be of the Pareto-rankablecharacter. While various arrangements might be NE in the social gameof science, as followers of Thomas Kuhn like to remind us, it ishighly improbable that all of these lie on a singlePareto-indifference curve. These themes, strongly represented incontemporary epistemology, philosophy of science and philosophy oflanguage, are all at least implicit applications of game theory. (Thereader can find a broad sample of applications, and references to thelarge literature, in Nozick (1998).)

Most of the social and political coordination games played by peoplealso have this feature. Unfortunately for us all, inefficiency trapsrepresented by Pareto-inferior NE are extremely common in them. Andsometimes dynamics of this kind give rise to the most terrible of allrecurrent human collective behaviors. Hardin’s analysis of tworecent genocidal episodes relies on the idea that the biologicallyshallow properties by which people sort themselves into racial andethnic groups serve highly efficiently as focal points in coordinationgames, which in turn produce deadly PDs between them.

According to Hardin, neither the Yugoslavian nor the Rwandan disasterswere PDs to begin with. That is, in neither situation, on either side,did most people begin by preferring the destruction of the other tomutual cooperation. However, the deadly logic of coordination,deliberately abetted by self-serving politicians, dynamicallycreated PDs. Some individual Serbs (Hutus) were encouraged toperceive their individual interests as best served throughidentification with Serbian (Hutu) group-interests. That is, theyfound that some of their circumstances, such as those involvingcompetition for jobs, had the form of coordination games. They thusacted so as to create situations in which this was true for otherSerbs (Hutus) as well. Eventually, once enough Serbs (Hutus)identified self-interest with group-interest, the identificationbecame almost universally correct, because (1) the mostimportant goal for each Serb (Hutu) was to do roughly what every otherSerb (Hutu) would, and (2) the most distinctively Serbianthing to do, the doing of which signalled coordination, was to excludeCroats (Tutsi). That is, strategies involving such exclusionarybehavior were selected as a result of having efficient focal points.This situation made it the case that an individual—andindividually threatened—Croat’s (Tutsi’s)self-interest was best maximized by coordinating on assertive Croat(Tutsi) group-identity, which further increased pressures on Serbs(Hutus) to coordinate, and so on. Note that it is not an aspect ofthis analysis to suggest that Serbs or Hutus started things; theprocess could have been (even if it wasn’t in fact) perfectlyreciprocal. But the outcome is ghastly: Serbs and Croats (Hutus andTutsis) seem progressively more threatening to each other as theyrally together for self-defense, until both see it as imperative topreempt their rivals and strike before being struck. If Hardin isright—and the point here is not to claim that he is,but rather to point out the worldly importance of determining whichgames agents are in fact playing—then the mere presence of anexternal enforcer (NATO?) would not have changed the game, pace theHobbesian analysis, since the enforcer could not have threatenedeither side with anything worse than what each feared from the other.What was needed was recalibration of evaluations of interests, which(arguably) happened in Yugoslavia when the Croatian army began todecisively win, at which point Bosnian Serbs decided that theirself/group interests were better served by the arrival of NATOpeacekeepers. The Rwandan genocide likewise ended with a militarysolution, in this case a Tutsi victory. (But this became the seed forthe most deadly international war on earth since 1945, the Congo Warof 1998–2006.)

Of course, it is not the case that most repeated games lead todisasters. The biological basis of friendship in people and otheranimals is partly a function of the logic of repeated games. Theimportance of payoffs achievable through cooperation in future gamesleads those who expect to interact in them to be less selfish thantemptation would otherwise encourage in present games. The fact thatsuch equilibria become more stable through learning gives friends thelogical character of built-up investments, which most people takegreat pleasure in sentimentalizing. Furthermore, cultivating sharedinterests and sentiments provides networks of focal points aroundwhich coordination can be increasingly facilitated.

5. Team Reasoning and Conditional Games

Following Lewis’s (1969) introduction of coordination games into the philosophical literature,the philosopher Margaret Gilbert (1989) argued, as against Lewis, that game theory is the wrong kind ofanalytical technology for thinking about human conventions because,among other problems, it is too ‘individualistic’, whereasconventions are essentially social phenomena. More directly, her claimwas that conventions are not merely the products of decisions of manyindividual people, as might be suggested by a theorist who modeled aconvention as an equilibrium of an n-person game in which eachplayer was a single person. Similar concerns about allegedlyindividualistic foundations of game theory have been echoed by anotherphilosopher, Martin Hollis (1998) and economists Robert Sugden (1993, 2000, 2003) and Michael Bacharach (2006). In particular, it motivated Bacharach to propose a theory of teamreasoning, which was completed by Sugden, along with NathalieGold, after Bacharach’s death. This theory constitutes a keypart of the background context for appreciating the value of a majorrecent extension to game theory, Wynn Stirling’s (2012) theory of conditional games.

Consider again the one-shot Prisoner’s Dilemma as discussed in Section 2.4 and produced, with an inverted matrix for ease of subsequentdiscussion, as follows:

C	D
C	2,2	0,3
D	3,0	1,1

(C denotes the strategy of cooperating with one’s opponent(i.e., refusing to confess) and D denotes the strategy of defecting ona deal with one’s opponent (i.e., confessing).) Many people findit incredible when a game theorist tells them that players designatedwith the honorific ‘rational’ must choose in this game insuch a way as to produce the outcome (D,D). The explanation seems torequire appeal to very strong forms of both descriptive and normativeindividualism. After all, if the players attached higher value to thesocial good (for their 2-person society of thieves) than to theirindividual welfare, they could then do better individually too;game-theoretic ‘rationality’, it is objected, yieldsbehavior that is perverse even from the individually optimizing pointof view. The players undermine their own welfare, one might argue,because they obstinately refuse to pay any attention to the socialcontext of their choices.Sugden (1993) seems to have been the first to suggest that players who trulydeserve to be called ‘rational’, including non-altruisticones, would in the one-shot PD reason as a team, that is,would each arrive at their choices of strategies by asking ‘Whatis best for us?’ instead of ’What is best forme?’.

Binmore (1994) forcefully argues that this line of criticism confuses game theory asmathematics with questions about which game theoretic models are mosttypically applicable to situations in which people find themselves. Ifplayers value the utility of a team they’re part of over andabove their more narrowly individualistic interests, then this shouldbe represented in the payoffs associated with a game theoretic modelof their choices. In the situation modeled as a PD above, if the twoplayers’ concern for ‘the team’ were strong enoughto induce a switch in strategies from D to C, then the payoffs in the(cardinally interpreted) upper left cell would have to be raised to atleast 3. (At 3, players would be indifferent betweencooperating and defecting.) Then we get the following transformationof the game:

C	D
C	4,4	0,3
D	3,0	1,1

This is no longer a PD; it is an Assurance game, which hastwo NE at (C,C) and (D,D), with the former being Pareto superior tothe latter. Thus if the players find this equilibrium, we should notsay that they have played non-NE strategies in a PD. Rather, we shouldsay that the PD was the wrong model of their situation.

What is at issue here is the best choice of a convention for applyingmathematics to empirical description. Binmore is clearly right, andthe majority of commentators have come to recognize that he is right,if we interpret the payoffs of games by reference to utility functionswith unrestricted domains. This is the overwhelmingly standardpractice in both economics and formal decision theory. For a number ofyears this issue was regarded as closed in the mainstream literature.However, Sugden (2018) argues in very recent work that there are reasons, quite independentof technical considerations about which conventions are mostconvenient for representing empirical interactions as games, foravoiding appeal to preferences over unrestricted domains in analyzingwelfare (that is, in doing normative economics). On the basis of thisargument, Sugden reverts to using game-theoretic models in whichpayoffs are restricted to objectively specifiable metrics, such asmonetary returns. The substantive issues in welfare economics on whichSugden sheds now light are too interesting for a critic to reasonablyrefuse to engage with them out of mere stubbornness about adhering toconvention in interpreting game representations. It is too soon toassess whether the advances in welfare analysis that Sugden seeks aresustainable under critical stress-testing. If they prove not to be,then his motivation for an alternative convention on payoffinterpretation will dissolve. I think it more likely, however, that aperiod of intensive innovation in welfare economics lies just ahead ofus, and that in the course of this economists and other analysts willgrow comfortable with operating two different representationalconventions depending on problem contexts. If that is indeed ourfuture, then we can anticipate a further stage in which, becauseproblem contexts tend not to remain conveniently isolated from oneanother, new formalism is demanded to allow both conventions to beoperated in a single application without confusion. But thesespeculations run well ahead of the current state of theory.

Let us then return to the thread of theory development that followedwidespread accommodation of Binmore’s critique.Bacharach’s scientific executors, Sugden and Gold, in Bacharach (2006), pp. 171–173), unlike Hollis and Sugden (1993), use the standard convention for payoff interpretation, under whichplayers can only be modeled as cooperating in a one-shot PD if atleast one player makes an error. (For some error specifications, (C,C)could arise consistently with QRE as the solution concept.) Under thisassumption, Bacharach, Sugden and Gold argue, human game players willoften or usually avoid framing situations in such a way that aone-shot PD is the right model of their circumstances. A situationthat ‘individualistic’ agents would frame as a PD might beframed by ‘team reasoning’ agents as the Assurance gametransformation above. Note that the welfare of the team might make adifference to (cardinal) payoffs without making enough of adifference to trump the lure of unilateral defection. Suppose itbumped them up to 2.5 for each player; then the game would remain aPD. This point is important, since in experiments in which subjectsplay sequences of one-shot PDs (not repeated PDs, sinceopponents in the experiments change from round to round), majoritiesof subjects begin by cooperating but learn to defect as theexperiments progress. On Bacharach’s account of this phenomenon,these subjects initially frame the game as team reasoners. However, aminority of subjects frame it as individualistic reasoners and defect,taking free riders’ profits. The team reasoners then re-framethe situation to defend themselves. This introduces a crucial aspectof Bacharach’s account. Individualistic reasoners and teamreasoners are not claimed to be different types of people. People,Bacharach maintains, flip back and forth between individualisticagency and participation in team agency.

Now consider the following Pure Coordination game:

C	D
C	1,1	0,0
D	0,0	1,1

We can interpret this as representing a situation in which players arenarrowly individualistic, and thus each indifferent between the two NEof (U, L) and (D, R), or are team reasoners but haven’trecognized that their team is better off if they stabilize around oneof the NE rather than the other. If they do come to such recognition,perhaps by finding a focal point, then the Pure Coordination game istransformed into the following game known as Hi-Lo:

Player II

Player I

t1	t2
s1	10,10	0,0
s2	0,0	1,1

Crucially, here the transformation requires more than mereteam reasoning. The players also need focal points to know which ofthe two Pure Coordination equilibria offers the less risky prospectfor social stabilization (Binmore 2008). In fact, Bacharach and his executors are interested in therelationship between Pure Coordination games and Hi-Lo games for aspecial reason. It does not seem to imply any criticism of NE as asolution concept that it doesn’t favor one strategy vector overanother in a Pure Coordination game. However, NE alsodoesn’t favor the choice of (U,L) over (D,R) in the Hi-Lo gamedepicted, because (D,R) is also a NE. At this point Bacharach and hisfriends adopt the philosophical reasoning of the refinement program.Surely, they complain, ‘rationality’ recommends (U,L).Therefore, they conclude, axioms for team reasoning should be builtinto refined foundations of game theory.

We need not endorse the idea that game theoretic solution conceptsshould be refined to accommodate an intuitive general concept ofrationality to motivate interest in Bacharach’s contribution.The non-psychological game theorist can propose a subtle shift ofemphasis: instead of worrying about whether our models should respecta team-centred norm of rationality, we might simply point to empiricalevidence that people, and perhaps other agents, seem to often makechoices that reveal preferences that are conditional on the welfare ofgroups with which they are associated. To this extent their agency ispartly or wholly—and perhaps stochastically—identifiedwith these groups, and this will need to be reflected when we modeltheir agency using utility functions. Then we could better describethe theory we want as a theory of team-centred choice rather than as atheory of team reasoning. Note that this philosophicalinterpretation is consistent with the idea that some of our evidence,perhaps even our best evidence, for the existence of team-centredchoice is psychological. It is also consistent with the suggestionthat the processes that flip people between individualized andteam-centred agency are often not deliberative or consciouslyrepresented. The point is simply that we need not follow Bacharach inthinking of game theory as a model of reasoning or rationality inorder to be persuaded that he has identified a gap we would like tohave formal resources to fill.

So, do people’s choices seem to reveal team-centredpreferences? Standard examples, including Bacharach’s own, aredrawn from team sports. Members of such teams are under considerablesocial pressure to choose actions that maximize prospects for victoryover actions that augment their personal statistics. The problem withthese examples is that they embed difficult identification problemswith respect to the estimation of utility functions; a narrowlyself-interested player who wants to be popular with fans might behaveidentically to a team-centred player. Soldiers in battle conditionsprovide more persuasive examples. Though trying to convince soldiersto sacrifice their lives in the interests of their countries is oftenineffective, most soldiers can be induced to take extraordinary risksin defense of their buddies, or when enemies directly menace theirhome towns and families. It is easy to think of other kinds of teamswith which most people plausibly identify some or most of the time:project groups, small companies, political constituency committees,local labor unions, clans and households. Strongly individualisticsocial theory tries to construct such teams as equilibria in gamesamongst individual people, but no assumption built into game theory(or, for that matter, mainstream economic theory) forces thisperspective (see Guala (2016) for a critical review of options). We can instead suppose that teamsare often exogenously welded into being by complex interrelatedpsychological and institutional processes. This invites the gametheorist to conceive of a mathematical mission that consists not inmodeling team reasoning, but rather in modeling choice that isconditional on the existence of team dynamics.

This brings us to Stirling’s (2012) extension of game theory to cover such conditional interactions.Stirling’s aim is to formalize, and derive equilibriumconditions for, a notion of group preference that is, on the one hand,not a mere aggregation of individual preferences but also does not, onthe other hand, simply assume the existence of a transcendentcollective will that is imposed on individuals. The intuitive targetStirling has in mind is that of processes by which people derive theiractual preferences partly on the basis of the comparative consequencesfor group welfare of different possible profiles of preferences thatmembers could severally hypothetically reveal. A key constraintStirling respects is that the theory’s solution concepts (i.e.,its equilibria) must formally generalize the standardsolution concepts (NE, SPE, QRE), not replace them.Conditional game theory is supposed to be ‘real’ gametheory, not ‘pseudo’ game theory.

Let us develop the intuitive idea of preference conditionalization inmore detail. People may often—perhaps typically—defer fullresolution of their preferences until they get more information aboutthe preferences of others who are their current or potentialteam-mates. Stirling himself provides a simple (arguably too simple)example from Keeney and Raiffa (1976), in which a farmer forms a clear preference among different climateconditions for a land purchase only after, and partly in light of,learning the preferences of his wife. This little thought experimentis plausible, but not ideal as an illustration because it is easilyconflated with vague notions we might entertain about fusionof agency in the ideal of marriage—and it is important todistinguish the dynamics of preference conditionalization in teams ofdistinct agents from the simple collapse of individualagency. So let us construct a better example. Imagine a corporateChairperson consulting her risk-averse Board about whether they shouldpursue a dangerous hostile takeover bid. Compare two possibleprocedures she might use: in process (i) she sends each Board memberan individual e-mail about the idea a week prior to the meeting; inprocess (ii) she springs it on them collectively at themeeting. Most people will agree that the two processes might yielddifferent outcomes, and that a main reason for this is that on process(i), but not (ii), some members might entrench personal opinions thatthey would not have time to settle into if they received informationabout one another’s willingness to challenge the Chair in publicat the same time as they heard the proposal for the first time. Inboth imagined processes there are, at the point of voting, sets ofindividual preferences to be aggregated by the vote. But it is morelikely that some preferences in the set generated by the secondprocess were conditional on preferences of others. Aconditional preference as Stirling defines it is a preference that isinfluenced by information about the preferences of (specified)others.

A second notion formalized in Stirling’s theory isconcordance. This refers to the extent of controversy ordiscord to which a set of preferences, including a set of conditionalpreferences, would generate if equilibrium among them wereimplemented. Members or leaders of teams do not always want tomaximize concordance by engineering all internal games as Assurance orHi-lo (though they will always likely want to eliminate PDs). Forexample, a manager might want to encourage a degree of competitionamong profit centers in a firm, while wanting the cost centers toidentify completely with the team as a whole.

Stirling formally defines representation theorems for three kinds ofordered utility functions: conditional utility, concordant utility andconditional concordant utility. These may be applied recursively, i.e.to individuals, to teams and to teams of teams. Then the core of theformal development is the theory that aggregates individuals’conditional concordant preferences to build models of team choice thatare not exogenously imposed on team members, but instead derive fromtheir several preferences. In stating Stirling’s aggregationprocedure in the present context, it is useful to change histerminology, and therefore paraphrase him rather than quote directly.This is because Stirling refers to “groups” rather than to“teams”. Stirling’s initial work on CGT was entirelyindependent of Bacharach’s work,so was not configured within thecontext of team reasoning (or what we might reinterpret asteam-centred choice). But Bacharach’s ideas provide a naturalsetting in which to frame Stirling’s technical achievement as anenrichment of the applicability of game theory in social science (see Hofmeyr and Ross (2019)). We can then paraphrase his five constraints on aggregation asfollows:

(1) Conditioning: A team member’s preference orderingmay be influenced by the preferences of other team members, i.e. maybe conditional. (Influence may be set to zero, in which case theconditional preference ordering collapses to the categoricalpreference ordering to standard RPT.)

(2) Endogeny: A concordant ordering for a team must bedetermined by the social interactions of its sub-teams. (Thiscondition ensures that team preferences are not simply imposed onindividual preferences.)

(3) Acyclicity: Social influence relations are notreciprocal. (This will likely look at first glance to be a strangerestriction: surely most social influence relationships, among peopleat any rate, are reciprocal. But, as noted earlier, we needto keep conditional preference distinct from agent fusion, and thiscondition helps to do that. More importantly, as a matter ofmathematics it allows teams to be represented in directed graphs. Thecondition is not as restrictive, where modeling flexibility isconcerned, as one might at first think, for two reasons. First, itonly bars us from representing an agent j influenced byanother agent i from directly influencingi. We are free to represent j as influencingk who in turn influences i.) Second, and moreimportantly, in light of the exchangeability constraint below,aggregation is insensitive to the ordering of pairs of players betweenwhom there is a social influence relationship.)

(4) Exchangeability: Concordant preference orderings areinvariant under representational transformations that are equivalentwith respect to information about conditional preferences.

(5) Monotonicity: If one sub-team prefers choice alternativeA to B and all other sub-teams are indifferent between A and B, thenthe team does not prefer B to A.

Under these restrictions, Stirling proves an aggregation theorem whichfollows a general result for updating utility in light of newinformation that was developed by Abbas (2003, Other Internet Resources). Individual team members each calculate the team preference byaggregating conditional concordant preferences. Then the analystapplies marginalization. Let (X^n) be a team. Let(X^m={X_{j1},ldots,X_{jm}}) and (X = {X_{i1},ldots, X_{ik}})be disjoint sub-teams of (X^n). Then the marginal concordant utilityof (X^m) with respect to the sub-team ({X^m, X^k}) is obtainedby summing over (mathcal{A}^k), yielding

[U_{x_m}(alpha_m) = sum_{alpha_k} Ux_m x_k (alpha_m, alpha_k)]

and the marginal utility of the individual team member (X_i) isgiven by

[U_{x_m}(alpha_m) = sum_{sim mathbb{a}_i} Ux_n (mathbb{a}_1, ldots, mathbb{a}_n)]

where the notation (sum_{sim mathbb{a}_i}) means that the sum istaken over all arguments except (mathbb{a}_i) (Stirling (2012), p. 62). This operation produces the non-conditionalpreferences of individual i ex post—that is, updated inlight of her conditional concordant preferences and the information onwhich they are conditioned, namely, the conditional concordantpreferences of the team. Once all ex post preferences of agents havebeen calculated, the resulting games in which they are involved can besolved by standard analysis.

Stirling’s construction is, as he says, a true generalization ofstandard utility theory so as to make non-conditioned(“categorical”) utility a special case. It provides abasis for formalization of team utility, which can be compared withany of the following: the pre-conditioned categorical utility of anindividual or sub-team; the conditional utility of an individual orsub-team; or the conditional concordant utility of an individual orsub-team. Once every individual’s preferences in a team choiceproblem have been marginalized, NE, SPE or QRE analyses can beproposed as solutions to the problem given full information aboutsocial influences. Situations of incomplete information can be solvedusing Byes-Nash or sequential equilibrium.

In case the reader has struggled to follow the overall point of thetechnical constructions above, we can summarize the achievement ofconditional game theory (CGT) in higher-level terms as follows. CGTmodels the propagation of influence flows by applying the formalsyntax of probability theory (through the operation ofmarginalization) to game theory, and constructing graph theoreticalrepresentations. As social influence propagates through a group andplayers modulate their preferences on the basis of otherplayers’ preferences, a group preference may emerge. Grouppreferences are not a direct basis for action, but encapsulate asocial model incorporating the relationships and interdependenciesamong the agents. CGT shows us how to derive a coordination orderingfor a group which combines the conditional and categorical preferencesof its members, in much the same way as, in probability theory, thejoint probability of an event is determined by conditional andmarginal probabilities. So, just as the conventional application ofthe probability syntax is a means of expressing a cognizer’sepistemological uncertainty regarding belief, so extending this syntaxto game theory allows us to represent an agent’s practicaluncertainty regarding preference.

If this were the end of the story, then CGT would be little more thana pre-processing mechanism for identifying standard games. The realinnovation lies in representing the influence of concordanceconsiderations on equilibrium determination. The social model can beused to generate an operational definition of group preference, and todefine truly coordinated choices. There is no assumption that groupsnecessarily optimize their preferences or that individual agentscoordinate their choices. The point is merely that we can formallyrepresent conditions under which agents in games can do what actualpeople often seem to: adapt and settle their individualpreferences in light both of what others prefer, and of what promotesa group’s stability and efficiency. Team agency is thusincorporated into game theory instead of being left as an exogenouspsychological construct that the analyst must investigate in advanceof building a game-theoretic model of socially embedded agents.

Insubsequent work, Stirling (2016) extends CGT to incorporate strategic choice under uncertainty.Stirling and Ross are currently engaged in a joint project to applyCGT to model the strategic stabilization and maintenance of socialnorms in the sense of Bicchieri (2006).

6. Commitment

In some games, a player can improve her outcome by taking an actionthat makes it impossible for her to take what would be her best actionin the corresponding simultaneous-move game. Such actions are referredto as commitments, and they can serve as alternatives toexternal enforcement in games which would otherwise settle onPareto-inefficient equilibria.

Consider the following hypothetical example (which is not aPD). Suppose you own a piece of land adjacent to mine, and I’dlike to buy it so as to expand my lot. Unfortunately, you don’twant to sell at the price I’m willing to pay. If we movesimultaneously—you post a selling price and I independently givemy agent an asking price—there will be no sale. So I might tryto change your incentives by playing an opening move in which Iannounce that I’ll build a putrid-smelling sewage disposal planton my land beside yours unless you sell, thereby inducing you to loweryour price. I’ve now turned this into a sequential-move game.However, this move so far changes nothing. If you refuse to sell inthe face of my threat, it is then not in my interest to carry it out,because in damaging you I also damage myself. Since you know this youshould ignore my threat. My threat is incredible, a case ofcheap talk.

However, I could make my threat credible by committingmyself. For example, I could sign a contract with some farmerspromising to supply them with treated sewage (fertilizer) from myplant, but including an escape clause in the contract releasing mefrom my obligation only if I can double my lot size and so put it tosome other use. Now my threat is credible: if you don’t sell,I’m committed to building the sewage plant. Since you know this,you now have an incentive to sell me your land in order to escape itsruination.

This sort of case exposes one of many fundamental differences betweenthe logic of non-parametric and parametric maximization. In parametricsituations, an agent can never be made worse off by having moreoptions. (Even if a new option is worse than the options with whichshe began, she can just ignore it.) But where circumstances arenon-parametric, one agent’s strategy can be influenced inanother’s favour if options are visibly restricted.Cortez’s burning of his boats (see Section 1) is, of course, an instance of this, one which serves to make theusual metaphor literal.

Another example will illustrate this, as well as the applicability ofprinciples across game-types. Here we will build an imaginarysituation that is not a PD—since only one player has anincentive to defect—but which is a social dilemma insofar as itsNE in the absence of commitment is Pareto-inferior to an outcome thatis achievable with a commitment device. Suppose that two ofus wish to poach a rare antelope from a national park in order to sellthe trophy. One of us must flush the animal down towards the secondperson, who waits in a blind to shoot it and load it onto a truck. Youpromise, of course, to share the proceeds with me. However, yourpromise is not credible. Once you’ve got the buck, you have noreason not to drive it away and pocket the full value from it. Afterall, I can’t very well complain to the police without gettingmyself arrested too. But now suppose I add the following opening moveto the game. Before our hunt, I rig out the truck with an alarm thatcan be turned off only by punching in a code. Only I know the code. Ifyou try to drive off without me, the alarm will sound and we’llboth get caught. You, knowing this, now have an incentive to wait forme. What is crucial to notice here is that you prefer that Irig up the alarm, since this makes your promise to give me my sharecredible. If I don’t do this, leaving your promiseincredible, we’ll be unable to agree to try the crimein the first place, and both of us will lose our shot at the profitfrom selling the trophy. Thus, you benefit from my preventing you fromdoing what’s optimal for you in a subgame.

We may now combine our analysis of PDs and commitment devices indiscussion of the application that first made game theory famousoutside of the academic community. The nuclear stand-off between thesuperpowers during the Cold War was intensively studied by the firstgeneration of game theorists, many of whom received direct or indirectfunding support from the US military. Poundstone 1992 provides the relatively ‘sanitized’ history of thisinvolvement that has long been available to the casual historian whorelies on secondary sources in addition to theorists’ publicreminiscences. Recently, a more skeptically alert and professionalhistorical study has been produced by Amadae (2016), which provides scholarly context for the still more hair-raisingmemoir of a pioneer of applied game theory, participant in thedevelopment of Cold War nuclear strategy, and famous leaker of thePentagon’s secret files on the Vietnam War, Daniel Ellsberg (Ellsberg 2017). History consistent with these accounts but stimulating less pupildilation in the reader is Erickson (2015).

In the conventional telling of the tale, the nuclear stand-off betweenthe USA and the USSR attributes the following policy to both parties.Each threatened to answer a first strike by the other with adevastating counter-strike. This pair of reciprocal strategies, whichby the late 1960s would effectively have meant blowing up the world,was known as ‘Mutually Assured Destruction’, or‘MAD’. Game theorists at the time objected that MAD wasmad, because it set up a PD as a result of the fact that thereciprocal threats were incredible. The reasoning behind thisdiagnosis went as follows. Suppose the USSR launches a first strikeagainst the USA. At that point, the American President finds hiscountry already destroyed. He doesn’t bring it back to life bynow blowing up the world, so he has no incentive to carry out hisoriginal threat to retaliate, which has now manifestly failed toachieve its point. Since the Russians can anticipate this, they shouldignore the threat to retaliate and strike first. Of course, theAmericans are in an exactly symmetric position, so they too shouldstrike first. Each power recognizes this incentive on the part of theother, and so anticipates an attack if they don’t rush topreempt it. What we should therefore expect, because it is the only NEof the game, is a race between the two powers to be the first toattack. The clear implication is the destruction of the world.

This game-theoretic analysis caused genuine consternation and fear onboth sides during the Cold War, and is reputed to have produced somestriking attempts at setting up strategic commitment devices. Someanecdotes, for example, allege that President Nixon had the CIA try toconvince the Russians that he was insane or frequently drunk, so thatthey’d believe that he’d launch a retaliatory strike evenwhen it was no longer in his interest to do so. Similarly, the SovietKGB is sometimes claimed, during Brezhnev’s later years, to tohave fabricated medical reports exaggerating the extent of hissenility with the same end in mind. Even if these stories aren’ttrue, their persistent circulation indicates understanding of thelogic of strategic commitment. Ultimately, the strategic symmetry thatconcerned the Pentagon’s analysts was complicated and perhapsbroken by changes in American missile deployment tactics. Theyequipped a worldwide fleet of submarines with enough missiles tolaunch a devastating counterattack by themselves. This made thereliability of the US military communications network lessstraightforward, and in so doing introduced an element ofstrategically relevant uncertainty. The President probably could beless sure to be able to reach the submarines and cancel their ordersto attack if prospects of American survival had become hopeless. Ofcourse, the value of this in breaking symmetry depended on theRussians being aware of the potential problem. In StanleyKubrick’s classic film Dr. Strangelove, the world isdestroyed by accident because the Russians build a doomsday machinethat will automatically trigger a retaliatory strike regardless oftheir leadership’s resolve to follow through on the implicit MADthreat but then keep it a secret. As a result, when anunequivocally mad American colonel launches missiles at Russia on hisown accord, and the American President tries to convince his Sovietcounterpart that the attack was unintended, the Russian Premiersheepishly tells him about the secret doomsday machine. Now the twoleaders can do nothing but watch in dismay as the world is blown updue to a game-theoretic mistake.

This example of the Cold War standoff, while famous and ofconsiderable importance in the history of game theory and its popularreception, relied at the time on analyses that weren’t verysubtle. The military game theorists were almost certainly mistaken tothe extent that they modeled the Cold War as a one-shot PD in thefirst place. For one thing, the nuclear balancing game was enmeshed inlarger global power games of great complexity. For another, it is farfrom clear that, for either superpower, annihilating the other whileavoiding self-annihilation was in fact the highest-ranked outcome. Ifit wasn’t, in either or both cases, then the game wasn’t aPD. A cynic might suggest that the operations researchers on bothsides were playing a cunning strategy in a game over funding, one thatinvolved them cooperating with one another in order to convince theirpoliticians to allocate more resources to weapons.

In more mundane circumstances, most people exploit a ubiquitouscommitment device that Adam Smith long ago made the centerpiece of histheory of social order: the value to people of their ownreputations. Even if I am secretly stingy, I may wish tocause others to think me generous by tipping in restaurants, includingrestaurants in which I never intend to eat again. The more I do thissort of thing, the more I invest in a valuable reputation which Icould badly damage through a single act of obvious, and observed,mean-ness. Thus my hard-earned reputation for generosity functions asa commitment mechanism in specific games, itself enforcing continuedre-investment. In time, my benevolence may become habitual, andconsequently insensitive to circumstantial variations, to the pointwhere an analyst has no remaining empirical justification forcontinuing to model me as having a preference for stinginess. There isa good deal of evidence that the hyper-sociality of humans issupported by evolved biological dispositions (found in most but notall people) to suffer emotionally from negative gossip and the fear ofit. People are also naturally disposed to enjoy gossiping,which means that punishing others by spreading the news when theircommitment devices fail is a form of social policing they don’tfind costly and happily take up. A nice feature of this form ofpunishment is that it can, unlike (say) hitting people with sticks, bewithdrawn without leaving long-term damage to the punishee. This is ahappy property of a device that has as its point the maintenance ofincentives to contribute to joint social projects; collaboration isgenerally more fruitful with team-mates whose bones aren’tbroken. Thus forgiveness conventions also play a strategic role inthis elegant commitment mechanism that natural selection built for us.Finally, norms are culturally evolved mutual expectations ina group of people (or, perhaps, in a few other intelligent socialanimals) that have the further property that individuals who violatethem may punish themselves by feeling guilt or shame. Thusthey may often take cooperative actions against their narrowself-interest even when no one else is paying attention. Religiousstories, or philosophical ones involving Kantian moral‘rationality’, are especially likely to be told inexplanation of norms because the underlying game-theoretic basisdoesn’t occur to people; and the norms in question may functionmore effectively for that very reason.

Though the so-called ‘moral emotions’are extremely usefulfor maintaining commitment, they are not necessary for it. Largerhuman institutions are, famously, highly morally obtuse; however,commitment is typically crucial to their functional logic. Forexample, a government tempted to negotiate with terrorists to securethe release of hostages on a particular occasion may commit to a‘line in the sand’ strategy for the sake of maintaining areputation for toughness intended to reduce terrorists’incentives to launch future attacks. A different sort of example isprovided by Qantas Airlines of Australia. Qantas has never suffered afatal accident, and for a time (until it suffered some embarrassingnon-fatal accidents to which it likely feared drawing attention) mademuch of this in its advertising. This means that its planes, at leastduring that period, probably were safer than average even ifthe initial advantage was merely a bit of statistical good fortune,because the value of its ability to claim a perfect record rose thelonger it lasted, and so gave the airline continuous incentives toincur greater costs in safety assurance. It likely still has incentiveto take extra care to prevent its record of fatalities from crossingthe magic reputational line between 0 and 1.

Certain conditions must hold if reputation effects are to underwritecommitment. A person’s reputation can have a standing valueacross a range of games she plays, but in that case her concern forits value should be factored into payoffs in specifying each specificgame into which she enters. Reputation can be built upthrough play of a game only in a case of a repeated game.Then the value of the reputation must be greater to its cultivatorthan the value to her of sacrificing it in any particularround of the repeated game. Thus players may establish commitment byreducing the value of each round so that the temptation to defect inany round never gets high enough to constitute a hard-to-resisttemptation. For example, parties to a contract may exchange theirobligations in small increments to reduce incentives on both sides torenege. Thus builders in construction projects may be paid in weeklyor monthly installments. Similarly, the International Monetary Fundoften dispenses loans to governments in small tranches, therebyreducing governments’ incentives to violate loan conditions oncethe money is in hand; and governments may actually prefer sucharrangements in order to remove domestic political pressure fornon-compliant use of the money. Of course, we are all familiar withcases in which the payoff from a defection in a current round becomestoo great relative to the longer-run value of reputation to futurecooperation, and we awake to find that the society treasurer hasabsconded overnight with the funds. Commitment through concern forreputation is the cement of society, but any such natural bondingagent will be far from perfectly effective.

7. Evolutionary Game Theory

Gintis (2009b, 2009b) feels justified in stating that “game theory is auniversal language for the unification of the behavioralsciences.” There are good examples of such unifying work.Binmore (1998, 2005a) models social history as a series of convergences onincreasingly efficient equilibria in commonly encountered transactiongames, interrupted by episodes in which some people try to shift tonew equilibria by moving off stable equilibrium paths, resulting inperiodic catastrophes. (Stalin, for example, tried to shift hissociety to a set of equilibria in which people cared more about thefuture industrial, military and political power of their state thanthey cared about their own lives. He was not successful; however, hisefforts certainly created a situation in which, for a few decades,many Soviet people attached far less importance to otherpeople’s lives than usual.) A game-theoretic perspectiveindeed seems pervasively useful in understanding phenomena across thefull range of social sciences. In Section 4, for example, we considered Lewis’s recognition that each humanlanguage amounts to a network of Nash equilibria in coordination gamesaround conveyance of information.

Given his work’s vintage, Lewis restricted his attention tostatic game theory, in which agents are modeled as deliberatelychoosing strategies given exogenously fixedutility-functions. As a result of this restriction, his accountinvited some philosophers to pursue a misguided quest for a generalanalytic theory of the rationality of conventions (as noted by Bickhard 2008). Though Binmore has criticized this focus repeatedly through acareer’s worth of contributions (see the references for aselection), Gintis (2009a) has recently isolated the underlying problem with particular clarityand tenacity. NE and SPE are brittle solution concepts whenapplied to naturally evolved computational mechanisms like animal(including human) brains. As we saw in Section 3 above, in coordination (and other) games with multiple NE, what it iseconomically rational for a player to do is highly sensitive to thelearning states of other players. In general, when players findthemselves in games where they do not have strictly dominantstrategies, they only have uncomplicated incentives to play NE or SPEstrategies to the extent that other players can be expected to findtheir NE or SPE strategies. Can a general theory ofstrategic rationality, of the sort that philosophers have sought, bereasonably expected to cover the resulting contingencies? Resort toBayesian reasoning principles, as we reviewed in Section 3.1, is the standard way of trying to incorporate such uncertainty intotheories of rational, strategic decision. However, as Binmore (2009) argues following the lead of Savage (1954), Bayesian principles are only plausible as principles ofrationality itself in so-called ‘small worlds’, thatis, environments in which distributions of risk are quantified in aset of known and enumerable parameters, as in the solution to ourriver crossing game from Section 3. In large worlds, where utility functions, strategy sets andinformational structure are difficult to estimate and subject tochange by contingent exogenous influences, the idea that Bayes’srule tells players how to ‘be rational’ is quiteimplausible. But then why should we expect players to choose NE or SPEor sequential-equilibrium strategies in wide ranges of socialinteractions?

As Binmore (2009) and Gintis (2009a) both stress, if game theory is to be used to model actual, naturalbehavior and its history, outside of the small-world settings on whichmicroeconomists (but not macroeconomists or political scientists orsociologists or philosophers of science) mainly traffic, then we needsome account of what is attractive about equilibria in games even whenno analysis can identify them by taming all uncertainty in such a waythat it can be represented as pure risk. To make reference again toLewis’s topic, when human language developed there was noexternal referee to care about and arrange for Pareto-efficiency byproviding focal points for coordination. Yet somehow people agreed,within linguistic communities, to use roughly the same words andconstructions to say similar things. It seems unlikely that anyexplicit, deliberate strategizing on anyone’s part played a rolein these processes. Nevertheless, game theory has turned out tofurnish the essential concepts for understanding stabilization oflanguages. This is a striking point of support for Gintis’soptimism about the reach of game theory. To understand it, we mustextend our attention to evolutionary games.

Game theory has been fruitfully applied in evolutionary biology, wherespecies and/or genes are treated as players, since pioneering work by Maynard Smith (1982) and his collaborators. Evolutionary (or dynamic) game theorynow constitutes a significant new mathematical extension applicable tomany settings apart from the biological. Skyrms (1996) uses evolutionary game theory to try to answer questions Lewis couldnot even ask, about the conditions under which language, concepts ofjustice, the notion of private property, and other non-designed,general phenomena of interest to philosophers would be likely toarise. What is novel about evolutionary game theory is that moves arenot chosen through deliberation by the individual agents. Instead,agents are typically hard-wired with particular strategies, andsuccess for a strategy is defined in terms of the number of copies ofitself that it will leave to play in the games of succeedinggenerations, given a population in which other strategies with whichit acts are distributed at particular frequencies. In this kind ofproblem setting, the strategies themselves are the players, andindividuals who play these strategies are their mere executors whoreceive the immediate-run costs and benefits associated withoutcomes.

The discussion here will closely follow Skyrms’s. We begin byintroducing the replicator dynamics. Consider first hownatural selection works to change lineages of animals, modifying,creating and destroying species. The basic mechanism isdifferential reproduction. Any animal with heritablefeatures that increase its expected number of offspring in agiven environment will tend to leave more offspring than others solong as the environment remains relatively stable. These offspringwill be more likely to inherit the features in question. Therefore,the proportion of these features in the population will graduallyincrease as generations pass. Some of these features may go tofixation, that is, eventually take over the entire population(until the environment changes).

How does game theory enter into this? Often, one of the most importantaspects of an organism’s environment will be the behaviouraltendencies of other organisms. We can think of each lineage as‘trying’ to maximize its reproductive fitness (= futurefrequencies of its distinctive genetic structures) through findingstrategies that are optimal given the strategies of other lineages. Soevolutionary theory is another domain of application fornon-parametric analysis.

In evolutionary game theory, we no longer think of individuals aschoosing strategies as they move from one game to another. This isbecause our interests are different. We’re now concerned lesswith finding the equilibria of single games than with discoveringwhich equilibria are stable, and how they will change over time. So wenow model the strategies themselves as playing against eachother. One strategy is ‘better’ than another if it islikely to leave more copies of itself in the next generation, when thegame will be played again. We study the changes in distribution ofstrategies in the population as the sequence of games unfolds.

For evolutionary game theory, we introduce a new equilibrium concept,due to Maynard Smith (1982). A set of strategies, in some particular proportion (e.g., 1/3:2/3,1/2:1/2, 1/9:8/9, 1/3:1/3:1/6:1/6—always summing to 1) is at anESS (Evolutionary Stable Strategy) equilibrium just in case(1) no individual playing one strategy could improve its reproductivefitness by switching to one of the other strategies in the proportion,and (2) no mutant playing a different strategy altogether couldestablish itself (‘invade’) in the population.

The principles of evolutionary game theory are best explained throughexamples. Skyrms begins by investigating the conditions under which asense of justice—understood for purposes of his specificanalysis as a disposition to view equal divisions of resources as fairunless efficiency considerations suggest otherwise in specialcases—might arise. He asks us to consider a population in whichindividuals regularly meet each other and must bargain over resources.Begin with three types of individuals:

Fairmen always demand exactly half the resource.
Greedies always demand more than half the resource. Whena greedy encounters another greedy, they waste the resource infighting over it.
Modests always demand less than half the resource. When amodest encounters another modest, they take less than all of theavailable resource and waste some.

Each single encounter where the total demands sum to 100% isa NE of that individual game. Similarly, there can be many dynamicequilibria. Suppose that Greedies demand 2/3 of the resource andModests demand 1/3. Then, given random pairing for interaction, thefollowing two proportions are ESSs:

Half the population is greedy and half is modest. We can calculatethe average payoff here. Modest gets 1/3 of the resource in everyencounter. Greedy gets 2/3 when she meets Modest, but nothing when shemeets another Greedy. So her average payoff is also 1/3. This is anESS because Fairman can’t invade. When Fairman meets Modest hegets 1/2. But when Fairman meets Greedy he gets nothing. So hisaverage payoff is only 1/4. No Modest has an incentive to changestrategies, and neither does any Greedy. A mutant Fairman arising inthe population would do worst of all, and so selection will notencourage the propagation of any such mutants.
All players are Fairmen. Everyone always gets half the resource,and no one can do better by switching to another strategy. Greediesentering this population encounter Fairmen and get an average payoffof 0. Modests get 1/3 as before, but this is less than Fairman’spayoff of 1/2.

Notice that equilibrium (i) is inefficient, since the average payoffacross the whole population is smaller. However, just as inefficientoutcomes can be NE of static games, so they can be ESSs ofevolutionary ones.

We refer to equilibria in which more than one strategy occurs aspolymorphisms. In general, in Skyrms’s game, anypolymorphism in which Greedy demands x and Modest demands1−x is an ESS. The question that interests the studentof justice concerns the relative likelihood with which these differentequilibria arise.

This depends on the proportions of strategies in the originalpopulation state. If the population begins with more than one Fairman,then there is some probability that Fairmen will encounter each other,and get the highest possible average payoff. Modests by themselves donot inhibit the spread of Fairmen; only Greedies do. But Greediesthemselves depend on having Modests around in order to be viable. Sothe more Fairmen there are in the population relative topairs of Greedies and Modests, the better Fairmen do onaverage. This implies a threshold effect. If the proportion of Fairmendrops below 33%, then the tendency will be for them to fall toextinction because they don’t meet each other often enough. Ifthe population of Fairmen rises above 33%, then the tendency will befor them to rise to fixation because their extra gains when they meeteach other compensates for their losses when they meet Greedies. Youcan see this by noticing that when each strategy is used by 33% of thepopulation, all have an expected average payoff of 1/3. Therefore, anyrise above this threshold on the part of Fairmen will tend to pushthem towards fixation.

This result shows that and how, given certain relatively generalconditions, justice as we have defined it can arisedynamically. The news for the fans of justice gets more cheerful stillif we introduce correlated play.

The model we just considered assumes that strategies are notcorrelated, that is, that the probability with which every strategymeets every other strategy is a simple function of their relativefrequencies in the population. We now examine what happens in ourdynamic resource-division game when we introduce correlation. Supposethat Fairmen have a slight ability to distinguish and seek out otherFairmen as interaction partners. In that case, Fairmen on average dobetter, and this must have the effect of lowering their threshold forgoing to fixation.

An evolutionary game modeler studies the effects of correlation andother parametric constraints by means of running large computersimulations in which the strategies compete with one another, roundafter round, in the virtual environment. The starting proportions ofstrategies, and any chosen degree of correlation, can simply be set inthe programme. One can then watch its dynamics unfold over time, andmeasure the proportion of time it stays in any one equilibrium. Theseproportions are represented by the relative sizes of the basins ofattraction for different possible equilibria. Equilibria areattractor points in a dynamic space; a basin of attraction for eachsuch point is then the set of points in the space from which thepopulation will converge to the equilibrium in question.

In introducing correlation into his model, Skyrms first sets thedegree of correlation at a very small .1. This causes the basin ofattraction for equilibrium (i) to shrink by half. When the degree ofcorrelation is set to .2, the polymorphic basin reduces to the pointat which the population starts in the polymorphism. Thus very smallincreases in correlation produce large proportionate increases in thestability of the equilibrium where everyone plays Fairman. A smallamount of correlation is a reasonable assumption in most populations,given that neighbours tend to interact with one another and to mimicone another (either genetically or because of tendencies todeliberately copy each other), and because genetically and culturallysimilar animals are more likely to live in common environments. Thusif justice can arise at all it will tend to be dominant andstable.

Much of political philosophy consists in attempts to produce deductivenormative arguments intended to convince an unjust agent that she hasreasons to act justly. Skyrms’s analysis suggests a quitedifferent approach. Fairman will do best of all in the dynamic game ifhe takes active steps to preserve correlation. Therefore, there isevolutionary pressure for both moral approval of justice andjust institutions to arise. Most people may think that50–50 splits are ‘fair’, and worth maintaining bymoral and institutional reward and sanction, because we arethe products of a dynamic game that promoted our tendency to thinkthis way.

The topic that has received most attention from evolutionary gametheorists is altruism, defined as any behaviour by anorganism that decreases its own expected fitness in a singleinteraction but increases that of the other interactor. It is arguablycommon in nature. How can it arise, however, given Darwiniancompetition?

Skyrms studies this question using the dynamic Prisoner’sDilemma as his example. This is simply a series of PD games played ina population, some of whose members are defectors and some of whom arecooperators. Payoffs, as always in evolutionary games, are measured interms of expected numbers of copies of each strategy in futuregenerations.

Let U(A) be the average fitness of strategyA in the population. Let U be the averagefitness of the whole population. Then the proportion of strategyA in the next generation is just the ratioU(A)/U. So if Ahas greater fitness than the population average A increases.If A has lower fitness than the population average thenA decreases.

In the dynamic PD where interaction is random (i.e., there’s nocorrelation), defectors do better than the population average as longas there are cooperators around. This follows from the fact that, aswe saw in Section 2.4, defection is always the dominant strategy in a single game. 100%defection is therefore the ESS in the dynamic game withoutcorrelation, corresponding to the NE in the one-shot static PD.

However, introducing the possibility of correlation radically changesthe picture. We now need to compute the average fitness of a strategygiven its probability of meeting each other possiblestrategy. In the evolutionary PD, cooperators whose probabilityof meeting other cooperators is high do better than defectors whoseprobability of meeting other defectors is high. Correlation thusfavours cooperation.

In order to be able to say something more precise about thisrelationship between correlation and cooperation (and in order to beable to relate evolutionary game theory to issues in decision theory,a matter falling outside the scope of this article), Skyrms introducesa new technical concept. He calls a strategy adaptivelyratifiable if there is a region around its fixation point in thedynamic space such that from anywhere within that region it will go tofixation. In the evolutionary PD, both defection and cooperation areadaptively ratifiable. The relative sizes of basins of attraction arehighly sensitive to the particular mechanisms by which correlation isachieved. To illustrate this point, Skyrms builds severalexamples.

One of Skyrms’s models introduces correlation by means of afilter on pairing for interaction. Suppose that in round 1 ofa dynamic PD individuals inspect each other and interact, or not,depending on what they find. In the second and subsequent rounds, allindividuals who didn’t pair in round 1 are randomly paired. Inthis game, the basin of attraction for defection is largeunless there is a high proportion of cooperators in roundone. In this case, defectors fail to pair in round 1, then get pairedmostly with each other in round 2 and drive each other to extinction.A model which is more interesting, because its mechanism is lessartificial, does not allow individuals to choose their partners, butrequires them to interact with those closest to them. Because ofgenetic relatedness (or cultural learning by copying) individuals aremore likely to resemble their neighbours than not. If this (finite)population is arrayed along one dimension (i.e., along a line), andboth cooperators and defectors are introduced into positions along itat random, then we get the following dynamics. Isolated cooperatorshave lower expected fitness than the surrounding defectors and aredriven locally to extinction. Members of groups of two cooperatorshave a 50% probability of interacting with each other, and a 50%probability of each interacting with a defector. As a result, theiraverage expected fitness remains smaller than that of theirneighbouring defectors, and they too face probable extinction. Groupsof three cooperators form an unstable point from which both extinctionand expansion are equally likely. However, in groups of four or morecooperators at least one encounter of a cooperator with a cooperatorsufficient to at least replace the original group is guaranteed. Underthis circumstance, the cooperators as a group do better than thesurrounding defectors and increase at their expense. Eventuallycooperators go almost to fixation—but nor quite. Singledefectors on the periphery of the population prey on the cooperatorsat the ends and survive as little ‘criminal communities’.We thus see that altruism can not only be maintained by the dynamicsof evolutionary games, but, with correlation, can even spread andcolonize originally non-altruistic populations.

Darwinian dynamics thus offers qualified good news for cooperation.Notice, however, that this holds only so long as individuals are stuckwith their natural or cultural programming and can’t re-evaluatetheir utilities for themselves. If our agents get too smart andflexible, they may notice that they’re in PDs and would each bebest off defecting. In that case, they’ll eventually drivethemselves to extinction—unless they develop stable, andeffective, moral norms that work to reinforce cooperation. But, ofcourse, these are just what we would expect to evolve in populationsof animals whose average fitness levels are closely linked to theircapacities for successful social cooperation. Even given this, thesepopulations will go extinct unless they care about future generationsfor some reason. But there’s no non-sentimental reason thatdoesn’t already presuppose altruistic morality as to why agentsshould care about future generations if each new generationwholly replaces the preceding one at each change of cohorts. For thisreason, economists use ‘overlapping generations’ modelswhen modeling intertemporal distribution games. Individuals ingeneration 1 who will last until generation 5 save resources for thegeneration 3 individuals with whom they’ll want to cooperate;and by generation 3 the new individuals care about generation 6; andso on.

Gintis (2009a) argues that when we set out to use evolutionary game theory to unifythe behavioral sciences, we should begin by using it to unify gametheory itself. We have pointed out at several earlier points in thepresent article that NE and SPE are problematic solution concepts inmany applications where explicit institutional rules are missingbecause agents only have incentives to play NE or SPE to the extentthat they are confident that other agents will do likewise. To theextent that agents do not have such confidence — and this, bythe way, is itself an insight due to game theory — what shouldbe predicted is general disorder and social confusion. Gintis shows indetail how the key to this problem is the existence of what he calls a‘choreographer’. By this he means some exogenous elementthat informs agents about which equilibrium strategies they shouldexpect others to play. As discussed in Section 6, cultural norms are probably the most important choreographers forpeople. Interesting utility functions that incorporate norms of therelevant sort are extensively studied in Bicchieri (2006). In this context, Gintis demonstrates a further unifying element ofgreat importance: if agents attach positive utility to following thechoreographer’s suggestions (that is, to being strategicallycorrelated with others for the sheer sake of it), then wherevercompeting potential payoffs do not overwhelm this incentive, agentscan also be expected to consistently estimate Bayesian priors, andthus arrive at equilibria-in-beliefs, as discussed in Section 3.1, in games of imperfect information. Finally, as discussed in Section 5, Conditional Game Theory promises to provide the resources formodeling the endogenous emergence of the choreographer within thedynamics of games.

In light of this, when we wonder about the value of game-theoreticmodels in application to human behavior outside of well-structuredmarkets, much hinges on what we take to be plausible and empiricallyvalidated sources of people’s incentives to be coordinated withone another. This has been a subject of extensive recent debate, whichwe will review in Section 8.3 below.

8. Game Theory and Behavioral Evidence

In earlier sections, we reviewed some problems that arise fromtreating classical (non-evolutionary) game theory as a normativetheory that tells people what they ought to do if they wish to berational in strategic situations. The difficulty, as we saw, is thatthere seems to be no one solution concept we can unequivocallyrecommend for all situations, particularly where agents have privateinformation. However, in the previous section we showed how appeal toevolutionary foundations sheds light on conditions under which utilityfunctions that have been explicitly worked out can plausibly beapplied to groups of people, leading to game-theoretic models withplausible and stable solutions. So far, however, we have not reviewedany actual empirical evidence from behavioral observations orexperiments. Has game theory indeed helped empirical researchers makenew discoveries about behavior (human or otherwise)? If so, what ingeneral has the content of these discoveries been?

In addressing these questions, an immediate epistemological issueconfronts us. There is no way of applying game theory ‘all byitself’, independently of other modelling technologies. Usingterminology standard in the philosophy of science, one can test agame-theoretic model of a phenomenon only in tandem with‘auxiliary assumptions’ about the phenomenon in question.At least, this follows if one is strict about treating game theorypurely as mathematics, with no empirical content of its own. In onesense, a theory with no empirical content is never open to testing atall; one can only worry about whether the axioms on which the theoryis based are mutually consistent. A mathematical theory cannevertheless be evaluated with respect to empiricalusefulness. One kind of philosophical criticism that hassometimes been made of game theory, interpreted as a mathematical toolfor modelling behavioral phenomena, is that its application always orusually requires resort to false, misleading or badly simplisticassumptions about those phenomena. We would expect this criticism tohave different degrees of force in different contexts of application,as the auxiliary assumptions vary.

So matters turn out. There is no interesting domain in whichapplications of game theory have been completely uncontroversial.However, there has been generally easier consensus on how to use gametheory (both classical and evolutionary) to understand non-humananimal behavior than on how to deploy it for explanation andprediction of the strategic activities of people. Let us first brieflyconsider philosophical and methodological issues that have arisenaround application of game theory in non-human biology, beforedevoting fuller attention to game-theoretic social science.

The least controversial game-theoretic modelling has applied theclassical form of the theory to consideration of strategies by whichnon-human animals seek to acquire the basic resource relevant to theirevolutionary tournament: opportunities to produce offspring that arethemselves likely to reproduce. In order to thereby maximize theirexpected fitness, animals must find optimal trade-offs among variousintermediate goods, such as nutrition, security from predation andability to out-compete rivals for mates. Efficient trade-off pointsamong these goods can often be estimated for particular species inparticular environmental circumstances, and, on the basis of theseestimations, both parametric and non-parametric equilibria can bederived. Models of this sort have an impressive track record inpredicting and explaining independent empirical data on such strategicphenomena as competitive foraging, mate selection, nepotism, siblingrivalry,herding, collective anti-predator vigilance and signaling,reciprocal grooming, and interspecific mutuality (symbiosis). (Forexamples see Krebs and Davies 1984, Bell 1991, Dugatkin and Reeve 1998, Dukas 1998, and Noe, van Hoof and Hammerstein 2001.) On the other hand, as Hammerstein (2003) observes, reciprocity, and its exploitation and metaexploitation, aremuch more rarely seen in social non-human animals than game-theoreticmodeling would lead us to anticipate. One explanation for thissuggested by Hammerstein is that non-human animals typically have lessability to restrict their interaction partners than do people. Ourdiscussion in the previous section of the importance of correlationfor stabilizing game solutions lends theoretical support to thissuggestion.

Why has classical game theory helped to predict non-human animalbehavior more straightforwardly than it has done most human behavior?The answer is presumed to lie in different levels of complicationamongst the relationships between auxiliary assumptions and phenomena. Ross (2005a) offers the following account. Utility-maximization andfitness-maximization problems are the domain of economics. Economictheory identifies the maximizing units—economicagents—with unchanging preference fields. Identification ofwhole biological individuals with such agents is more plausible theless cognitively sophisticated the organism. Thus insects (forexample) are tailor-made for easy application of Revealed PreferenceTheory (see Section 2.1). As nervous systems become more complex, however, we encounter animalsthat learn. Learning can cause a sufficient degree of permanentmodification in an animal’s behavioral patterns that we canpreserve the identification of the biological individual with a singleagent across the modification only at the cost of explanatoryemptiness (because assignments of utility functions becomeincreasingly ad hoc). Furthermore, increasing complexity confoundssimple modeling on a second dimension: cognitively sophisticatedanimals not only change their preferences over time, but are governedby distributed control processes that make them sites of competitionamong internal agents (Schelling 1980; Ainslie 1992, Ainslie 2001). Thus they are not straightforward economic agents even at atime. In setting out to model the behavior of people using any part ofeconomic theory, including game theory, we must recognize that therelationship between any given person and an economic agent weconstruct for modeling purposes will always be more complicated thansimple identity.

There is no sudden crossing point at which an animal becomes toocognitively sophisticated to be modeled as a single economic agent,and for all animals (including humans) there are contexts in which wecan usefully ignore the synchronic dimension of complexity. However,we encounter a phase shift in modeling dynamics when we turn fromasocial animals to non-eusocial social ones. (This refers to animalsthat are social but that don’t, like ants, bees, wasps, termitesand naked mole rats, achieve cooperation thanks to fundamental changesin their population genetics that make individuals within groups intonear clones. Some known instances are parrots, corvids, bats, rats,canines, hyenas, pigs, raccoons, otters, elephants, hyraxes,cetaceans, and primates.) In their cases stabilization of internalcontrol dynamics is partly located outside the individuals,at the level of group dynamics. With these creatures, modeling anindividual as an economic agent, with a single comprehensive utilityfunction, is a drastic idealization, which can only be done with thegreatest methodological caution and attention to specific contextualfactors relevant to the particular modeling exercise. Applications ofgame theory here can only be empirically adequate to the extent thatthe economic modeling is empirically adequate.

H. sapiens is the extreme case in this respect. Individualhumans are socially controlled to an extreme degree by comparison withmost other non-eusocial species. At the same time, their greatcognitive plasticity allows them to vary significantly betweencultures. People are thus the least straightforward economic agentsamong all organisms. (It might thus be thought ironic that they weretaken, originally and for many years, to be the exemplary instances ofeconomic agency, on account of their allegedly superior‘rationality’.) We will consider the implications of thisfor applications of game theory below.

First, however, comments are in order concerning the empiricaladequacy of evolutionary game theory to explain and predictdistributions of strategic dispositions in populations of agents. Suchmodeling is applied both to animals as products of natural selection (Hofbauer and Sigmund 1998), and to non-eusocial social animals (but especially humans) asproducts of cultural selection (Young 1998). There are two main kinds of auxiliary assumptions one must justify,relative to a particular instance at hand, in constructing suchapplications. First, one must have grounds for confidence that thedispositions one seeks to explain are (either biological or cultural,as the case may be) adaptations—that is, dispositionsthat were selected and are maintained because of the way in which theypromote their own fitness or the fitness of the wider system, ratherthan being accidents or structurally inevitable byproducts of otheradaptations. (See Dennett 1995 for a general discussion of this issue.) Second, one must be able toset the modeling enterprise in the context of a justified set ofassumptions about interrelationships among nested evolutionaryprocesses on different time scales. (For example, in the case of aspecies with cultural dynamics, how does slow genetic evolutionconstrain fast cultural evolution? How does cultural evolution feedback into genetic evolution, if it feeds back at all? For a masterfuldiscussion of these issues, see Sterelny 2003.) Conflicting views over which such assumptions should be made abouthuman evolution are the basis for lively current disputes in theevolutionary game-theoretic modeling of human behavioral dispositionsand institutions. This is where issues in evolutionary game theorymeet issues in the booming field of behavioral-experimentalgame theory. I will therefore first describe the second field beforegiving a sense of the controversies just alluded to, which nowconstitute the liveliest domain of philosophical argument in thefoundations of game theory and its applications.

8.1 Game Theory in the Laboratory

Economists have been testing theories by running laboratoryexperiments with human and other animal subjects since pioneering workby Thurstone (1931). In recent decades, the volume of such work has become positivelygigantic. The vast majority of it sets subjects in microeconomicproblem environments that are imperfectly competitive. Since this isprecisely the condition in which microeconomics collapses into gametheory, most experimental economics has been experimental game theory.It is thus difficult to distinguish between experimentally motivatedquestions about the empirical adequacy of microeconomic theory andquestions about the empirical adequacy of game theory.

We can here give only a broad overview of an enormous and complicatedliterature. Readers are referred to critical surveys in Kagel and Roth (1995), Camerer (2003), Samuelson (2005), and Guala (2005). A useful high-level principle for sorting the literature indexes itto the different auxiliary assumptions with which game-theoreticaxioms are applied. It is often said in popular presentations (e.g., Ormerod 1994) that the experimental data generally refute the hypothesis thatpeople are rational economic agents. Such claims are too imprecise tobe sustainable interpretations of the results. All data are consistentwith the view that people are approximate economic agents, atleast for stretches of time long enough to permit game-theoreticanalysis of particular scenarios, in the minimal sense that theirbehavior can be modeled compatibly with Revealed Preference Theory(see Section 2.1). However, RPT makes so little in the way of empirical demands thatthis is not nearly as surprising as many non-economists suppose (Ross 2005a). What is really at issue in many of the debates around the generalinterpretation of experimental evidence is the extent to which peopleare maximizers of expected utility. As we saw in Section 3, expected utility theory (EUT) is generally applied in tandem withgame theory in order to model situations involving uncertainty —which is to say, most situations of interest in behavioral science.However, a variety of alternative structural models of utility lendthemselves to Von Neumann-Morgenstern cardinalization of preferencesand are definable in terms of subsets of the Savage (1954) axioms of subjective utility. The empirical usefulness of game theorywould be called into question only if we thought that people’sbehavior is not generally describable by means of cardinal vNMufs.

What the experimental literature truly appears to show is a world ofbehavior that is usually noisy from the theorist’s point ofview. The noise in question arises from substantial heterogeneity,both among people and among (person, situation) vectors. There is nosingle structural utility function such that all people act so as tomaximize a function of that structure in all circumstances. Faced withwell-learned problems in contexts that are not unduly demanding, orthat are highly institutionally structured people often behave likeexpected utility maximizers. For general reviews of theoretical issuesand evidence, see Smith (2008) and Binmore (2007). For an extended sequence of examples of empirical studies, see theso-called ‘continuous double auction’ experimentsdiscussed in Plott and Smith 1978 and Smith 1962, 1964, 1965, 1976, 1982. As a result, classical game theory can be used in such domains withhigh reliability to predict behavior and implement public policy, asis demonstrated by the dozens of extremely successful governmentauctions of utilities and other assets designed by game theorists toincrease public revenue (Binmore and Klemperer 2002).

In other contexts, interpreting people’s behavior asgenerally expected-utility maximizing requires undue violenceto the need for generality in theory construction. We get betterprediction using fewer case-specific restrictions if we suppose thatsubjects are maximizing according to one or (typically) moreof several alternatives (which will not be described here because theyare not directly about game theory): cumulative prospect theory (Tversky and Kahneman 1992), or alpha-nu utility theory (Chew and MacCrimmon 1979), or rank-dependent utility theory (Quiggin 1982, Yaari 1987). (The last alternative in fact denotes a family ofalternative specifications. One of these, the specification of Prelec (1998), has emerged in an accumulating mass of empirical estimations as thestatistically most common human utility function.) Harrison and Rutstrom (2008) show how to design and code maximum likelihood mixturemodels, which allow an empirical modeler to apply a range ofthese decision functions to a single set of choice data. The resultinganalysis identifies the proportion of the total choice set bestexplained by each model in the mixture. Andersen et al (2014) take this approach to the current state of the art, demonstrating theempirical value of including a model of non-maximizing psychologicalprocesses in a mixture along with maximizing economic models. This neweffective flexibility with respect to the decision modeling that canbe deployed in empirical applications of game theory relieves mostpressure to seek adjustments in the game theoretic structuresthemselves. Thus it fits well with the interpretation of game theoryas part of the behavioral scientist’s mathematical toolkit,rather than as a first-order empirical model of human psychology.

A more serious threat to the usefulness of game theory is evidence ofsystematic reversal of preferences, in both humans and other animals.This is more serious both because it extends beyond the human case,and because it challenges Revealed Preference Theory (RPT) rather thanjust unnecessarily rigid commitment to EUT. As explained in Section 2.1, RPT, unlike EUT, is among the axiomatic foundations of game theoryinterpreted non-psychologically. (Not all writers agree that apparentpreference reversal phenomena threaten RPT rather than EUT; but seethe discussions in Camerer (1995), pp. 660–665, and Ross (2005a), pp. 177–181.) A basis for preference reversals that seems to becommon in animals with brains is hyperbolic discounting of thefuture (Strotz 1956, Ainslie 1992). This is the phenomenon whereby agents discount futurerewards more steeply in close temporal distances from the currentreference point than at more remote temporal distances. This is bestunderstood by contrast with the idea found in most traditionaleconomic models of exponential discounting, in which there isa linear relationship between the rate of change in the distance to apayoff and the rate at which the value of the payoff from thereference point declines. The figure below shows exponential andhyperbolic curves for the same interval from a reference point to afuture payoff. The bottom one graphs the hyperbolic function; thebowed shape results from the change in the rate of discounting.

A result of this is that, as later prospects come closer to the pointof possible consumption, people and other animals will sometimes spendresources undoing the consequences of previous actions that also costthem resources. For example: deciding today whether to mark a pile ofundergraduate essays or watch a baseball game, I procrastinate,despite knowing that by doing so I put out of reach some even more funpossibility that might come up for tomorrow (when there’s anequally attractive ball game on if the better option doesn’tarise). So far, this can be accounted for in a way that preservesconsistency of preferences: if the world might end tonight, with atiny but nonzero probability, then there’s some level of riskaversion at which I’d rather leave the essays unmarked. Thefigure below compares two exponential discount curves, the lower onefor the value of the game I watch before finishing my marking, and thehigher one for the more valuable game I enjoy after completing thejob. Both have higher value from the reference point the closer theyare to it; but the curves do not cross, so my revealed preferences areconsistent over time no matter how impatient I might be.

Figure 16

However, if I bind myself against procrastination by buying a ticketfor tomorrow’s game, when in the absence of the awful task Iwouldn’t have done so, then I’ve violated intertemporalpreference consistency. More vividly, had I been in a position tochoose last week whether to procrastinate today, I’d have chosennot to. In this case, my discount curve drawn from the reference pointof last week crosses the curve drawn from the perspective of today,and my preferences reverse. The figure below shows this situation.

This phenomenon complicates applications of classical game theory tointelligent animals. However, it clearly doesn’t vitiate italtogether, since people (and other animals) oftendon’t reverse their preferences. (If this weren’ttrue, the successful auction models and other s-called‘mechanism designs’ would be mysterious.) Interestingly,the leading theories that aim to explain why hyperbolic discountersmight often behave in accordance with RPT themselves appeal to gametheoretic principles. Ainslie (1992, 2001) has produced an account of people as communities ofinternal bargaining interests, in which subunits based on short-term,medium-term and long-term interests face conflict that they mustresolve because if they don’t, and instead generate an internalHobbesian breakdown (Section 1), outside agents who avoid the Hobbesian outcome can ruin them all. Thedevice of the Hobbesian tyrant is unavailable to the brain. Therefore,its behavior (when system-level insanity is avoided) is a sequence ofself-enforcing equilibria of the sort studied by game-theoretic publicchoice literature on coalitional bargaining in democraticlegislatures. That is, the internal politics of the brain consists in‘logrolling’ (Stratmann 1997). These internal dynamics are then partly regulated and stabilized bythe wider social games in which coalitions (people as wholes overtemporal subparts of their biographies) are embedded (Ross 2005a , pp. 334–353). (For example: social expectations aboutsomeone’s role as a salesperson set behavioral equilibriumtargets for the logrolling processes in their brain.) This potentiallyadds further relevant elements to the explanation of why and howstable institutions with relatively transparent rules are keyconditions that help people more closely resemble straightforwardeconomic agents, such that classical game theory finds reliableapplication to them as entire units.

One important note of caution is in order here. Much of the recentbehavioral literature takes for granted that temporally inconsistentdiscounting is the standard or default case for people. However, Andersen et al (2008) show empirically that this arises from (i) assuming that groups ofpeople are homogenous with respect to which functional forms bestdescribe their discounting behavior, and (ii) failure to independentlyelicit and control for people’s differing levels of riskaversion in estimating their discount functions. In a range ofpopulations that have been studied with these two considerations inmind, data suggest that temporally consistent discounting describessubstantially higher proportions of choices than does temporallyinconsistent choices. Over-generalization of hyperbolic discountingmodels should thus be avoided.

8.2 Neuroeconomics and Game Theory

The idea that game theory can find novel application to the internaldynamics of brains, as suggested in the previous section, has beendeveloped from independent motivations by the research program knownas neuroeconomics (Montague and Berns 2002, Glimcher 2003, Ross 2005a, pp. 320–334, Camerer, Loewenstein and Prelec 2005). Thanks to new non-invasive scanning technologies, especiallyfunctional magnetic resonance imaging (fMRI), it has recently becomepossible to study synaptic activity in working brains while theyrespond to controlled cues. This has allowed a new path ofaccess—though still a highly indirect one (Harrison and Ross 2010)— to the brain’s computation of expected values of rewards, whichare (naturally) taken to play a crucial role in determining behavior.Economic theory is used to frame the derivation of the functionsmaximized by synaptic-level computation of these expected values;hence the name ‘neuroeconomics’.

Game theory plays a leading role in neuroeconomics at two levels.First, game theory has been used to predict the computations thatindividual neurons and groups of neurons serving the reward systemmust perform. In the best publicized example, Glimcher (2003) and colleagues have fMRI-scanned monkeys they had trained to playso-called ‘inspection games’ against computers. In aninspection game, one player faces a series of choices either to workfor a reward, in which case he is sure to receive it, or to performanother, easier action (“shirking”), in which case he willreceive the reward only if the other player (the“inspector”) is not monitoring him. Assume that the firstplayer’s (the “worker’s”) behavior reveals autility function bounded on each end as follows: he will work on everyoccasion if the inspector always monitors and he will shirk on everyoccasion if the inspector never monitors. The inspector prefers toobtain the highest possible amount of work for the lowest possiblemonitoring rate. In this game, the only NE for both players are inmixed strategies, since any pattern in one player’s strategythat can be detected by the other can be exploited. For any given pairof specific utility functions for the two players meeting theconstraints described above, any pair of strategies in which, on eachtrial, either the worker is indifferent between working and shirkingor the inspector is indifferent between monitoring and not monitoring,is a NE.

Applying inspection game analyses to pairs or groups of agentsrequires us to have either independently justified theirutility functions over all variables relevant to their play, in whichcase we can define NE and then test to see whether they successfullymaximize expected utility; or to assume that they maximizeexpected utility, or obey some other rule such as a matching function,and then infer their utility functions from their behavior. Eithersuch procedure can be sensible in different empirical contexts. Butepistemological leverage increases greatly if the utility function ofthe inspector is exogenously determined, as it often is. (Policeimplementing random roadside inspections to catch drunk drivers, forexample, typically have a maximum incidence of drunk driving assignedto them as a target by policy, and an exogenously set budget. Thesedetermine their utility function, given a distribution of preferencesand attitudes to risk among the population of drivers.) In the case ofGlimcher’s experiments the inspector is a computer, so itsprogram is under experimental control and its side of the payoffmatrix is known. Proxies for the subjects’ expected utility, inthis case squirts of fruit juice for the monkeys, can be antecedentlydetermined in parametric test settings. The computer is thenprogrammed with the economic model of the monkeys, and can search thedata in their behavior in game conditions for exploitable patterns,varying its strategy accordingly. With these variables fixed,expected-utility-maximizing NE behavior by the monkeys can becalculated and tested by manipulating the computer’s utilityfunction in various runs of the game.

Monkey behavior after training tracks NE very robustly (as does thebehavior of people playing similar games for monetary prizes; Glimcher 2003, pp. 307–308). Working with trained monkeys, Glimcher andcolleagues could then perform the experiments of significance here.Working and shirking behaviors for the monkeys had been associated bytheir training with staring either to the right or to the left on avisual display. In earlier experiments, Platt and Glimcher (1999) had established that, in parametric settings, as juice rewards variedfrom one block of trials to another, firing rates of each parietalneuron that controls eye movements could be trained to encode theexpected utility to the monkey of each possible movement relative tothe expected utility of the alternative movement. Thus“movements that were worth 0.4 ml of juice were representedtwice as strongly [in neural firing probabilities] as movements worth0.2 ml of juice” (p. 314). Unsurprisingly, when amounts of juicerewarded for each movement were varied from one block of trials toanother, firing rates also varied.

Against this background, Glimcher and colleagues could investigate theway in which monkeys’ brains implemented the tracking of NE.When the monkeys played the inspection game against the computer, thetarget associated with shirking could be set at the optimal location,given the prior training, for a specific neuron under study, while thework target would appear at a null location. This permitted Glimcherto test the answer to the following question: did the monkeys maintainNE in the game by keeping the firing rate of the neuron constant whilethe actual and optimal behavior of the monkey as a whole varied? Thedata robustly gave the answer ‘yes’. Glimcher reasonablyinterprets these data as suggesting that neural firing rates, at leastin this cortical region for this task, encode expected utility in bothparametric and nonparametric settings. Here we have an apparentvindication of the empirical applicability of classical game theory ina context independent of institutions or social conventions.

Further analysis pushed the hypothesis deeper. The computer playingInspector was presented with the same sequence of outcomes as itsmonkey opponent had received on the previous day’s play, and foreach move was asked to assess the relative expected values of theshirking and working actions available on the next move. Glimcherreports a positive correlation between small fluctuations around thestable NE firing rates in the individual neuron and the expectedvalues estimated by the computer trying to track the same NE. Glimchercomments on this finding as follows:

The neurons seemed to be reflecting, on a play-by-play basis, acomputation close to the one performed by our computer … [A]t a… [relatively] … microscopic scale, we were able to usegame theory to begin to describe the decision-by-decision computationsthat the neurons in area LIP were performing. (Glimcher 2003, p. 317)

Thus we find game theory reaching beyond its traditional role as atechnology for framing high-level constraints on evolutionary dynamicsor on behavior by well-informed agents operating in institutionalstraitjackets. In Glimcher’s hands, it is used to directly modelactivity in a monkey’s brain. Ross (2005a) argues that groups of neurons thus modeled should not be identifiedwith the sub-personal game-playing units found in Ainslie’stheory of intra-personal bargaining described earlier; that wouldinvolve a kind of straightforward reduction that experience in thebehavioral and life sciences has taught us not to expect. This issuehas since arisen in a direct dispute between neuroeconomists overrival interpretations of fMRI observations of intertemporal choice anddiscounting (McClure et al. 2004), Glimcher et al. 2007). The weight of evidence so far favors the view that ifit is sometimes useful to analyze people’s choices as equilibriain games amongst sub-personal agents, the sub-personal agents inquestion should not be identified with separate brain areas. Theopposite interpretation is unfortunately still most common in lessspecialized literature.

We have now seen the first level at which neuroeconomics applies gametheory. A second level involves seeking conditioning variables inneural activity that might impact people’s choices of strategieswhen they play games. This has typically involved repeating protocolsfrom the behavioral game theory literature with research subjects whoare lying in fMRI scanners during play. Harrison (2008) and Ross (2008b) have argued for skepticism about the value of work of this kind,which involves various uncomfortably large leaps of inference inassociating the observed behavior with specific imputed neuralresponses. It can also be questioned whether much generalizable newknowledge is gained to the extent that such associations canbe successfully identified.

Let us provide an example of this kind of “game in ascanner”— that directly involves strategic interaction. King-Casas et al. (2005) took a standard protocol from behavioral game theory, the so-called‘trust’ game, and implemented it with subjects whosebrains were jointly scanned using a technology for linking thefunctional maps of their respective brains, known as‘hyperscanning’). This game involves two players. In itsrepeated format as used in the King-Casas et al. experiment,the first player is designated the ‘investor’ and thesecond the ‘trustee’. The investor begins with $20, ofwhich she can keep any portion of her choice while investing theremainder with the trustee. In the trustee’s hands the investedamount is tripled by the experimenter. The trustee may then return asmuch or as little of this profit to the investor as he deems fit. Theprocedure is run for ten rounds, with players’ identities keptanonymous from one another.

This game has an infinite number of NE. Previous data from behavioraleconomics are consistent with the claim that the modal NE in humanplay approximates both players using‘Tit-for-tat’ strategies (see Section 4) modified by occasional defections to probe for information, and somepost-defection cooperation that manifests (limited) toleration of suchprobes. This is a very weak result, since it is compatible with a widerange of hypotheses on exactly which variations of Tit-for-tat areused and sustained, and thus licenses no inferences about potentialdynamics under different learning conditions, institutions, orcross-cultural transfers.

When they ran this game under hyperscanning, the researchersinterpreted their observations as follows. Neurons in thetrustee’s caudate nucleus (generally thought to implementcomputations or outputs of midbrain dopaminergic systems) were thoughtto show strong response when investors benevolently reciprocatedtrust—that is, responded to defection with increased generosity.As the game progressed, these responses were believed to have shiftedfrom being reactionary to being anticipatory. Thus reputationalprofiles as predicted by classical game-theoretic models were inferredto have been constructed directly by the brain. A further aspect ofthe findings not predictable by theoretical modeling alone, and whichpurely behavioral observation had not been sufficient to discriminate,was taken to be that responses by the caudate neurons to malevolentreciprocity—that is, reduced generosity in response tocooperation—were significantly smaller in amplitude. This washypothesized to be a mechanism by which the brain implementsmodification of Tit-for-tat so as to prevent occasional defections forinformational probing from unraveling cooperation permanently.

The advance in understanding for which practitioners of this style ofneuroeconomics hope consists not in what it tells us about particulartypes of games, but rather in comparative inferences it facilitatesabout the ways in which contextual framing influences people’sconjectures about which games they’re playing. fMRI or otherkinds of probes of working brains might, it is conjectured, enable usto quantitatively estimate degrees of strategic surprise.Reciprocally interacting expectations about surprise may themselves besubject to strategic manipulation, but this is an idea that has barelybegun to be theoretically explored by game theorists (see Ross and Dumouchel 2004). The view of some neuroeconomists that we now have the prospect ofempirically testing such new theories, as opposed to justhypothetically modeling them, has stimulated growth in this line ofresearch.

8.3 Game Theoretic Models of Human Nature

The developments reviewed in the previous section bring us up to themoving frontier of experimental / behavioral applications of classicalgame theory. We can now return to the branch point left off severalparagraphs back, where this stream of investigation meets that comingfrom evolutionary game theory. There is no serious doubt that, bycomparison to other non-eusocial animals —including our nearestrelatives, chimpanzees and bonobos—humans achieve prodigiousfeats of coordination (see Section 4) (Tomasello et al. 2004). A lively controversy, with important philosophicalimplications and fought on both sides with game-theoretic arguments,currently rages over the question of whether this capacity can bewholly explained by cultural adaptation, or is better explained byinference to a genetic change early in the career of H.sapiens.

Henrich et al. (2004, 2005) have run a series of experimental games withpopulations drawn from fifteen small-scale human societies in SouthAmerica, Africa, and Asia, including three groups of foragers, sixgroups of slash-and-burn horticulturists, four groups of nomadicherders, and two groups of small-scale agriculturists. The games(Ultimatum, Dictator, Public Goods) they implemented all placesubjects in situations broadly resembling that of the Trust gamediscussed in the previous section. That is, Ultimatum and Public Goodsgames are scenarios in which both social welfare and eachindividual’s welfare are optimized (Pareto efficiency achieved)if and only if at least some players use strategies that are notsub-game perfect equilibrium strategies (see Section 2.6). In Dictator games, a narrowly selfish first mover would capture allavailable profits. Thus in each of the three game types, SPE playerswho cared only about their own monetary welfare would get outcomesthat would involve highly inegalitarian payoffs. In none of thesocieties studied by Henrich et al. (or any other society inwhich games of this sort have been run) are such outcomes observed.The players whose roles are such that they would take away all butepsilon of the monetary profits if they and their partners played SPEalways offered the partners substantially more than epsilon, and eventhen partners sometimes refused such offers at the cost of receivingno money. Furthermore, unlike the traditional subjects of experimentaleconomics—university students in industrializedcountries—Henrich et al.’s subjects did not evenplay Nash equilibrium strategies with respect to monetarypayoffs. (That is, strategically advantaged players offered largerprofit splits to strategically disadvantaged ones than was necessaryto induce agreement to their offers.) Henrich et al.interpret these results by suggesting that all actual people, unlike‘rational economic man’, value egalitarian outcomes tosome extent. However, their experiments also show that this extentvaries significantly with culture, and is correlated with variationsin two specific cultural variables: typical payoffs to cooperation(the extent to which economic life in the society depends oncooperation with non-immediate kin) and aggregate market integration(a construct built out of independently measured degrees of socialcomplexity, anonymity, privacy, and settlement size). As the values ofthese two variables increase, game behavior shifts (weakly) in thedirection of Nash equilibrium play. Thus the researchers conclude thatpeople are genetically endowed with preferences for egalitarianism,but that the relative weight of these preferences is programmable bysocial learning processes conditioned on local cultural cues.

In evaluating Henrich et al.’s interpretation of thesedata, we should first note that no axioms of RPT, or of the variousmodels of decision mentioned in Section 8.1, which are applied jointly with game theoretic modeling to humanchoice data, specify or entail the property of narrow selfishness.(See Ross (2005a) ch. 4; Binmore (2005b) and (2009); and any economics or game theory text that lets the mathematics speakfor itself.) Orthodox game theory thus does not predict that peoplewill play SPE or NE strategies derived by treating monetary payoffs asequivalent to utility. Binmore (2005b) is therefore justified in criticizing Henrich et al forrhetoric suggesting that their empirical work embarrasses orthodoxtheory.

This is not to suggest that the anthropological interpretation of theempirical results should be taken as uncontroversial. Binmore (1994, 1998, 2005a, 2005b) has argued for many years, based on a wide range ofbehavioral data, that when people play games with non-relatives theytend to learn to play Nash equilibrium with respect to utilityfunctions that approximately correspond to income functions. As hepoints out in Binmore (2005b), Henrich et al.’s data do not test this hypothesis fortheir small-scale societies, because their subjects were not exposedto the test games for the (quite long, in the case of the Ultimatumgame) learning period that theoretical and computational modelssuggest are required for people to converge on NE. When people playunfamiliar games, they tend to model them by reference to games theyare used to in everyday experience. In particular, they tend to playone-shot laboratory games as though they were familiarrepeated games, since one-shot games are rare in normalsocial life outside of special institutional contexts. Many of theinterpretive remarks made by Henrich et al. are consistentwith this hypothesis concerning their subjects, though theynevertheless explicitly reject the hypothesis itself. What iscontroversial here—the issues of spin around‘orthodox’ theory aside—is less about what theparticular subjects in this experiment were doing than about whattheir behavior should lead us to infer about human evolution.

Gintis (2004), (2009a) argues that data of the sort we have been discussingsupport the following conjecture about human evolution. Our ancestorsapproximated maximizers of individual fitness. Somewhere along theevolutionary line these ancestors arrived in circumstances whereenough of them optimized their individual fitness by acting so as tooptimize the welfare of their group (Sober and Wilson 1998) that a genetic modification went to fixation in the species: wedeveloped preferences not just over our own individual welfare, butover the relative welfare of all members of our communities, indexedto social norms programmable in each individual by culturallearning. Thus the contemporary researcher applying game theory tomodel a social situation is advised to unearth her subjects’utility functions by (i) finding out what community (or communities)they are members of, and then (ii) inferring the utility function(s)programmed into members of that community (communities) by studyingrepresentatives of each relevant community in a range of games andassuming that the outcomes are coordinated equilibria. Since theutility functions are the dependent variables here, the games must beindependently determined. We can typically hold at least the strategicforms of the relevant games fixed, Gintis supposes, by virtue of (a)our confidence that people prefer egalitarian outcomes, all else beingequal, to inegalitarian ones within the culturally evolved‘insider groups’ to which they perceive themselves asbelonging and (b) a requirement that game equilibria are drawn fromstable attractors in plausible evolutionary game-theoretic models ofthe culture’s historical dynamics.

Requirement (b) as a constraint on game-theoretic modeling of generalhuman strategic dispositions is no longer very controversial —or, at least, is no more controversial than the generic adaptationismin evolutionary anthropology of which it is one expression. However,some commentators are skeptical of Gintis’s suggestion thatthere was a genetic discontinuity in the evolution of human sociality.(For a cognitive-evolutionary anthropology that explicitly denies suchdiscontinuity, see Sterelny 2003.) Based partly on such skepticism (but more directly on behavioraldata) Binmore (2005a, 2005b) resists modeling people as having built-in preferencesfor egalitarianism. According to Binmore’s (1994, 1998, 2005a) model, the basic class of strategic problems facing non-eusocialsocial animals are coordination games. Human communities evolvecultural norms to select equilibria in these games, and many of theseequilibria will be compatible with high levels of apparentlyaltruistic behavior in some (but not all) games. Binmore argues thatpeople adapt their conceptions of fairness to whatever happen to betheir locally prevailing equilibrium selection rules. However, hemaintains that the dynamic development of such norms must becompatible, in the long run, with bargaining equilibria amongself-regarding individuals. Indeed, he argues that as societies evolveinstitutions that encourage what Henrich et al. callaggregate market integration (discussed above), their utilityfunctions and social norms tend to converge on self-regarding economicrationality with respect to welfare. This does not mean that Binmoreis pessimistic about the prospects for egalitarianism: he develops amodel showing that societies of broadly self-interested bargainers canbe pulled naturally along dynamically stable equilibrium paths towardsnorms of distribution corresponding to Rawlsian justice (Rawls 1971). The principal barriers to such evolution, according to Binmore, areprecisely the kinds of other-regarding preferences that conservativesvalorize as a way of discouraging examination of more egalitarianbargaining equilibria that are within reach along societies’equilibrium paths.

Resolution of this debate between Gintis and Binmore fortunately neednot wait upon discoveries about the deep human evolutionary past thatwe may never have. The models make rival empirical predictions of sometestable phenomena. If Gintis is right then there are limits, imposedby the discontinuity in hominin evolution, on the extent to whichpeople can learn to be self-regarding. This is the main significanceof the controversy discussed above over Henrich etal.’s interpretation of their field data. Binmore’smodel of social equilibrium selection also depends, unlikeGintis’s, on widespread dispositions among people to inflictsecond-order punishment on members of society who fail to sanctionviolators of social norms. Gintis (2005) shows using a game theory model that this is implausible ifpunishment costs are significant. However, Ross (2008a) argues that the widespread assumption in the literature thatpunishment of norm-violation must be costly results from failure toadequately distinguish between models of the original evolution ofsociality, on the one hand, and models of the maintenance anddevelopment of norms and institutions once an initial set of them hasstabilized. Finally, Ross also points out that Binmore’sobjectives are as much normative as descriptive: he aims to showegalitarians how to diagnose the errors in conservativerationalisations of the status quo without calling for revolutionsthat put equilibrium path stability (and, therefore, social welfare)at risk. It is a sound principle in constructing reform proposals thatthey should be ‘knave-proof’ (as Hume put it), that is,should be compatible with less altruism than might prevail inpeople. Thus, despite the fact that the majority of researchersworking on game-theoretic foundations of social organization presentlyappear to side with Gintis and the other members of the Henrich etal. team, Binmore’s alternative model has some strongconsiderations in its favor. Here, then, is another issue along thefrontier of game theory application awaiting resolution in the yearsto come.

9. Looking Ahead: Areas of Current Innovation

In 2016 the Journal of Economic Perspectives published asymposium on “What is Happening in Game Theory?” Each ofthe participants noted independently that game theory has become sotightly entangled with microeconomic theory in general that thequestion becomes difficult to distinguish from inquiry into the movingfrontier of that entire sub-discipline, which is in turn the largestpart of economics as a whole. Thus the boundary between thephilosophy of game theory and the philosophy ofmicroeconomics is now similarly indistinct. Of course, as has beenstressed, applications of game theory extend beyond the traditionaldomain of economics, into all of the behavioral and social sciences.But as the methods of game theory have fused with the methods ofmicroeconomics, a commentator might equally view these extensions asbeing exported applications of microeconomics.

Following decades of development (incompletely) surveyed in thepresent article, the past few years have been relatively quiet oneswhere foundational innovations of the kind that invite contributionsfrom philosophers are concerned. Some parts of the originalfoundations are being newly revisited, however.

von Neumann and Morgenstern’s (1944) introduction of game theory divided the inquiry into two parts.Noncooperative game theory analyzes cases built on theassumption that each player maximizes her own utility function whiletreating the expected strategic responses of other players asconstraints. As discussed above, the specific game to which vonNeumann and Morgenstern applied their modeling was poker, which is azero-sum game. Most of the present article has focused on the manytheoretical challenges and insights that arose from extendingnoncooperative game theory beyond the zero-sum domain. But this infact develops only half of von Neumann and Morgenstern’sclassic. The other half developed cooperative game theory,about which nothing has so far been said here. The reason for thissilence is that for most game theorists cooperative game theory is adistraction at best and at worst a technology that confusesthe point of game theory by bypassing the aspect of games that mainlymakes them potentially interesting and insightful in application,namely, the requirement that equilibria be selected endogenously underthe restrictions imposed by Nash (1950a). This, after all, is what makes equilibria self-enforcing, just in theway that prices in competitive markets are, and thus renders themstable unless shocked from outside. Nash (1953) argued that solutions to cooperative games should always be verifiedby showing that they are also solutions to formally equivalentnoncooperative games. Nash’s accomplishment in the paper wa theanalytical identification of the relevant equivalence. One way ofinterpreting this was as demonstrating the ultimate redundancy ofcooperative game theory.

Cooperative game theory begins from the assumption that players havealready, by some unspecified process, agreed on a vector ofstrategies, and thus on an outcome. Then the analyst deploys thetheory to determine the minimal set of conditions under which theagreement remains stable. The idea is typically illustrated by theexample of a parliamentary coalition. Suppose that there is onedominant party that must be a member of any coalition if it is tocommand a majority of parliamentary votes on legislation andconfidence. There might then be a range of alternative possiblegroupings of other parties that could sustain it. Imagine, to make theexample more structured and interesting, that some parties will notserve in a coalition that includes certain specific others; so theproblem faced by the coalition organizers is not simply a matter ofsumming potential votes. The cooperative game theorist identifies theset of possible coalitions. There may be some other parties, inaddition to the dominant party, that turn out to be needed in everypossible coalition. Identifying these parties would, in this example,reveal the core of the game, the elements shared by allequilibria. The core is the key solution concept of cooperative gametheory, for which Shapley shared the Nobel prize. (Shapley (1953) is the great paper.) Nash (1953) defined the “Nash program” as consisting of verifying aparticular cooperative equilibrium by showing that noncooperativeplayers could arrive at it through the sequential bargainingprocess specified in Nash (1950b), and that all outcomes of such bargaining would include thecore.

In light of the example, it is no surprise that political scientistswere the primary users of cooperative theory during the years whilenoncooperative game theory was still being fully developed. It hasalso been applied usefully by labor economists studying settlementnegotiations between firms and unions, and by analysts ofinternational trade negotiations. We might illustrate the value ofsuch application by reference to the second example. Suppose that,given the weight of domestic lobbies in South Africa, the SouthAfrican government will never agree to any trade agreement that doesnot allow it to protect its automative assembly sector. (This has infact been the case so far.) Then allowance for such protection is partof the core of any trade treaty another country or bloc might concludewith South Africa. Knowing this can help the parties duringnegotiations avoid rhetoric or commitments to other lobbies, in any ofthe negotiating countries, that would put the core out of reach andthus guarantee negotiation failure. This example also helps usillustrate the limitations of cooperative game theory. South Africawill have to trade off the interests of some other lobbies to protectits automative industry. Which others will get traded offwill be a function of the extensive-form play of non-cooperativesequential proposals and counter-proposals, and the South Africanbargainers, if they have done their due diligence, must be attentiveto which paths through the tree throw which specific domesticinterests under the proverbial bus. Thus carrying out the cooperativeanalysis does not relieve them of the need to also conduct thenoncooperative analysis. Their game theory consultants might as wellsimply code the non-cooperative parameters into their Gambit software,which will output the core if asked.

But cooperative game theory did not die, or become confined topolitical science applications. There has turned out to be a range ofpolicy problems, involving many players whose attributes vary butwhose ordinal utility functions are symmetrical, for whichnoncooperative modeling, while possible in principle, is absurdlycumbersome and computationally demanding, but for which cooperativemodeling is beautifully suited. That we be dealing with ordinalutility functions is important, because in the relevant markets thereare often no prices. The classic example (Gale and Shapley 1962) is a marriage market. Abstracting from the scale of individualromantic dramas and comedies, society features, as it were, a vast setof people who want to form into pairs, but care very much who they endup paired with. Suppose we have a finite set of such people. Imaginethat the match-maker, or app, first splits the set into two propersubsets, and announces a rule that everyone in subset A willpropose to someone in subset B. Each of those in B whoreceive a proposal knows that she is the first choice of someone inA. She selects her first choice from the proposals she hasreceived and throws the rest back into the pool. Those in Awhose initial proposals were not accepted now each propose to someonethey did not propose to before, but possibly including people who areholding proposals from a previous round — Nkosi knows thatBarbara preferred Amalia in round 1, but Nkosi wasn’t part ofthat choice set and so might displace Amalia in round 2). Provablythere exists a terminal round after which no further proposals will bemade, and the matchmaking app will have found the core of thecooperative game because no person i in set B willprefer to pair with someone from set A who prefers ito whoever is holding that A-set dreamboat’s proposal.Everyone from set B will now accept the proposal they are holding,and, if the two sets had the same cardinality and everyone wouldrather pair with someone than pair with no one, then nobody will gooff alone.

This is not a directly applicable model of a marriage market, so thereis no money to be made in selling the simple matchmaking app describedabove. The problem is that we have no guarantee that, in the example,Nkosi and Amalia aren’t one another’s partners of destiny,but cannot get paired because they both began in subset A. Ingame theory textbooks this problem is often finessed by assuming thatSet A contains men and Set B contains women, and thateveryone is so committed to heterosexuality that they’d ratherpair with anyone of the opposite sex than anyone of their own sex. Onthe other hand, the model provides some insight, in the way thatmodels typically do, if we don’t insist on applying it tooliterally. After working through it, one sees the logic of facts aboutsociety that someone designing a real matchmaking app had betterunderstand: that the app will have to log proposals underconsideration but not yet accepted, leave people holding proposalsunder consideration on the market, and remember who has previouslyrejected whom (without creating a generalised emotional catastrophe bypublicly posting this information). The real app will not be able toreliably find the core of the cooperative game, unless the set ofpeople in the market is small, restricted, and has self-sorted intosubsets to at least some extent by providing such information as“X-type person seeks Y-type person” forX and Y properties that everyone prioritizes. (Are theresuch properties, at least as an approximation?) But the realmatchmaking apps seem to work well enough to be transforming the wayin which most young people now find mates in countries with generallyavailable internet access. Relationships between theoreticallyidealized and real marriage markets are comprehensively reviewed in Chiappori (2017).

The revival of cooperative game theory as site of renewed interest hasoccurred because policy problems have been encountered that, unlikethe original toy illustration using the all-straights marriage market,satisfy the model’s crucial assumptions. Leading instances arematching university applicants and universities, and matching peopleneeding organ transplants with donors (see Roth 2015). In these markets, there is no ambivalence about partitioning the setsto be matched. Ordinal preferences are the relevant ones: universitiesdon’t auction off places to the highest bidder (or at least notin general), and organs are not for sale (or at least not legally).The models are really applied, and they demonstrably have improvedefficiency and saved lives.

It is common in science for models that are practically clumsy fits totheir original problems to turn out to furnish highly efficientsolutions to new problems thrown up by technological change. Theinternet has created an environment for applications of matchingalgorithms — travellers and flat renters, diners andrestaurants, students and tutors, and (regrettably) socially alienatedpeople and purveyors of propaganda and fanaticism — that couldhave been designed by a theorist at any time since Shapley’soriginal innovations, but would previously have been practicallyimpossible to implement. These applications of cooperative game theoryare often applied conjointly with the noncooperative game theory ofauctions (Klemperer 2004) to drive market designs for goods and services so efficient as to beannihilating the once mighty shopping mall in even the suburban USA.Why are hotels far more profitable and easily available than was thecase in all but the largest cities before about 2007? The answer isthat dynamic pricing algorithms (Gershkov and Moldovanu 2014) blend matching theory and auction theory to allow hotels, combinedwith online travel service aggregators, to find customers willing topay premium rates for their ideal locations and times, and then fillthe remaining rooms with bargain hunters whose preferences are moreflexible. Airlines operate similar technology. Game theory thuscontinues to be one of the 20th-century inventions that is drivingsocial revolutions in the 21st, and Samuelson (2016) predicts a coming surge of renewed interest in the deeper mathematicsof cooperative games and their relationships to noncooperative games.

An range of further applications of both classical and evolutionarygame theory have been developed, but we have hopefully now providedenough to convince the reader of the tremendous, and constantlyexpanding, utility of this analytical tool. The reader whose appetitefor more has been aroused should find that she now has sufficientgrasp of fundamentals to be able to work through the large literature,of which some highlights are listed below.

Bibliography

Annotations

In the following section, books and articles which no one seriouslyinterested in game theory can afford to miss are marked with (**).

The most accessible textbook that covers all of the main branches ofgame theory is Dixit, Skeath and Reiley (2014). A student entirely new to the field should work through this beforemoving on to anything else.

Game theory has countless applications, of which this article has beenable to suggest only a few. Readers in search of more, but not wishingto immerse themselves in mathematics, can find a number of goodsources. Dixit and Nalebuff (1991) and (2008) are especially strong on political and social examples. McMillan (1991) emphasizes business applications.

The great historical breakthrough that officially launched game theoryis von Neumann and Morgenstern (1944), which those with scholarly interest in game theory should read withclassic papers of John Nash (1950a, 1950b, 1951). A very useful collection of key foundational papers, all classics, is Kuhn (1997). For a contemporary mathematical treatment that is unusuallyphilosophically sophisticated, Binmore (2005c) (**) is in a class by itself. The second half of Kreps (1990) (**) is the best available starting point for a tour of thephilosophical worries surrounding equilibrium selection fornormativists. Koons (1992) takes these issues further. Fudenberg and Tirole (1991) remains the most thorough and complete mathematical text available. Gintis (2009b) (**) provides a text crammed with terrific problem exercises, whichis also unique in that it treats evolutionary game theory as providingthe foundational basis for game theory in general. Recent developmentsin fundamental theory are well represented in Binmore, Kirman and Tani (1993). Anyone who wants to apply game theory to real human choices, whichare generally related stochastically rather than deterministically toaxioms of optimization, needs to understand quantal response theory(QRE) as a solution concept. The original development of this is foundin McKelvey and Palfrey (1995) and McKelvey and Palfrey (1998). Goeree, Holt, and Palfrey (2016) provide acomprehensive and up-to-date review of QRE and its leadingapplications.

The philosophical foundations of the basic game-theoretic concepts aseconomists understand them are presented in LaCasse and Ross (1994). Ross and LaCasse (1995) outline therelationships between games and the axiomatic assumptions ofmicroeconomics and macroeconomics. Philosophical puzzles at thisfoundational level are critically discussed in Bicchieri (1993). Lewis (1969) puts game-theoretic equilibriumconcepts to wider application in philosophy, though making somefoundational assumptions that economists generally do not share. Hisprogram is carried a good deal further, and without the contestedassumptions, by Skyrms (1996) (**) and (2004). (See also Nozick [1998].) Gauthier (1986) launches a literature notsurveyed in this article, in which the possibility of game-theoreticfoundations for contractarian ethics is investigated. This work iscritically surveyed in Vallentyne (1991), and extended into a dynamic setting in Danielson (1992). Binmore (1994, 1998) (**), however, sharplycriticizes this project as inconsistent with naturalpsychology. Philosophers will also find Hollis (1998) to be of interest.

In a class by themselves for insight, originality, readability andcross-disciplinary importance are the works of the Nobel laureateThomas Schelling. He is the fountainhead of the huge literature thatapplies game theory to social and political issues of immediaterelevance, and shows how lightly it is possible to wear one’smathematics if the logic is sufficiently sure-footed. There are fourvolumes, all essential: Schelling (1960) (**), Schelling (1978 / 2006) (**), Schelling (1984) (**), Schelling (2006) (**).

Hardin (1995) is one of many examples of the application of game theory to problemsin applied political theory. Baird, Gertner and Picker (1994) review uses of game theory in legal theory and jurisprudence. Mueller (1997) surveys applications in public choice. Ghemawat (1997) provides case studies intended to serve as a methodological templatefor practical application of game theory to business strategyproblems. Poundstone (1992) provides a lively history of the Prisoner’s Dilemma and its useby Cold War strategists. Amadae (2016) tells the same story, based on original scholarly sleuthing, withless complacency concerning its implications. The memoir of Ellsberg (2017) largely confirms Amadae’s perspective. Durlauf and Young (2001) is a useful collection on applications to social structures andsocial change.

Evolutionary game theory owes its explicit genesis to Maynard Smith (1982) (**). For a text that integrates game theory directly with biology,see Hofbauer and Sigmund (1998) (**). Sigmund (1993) presents this material in a less technical and more accessibleformat. Some exciting applications of evolutionary game theory to arange of philosophical issues, on which this article has drawnheavily, is Skyrms (1996) (**). These issues and others are critically discussed from variousangles in Danielson (1998). Mathematical foundations for evolutionary games are presented in Weibull (1995), and pursued further in Samuelson (1997). As noted above, Gintis (2009b) (**) now provides an introductory textbook that takes evolutionarymodeling to be foundational to all of game theory. H.P. Young (1998) gives sophisticated models of the evolutionary dynamics of culturalnorms through the game-theoretic interactions of agents with limitedcognitive capacities but dispositions to imitate one another. Fudenberg and Levine (1998) gives the technical foundations for modeling of this kind.

Many philosophers will also be interested in Binmore (19941998, 2005a) (**), which shows that application of game-theoretic analysis canunderwrite a Rawlsian conception of justice that does not requirerecourse to Kantian presuppositions about what rational agents woulddesire behind a veil of ignorance concerning their identities andsocial roles. (In addition, Binmore offers excursions into a range ofother issues both central and peripheral to both the foundations andthe frontiers of game theory; these books are particularly rich onproblems that interest philosophers.) Almost everyone will beinterested in Frank (1988) (**), where evolutionary game theory is used to illuminate basicfeatures of human nature and emotion; though readers of this can findcriticism of Frank’s model in Ross and Dumouchel (2004).

Behavioral and experimental applications of game theory are surveyedin Kagel and Roth (1995). Camerer (2003) (**) is a comprehensive and morerecent study of this literature, and cannot be missed by anyoneinterested in these issues. A shorter survey that emphasizesphilosophical and methodological criticism is Samuelson (2005). Philosophical foundations are also carefully examined in Guala (2005).

Two volumes from leading theorists that offer comprehensive views onthe philosophical foundations of game theory were published in 2009.These are Binmore (2009) (**) and Gintis (2009a) (**). Both are indispensible to philosophers who aim to participatein critical discussions of foundational issues.

A volume of interviews with nineteen leading game theorists, elicitingtheir views on motivations and foundational topics, is Hendricks and Hansen (2007).

A portentous recent development in the foundations of game theory isthe invention of the theory of conditional games by Stirling (2012). This first volume restricts itself to the mathematics, with someleading possibilities for application, along with technical extensionsthat provide bridges into economics, being found in the follow-up, Stirling (2016). The philosophical importance of this work is best understood in lightof considerations introduced in Bacharach (2006).

Game-theoretic dynamics of the sub-person receive deep but accessiblereflection in Ainslie (2001). Seminal texts in neuroeconomics, with extensive use of andimplications for behavioral game theory, are Montague and Berns (2002), Glimcher 2003 (**), and Camerer, Loewenstein and Prelec (2005). Ross (2005a) studies the game-theoreticfoundations of microeconomics in general, but especially behavioraleconomics and neuroeconomics, from the perspective of cognitivescience and in close alignment with Ainslie.

The theory of cooperative games is consolidated in Chakravarty, Mitra and Sarkar (2015). An accessible and non-technical review of applications of matchingtheory, by the economist whose work on it earned a Nobel Prize, is Roth (2015).

References

Ainslie, G. (1992).Picoeconomics, Cambridge: Cambridge University Press.
––– (2001).Breakdown of Will, Cambridge: Cambridge UniversityPress.
Amadae, S. (2016).Prisoners of Reason, Cambridge: Cambridge UniversityPress.
Andersen, S., Harrison, G., Lau, M., and Rutstrom,E. (2008). Eliciting risk and timepreferences. Econometrica, 76: 583–618.
Andersen, S., Harrison, G., Lau, M., and Rutstrom,E. (2014). Dual criteria decisions. Journal of EconomicPsychology, forthcoming.
Bacharach, M. (2006).Beyond Individual Choice: Teams and Frames in Game Theory,Princeton: Princeton University Press.
Baird, D., Gertner, R., and Picker, R. (1994).Game Theory and the Law, Cambridge, MA: Harvard UniversityPress.
Bell, W., (1991).Searching Behaviour, London: Chapman and Hall.
Bicchieri, C. (1993).Rationality and Coordination, Cambridge: CambridgeUniversity Press.
––– (2006).The Grammar of Society, Cambridge: Cambridge UniversityPress.
Bickhard, M. (2008). Social ontology asconvention. Topoi, 27: 139–149.
Binmore, K. (1987). Modeling Rational PlayersI. Economics and Philosophy, 3: 179–214.
––– (1994).Game Theory and the Social Contract (v. 1): PlayingFair, Cambridge, MA: MIT Press.
––– (1998).Game Theory and the Social Contract (v. 2): JustPlaying, Cambridge, MA: MIT Press.
––– (2005a).Natural Justice, Oxford: Oxford University Press.
––– (2005b). EconomicMan—or Straw Man? Behavioral and Brain Sciences 28:817–818.
––– (2005c).Playing For Real, Oxford: Oxford University Press.
––– (2007).Does Game Theory Work? The Bargaining Challenge, Cambridge,MA: MIT Press.
––– (2008). Do conventionsneed to be common knowledge? Topoi 27: 17–27.
––– (2009).Rational Decisions, Princeton: Princeton UniversityPress.
Binmore, K., Kirman, A., and Tani, P. (eds.)(1993).Frontiers of Game Theory, Cambridge, MA: MIT Press
Binmore, K., and Klemperer, P. (2002). TheBiggest Auction Ever: The Sale of British 3G Telcom Licenses.Economic Journal, 112: C74–C96.
Camerer, C. (1995). Individual DecisionMaking. In J. Kagel and A. Roth, eds.,Handbook of Experimental Economics, 587–703. Princeton:Princeton University Press.
––– (2003).Behavioral Game Theory: Experiments in StrategicInteraction, Princeton: Princeton University Press.
Camerer, C., Loewenstein, G., and Prelec,D. (2005). Neuroeconomics: How Neuroscience Can InformEconomics. Journal of Economic Literature, 40:9–64.
Chakravarty, S., Mitra, M., and Sarkar,P. (2015).A Course on Cooperative Game Theory, Cambridge: CambridgeUniversity Press.
Chew, S., and MacCrimmon, K. (1979).Alpha-nu Choice Theory: A Generalization of Expected Utility Theory.Working Paper No. 686, University of Columbia Faculty of Commerce andBusiness Administration.
Chiappori, P.-A. (2017).Matching With Transfers: The Economics of Love and Marriage,Princeton: Princeton University Press.
Clark, A. (1997).Being There, Cambridge, MA: MIT Press.
Danielson, P. (1992).Artificial Morality, London: Routledge
––– (ed.) (1998).Modelling Rationality, Morality and Evolution, Oxford:Oxford University Press.
Dennett, D. (1995).Darwin’s Dangerous Idea, New York: Simon andSchuster.
Dixit, A., and Nalebuff, B. (1991).Thinking Strategically, New York: Norton.
––– (2008).The Art of Strategy, New York: Norton.
Dixit, A., Skeath, S., and Reiley, D. (2014).Games of Strategy, fourth edition. New York: W. W. Nortonand Company.
Dugatkin, L., and Reeve, H., eds. (1998).Game Theory and Animal Behavior, Oxford: Oxford UniversityPress.
Dukas, R., ed. (1998).Cognitive Ecology., Chicago: University of ChicagoPress.
Durlauf, S., and Young, H.P., eds. (2001).Social Dynamics, Cambridge, MA: MIT Press.
Ellsberg, D. (2017).The Doomsday Machine, New York: Bloomsbury.
Erickson, P. (2015).The World the Game Theorists Made, Chicago: University ofChicago Press.
Frank, R. (1988).Passions Within Reason, New York: Norton.
Fudenberg, D., and Levine, D. (1998).The Theory of Learning in Games, Cambridge, MA: MITPress.
––– (2016). Whither GameTheory? Towards a Theory of Learning in Games.Journal of Economic Perspectives, 30(4): 151–170
Fudenberg, D., and Tirole, J. (1991).Game Theory, Cambridge, MA: MIT Press.
Gale, D., and Shapley, L. (1962). CollegeAdmissions and the Stability of Marriage. American MathematicalMonthly, 69 :9–15.
Gauthier, D. (1986).Morals By Agreement, Oxford: Oxford University Press.
Gershkov, A., and Moldovanu, B. (2014).Dynamic Allocation and Pricing: A Mechanism Design Approach,Cambridge, MA: MIT Press.
Ghemawat, P. (1997).Games Businesses Play, Cambridge, MA: MIT Press.
Gilbert, M. (1989).On Social Facts, Princeton: Princeton University Press.
Gintis, G.(2004). Towards the Unity of theHuman Behavioral Sciences. Philosophy, Politics andEconomics, 31: 37–57.
––– (2005). BehavioralEthics Meets Natural Justice. Politics, Philosophy andEconomics, 5: 5–32.
––– (2009a).The Bounds of Reason, Princeton: Princeton UniversityPress.
––– (2009b).Game Theory Evolving. Second edition. Princeton: PrincetonUniversity Press.
Glimcher, P. (2003).Decisions, Uncertainty and the Brain, Cambridge, MA: MITPress.
Glimcher, P., Kable, J., and Louie, K. (2007).Neuroeconomic studies of impulsivity: Now or just as soon aspossible? American Economic Review (Papers and Proceedings),97: 142–147.
Goeree, J., Holt, C., and Palfrey, T. (2016).Quantal Response Equilibrium, Princeton: PrincetonUniversity Press.
Guala, F. (2005).The Methodology of Experimental Economics, Cambridge:Cambridge University Press.
––– (2016).Understanding Institutions, Princeton: Princeton UniversityPress.
Hammerstein, P. (2003). Why is reciprocity sorare in social animals? A protestant appeal. In P. Hammerstein,ed., Genetic and Cultural Evolution of Cooperation,pp. 83–93. Cambridge, MA: MIT Press.
Hardin, R. (1995).One For All, Princeton: Princeton University Press.
Harrison, G.W. (2008). Neuroeconomics: Acritical reconsideration. Economics and Philosophy 24:303–344.
Harrison, G.W., and Rutstrom, E. (2008). Riskaversion in the laboratory. In Risk Aversion in Experiments,J. Cox and G. Harrison eds., Bingley, UK: Emerald,pp. 41–196.
Harrison, G.W., and Ross, D. (2010). Themethodologies of neuroeconomics. Journal of EconomicMethodology 17: 185–196.
Harsanyi, J. (1967). Games With IncompleteInformation Played by ‘Bayesian’ Players, PartsI-III. Management Science 14: 159–182.
Henrich, J., Boyd, R., Bowles, S., Camerer, C.,Fehr, E., and Gintis, H., eds. (2004).Foundations of Human Sociality: Economic Experiments andEthnographic Evidence From 15 Small-Scale Societies, Oxford:Oxford University Press.
Henrich, J., Boyd, R., Bowles, S., Camerer,C., Fehr, E., Gintis, H., McElreath, R., Alvard, M., Barr, A.,Ensminger, J., Henrich, N., Hill, K., Gil-White, F., Gurven, M.,Marlowe, F., Patton, J., and Tracer, D. (2005). ‘EconomicMan’ in Cross-Cultural Perspective.Behavioral and Brain Sciences, 28: 795–815.
Hendricks, V., and Hansen, P., eds. (2007).Game Theory: 5 Questions, Copenhagen: Automatic Press.
Hofbauer, J., and Sigmund, K. (1998).Evolutionary Games and Population Dynamics, Cambridge:Cambridge University Press.
Hofmeyr, A., and Ross, D. (2019). Team Agencyand Conditional Games. In M. Nagatsu, ed., Philosophy and SocialScience: An Interdisciplinary Dialogue, London: Bloomsbury.
Hollis, M. (1998).Trust Within Reason, Cambridge: Cambridge UniversityPress.
Hollis, M., and Sugden, R. (1993).Rationality in Action. Mind 102: 1–35.
Hurwicz, L., and Reiter, S. (2006).Designing Economic Mechanisms, Cambridge: CambridgeUniversity Press.
Kagel, J., and Roth, A., eds. (1995).Handbook of Experimental Economics, Princeton: PrincetonUniversity Press.
Keeney, R., and Raiffa, H. (1976).Decisions With Multiple Objectives, New York: Wiley.
King-Casas, B., Tomlin, D., Anen, C., Camerer, C.,Quartz, S., and Montague, P.R. (2005). Getting to Know You:Reputation and Trust in a Two-Person EconomicExchange. Science, 308: 78–83.
Klemperer, P. (2004).Auctions: Theory and Practice, Princeton: PrincetonUniversity Press.
Koons, R. (1992).Paradoxes of Belief and Strategic Rationality, Cambridge:Cambridge University Press.
Krebs, J., and Davies, N. (1984).Behavioral Ecology: An Evolutionary Approach, Secondedition. Sunderland: Sinauer.
Kreps, D. (1990).A Course in Microeconomic Theory, Princeton: PrincetonUniversity Press.
Kuhn, H., ed., (1997).Classics in Game Theory, Princeton: Princeton UniversityPress.
LaCasse, C., and Ross, D. (1994). ‘TheMicroeconomic Interpretation of Games’. PSA 1994, Volume1, D. Hull, S. Forbes and R. Burien (eds.), East Lansing, MI:Philosophy of Science Association, pp. 479–387.
Ledyard, J. (1995). Public Goods: A Survey ofExperimental Research. In J. Kagel and A. Roth, eds., Handbook ofExperimental Economics, Princeton: Princeton UniversityPress.
Lewis, D. (1969).Convention, Cambridge, MA: Harvard UniversityPress.
Maynard Smith, J. (1982).Evolution and the Theory of Games, Cambridge: CambridgeUniversity Press.
McClure, S., Laibson, D., Loewenstein, G., andCohen, J. (2004). Separate neural systems value immediate anddelayed monetary rewards.Science, 306: 503–507.
McKelvey, R., and Palfrey, T. (1995). Quantalresponse equilibria for normal form games. Games and EconomicBehavior 10: 6–38.
––– (1998). Quantalresponse equilibria for extensive form games.Experimental Economics 1: 9–41.
McMillan, J. (1991).Games, Strategies and Managers, Oxford: Oxford UniversityPress.
Millikan, R. (1984).Language, Thought and Other Biological Categories,Cambridge, MA: MIT Press.
Montague,P. R., and Berns, G. (2002). NeuralEconomics and the Biological Substrates of Valuation.Neuron, 36: 265–284.
Mueller, D. (1997).Perspectives on Public Choice, Cambridge: CambridgeUniversity Press.
Nash, J. (1950a). ‘Equilibrium Pointsin n-Person Games.’Proceedings of the National Academy of Science, 36:48–49.
––– (1950b). ‘TheBargaining Problem.’ Econometrica, 18:155–162.
––– (1951).‘Non-cooperative Games.’ Annals of MathematicsJournal, 54: 286–295.
––– (1953). Two-PersonCooperative Games. Econometrica, 21: 128–140.
Noe, R., van Hoof, J., and Hammerstein, P.,eds. (2001).Economics in Nature, Cambridge: Cambridge UniversityPress.
Nozick, R. (1998).Socratic Puzzles, Cambridge, MA: Harvard UniversityPress.
Ormerod, P. (1994).The Death of Economics, New York: Wiley.
Pettit, P., and Sugden, R. (1989). TheBackward Induction Paradox. Journal of Philosophy, 86:169–182.
Platt, M., and Glimcher, P. (1999). NeuralCorrelates of Decision Variables in Parietal Cortex.Nature, 400: 233–238.
Plott, C., and Smith, V. (1978). AnExperimental Examination of Two Exchange Institutions. Review ofEconomic Studies, 45: 133–153.
Poundstone, W. (1992).Prisoner’s Dilemma, New York: Doubleday.
Prelec, D. (1998). The Probability WeightingFunction. Econometrica, 66: 497–527.
Quiggin,J. (1982). A Theory of AnticipatedUtility. Journal of Economic Behavior and Organization, 3:323–343.
Rawls, J. (1971).A Theory of Justice, Cambridge, MA: Harvard UniversityPress.
Robbins, L. (1931).An Essay on the Nature and Significance of Economic Science,London: Macmillan.
Ross, D. (2005a).Economic Theory and Cognitive Science: Microexplanation.,Cambridge, MA: MIT Press.
––– (2005b). EvolutionaryGame Theory and the Normative Theory of Institutional Design: Binmoreand Behavioral Economics. Politics, Philosophy and Economics,forthcoming.
––– (2008a). Classicalgame theory, socialization and the rationalization ofconventions. Topoi, 27: 57–72.
––– (2008b). Two styles ofneuroeconomics. Economics and Philosophy 24:473–483.
––– (2014).Philosophy of Economics, Houndmills, Basingstoke: PalgraveMacmillan.
Ross, D., and Dumouchel, P. (2004). Emotionsas Strategic Signals. Rationality and Society, 16:251–286.
Ross, D., and LaCasse, C. (1995).‘Towards a New Philosophy of Positive Economics’.Dialogue, 34: 467–493.
Roth, A. (2015).Who Gets What and Why?, New York: Houghton MifflinHarcourt.
Sally, J. (1995). Conversation andCooperation in Social Dilemmas: A Meta-analysis of Experiments From1958 to 1992. Rationality and Society, 7: 58–92.
Samuelson, L. (1997).Evolutionary Games and Equilibrium Selection, Cambridge, MA:MIT Press.
––– (2005). Economic Theoryand Experimental Economics. Journal of Economic Literature,43: 65–107.
––– (2016). Game Theory inEconomics and Beyond. Journal of Economic Perspectives,30(4): 107–130.
Samuelson, P. (1938). ‘A Note on thePure Theory of Consumers’ Behaviour.’ Economica,5: 61–71.
Savage, L. (1954).The Foundations of Statistics, New York: Wiley.
Schelling, T. (1960>. Schelling, T(1960). Strategy of Conflict, Cambridge, MA: HarvardUniversity Press.
––– (1978).Micromotives and Macrobehavior, New York: Norton. Secondedition 2006.
––– (1980). The IntimateContest for Self-Command. Public Interest, 60:94–118.
––– (1984).Choice and Consequence, Cambridge, MA: Harvard UniversityPress.
––– (2006).Strategies of Commitment, Cambridge, MA: Harvard UniversityPress.
Selten, R. (1975). ‘Re-examination ofthe Perfectness Concept for Equilibrium Points in ExtensiveGames.’ International Journal of Game Theory, 4:22–55.
Sigmund, K. (1993).Games of Life, Oxford: Oxford University Press.
Shapley, L. (1953). A Value of n-PersonGames. In H, Kuhn and A. Tucker, eds.,Contributions to the Theory of Games II, pp. 307–317.Princeton: Princeton University Press.
Skyrms, B.(1996).Evolution of the Social Contract, Cambridge: CambridgeUniversity Press.
––– (2004).The Stag Hunt and the Evolution of Social Structure,Cambridge: Cambridge University Press.
Smith, V. (1962). An Experimental Study ofCompetitive Market Behavior. Journal of Political Economy,70: 111–137.
––– (1964). Effect ofMarket Organization on Competitive Equilibrium.Quarterly Journal of Economics, 78: 181–201.
––– (1965). ExperimentalAuction Markets and the Walrasian Hypothesis.Journal of Political Economy, 73: 387–393.
––– (1976). Bidding andAuctioning Institutions: Experimental Results. In Y. Amihud,ed., Bidding and Auctioning for Procurement and Allocation,43–64. New York: New York University Press.
––– (1982). MicroeconomicSystems as an Experimental Science. American Economic Review,72: 923–955.
––– (2008).Rationality in Economics, Cambridge: Cambridge UniversityPress.
Sober, E., and Wilson, D.S. (1998).Unto Others, Cambridge, MA: Harvard University Press.
Sterelny, K. (2003).Thought in a Hostile World, Oxford: Blackwell.
Stirling, W. (2012).Theory of Conditional Games, Cambridge: Cambridge UniversityPress.
––– (2016).Theory of Social Choice on Networks, Cambridge: CambridgeUniversity Press.
Stratmann, T. (1997). Logrolling. InD. Mueller, ed., Perspectives on Public Choice,322–341. Cambridge: Cambridge University Press.
Strotz, R. (1956). Myopia and Inconsistencyin Dynamic Utility Maximization. The Review of EconomicStudies, 23: 165–180.
Sugden, R. (1993). Thinking as a Team:Towards an Explanation of Nonselfish Behavior.Social Philosophy and Policy 10: 69–89.
––– (2000). TeamPreferences. Economics and Philosophy 16: 175–204.
––– (2003). The Logic ofTeam Reasoning. Philosophical Explorations 6:165–181.
––– (2018).The Community of Advantage, Oxford: Oxford UniversityPress.
Thurstone, L. (1931). The IndifferenceFunction. Journal of Social Psychology, 2:139–167.
Tomasello, M., M. Carpenter, J. Call, T. Behneand H. Moll (2004). Understanding and Sharing Intentions: The Originsof Cultural Cognition. Behavioral and Brain Sciences, 28:675–691.
Tversky, A., and Kahneman, D. (1992).Advances in Prospect Theory: Cumulative Representation ofUncertainty. Journal of Risk and Uncertainty, 5:297–323.
Vallentyne, P. (ed.). (1991).Contractarianism and Rational Choice, Cambridge: CambridgeUniversity Press.
von Neumann, J., and Morgenstern, O., (1944).The Theory of Games and Economic Behavior, Princeton:Princeton University Press.
von Neumann, J., and Morgenstern, O., (1947).The Theory of Games and Economic Behavior, second edition,Princeton: Princeton University Press.
Weibull, J. (1995).Evolutionary Game Theory, Cambridge, MA: MIT Press.
Wilcox, N. (2008). Stochastic Models forBinary Discrete Choice Under Risk: A Critical Primer and EconometricComparison. In J. Cox and G. Harrison, eds.,Risk Aversion and Experiments, Bingley, UK: Emerald.
Yaari, M. (1987). The Dual Theory of ChoiceUnder Risk. Econometrica, 55: 95–115.
Young, H.P. (1998).Individual Strategy and Social Structure, Princeton:Princeton University Press.

Academic Tools

How to cite this entry.

Preview the PDF version of this entry at theFriends of the SEP Society.

Look up this entry topicat the Indiana Philosophy Ontology Project (InPhO).

Enhanced bibliography for this entryat PhilPapers, with links to its database.

Other Internet Resources

Abbas, A., 2003, “The Algebra of Utility Inference,” Cornell University working paper.
A Chronology of Game Theory, Paul Walker, Economics, U. Canterbury (Christchurch, NewZealand).
What is Game Theory?, David K. Levine, Economics, UCLA.
Game Theory, Experimental Economics, and Market Design, page maintained by Al Roth (Economics, Stanford).

Related Entries

Acknowledgments

Playing For Real Binmore Pdf Writer Download

I would like to thank James Joyce and Edward Zalta for their commentson various versions of this entry. I would also like to thank SamLazell for not only catching a nasty patch of erroneous analysis inthe second version, but going to the supererogatory trouble ofactually providing fully corrected reasoning. If there were many suchreaders, all authors in this project would become increasinglycollective over time. One of my MBA students, Anthony Boting, noticedthat my solution to an example I used in the second version rested onequivocating between relative-frequency and objective-chanceinterpretations of probability. Two readers, Brian Ballsun-Stanton andGeorge Mucalov, spotted this too and were kind enough to write to meabout it. Many thanks to them. Joel Guttman pointed out that I’dillustrated a few principles with some historical anecdotes thatcirculate in the game theory community, but told them in a way thatwas too credulous with respect to their accuracy. Michel Benaim andMathius Grasselli noted that I’d identified the wrong Plato textas the source of Socrates’s reflections on soldiers’incentives. Ken Binmore picked up another factual error while thethird revision was in preparation, as a result of which no one elseever saw it. Not so for a mistake found by Bob Galesloot that survivedin the article all the way into the third edition. (That error wascorrected in July 2010.) Some other readers helpfully spotted typos:thanks to Fabian Ottjes, Brad Colbourne, Nicholas Dozet and GustavoNarez. Nelleke Bak, my in-house graphics guru (and spouse) drew allfigures except 15, 16, and 17, which were generously contributed byGeorge Ainslie. My thanks to her and him. Finally, thanks go to ColinAllen for technical support (in the effort to deal with bandwidthproblems to South Africa) prior to publication of the second versionof this entry, to Daniel McKenzie for procedural advice on preparationof the third version, and to Uri Nodelman for helping with code formath notation and formatting of figures for the fifth, versionpublished in 2014.

Showing 1-50 of 509

The Evolution of Cooperation (Paperback)
by(shelved 36 times as game-theory)
avg rating 4.24 — 1,485 ratings — published 1984

Rate this book

The Art of Strategy: A Game Theorist's Guide to Success in Business and Life (Hardcover)
by(shelved 36 times as game-theory)
avg rating 3.87 — 1,919 ratings — published 1991

Rate this book

Thinking Strategically: The Competitive Edge in Business, Politics, and Everyday Life (Paperback)
by(shelved 35 times as game-theory)
avg rating 3.98 — 2,482 ratings — published 1991

Rate this book

Theory of Games and Economic Behavior (Paperback)
by(shelved 32 times as game-theory)
avg rating 4.18 — 248 ratings — published 1944

Rate this book

The Strategy of Conflict: With a New Preface by the Author (Paperback)
by(shelved 31 times as game-theory)
avg rating 4.03 — 582 ratings — published 1960

Rate this book

Prisoner's Dilemma: John von Neumann, Game Theory, and the Puzzle of the Bomb (Paperback)
by(shelved 28 times as game-theory)
avg rating 3.91 — 1,442 ratings — published 1992

Rate this book

Game Theory: A Very Short Introduction (Paperback)
by(shelved 20 times as game-theory)
avg rating 3.26 — 688 ratings — published 2007

Rate this book

Game Theory for Applied Economists (Paperback)
by(shelved 19 times as game-theory)
avg rating 3.87 — 265 ratings — published 1992

Rate this book

Rock, Paper, Scissors: Game Theory in Everyday Life (Paperback)
by(shelved 19 times as game-theory)
avg rating 3.42 — 933 ratings — published 2000

Rate this book

A course in Game Theory (Paperback)
by(shelved 17 times as game-theory)
avg rating 3.99 — 121 ratings — published 1994

Rate this book

Games and Decisions: Introduction and Critical Survey (Paperback)
by(shelved 17 times as game-theory)
avg rating 3.85 — 114 ratings — published 1957

Rate this book

Game Theory: A Nontechnical Introduction (Paperback)
by(shelved 17 times as game-theory)
avg rating 3.57 — 442 ratings — published 1970

Rate this book

Co-Opetition (Paperback)
by(shelved 16 times as game-theory)
avg rating 3.99 — 1,105 ratings — published 1996

Rate this book

Game Theory. Analysis of conflict (Paperback)
by(shelved 15 times as game-theory)
avg rating 3.93 — 136 ratings — published 1991

Rate this book

Game Theory (Hardcover)
by(shelved 15 times as game-theory)
avg rating 4.03 — 105 ratings — published 1991

Rate this book

An Introduction to Game Theory (Hardcover)
by(shelved 15 times as game-theory)
avg rating 3.80 — 128 ratings — published 2003

Rate this book

A Beautiful Mind (Paperback)
by(shelved 13 times as game-theory)
avg rating 4.13 — 115,342 ratings — published 1998

Rate this book

Behavioral Game Theory: Experiments in Strategic Interaction (Hardcover)
by(shelved 13 times as game-theory)
avg rating 4.03 — 91 ratings — published 2003

Rate this book

Finite and Infinite Games: A Vision of Life as Play and Possibility (Mass Market Paperback)
by(shelved 13 times as game-theory)
avg rating 3.93 — 3,010 ratings — published 1986

Rate this book

Game Theory: An Introduction (Hardcover)
by(shelved 12 times as game-theory)
avg rating 4.02 — 44 ratings — published 2012

Rate this book

Micromotives and Macrobehavior (Paperback)
by(shelved 12 times as game-theory)
avg rating 4.03 — 936 ratings — published 1978

Rate this book

The Joy of Game Theory: An Introduction to Strategic Thinking (Kindle Edition)
by(shelved 11 times as game-theory)
avg rating 3.89 — 329 ratings — published 2013

Rate this book

Evolution and the Theory of Games (Paperback)
by(shelved 11 times as game-theory)
avg rating 3.99 — 84 ratings — published 1982

Rate this book

Game Theory at Work: How to Use Game Theory to Outthink and Outmaneuver Your Competition (Hardcover)
by(shelved 10 times as game-theory)
avg rating 3.95 — 92 ratings — published 2003

Rate this book

Algorithmic Game Theory (Hardcover)
by(shelved 10 times as game-theory)
avg rating 4.17 — 52 ratings — published 2007

Rate this book

Games of Strategy (Hardcover)
by(shelved 10 times as game-theory)
avg rating 3.95 — 243 ratings — published 1999

Rate this book

Who Gets What — and Why: The New Economics of Matchmaking and Market Design (Hardcover)
by(shelved 9 times as game-theory)
avg rating 3.87 — 1,856 ratings — published 2014

Rate this book

The Compleat Strategyst: Being a Primer on the Theory of Games of Strategy (Paperback)
by(shelved 9 times as game-theory)
avg rating 3.60 — 120 ratings — published 1965

Rate this book

Playing for Real: A Text on Game Theory (Hardcover)
by(shelved 9 times as game-theory)
avg rating 4.14 — 35 ratings — published 2007

Rate this book

Reality is Broken: Why Games Make Us Better and How They Can Change the World (Paperback)
by(shelved 9 times as game-theory)
avg rating 3.81 — 6,803 ratings — published 2010

Rate this book

The Bounds of Reason: Game Theory and the Unification of the Behavioral Sciences (Hardcover)
by(shelved 8 times as game-theory)
avg rating 3.76 — 34 ratings — published 2009

Rate this book

Game Theory: A Critical Introduction (Paperback)
by(shelved 8 times as game-theory)
avg rating 3.71 — 34 ratings — published 1995

Rate this book

Natural Justice (Hardcover)
by(shelved 8 times as game-theory)
avg rating 4.12 — 50 ratings — published 2005

Rate this book

The Game Theorist's Guide to Parenting: How the Science of Strategic Thinking Can Help You Deal with the Toughest Negotiators You Know--Your Kids (Hardcover)
by(shelved 7 times as game-theory)
avg rating 3.51 — 235 ratings — published 2016

Rate this book

Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations (Hardcover)
by(shelved 7 times as game-theory)
avg rating 3.68 — 28 ratings — published 2008

Rate this book

The Complete Idiot's Guide to Game Theory (Paperback)
by(shelved 7 times as game-theory)
avg rating 3.45 — 84 ratings — published 2005

Rate this book

Evolutionary Game Theory (Paperback)
by(shelved 7 times as game-theory)
avg rating 3.65 — 17 ratings — published 1995

Rate this book

Introducing Game Theory: A Graphic Guide (Paperback)
by(shelved 6 times as game-theory)
avg rating 3.88 — 705 ratings — published

Rate this book

Auction Theory (Hardcover)
by(shelved 6 times as game-theory)
avg rating 4.26 — 27 ratings — published 2002

Rate this book

Game Theory and the Social Contract, Volume 2: Just Playing (Economic Learning and Social Evolution)
by(shelved 6 times as game-theory)
avg rating 3.88 — 8 ratings — published 1998

Rate this book

Rational Decisions (Hardcover)
by(shelved 6 times as game-theory)
avg rating 3.94 — 36 ratings — published 2008

Rate this book

Game Theory Evolving: A Problem-Centered Introduction to Modeling Strategic Interaction (Paperback)
by(shelved 6 times as game-theory)
avg rating 4.13 — 39 ratings — published 2000

Rate this book

Evolution of the Social Contract (Paperback)
by(shelved 6 times as game-theory)
avg rating 3.86 — 69 ratings — published 1996

Rate this book

The Predictioneer's Game: Using the Logic of Brazen Self-Interest to See and Shape the Future (Hardcover)
by(shelved 6 times as game-theory)
avg rating 3.62 — 735 ratings — published 2009

Rate this book

A Beautiful Math: John Nash, Game Theory, and the Modern Quest for a Code of Nature (Hardcover)
by(shelved 5 times as game-theory)
avg rating 3.62 — 260 ratings — published 2006

Rate this book

Arms and Influence (Paperback)
by(shelved 5 times as game-theory)
avg rating 4.10 — 594 ratings — published 1967

Rate this book

Gaming the Vote: Why Elections Aren't Fair (and What We Can Do About It)
by(shelved 5 times as game-theory)
avg rating 3.94 — 266 ratings — published 2008

Rate this book

Game Theory (Hardcover)
by(shelved 5 times as game-theory)
avg rating 4.44 — 27 ratings — published 2013

Rate this book

The Essential John Nash (Paperback)
by(shelved 5 times as game-theory)
avg rating 3.92 — 99 ratings — published 2001

Rate this book

Game Theory and Economic Modelling (Clarendon Lectures in Economics)
by(shelved 5 times as game-theory)
avg rating 3.81 — 27 ratings — published 1990

Playing For Real Binmore Pdf Writer Free

Rate this book

Game Theory for Beginners

5 books — 1 voter

“You have not seen desperation and helplessness till you have seen a man hopeless in love. Of course, unless you have seen a gamer.”
―

“There is a neat economic explanation for the sexual division of labour in hunter-gatherers. In terms of nutrition, women generally collect dependable, staple carbohydrates whereas men fetch precious protein. Combine the two – predictable calories from women and occasional protein from men – and you get the best of both worlds. At the cost of some extra work, women get to eat some good protein without having to chase it; men get to know where the next meal is coming from if they fail to kill a deer. That very fact makes it easier for them to spend more time chasing deer and so makes it more likely they will catch one. Everybody gains – gains from trade. It is as if the species now has two brains and two stores of knowledge instead of one – a brain that learns about hunting and a brain that learns about gathering.”
―

More quotes..

Bruce Bueno de Mesquita predicts Iran's future

GAME THEORY