About site: Philosophy/Reference/Stanford Encyclopedia of Philosophy - Game Theory
Return to Society also Society
  About site: http://plato.stanford.edu/entries/game-theory/

Title: Philosophy/Reference/Stanford Encyclopedia of Philosophy - Game Theory Von Neumann and Morgensterns mathematical theory of bargaining, introduced by Don Ross University of Cape Town.
Australian_Egalitarian_Movement_Inc_(AEM) Political organisation opposing capitalism and promoting egalitarianism .

Answering_the_Call,_by_Paige_Newman Longtime veggie Paige Newman reveals why it's getting easier to be a vegan activist.

Mystics_at_Work Seminars, conference presentations, group discussion and articles.

Woman_Vision_-_Social_Change_Through_Media Production and use of educational media to promote tolerance and equal treatment of all people. Diversity programs promoting understanding and diversity as values.

Author_will_headline_night_with_Eliot_Ness_lore,_facts Article from Sun Newspapers of Cleveland, Ohio, with an overview of his career, and a profile of Ness biographer, Paul Heimel. (February 13, 1997)

Aaron Summary and analysis of the high priest's life from a nineteenth-century Christian source.


  Alexa statistic for http://plato.stanford.edu/entries/game-theory/





Get your Google PageRank






Please visit: http://plato.stanford.edu/entries/game-theory/


  Related sites for http://plato.stanford.edu/entries/game-theory/
    Dearest_Mom An on-line support group for moms who have a grown child in the prison system.
    Conflict_Prevention_Initiative A project of the Program on Humanitarian Policy and Conflict Research at Harvard University that gives U.N. policymakers access to information on human security and conflict prevention. Features relat
    Lesbian_Community_Cancer_Project Promotes lesbian health with a focus on cancer-related issues.
    Bible_On_Sex_And_Marriage,_The Quotations from the Bible says about sex and from a psycholgy book about sex and marriage.
    La_Voz_de_Aztlán The Voice of Aztlán, a magazine publishing news, analyses, and scholarly writings on La Raza.
    Breslin,_Patrick Includes personal details, interests and family photo album.
    Angelic_Intuitive_Counseling Margaret Gray gives readings by phone or in her office. Includes a profile, rates and testimonials. Located in San Diego, California, United States.
    The_Watermelon_Patch An anarchist look at poetry, politics and the environment.
    Ritual_Abuse_and_Secret_Societies_Information Alleges that Ritual Abuse is routinely committed by Freemasons, Mormons, Illuminati and Gnostic Satanists.
    Climate_Action_Network Global network of NGOs working to promote government and individual action to limit human-induced climate change to ecologically sustainable levels. Current events and actions on what you can do.
    Firemaster European firewalk instructor school. Photographs of recent training, with brief overview of what is involved.
    Sam_Houston\'s_Cannonball Relates the story of the celabratory firing of cannon when Texas won its independence from Mexico.
    Mr__Jefferson\'s_Music Passionate music-lover and amateur violinist, Jefferson obtained the latest baroque compositions from Europe.
    University_of_Sussex M. A. in Aesthetics. Full-time or part-time.
    Men\'s_Improvement Self improvement for intelligent men.
    South_Carolina_Association_of_School_Resource_Officers Resource for all School Resource Officers in South Carolina.
    President_Calvin_Coolidge_State_Historic_Site Visit the rural Vermont village, birthplace and boyhood home of President Calvin Coolidge. Open May to mid-October. Historic site map, building descriptions and directions.
    EstatePlanning_com Information on trusts, estate taxes and probate, with a directory of estate planning professionals.
    Doors_of_Peace_-_Channeling Spiritual site includes instructions on how to channel and channelings from spirit divided into different sections including nature and faery.
    FUBAR! The complete idiot's Guide to the moron's Guide to the Hitchhiker's Guide to Jesus, for dummies.
This is websites2007.org cache of m/ as retrieved on 2008.10.14 websites2007.org's cache is the snapshot that we took of the page as we crawled the web. The page may have changed since that time.
Game Theory (Stanford Encyclopedia of Philosophy)  Cite this entry Search the SEP • Advanced Search • Tools • RSS FeedTable of Contents• What's New• Archives• Projected ContentsEditorial Information• About the SEP• Editorial Board• How to Cite the SEP• Special CharactersSupport the SEPContact the SEPSEP logo©Metaphysics Research Lab,CSLI,Stanford University Stanford Encyclopedia of Philosophy Open access to the SEP is made possible by a world-wide funding initiative. Please Read How You Can Help Keep the Encyclopedia Free

Game Theory

First published Sat Jan 25, 1997; substantive revision Fri Mar 10, 2006Game theory is the study of the ways in which strategicinteractions among rational players produceoutcomes with respect to the preferences (orutilities) of those players, none of which might have beenintended by any of them. The meaning of this statement will not beclear to the non-expert until each of the italicized words and phraseshas been explained and featured in some examples. Doing this will bethe main business of this article. First, however, we provide somehistorical and philosophical context in order to motivate the readerfor all of this technical work ahead.1. Philosophical and Historical Motivation2. Basic Elements and Assumptions of Game Theory 2.1 Utility 2.2 Games and Information 2.3 Trees and Matrices 2.4 The Prisoner's Dilemma as an Example of Strategic-Form vs. Extensive-Form Representation 2.5 Solution Concepts and Equilibria 2.6 Modular Rationality and Subgame Perfection 2.7 On Interpreting Payoffs: Morality and Efficiency in Games 2.8 Trembling Hands 3. Uncertainty, Risk and Sequential Equilibria 3.1 Beliefs 4. Repeated Games and Coordination5. Commitment6. Evolutionary Game Theory7. Game Theory and Behavioral Evidence 7.1 Game Theory in the Laboratory 7.2 Neuroeconomics and Game Theory 7.3 Game Theoretic Models of Human Nature BibliographyOther Internet ResourcesRelated Entries

1. Philosophical and Historical Motivation

The mathematical theory of games was invented by John von Neumann andOskar Morgenstern (1944). For reasons to be discussed later, limitations in their mathematicalframework initially made the theory applicable only under special andlimited conditions. This situation has gradually changed, in ways wewill examine as we go along, over the past six decades, as theframework was deepened and generalized. Refinements are still beingmade, and we will review a few outstanding philosophical problems thatlie along the advancing front edge of these developments towards theend of the article. However, since at least the late 1970s it has beenpossible to say with confidence that game theory is the most importantand useful tool in the analyst's kit whenever she confronts situationsin which what counts as one agent's best action (for her) depends onexpectations about what one or more other agents will do, and whatcounts as their best actions (for them) similarly depend onexpectations about her. Despite the fact that game theory has been rendered mathematically andlogically systematic only recently, however, game-theoretic insightscan be found among philosophers and political commentators going backto ancient times. For example, in two of Plato's texts, theLaches and the Symposium, Socrates recalls anepisode from the Battle of Delium that involved the followingsituation. Consider a soldier at the front, waiting with his comradesto repulse an enemy attack. It may occur to him that if the defense islikely to be successful, then it isn't very probable that his ownpersonal contribution will be essential. But if he stays, he runs therisk of being killed or wounded—apparently for no point. On theother hand, if the enemy is going to win the battle, then his chancesof death or injury are higher still, and now quite clearly to nopoint, since the line will be overwhelmed anyway. Based on thisreasoning, it would appear that the soldier is better off running awayregardless of who is going to win the battle. Of course, if all of thesoldiers reason this way—as they all apparently should,since they're all in identical situations—then this willcertainly bring about the outcome in which the battle islost. Of course, this point, since it has occurred to us as analysts,can occur to the soldiers too. Does this give them a reason forstaying at their posts? Just the contrary: the greater the soldiers'fear that the battle will be lost, the greater their incentive to getthemselves out of harm's way. And the greater the soldiers' beliefthat the battle will be won, without the need of any particularindividual's contributions, the less reason they have to stay andfight. If each soldier anticipates this sort of reasoning onthe part of the others, all will quickly reason themselves into apanic, and their horrified commander will have a rout on his handsbefore the enemy has even fired a shot.Long before game theory had come along to show people how to thinkabout this sort of problem systematically, it had occurred to someactual military leaders and influenced their strategies. Thus theSpanish conqueror Cortez, when landing in Mexico with a small forcewho had good reason to fear their capacity to repel attack from thefar more numerous Aztecs, removed the risk that his troops might thinktheir way into a retreat by burning the ships on which they hadlanded. With retreat having thus been rendered physically impossible,the Spanish soldiers had no better course of action but to stand andfight—and, furthermore, to fight with as much determination asthey could muster. Better still, from Cortez's point of view, hisaction had a discouraging effect on the motivation of the Aztecs. Hetook care to burn his ships very visibly, so that the Aztecs would besure to see what he had done. They then reasoned as follows: Anycommander who could be so confident as to willfully destroy his ownoption to be prudent if the battle went badly for him must have goodreasons for such extreme optimism. It cannot be wise to attack anopponent who has a good reason (whatever, exactly, it might be) forbeing sure that he can't lose. The Aztecs therefore retreated into thesurrounding hills, and Cortez had his victory bloodlessly.These situations as recalled by Plato and as vividly acted upon byCortez have a common and interesting underlying logic. Notice that thesoldiers are not motivated to retreat just, or even mainly,by their rational assessment of the dangers of battle and by theirself-interest. Rather, they discover a sound reason to run away byrealizing that what it makes sense for them to do depends on what itwill make sense for others to do, and that all of the others cannotice this too. Even a quite brave soldier may prefer to run ratherthan heroically, but pointlessly, die trying to stem the oncoming tideall by himself. Thus we could imagine, without contradiction, acircumstance in which an army, all of whose members are brave, fleesat top speed before the enemy makes a move. If the soldiers reallyare brave, then this surely isn't the outcome any of themwanted; each would have preferred that all stand and fight. What wehave here, then, is a case in which the interaction of manyindividually rational decision-making processes—one process persoldier—produces an outcome intended by no one. (Most armies tryto avoid this problem just as Cortez did. Since they can't usuallymake retreat physically impossible, they make iteconomically impossible: they shoot deserters. Then standingand fighting is each soldier's individually rational course of actionafter all, because the cost of running is sure to be at least as highas the cost of staying.)Another classic source that invites this sequence of reasoning isfound in Shakespeare's Henry V. During the Battle ofAgincourt Henry decided to slaughter his French prisoners, in fullview of the enemy and to the surprise of his subordinates, whodescribe the action as being out of moral character. The reasons Henrygives allude to parametric considerations: he is afraid that theprisoners may free themselves and threaten his position. However, agame theorist might have furnished him with supplementary strategic(and similarly prudential, though perhaps not moral)justification. His own troops observe that the prisoners have beenkilled, and observe that the enemy has observed this. Therefore, theyknow what fate will await them at the enemy's hand if they don'twin. Metaphorically, but very effectively, their boats have beenburnt. The slaughter of the prisoners plausibly sent a signal to thesoldiers of both sides, thereby changing their incentives in ways thatfavoured English prospects for victory.These examples might seem to be relevant only for those who findthemselves in sordid situations of cut-throat competition. Perhaps, onemight think, it is important for generals, politicians, businesspeopleand others whose jobs involve manipulation of others, but thephilosopher should only deplore its horrid morality. Such a conclusionwould be highly premature, however. The study of the logicthat governs the interrelationships amongst incentives, strategicinteractions and outcomes has been fundamental in modern politicalphilosophy, since centuries before anyone had an explicit name for thissort of logic.Hobbes's Leviathan is often regarded as the founding workin modern political philosophy, the text that began the continuinground of analyses of the function and justification of the state andits restrictions on individual liberties. The core of Hobbes'sreasoning can be given quite straightforwardly as follows. The bestsituation for all people is one in which each is free to do as shepleases. Often, such free people will wish to cooperate with oneanother in order to carry out projects that would be impossible for anindividual acting alone. But if there are any immoral or amoral agentsaround, they will notice that their interests are best served bygetting the benefits from cooperation and not returning them. Suppose,for example, that you agree to help me build my house in return for mypromise to help you build yours. After my house is finished, I can makeyour labour free to me simply by reneging on my promise. I thenrealize, however, that if this leaves you with no house, you will havean incentive to take mine. This will put me in constant fear of you,and force me to spend valuable time and resources guarding myselfagainst you. I can best minimize these costs by striking first andkilling you at the first opportunity. Of course, you can anticipate allof this reasoning by me, and so have good reason to try to beat me tothe punch. Since I can anticipate this reasoning byyou, my original fear of you was not paranoid; nor was yoursof me. In fact, neither of us actually needs to be immoral to get thischain of mutual reasoning going; we need only think that there is somepossibility that the other might try to cheat on bargains.Once a small wedge of doubt enters any one mind, the incentive inducedby fear of the consequences of being preempted—hit beforehitting first—quickly becomes overwhelming on both sides. If eitherof us has any resources of our own that the other might want, thismurderous logic will take hold long before we are so silly as toimagine that we could ever actually get so far as making deals to helpone another build houses in the first place. Left to their own devices,rational agents will never derive the benefits of cooperation, and willinstead live from the outset in a state of ‘war of all againstall’, in Hobbes's words. In these circumstances, all human life,as he vividly and famously put it, will be "solitary, poor, nasty,brutish and short."Hobbes's proposed solution to this problem was tyranny. The people canhire an agent—a government—whose job is to punish anyonewho breaks any promise. So long as the threatened punishment issufficiently dire—Hobbes thought decapitation generallyappropriate—then the cost of reneging on promises will exceedthe cost of keeping them. The logic here is identical to that used byan army when it threatens to shoot deserters. If all people know thatthese incentives hold for most others, then cooperation will not onlybe possible, but will be the expected norm, and the war of all againstall becomes a general peace.Hobbes pushes the logic of this argument to a very strongconclusion, arguing that it implies not only a government with theright and the power to enforce cooperation, but an‘undivided’ government in which the arbitrary will of asingle ruler must impose absolute obligation on all. Few contemporarypolitical theorists think that the particular steps by which Hobbesreasons his way to this conclusion are both sound and valid. Workingthrough these issues here, however, would carry us away from our topicinto complex details of contractarian political philosophy. What isimportant in the present context is that these details, as they are infact pursued in the contemporary debates, all involve sophisticatedinterpretation of the issues using the resources of modern game theory.Furthermore, Hobbes's most basic point, that the fundamentaljustification for the coercive authority and practices of governmentsis peoples' own need to protect themselves from what game theoristscall ‘social dilemmas’, is accepted by many, if not most,political theorists. Notice that Hobbes has not argued thattyranny is a desirable thing in itself. The structure of his argumentis that the logic of strategic interaction leaves only two generalpolitical outcomes possible: tyranny and anarchy. Rational agents thenchoose tyranny as the lesser of two evils.The reasoning of Cortez, of Henry V and of Hobbes's political agentshas a common logic, one derived from their situations. In each case,the aspect of the environment that is most important to the agents'achievement of their preferred outcomes is the set of expectations andpossible reactions to their strategies by other agents. The distinctionbetween acting parametrically on a passive world and actingnon-parametrically on a world that tries to act inanticipation of these actions is fundamental. If you wish to kick arock down a hill, you need only concern yourself with the rock's massrelative to the force of your blow, the extent to which it is bondedwith its supporting surface, the slope of the ground on the other sideof the rock, and the expected impact of the collision on your foot. Thevalues of all of these variables are independent of your plans andintentions, since the rock has no interests of its own and takes noactions to attempt to assist or thwart you. By contrast, if you wish tokick a person down the hill, then unless that person is unconscious,bound or otherwise incapacitated, you will likely not succeed unlessyou can disguise your plans until it's too late for him to take eitherevasive or forestalling action. The logical issues associated with thesecond sort of situation are typically much more complicated, as asimple hypothetical example will illustrate.Suppose first that you wish to cross a river that is spanned by threebridges. (Assume that swimming, wading or boating across areimpossible.) The first bridge is known to be safe and free ofobstacles; if you try to cross there, you will succeed. The secondbridge lies beneath a cliff from which large rocks sometimes fall. Thethird is inhabited by deadly cobras. Now suppose you wish torank-order the three bridges with respect to their preferability ascrossing-points. Your task here is quite straightforward. The firstbridge is obviously best, since it is safest. To rank-order the othertwo bridges, you require information about their relative levels ofdanger. If you can study the frequency of rock-falls and the movementsof the cobras for awhile, you might be able to calculate that theprobability of your being crushed by a rock at the second bridge is10% and of being struck by a cobra at the third bridge is 20%. Yourreasoning here is strictly parametric because neither the rocks northe cobras are trying to influence your actions, by, for example,concealing their typical patterns of behaviour because they know youare studying them. It is quite obvious what you should do here: crossat the safe bridge. Now let us complicate the situation a bit. Supposethat the bridge with the rocks was immediately before you, while thesafe bridge was a day's difficult hike upstream. Your decision-makingsituation here is slightly more complicated, but it is still strictlyparametric. You would have to decide whether the cost of the long hikewas worth exchanging for the penalty of a 10% chance of being hit by arock. However, this is all you must decide, and your probability of asuccessful crossing is entirely up to you; the environment is notinterested in your plans.However, if we now complicate the situation in the direction ofnon-parametricity, it becomes much more puzzling. Suppose that you area fugitive of some sort, and waiting on the other side of the riverwith a gun is your pursuer. She will catch and shoot you, let ussuppose, only if she waits at the bridge you try to cross; otherwise,you will escape. As you reason through your choice of bridge, itoccurs to you that she is over there trying to anticipate yourreasoning. It will seem that, surely, choosing the safe bridgestraight away would be a mistake, since that is just where she willexpect you, and your chances of death rise to certainty. So perhapsyou should risk the rocks, since these odds are much better. But wait… if you can reach this conclusion, your pursuer, who is justas rational and well-informed as you are, can anticipate that you willreach it, and will be waiting for you if you evade the rocks. Soperhaps you must take your chances with the cobras; that is what shemust least expect. But, then, no … if she expects that youwill expect that she will least expect this, then she will most expectit. This dilemma, you realize with dread, is general: you must do whatyour pursuer least expects; but whatever you most expect her to leastexpect is automatically what she will most expect. You appear to betrapped in indecision. All that might console you a bit here is that,on the other side of the river, your pursuer is trapped in exactly thesame quandary, unable to decide which bridge to wait at because assoon as she imagines committing to one, she will notice that if shecan find a best reason to pick a bridge, you can anticipate that samereason and then avoid her.We know from experience that, in situations such as this, people donot usually stand and dither in circles forever. As we'll see later,there is a rational solution—that is, a best rationalaction—available to both players. However, until the 1940sneither philosophers nor economists knew how to find itmathematically. As a result, economists were forced to treatnon-parametric influences as if they were complications on parametricones. This is likely to strike the reader as odd, since, as ourexample of the bridge-crossing problem was meant to show,non-parametric features are often fundamental features ofdecision-making problems. Part of the explanation for game theory'srelatively late entry into the field lies in the problems with whicheconomists had historically been concerned. Classical economists, suchas Adam Smith and David Ricardo, were mainly interested in thequestion of how agents in very large markets—wholenations—could interact so as to bring about maximum monetarywealth for themselves. Smith's basic insight, that efficiency is bestmaximized by agents freely seeking mutually advantageous bargains, wasmathematically verified in the twentieth century. However, thedemonstration of this fact applies only in conditions of‘perfect competition,’ that is, when firms face no costsof entry or exit into markets, when there are no economies of scale,and when no agents' actions have unintended side-effects on otheragents' well-being. Economists always recognized that this set ofassumptions is purely an idealization for purposes of analysis, not apossible state of affairs anyone could try (or should want to try) toattain. But until the mathematics of game theory matured near the endof the 1970s, economists had to hope that the more closely a marketapproximates perfect competition, the more efficient it willbe. No such hope, however, can be mathematically or logicallyjustified in general; indeed, as a strict generalization theassumption can be shown to be false.This article is not about the foundations of economics, but it isimportant for understanding the origins and scope of game theory toknow that perfectly competitive markets have built into them a featurethat renders them susceptible to parametric analysis. Because agentsface no entry costs to markets, they will open shop in any given marketuntil competition drives all profits to zero. This implies that ifcosts and demand are fixed, then agents have no options about how muchto produce if they are trying to maximize the differences between theircosts and their revenues. These production levels can be determinedseparately for each agent, so none need pay attention to what theothers are doing; each agent treats her counterparts as passivefeatures of the environment. The other kind of situation to whichclassical economic analysis can be applied without recourse to gametheory is that of monopoly. Here, quite obviously, non-parametricconsiderations drop out, since there is only one agent under study.However, both perfect and monopolistic competition are very special andunusual market arrangements. Prior to the advent of game theory,therefore, economists were severely limited in the class ofcircumstances to which they could neatly apply their models.Philosophers share with economists a professional interest in theconditions and techniques for the maximization of human welfare. Inaddition, philosophers have a special concern with the logicaljustification of actions, and often actions must be justified byreference to their expected outcomes. Without game theory, both ofthese problems resist analysis wherever non-parametric aspects arerelevant. We will demonstrate this shortly by reference to the mostfamous (though not the most typical) game, the so-called Prisoner'sDilemma, and to other, more typical, games. In doing this, we willneed to introduce, define and illustrate the basic elements andtechniques of game theory. To this job we therefore now turn.

2. Basic Elements and Assumptions of Game Theory

2.1 UtilityAn agent is, by definition, an entity with preferences. Gametheorists, like economists and philosophers studying rationaldecision-making, describe these by means of an abstract concept calledutility. This refers to the amount of ‘welfare’ an agentderives from an object or an event. By ‘welfare’ we refer to somenormative index of relative well-being, justified by reference to somebackground framework. For example, we might evaluate the relativewelfare of countries (which we might model as agents for somepurposes) by reference to their per capita incomes, and we mightevaluate the relative welfare of an animal, in the context ofpredicting and explaining its behavioral dispositions, by reference toits expected fitness. In the case of people, it is most typical ineconomics and applications of game theory to evaluate their relativewelfare by reference to their own implicit or explicit judgments ofit. Thus a person who, say, adores the taste of pickles but dislikesonions would be said to associate higher utility with states of theworld in which, all else being equal, she consumes more pickles andfewer onions than with states in which she consumes more onions andfewer pickles. Examples of this kind suggest that‘utility’ denotes a measure of subjectivepsychological fulfillment, and this is indeed how the conceptwas generally (though not always) interpreted prior to the 1930s.During that decade, however, economists and philosophers under theinfluence of behaviourism objected to the theoretical use of suchunobservable entities as ‘psychological fulfillmentquotients.’ The economist Paul Samuelson (1938) therefore set out to define utility in such a way that it becomes apurely technical concept. That is, when we say that an agent acts soas to maximize her utility, we mean by ‘utility’ simplywhatever it is that the agent's behavior suggests her to consistentlydesire. If this looks circular to you, it should: theorists who followSamuelson intend the statement ‘agents act so as tomaximize their utility’ as a tautology. Like other tautologiesoccurring in the foundations of scientific theories, it is useful notin itself, but because it helps to fix our contexts of inquiry.Though we might no longer be moved by scruples derived frompsychological behaviorism, many theorists continue to followSamuelson's way of understanding utility because they think itimportant that game theory apply to any kind of agent—aperson, a bear, a bee, a firm or a country—and not just to agentswith human minds. When such theorists say that agents act so as tomaximize their utility, they want this to be part of thedefinition of what it is to be an agent, not an empiricalclaim about possible inner states and motivations. Samuelson'sconception of utility, defined by way of Revealed PreferenceTheory (RPT) introduced in his classic paper (Samuelson (1938)) satisfies this demand.Some other theorists understand the point of game theorydifferently. They view game theory as providing an explanatory accountof strategic reasoning. For this idea to be applicable, we mustsuppose that agents at least sometimes do what they do innon-parametric settings because game-theoretic logicrecommends certain actions as the rational ones. Still other theoristsinterpret game theory normatively, as advising agents on whatto do in strategic contexts in order to maximize theirutility. Fortunately for our purposes, all of these ways of thinkingabout the possible uses of game theory are compatible with thetautological interpretation of utility maximization. The philosophicaldifferences are not idle from the perspective of the working gametheorist, however. As we will see in a later section, those who hopeto use game theory to explain strategic reasoning, as opposedto merely strategic behavior, face some special philosophicaland practical problems.Since game theory involves formal reasoning, we must have a devicefor thinking of utility maximization in mathematical terms. Such adevice is called a utility function. The utility-map for anagent is called a ‘function’ because it maps orderedpreferences onto the real numbers. Suppose that agent xprefers bundle a to bundle b and bundle b tobundle c. We then map these onto a list of numbers, where thefunction maps the highest-ranked bundle onto the largest number in thelist, the second-highest-ranked bundle onto the next-largest number inthe list, and so on, thus:bundle a >>3 bundle b >> 2bundle c >> 1The only property mapped by this function is order. Themagnitudes of the numbers are irrelevant; that is, it must not beinferred that x gets 3 times as much utility from bundlea as she gets from bundle c. Thus we could representexactly the same utility function as that above by bundle a >>7,326 bundle b >> 12.6bundle c >>−1,000,000The numbers featuring in an ordinal utility function are thus notmeasuring any quantity of anything. A utility-function inwhich magnitudes do matter is called ‘cardinal’.Whenever someone refers to a utility function without specifying whichkind is meant, you should assume that it's ordinal. These are thesorts we'll need for the first set of games we'll examine. Later, whenwe come to seeing how to solve games that involverandomization—our river-crossing game from Part 1above, for example—we'll need to build cardinal utilityfunctions. The technique for doing this was given by von Neumann & Morgenstern (1947), and was anessential aspect of their invention of game theory. For the moment,however, we will need only ordinal functions.2.2 Games and InformationAll situations in which at least one agent can only act to maximizehis utility through anticipating (either consciously, or justimplicitly in his behavior) the responses to his actions by one ormore other agents is called a game. Agents involved in gamesare referred to as players. If all agents have optimalactions regardless of what the others do, as in purely parametricsituations or conditions of monopoly or perfect competition (see Section 1 above) we can model this without appeal to game theory; otherwise, weneed it.We assume that players are economically rational. That is, a playercan (i) assess outcomes; (ii) calculate paths to outcomes; and (iii)choose actions that yield their most-preferred outcomes, given theactions of the other players. This rationality might in some cases beinternally computed by the agent. In other cases, it might simply beembodied in behavioral dispositions built by natural, cultural oreconomic selection. In particular, in calling an action‘chosen’ we imply no necessary deliberation, conscious orotherwise. We mean merely that the action was taken when analternative action was available, in some sense of‘available’ normally established by the context of theparticular analysis.Each player in a game faces a choice among two or more possiblestrategies. A strategy is a predetermined ‘programme ofplay’ that tells her what actions to take in response toevery possible strategy other players might use. Thesignificance of the italicized phrase here will become clear when wetake up some sample games below.A crucial aspect of the specification of a game involves theinformation that players have when they choose strategies. Thesimplest games (from the perspective of logical structure) are thosein which agents have perfect information, meaning that atevery point where each agent's strategy tells her to take an action,she knows everything that has happened in the game up to that point. Aboard-game of sequential moves in which in which both players watchall the action (and know the rules in common), such as chess, is aninstance of such a game. By contrast, the example of thebridge-crossing game from Section 1 above illustrates a game ofimperfect information, since the fugitive must choose abridge to cross without knowing the bridge at which the pursuer haschosen to wait, and the pursuer similarly makes her decision inignorance of the actions of her quarry. Since game theory is aboutrational action given the strategically significant actions of others,it should not surprise you to be told that what agents in games know,or fail to know, about each others' actions makes a considerabledifference to the logic of our analyses, as we will see.2.3 Trees and Matrices The difference between games of perfect and of imperfect informationis closely related to (though certainly not identical with!) adistinction between ways of representing games that is basedon order of play. Let us begin by distinguishing betweensequential-move and simultaneous-move games in terms ofinformation. It is natural, as a first approximation, to think ofsequential-move games as being ones in which players choose theirstrategies one after the other, and of simultaneous-move games as onesin which players choose their strategies at the same time. This isn'tquite right, however, because what is of strategic importance is notthe temporal order of events per se, but whether and whenplayers know about other players' actions relative to havingto choose their own. For example, if two competing businesses are bothplanning marketing campaigns, one might commit to its strategy monthsbefore the other does; but if neither knows what the other hascommitted to or will commit to when they make their decisions, this isa simultaneous-move game. Chess, by contrast, is normally played as asequential-move game: you see what your opponent has done beforechoosing your own next action. (Chess can be turned into asimultaneous-move game if the players each call moves while isolatedfrom the common board; but this is a very different game fromconventional chess.)It was said above that the distinction between sequential-move andsimultaneous-move games is not identical to the distinction betweenperfect-information and imperfect-information games. Explaining whythis is so is a good way of establishing full understanding of bothsets of concepts. As simultaneous-move games were characterized in theprevious paragraph, it must be true that all simultaneous-move gamesare games of imperfect information. However, some games may containmixes of sequential and simultaneous moves. For example, two firmsmight commit to their marketing strategies independently and insecrecy from one another, but thereafter engage in pricing competitionin full view of one another. If the optimal marketing strategies werepartially or wholly dependent on what was expected to happen in thesubsequent pricing game, then the two stages would need to be analyzedas a single game, in which a stage of sequential play followed a stageof simultaneous play. Whole games that involve mixed stages of thissort are games of imperfect information, however temporally stagedthey might be. Games of perfect information (as the name implies)denote cases where no moves are simultaneous (and where noplayer ever forgets what has gone before).It was said above that games of perfect information are the(logically) simplest sorts of games. This is so because in such games(as long as the games are finite, that is, terminate after a knownnumber of actions) players and analysts can use a straightforwardprocedure for predicting outcomes. A rational player in such a gamechooses her first action by considering each series of responses andcounter-responses that will result from each action open to her. Shethen asks herself which of the available final outcomes brings her thehighest utility, and chooses the action that starts the chain leadingto this outcome. This process is called backward induction(because the reasoning works backwards from eventual outcomes topresent decision problems).We will have much more to say about backward induction and itsproperties in a later section (when we come to discuss equilibrium andequilibrium selection). For now, we have described it just in order touse it to introduce one of the two types of mathematical objects usedto represent games: game-trees. A game-tree is an example ofwhat mathematicians call a directed graph. That is, it is aset of connected nodes in which the overall graph has a direction. Wecan draw trees from the top of the page to the bottom, or from left toright. In the first case, nodes at the top of the page are interpretedas coming earlier in the sequence of actions. In the case of a treedrawn from left to right, leftward nodes are prior in the sequence torightward ones. An unlabelled tree has a structure of thefollowing sort: Figure 1 Figure 1The point of representing games using trees can best be grasped byvisualizing the use of them in supporting backward-inductionreasoning. Just imagine the player (or analyst) beginning at the endof the tree, where outcomes are displayed, and then working backwardsfrom these, looking for sets of strategies that describe paths leadingto them. Since a player's utility function indicates which outcomesshe prefers to which, we also know which paths she will prefer. Ofcourse, not all paths will be possible because the other player has arole in selecting paths too, and won't take actions that lead to lesspreferred outcomes for him. We will present some examples of thisinteractive path-selection, and detailed techniques for reasoningthrough them, after we have described a situation we can use a tree todepict.Trees are used to represent sequential games, because theyshow the order in which actions are taken by the players. However,games are sometimes represented on matrices rather than trees.This is the second type of mathematical object used to represent games.Matrices, unlike trees, simply show the outcomes, represented in termsof the players' utility functions, for every possible combination ofstrategies the players might use. For example, it makes sense todisplay the river-crossing game from Section 1 on a matrix, since in that game both the fugitive and the hunter havejust one move each, and each chooses their move in ignorance of whatthe other has decided to do. Here, then, is part of thematrix: figure 2Figure 2The fugitive's three possible strategies—cross at the safebridge, risk the rocks and risk the cobras—form the rows of thematrix. Similarly, the hunter's three possiblestrategies—waiting at the safe bridge, waiting at the rockybridge and waiting at the cobra bridge—form the columns of thematrix. Each cell of the matrix shows—or, rather wouldshow if our matrix was complete—an outcome defined interms of the players' payoffs. A player's payoff is simplythe number assigned by her ordinal utility function to the state ofaffairs corresponding to the outcome in question. For each outcome,Row's payoff is always listed first, followed by Column's. Thus, forexample, the upper left-hand corner above shows that when the fugitivecrosses at the safe bridge and the hunter is waiting there, thefugitive gets a payoff of 0 and the hunter gets a payoff of 1. Weinterpret these by reference to their utility functions, which in thisgame are very simple. If the fugitive gets safely across the river hereceives a payoff of 1; if he doesn't he gets 0. If the fugitivedoesn't make it, either because he's shot by the hunter or hit by arock or struck by a cobra, then the hunter gets a payoff of 1 and thefugitive gets a payoff of 0.We'll briefly explain the parts of the matrix that have been filledin, and then say why we can't yet complete the rest. Whenever thehunter waits at the bridge chosen by the fugitive, the fugitive isshot. These outcomes all deliver the payoff vector (0, 1). You can findthem descending diagonally across the matrix above from the upperleft-hand corner. Whenever the fugitive chooses the safe bridge butthe hunter waits at another, the fugitive gets safely across, yieldingthe payoff vector (1, 0). These two outcomes are shown in the secondtwo cells of the top row. All of the other cells are marked, fornow, with question marks. Why? The problem here is that if thefugitive crosses at either the rocky bridge or the cobra bridge, heintroduces parametric factors into the game. In these cases, he takeson some risk of getting killed, and so producing the payoff vector (0,1), that is independent of anything the hunter does. We don't yet haveenough concepts introduced to be able to show how to represent theseoutcomes in terms of utility functions—but by the time we'refinished we will, and this will provide the key to solving our puzzlefrom Section 1.Matrix games are referred to as ‘normal-form’ or‘strategic-form’ games, and games as trees are referred toas ‘extensive-form’ games. The two sorts of games are notequivalent, because extensive-form games containinformation—about sequences of play and players' levels ofinformation about the game structure—that strategic-form gamesdo not. In general, a strategic-form game could represent any one ofseveral extensive-form games, so a strategic-form game is best thoughtof as being a set of extensive-form games. When order of playis irrelevant to a game's outcome, then you should study its strategicform, since it's the whole set you want to know about. Where order ofplay is relevant, the extensive form must bespecified or your conclusions will be unreliable.2.4 The Prisoner's Dilemma as an Example of Strategic-Form vs. Extensive-Form RepresentationThe distinctions described above are difficult to fully grasp if allone has to go on are abstract descriptions. They're best illustrated bymeans of an example. For this purpose, we'll use the most famous game:the Prisoner's Dilemma. It in fact gives the logic of the problem facedby Cortez's and Henry V's soldiers (see Section 1 above), and by Hobbes's agents before they empower the tyrant. However, forreasons which will become clear a bit later, you should not take thePD as a typical game; it isn't. We use it as an extendedexample here only because it's particularly helpful for illustratingthe relationship between strategic-form and extensive-formgames (and later, for illustrating the relationships between one-shotand repeated games; see Section 4 below). The name of the Prisoner's Dilemma game is derived from the followingsituation typically used to exemplify it. Suppose that the police havearrested two people whom they know have committed an armed robberytogether. Unfortunately, they lack enough admissible evidence to get ajury to convict. They do, however, have enough evidence tosend each prisoner away for two years for theft of the getawaycar. The chief inspector now makes the following offer to eachprisoner: If you will confess to the robbery, implicating yourpartner, and she does not also confess, then you'll go free and she'llget ten years. If you both confess, you'll each get 5 years. Ifneither of you confess, then you'll each get two years for the autotheft.Our first step in modeling your situation as a game is to representit in terms of utility functions. Both you and your partner's utilityfunctions are identical:Go free >> 4 2 years >> 35 years >> 210 years >> 0The numbers in the function above are now used to express your andyour partner's payoffs in the various outcomes possible inyour situation. We will refer to you as ‘Player I’ and toyour partner as ‘Player II’. Now we can represent yourentire situation on a matrix; this is the strategic form of your game:Figure 3 Figure 3Each cell of the matrix gives the payoffs to both players for eachcombination of actions. Player I's payoff appears as the first numberof each pair, Player II's as the second. So, if both of you confessyou each get a payoff of 2 (5 years in prison each). This appears inthe upper-left cell. If neither of you confess, you each get a payoffof 3 (2 years in prison each). This appears as the lower-rightcell. If you confess and your partner doesn't you get a payoff of 4(going free) and she gets a payoff of 0 (ten years in prison). Thisappears in the upper-right cell. The reverse situation, in which sheconfesses and you refuse, appears in the lower-left cell.You evaluate your two possible actions here by comparing your payoffsin each column, since this shows you which of your actions ispreferable for each possible action by your partner. So, observe: Ifyour partner confesses than you get a payoff of 2 by confessing and apayoff of 0 by refusing. If your partner refuses, you get a payoff of4 by confessing and a payoff of 3 by refusing. Therefore, you'rebetter off confessing regardless of what she does. Your partner,meanwhile, evaluates her actions by comparing her payoffs down eachrow, and she comes to exactly the same conclusion that youdo. Wherever one action for a player is superior to her other actionsfor each possible action by the opponent, we say that the first actionstrictly dominates the second one. In the PD, then,confessing strictly dominates refusing for both players. Both playersknow this about each other, thus entirely eliminating any temptationto depart from the strictly dominated path. Thus both players willconfess, and both will go to prison for 5 years.The players, and analysts, can predict this outcome using a mechanicalprocedure, known as iterated elimination of strictly dominatedstrategies. You, as Player 1, can see by examining the matrix thatyour payoffs in each cell of the top row are higher than your payoffsin each corresponding cell of the bottom row. Therefore, it can neverbe rational for you to play your bottom-row strategy, viz., refusingto confess, regardless of what your opponent does. Sinceyour bottom-row strategy will never be played, we can simplydelete the bottom row from the matrix. Now it is obvious thatPlayer II will not refuse to confess, since his payoff from confessingin the two cells that remain is higher than his payoff from refusing.So, once again, we can delete the one-cell column on the right fromthe game. We now have only one cell remaining, that corresponding tothe outcome brought about by mutual confession. Since the reasoningthat led us to delete all other possible outcomes depended at eachstep only on the premise that both players are economically rational— that is, prefer higher payoffs to lower ones — there isvery strong grounds for viewing joint confession as thesolution to the game, the outcome on which its playmust converge. You should note that the order in whichstrictly dominated rows and columns are deleted doesn't matter. Had webegun by deleting the right-hand column and then deleted the bottomrow, we would have arrived at the same solution.It's been said a couple of times that the PD is not a typical game inmany respects. One of these respects is that all its rows and columnsare either strictly dominated or strictly dominant. In anystrategic-form game where this is true, iterated elimination ofstrictly dominated strategies is guaranteed to yield a uniquesolution. Later, however, we will see that for many games thiscondition does not apply, and then our analytic task is lessstraightforward.You will probably have noticed something disturbing about the outcomeof the PD. Had you both refused to confess, you'd have arrived at thelower-right outcome in which you each go to prison for only 2 years,thereby both earning higher utility than you receive when youconfess. This is the most important fact about the PD, and itssignificance for game theory is quite general. We'll therefore returnto it below when we discuss equilibrium concepts in game theory. Fornow, however, let us stay with our use of this particular game toillustrate the difference between strategic and extensive forms.When people introduce the PD into popular discussions, you willsometimes hear them say that the police inspector must lock hisprisoners into separate rooms so that they can't communicate with oneanother. The reasoning behind this idea seems obvious: if you couldcommunicate, you'd surely see that you're both better off refusing,and could make an agreement to do so, no? This, one presumes, wouldremove your conviction that you must confess because you'll otherwisebe sold up the river by your partner. In fact, however, this intuitionis misleading and its conclusion is false.When we represent the PD as a strategic-form game, we implicitlyassume that the prisoners can't attempt collusive agreement since theychoose their actions simultaneously. In this case, agreement beforethe fact can't help. If you are convinced that your partner will stickto the bargain then you can seize the opportunity to go scot-free byconfessing. Of course, you realize that the same temptation will occurto her; but in that case you again want to make sure you confess, asthis is your only means of avoiding your worst outcome. Your agreementcomes to naught because you have no way of enforcing it; itconstitutes what game theorists call ‘cheap talk’.But now suppose that you do not move simultaneously. That is,suppose that one of you can choose after observing theother's action. This is the sort of situation that people who thinknon-communication important must have in mind. Now you can see thatyour partner has remained steadfast when it comes to your choice, andyou need not be concerned about being suckered. However, this doesn'tchange anything, a point that is best made by re-representing the gamein extensive form. This gives us our opportunity to introducegame-trees and the method of analysis appropriate to them.First, however, here are definitions of some concepts that will behelpful in analyzing game-trees:Node: A point at which a player takes an action. Initial node: The point at which the first action in thegame occurs.Terminal node: Any node which, if reached, ends the game.Each terminal node corresponds to an outcome.Subgame: Any set of nodes and branches descending uniquelyfrom one node.Payoff: an ordinal utility number assigned to a player atan outcome.Outcome: an assignment of a set of payoffs, one to eachplayer in the game.Strategy: a program instructing a player which action totake at every node in the tree where she could possibly be called on tomake a choice.These quick definitions may not mean very much to you until you followthem being put to use in our analyses of trees below. It will probablybe best if you scroll back and forth between them and the examples aswe work through them. By the time you understand each example, you'llfind the concepts and their definitions quite natural and intuitive. To make this exercise maximally instructive, let's suppose that youand your partner have studied the matrix above and, seeing that you'reboth better off in the outcome represented by the lower-right cell,have formed an agreement to cooperate. You are to commit to refusalfirst, at which point she will reciprocate. We will refer to a strategyof keeping the agreement as ‘cooperation’, and will denoteit in the tree below with ‘C’. We will refer to a strategyof breaking the agreement as ‘defection’, and will denoteit on the tree below with ‘D’. As before, you are I andyour partner is II. Each node is numbered 1, 2, 3, … , from top tobottom, for ease of reference in discussion. Here, then, is thetree:Figure 4Figure 4Look first at each of the terminal nodes (those along the bottom).These represent possible outcomes. Each is identified with anassignment of payoffs, just as in the strategic-form game, with I'spayoff appearing first in each set and II's appearing second. Each ofthe structures descending from the nodes 1, 2 and 3 respectively is asub-game. We begin our backward-induction analysis—using atechnique called Zermelo's algorithm—with the sub-gamesthat arise last in the sequence of play. If the subgame descendingfrom node 3 is played, then Player II will face a choice between apayoff of 4 and a payoff of 3. (Consult the second number,representing her payoff, in each set at a terminal node descendingfrom node 3.) II earns her higher payoff by playing D. We maytherefore replace the entire subgame with an assignment of the payoff(0,4) directly to node 3, since this is the outcome that will berealized if the game reaches that node. Now consider the subgamedescending from node 2. Here, II faces a choice between a payoff of 2and one of 0. She obtains her higher one, 2, by playing D. We maytherefore assign the payoff (2,2) directly to node 2. Now we move tothe subgame descending from node 1. (This subgame is, of course,identical to the whole game; all games are subgames of themselves.)You (Player I) now face a choice between outcomes (2,2) and(0,4). Consulting the first numbers in each of these sets, you seethat you get your higher payoff—2—by playing D. D is, ofcourse, the option of confessing. So you confess, and then yourpartner also confesses, yielding the same outcome as in thestrategic-form representation.What has happened here is that you realize that if you play C(refuse to confess) at node 1, then your partner will be able tomaximize her utility by suckering you and playing D. (On the tree, thishappens at node 3.) This leaves you with a payoff of 0 (ten years inprison), which you can avoid only by playing D to begin with. Youtherefore defect from the agreement.We have thus seen that in the case of the Prisoner's Dilemma, thesimultaneous and sequential versions yield the same outcome. This willoften not be true, however. In particular, only finite extensive-form(sequential) games of perfect information can be solved usingZermelo's algorithm.As noted earlier in this section, sometimes we must representsimultaneous moves within games that are otherwise sequential.(As we said above, in all such cases the game as a whole will be one ofimperfect information, so we won't be able to solve it using Zermelo'salgorithm.) We represent such games using the device of informationsets. Consider the following tree:Figure 5Figure 5The oval drawn around nodes b and c indicates thatthey lie within a common information set. This means that at thesenodes players cannot infer back up the path from whence they came; IIdoes not know, in choosing her strategy, whether she is at bor c. (For this reason, what properly bear numbers inextensive-form games are information sets, conceived as ‘actionpoints’, rather than nodes themselves; this is why the nodesinside the oval are labelled with letters rather than numbers.) Putanother way, II, when choosing, does not know what I has done at nodea. But you will recall from earlier in this section that thisis just what defines two moves as simultaneous. We can thus see thatthe method of representing games as trees is entirely general. If nonode after the initial node is alone in an information set on itstree, so that the game has only one subgame (itself), then the wholegame is one of simultaneous play. If at least one node shares itsinformation set with another, while others are alone, the gameinvolves both simultaneous and sequential play, and so is still a gameof imperfect information. Only if all information sets are inhabitedby just one node do we have a game of perfect information.2.5 Solution Concepts and EquilibriaIn the Prisoner's Dilemma, the outcome we've represented as (2,2),indicating mutual defection, was said to be the ‘solution’to the game. Following the general practice in economics, gametheorists refer to the solutions of games as equilibria.Philosophically minded readers will want to pose a conceptual questionright here: What is ‘equilibrated’ about some gameoutcomes such that we are motivated to call them‘solutions’? When we say that a physical system is inequilibrium, we mean that it is in a stable state, one inwhich all the causal forces internal to the system balance each otherout and so leave it ‘at rest’ until and unless it isperturbed by the intervention of some exogenous (that is,‘external’) force. This what economists have traditionallymeant in talking about ‘equilibria’; they read economicsystems as being networks of causal relations, just like physicalsystems, and the equilibria of such systems are then theirendogenously stable states. As we will see after discussing evolutionary game theory in a later section, it ispossible to maintain this understanding of equilibria in the case ofgame theory. However, as we noted in Section 2.1, some people interpretgame theory as being an explanatory theory of strategic reasoning. Forthem, a solution to a game must be an outcome that a rational agentwould predict using the mechanisms of rational computationalone. Such theorists face some puzzles about solution conceptsthat aren't so important for the behaviorist. We will be visiting suchpuzzles and their possible solutions throughout the rest of thisarticle. It's useful to start the discussion here from the case of thePrisoner's Dilemma because it's unusually simple from the perspectiveof these puzzles. What we referred to as its ‘solution’ isthe unique Nash equilibrium of the game. (The‘Nash’ here refers to John Nash, the Nobel Prize-winningmathematician who in Nash (1950) did most to extend and generalize von Neumann & Morgenstern'spioneering work.) Nash equilibrium (henceforth ‘NE’)applies (or fails to apply, as the case may be) to whole setsof strategies, one for each player in a game. A set of strategies is aNE just in case no player could improve her payoff, given thestrategies of all other players in the game, by changing herstrategy. Notice how closely this idea is related to the idea ofstrict dominance: no strategy could be a NE strategy if it is strictlydominated. Therefore, if iterative elimination of strictly dominatedstrategies takes us to a unique outcome, we know we have found thegame's unique NE. Now, almost all theorists agree that avoidance ofstrictly dominated strategies is a minimum requirement ofrationality. This implies that if a game has an outcome thatis a unique NE, as in the case of joint confession in the PD, thatmust be its unique solution. This is one of the most importantrespects in which the PD is an ‘easy’ (and atypical)game.We can specify one class of games in which NE is always not onlynecessary but sufficient as a solution concept. These arefinite perfect-information games that are also zero-sum. Azero-sum game (in the case of a game involving just two players) isone in which one player can only be made better off by making theother player worse off. (Tic-tac-toe is a simple example of such agame: any move that brings me closer to winning brings you closer tolosing, and vice-versa.) We can determine whether a game is zero-sumby examining players' utility functions: in zero-sum games these willbe mirror-images of each other, with one player's highly rankedoutcomes being low-ranked for the other and vice-versa. In such agame, if I am playing a strategy such that, given your strategy, Ican't do any better, and if you are also playing such astrategy, then, since any change of strategy by me would have to makeyou worse off and vice-versa, it follows that our game can have nosolution compatible with our mutual rationality other than its uniqueNE. We can put this another way: in a zero-sum game, my playing astrategy that maximizes my minimum payoff if you play the best youcan, and your simultaneously doing the same thing, is justequivalent to our both playing our best strategies, so thispair of so-called ‘maximin’ procedures is guaranteed tofind the unique solution to the game, which is its unique NE. (Intic-tac-toe, this is a draw. You can't do any better than drawing, andneither can I, if both of us are trying to win and trying not tolose.)However, most games do not have this property. It won't be possible,in this one article, to enumerate all of the ways in whichgames can be problematic from the perspective of their possiblesolutions. (For one thing, it is highly unlikely that theorists haveyet discovered all of the possible problems!) However, we can try togeneralize the issues a bit.First, there is the problem that in most non-zero-sum games, thereis more than one NE, but not all NE look equally plausible as thesolutions upon which strategically rational players would hit. Considerthe strategic-form game below (taken from Kreps (1990), p. 403):Figure 6 Figure 6This game has two NE: s1-t1 and s2-t2. (Note that no rows or columnsare strictly dominated here. But if Player I is playing s1 then PlayerII can do no better than t1, and vice-versa; and similarly for thes2-t2 pair.) If NE is our only solution concept, then we shall beforced to say that either of these outcomes is equally persuasive as asolution. However, if game theory is regarded as an explanatory and/ornormative theory of strategic reasoning, this seems to be leavingsomething out: surely rational players with perfect information wouldconverge on s1-t1? (Note that this is not like the situationin the PD, where the socially superior situation is unachievablebecause it is not a NE. In the case of the game above, both playershave every reason to try to converge on the NE in which they arebetter off.)This illustrates the fact that NE is a relatively (logically)weak solution concept, often failing to predict intuitivelysensible solutions because, if applied alone, it refuses to allowplayers to use principles of equilibrium selection that, if notdemanded by rationality, are at least not irrational.Consider another example from Kreps (1990), p.397:Figure 7 Figure 7Here, no strategy strictly dominates another. However, Player I's toprow, s1, weakly dominates s2, since I does at least aswell using s1 as s2 for any reply by Player II, and on one replyby II (t2), I does better. So should not the players (and the analyst)delete the weakly dominated row s2? When they do so, column t1 is thenstrictly dominated, and the NE s1-t2 is selected as the uniquesolution. However, as Kreps goes on to show using this example, theidea that weakly dominated strategies should be deleted just likestrict ones has odd consequences. Suppose we change the payoffs of the game just a bit, as follows:Figure 8 Figure 8s2 is still weakly dominated as before; but of our two NE, s2-t1 isnow the most attractive for both players; so why should the analysteliminate its possibility? (Note that this game, again, doesnot replicate the logic of the PD. There, it makes sense toeliminate the most attractive outcome, joint refusal to confess,because both players hess.Smith, V. (1962). An Experimental Study ofCompetitive Market Behavior. Journal of Political Economy70:111-137.Smith, V. (1964). Effect of MarketOrganization on Competitive Equilibrium. Quarterly Journal ofEconomics 78:181-201.Smith, V. (1965). Experimental Auction Marketsand the Walrasian Hypothesis. Journal of Political Economy73:387-393.Smith, V. (1976). Bidding and AuctioningInstitutions: Experimental Results. In Y. Amihud, ed., Bidding andAuctioning for Procurement and Allocation, 43-64. New York: NewYork University Press.Smith, V. (1982). Microeconomic Systems as anExperimental Science. American Economic Review72:923-955.Sober, E., and Wilson, D.S. (1998). UntoOthers. Cambridge, MA: Harvard University Press.Sterelny, K. (2004). Thought in a HostileWorld. Oxford: Blackwell.Stratmann, T. (1997). Logrolling. InD. Mueller, ed., Perspectives on Public Choice,322-341. Cambridge: Cambridge University Press.Strotz, R. (1956). Myopia and Inconsistency inDynamic Utility Maximization. The Review of Economic Studies23:165–180.Thurstone, L. (1931). The IndifferenceFunction. Journal of Social Psychology 2:139-167.Tomasello, M., M. Carpenter, J. Call, T. Behneand H. Moll (2004). Understanding and Sharing Intentions: The Originsof Cultural Cognition. Behavioral and Brain Sciences,forthcoming.Vallentyne, P. (ed.). (1991).Contractarianism and Rational Choice. Cambridge: CambridgeUniversity Press.von Neumann, J., and Morgenstern, O., (1947). The Theory of Games and Economic Behavior.Princeton: Princeton University Press, 2nd edition.Weibull, J. (1995). Evolutionary GameTheory. Cambridge, MA: MIT Press.Yaari, M. (1987). The Dual Theory of ChoiceUnder Risk. Econometrica 55:95-115.Young, H.P. (1998). Individual Strategyand Social Structure. Princeton: Princeton University Press.

Other Internet Resources

Histhe number of refinements that could beconsidered, since there may also be no limits on the set ofphilosophical intuitions about what principles a rational agent mightor might not see fit to follow or to fear or hope that other playersare following.Behaviorists take a dim view of much of this activity. They see thejob of game theory as being to predict outcomes given somedistribution of strategic dispositions, and some distribution ofexpectations about the strategic dispositions of others, that havebeen shaped by institutional processes and / or evolutionaryselection. (See Section 7 for further discussion.) On this view, which NE are viable in a gameis determined by the underlying dynamics that equipped players withdispositions prior to the commencement of a game. Thestrategic natures of players are thereby treated as a set of exogenousinputs to the game, just as utility functions are. Behaviorists arethus less inclined to seek general refinements of theequilibrium concept itself, at least insofar as these involve themodeling of more sophisticated expressions of rationalityover and above merely consistent maximization of utility. Behavioristsare often inclined to doubt that the goal of seeking ageneral theory of rationality makes sense as aproject. Institutions and evolutionary processes build manyenvironments, and what counts as rational procedure in one environmentmay not be favoured in another. Economic rationality requires onlythat agents have consistent preferences, that is, that they not prefera to b and b to cand c to a. A great manyarrangements of strategic dispositions are compatible with thisminimal requirement, and evolutionary or institutional processes mightgenerate games in any of them. On this view, NE is a robustequilibrium concept because if players evolve their strategicdispositions in settings that are competitive, those who don't dowhat's optimal given the strategies of others in that specificenvironment will be outcompeted, and so selection will eithereliminate them or encourage the learning of new dispositions. There isno more ‘refined’ concept of rationality of which this canbe argued to be true in general; and so, according tobehaviorists, refinements of NE based on refinements of rationalityare likely to be of merely occasional interest.This does not imply that behaviorists abjure all ways of restrictingsets of NE to plausible subsets. In particular, they tend to besympathetic to approaches that shift emphasis from rationality itselfonto considerations of the informational dynamics of games. We shouldperhaps not be surprised that NE analysis alone often fails to tell usmuch of interest about strategic-form games (e.g., Figure 6 above), inwhich informational structure is suppressed. Equilibrium selectionissues are often more fruitfully addressed in the context ofextensive-form games.2.6 Modular Rationality and Subgame PerfectionIn order to deepen our understanding of extensive-form games, we needan example with more interesting structure than the PD offers. Consider the game described by this tree:Figure 9 Figure 9This game is not intended to fit any preconceived situation; it issimply a mathematical object in search of an application. (L and R herejust denote ‘left’ and ‘right’ respectively.) Now consider the strategic form of this game:Figure 10 Figure 10(If you are confused by this, remember that a strategy must tell aplayer what to do at every information set where that playerhas an action. Since each player chooses between two actions at eachof two information sets here, each player has four strategies intotal. The first letter in each strategy designation tells eachplayer what to do if he or she reaches their first information set,the second what to do if their second information set isreached. I.e., LR for Player II tells II to play L if information set5 is reached and R if information set 6 is reached.) If you examinethis matrix, you will discover that (LL, RL) is among the NE. This isa bit puzzling, since if Player I reaches her second information set(7) in the extensive-form game, I would hardly wish to play L there;she earns a higher payoff by playing R at node 7. Mere NE analysisdoesn't notice this because NE is insensitive to what happens offthe path of play. Player I, in choosing L at node 4, ensures thatnode 7 will not be reached; this is what is meant by saying that it is‘off the path of play’. In analyzing extensive-formgames, however, we should care what happens off the path ofplay, because consideration of this is crucial to what happenson the path. For example, it is the fact that Player Iwould play R if node 7 were reached that would causePlayer II to play L if node 6 were reached, and this is why Player Iwon't choose R at node 4. We are throwing away information relevant togame solutions if we ignore off-path outcomes, as mere NE analysisdoes. Notice that this reason for doubting that NE is a whollysatisfactory equilibrium concept in itself has nothing to do withintuitions about rationality, as in the case of the refinementconcepts discussed in Section 2.5.Now apply Zermelo's algorithm to the extensive form of our currentexample. Begin, again, with the last subgame, that descending from node7. This is Player I's move, and she would choose R because she prefersher payoff of 5 to the payoff of 4 she gets by playing L. Therefore, weassign the payoff (5, -1) to node 7. Thus at node 6 II faces a choicebetween (-1, 0) and (5, -1). He chooses L. At node 5 II chooses R. Atnode 4 I is thus choosing between (0, 5) and (-1, 0), and so plays L.Note that, as in the PD, an outcome appears at a terminal node—(4,5) from node 7—that is Pareto superior to the NE. Again, however,the dynamics of the game prevent it from being reached.The fact that Zermelo's algorithm picks out the strategy vector (LR,RL) as the unique solution to the game shows that it's yieldingsomething other than just an NE. In fact, it is generating the game'ssubgame perfect equilibrium (SPE). It gives an outcome thatyields a NE not just in the whole game but in every subgameas well. This is a persuasive solution concept because, again unlikethe refinements of Section 2.5, it does not demand ‘more’rationality of agents, but less. (It does, however, assumethat players not only know everything strategically relevant to theirsituation but also use all of that information; we must becareful not to confuse rationality with computational power.) Theagents, at every node, simply choose the path that brings them thehighest payoff in the subgame emanating from that node; and,then, in solving the game, they foresee that they will all do that.Agents who proceed in this way are said to be modularrational, that is, short-run rational at each step. They do notimagine themselves, by some fancy processes of hyper-rationality,acting against their local preferences for the sake of some widergoal. Note that, as in the PD, this can lead to outcomes which mightbe regretted from the social point of view. In our current example,Player I would be better off, and Player II no worse off, at theleft-hand node emanating from node 7 than at the SPE outcome. ButPlayer I's very modular rationality, and Player II's awareness ofthis, blocks the socially efficient outcome. If our players wish tobring about the more equitable outcome (4,5) here, they must do so byredesigning their institutions so as to change the structures of thegames they play. Merely wishing that they could be hyper-rational insome way does not seem altogether coherent as an approach.2.7 On Interpreting Payoffs: Morality and Efficiency in GamesMany readers might suppose that the conclusion of the previous sectionhas been asserted on the basis of no adequate defense. Surely, theplayers might be able to just see that outcome (4,5) issocially and morally superior; and since we know they can also see thepath of actions that leads to it, who is the game theorist to announcethat, within the game they're playing, it's unattainable? In fact, tosuggest that hyper-rationality is a will o’ the wisp isphilosophically tendentious, though it is indeed what behavioristsabout game theory believe. The reader who seeks a thoroughjustification for this belief is referred to Binmore (1994, 1998). However, before we just leave matters at a stand-off (here), we mustbe careful not to confuse what is controversial with the consequencesof a simple technical mistake. Consider the Prisoner's Dilemmaagain. We have seen that in the unique NE of the PD, both players getless utility than they could have through mutual cooperation. This maystrike you (as it has struck many commentators) as perverse. Surely,you may think, it simply results from a combination of selfishness andparanoia on the part of the players. To begin with they have no regardfor the social good, and then they shoot themselves in the feet bybeing too untrustworthy to respect agreements.This way of thinking leads to serious misunderstandings of gametheory, and so must be dispelled. Let us first introduce someterminology for talking about outcomes. Welfare economists typicallymeasure social good in terms of Pareto efficiency. Adistribution of utility β is said to be Pareto dominantover another distribution δ just in case from state δthere is a possible redistribution of utility to β such that atleast one player is better off in β than in δ and no playeris worse off. Failure to move to a Pareto-dominant redistribution isinefficient because the existence of β as a logicalpossibility shows that in δ some utility is being wasted. Now,the outcome (3,3) that represents mutual cooperation in our model ofthe PD is clearly Pareto dominant over mutual defection; at (3,3)both players are better off than at (2,2). So it is true thatPDs lead to inefficient outcomes. This was true of our example inSection 2.6 as well.However, inefficiency should not be associated with immorality. Autility function for a player is supposed to represent everythingthat player cares about, which may be anything at all. As we havedescribed the situation of our prisoners they do indeed care only abouttheir own relative prison sentences, but there is nothing essential inthis. What makes a game an instance of the PD is strictly and only itspayoff structure. Thus we could have two Mother Theresa types here,both of whom care little for themselves and wish only to feed starvingchildren. But suppose the original Mother Theresa wishes to feed thechildren of Calcutta while Mother Juanita wishes to feed the childrenof Bogota. And suppose that the international aid agency will maximizeits donation if the two saints nominate the same city, will give thesecond-highest amount if they nominate each others' cities, and thelowest amount if they each nominate their own city. Our saints are in aPD here, though hardly selfish or unconcerned with the social good.To return to our prisoners, suppose that, contrary to ourassumptions, they do value each other's well-being as well astheir own. In that case, this must be reflected in their utilityfunctions, and hence in their payoffs. If their payoff structures arechanged, they will no longer be in a PD. But all this shows is that notevery possible situation is a PD; it does not show that thethreat of inefficient outcomes is a special artifact of selfishness. Itis the logic of the prisoners' situation, not theirpsychology, that traps them in the inefficient outcome, and if thatreally is their situation then they are stuck in it (barringfurther complications to be discussed below). Agents who wish to avoidinefficient outcomes are best advised to prevent certain games fromarising; the defender of the possibility of hyper-rationality is reallyproposing that they try to dig themselves out of such games by turningthemselves into different kinds of agents.In general, then, a game is partly defined by the payoffsassigned to the players. If a proposed solution involves tacitlychanging these payoffs, then this ‘solution’ is in fact adisguised way of changing the subject.2.8 Trembling HandsOur last point above opens the way to a philosophical puzzle, one ofseveral that still preoccupy those concerned with the logicalfoundations of game theory. It can be raised with respect to any numberof examples, but we will borrow an elegant one from C. Bicchieri (1993), who also provides the most extensivetreatment of the problem found in the literature. Consider thefollowing game: Figure 11 Figure 11The NE outcome here is at the single leftmost node descending fromnode 8. To see this, backward induct again. At node 10, I would play Lfor a payoff of 3, giving II a payoff of 1. II can do better than thisby playing L at node 9, giving I a payoff of 0. I can do better thanthis by playing L at node 8; so that is what I does, and the gameterminates without II getting to move. But, now, notice the reasoningrequired to support this prediction. I plays L at node 8 because sheknows that II is rational, and so would, at node 9, play L because IIknows that I is rational and so would, at node 10, play L. But now wehave the following paradox: I must suppose that II, at node 9, wouldpredict I's rational play at node 10 despite having arrived at a node(9) that could only be reached if I is not rational! If I is notrational then II is not justified in predicting that I will not play Rat node 10, in which case it is not clear that II shouldn't play R at9; and if II plays R at 9, then I is guaranteed of a better payoff thenshe gets if she plays L at node 8. Both players must use backwardinduction to solve the game; backward induction requires that I knowthat II knows that I is rational; but II can solve the game only byusing a backward induction argument that takes as a premise theirrationality of I. This is the paradox of backwardinduction.A standard way around this paradox in the literature is to invokethe so-called ‘trembling hand’ due to Selten (1975). The idea here is that a decision and its consequent act may‘come apart’ with some nonzero probability, howeversmall. That is, a player might intend to take an action but then slipup in the execution and send the game down some other path instead. Ifthere is even a remote possibility that a player may make amistake—that her ‘hand may tremble’—then nocontradiction is introduced by a player's using a backward inductionargument that requires the hypothetical assumption that another playerhas taken a path that a rational player could not choose. In ourexample, II could reason about what to do at node 9 conditional on theassumption that I rationally chose L at node 8 but then slipped.There is a substantial technical literature on thisbackward-induction paradox, of which Bicchieri (1993) is the most comprehensive source. (Bicchieri, it should be noted,does not endorse an appeal to trembling hands as theappropriate solution. Discussing her particular proposal here would,however, take us too far afield into technicalities. The interestedreader should study her book.) The puzzle has been introduced herejust in order to point out that refinements of the type discussed inSection 2.6 can be encouraged by more than mere intuitions about theconcept of rationality. For if hands may tremble then merelyeconomically rational players will be motivated to worryabout the probabilities with which apparent departures from rationalplay will be observed. For example, if my opponent's hand may tremble,then this gives me good reason to avoid the weakly dominated strategys2 in the third example from Section 2.5. After all, my opponent might promise to play t1 in that game, and Imay believe his promise; but if his hand then trembles and a play oft2 results, I get my worst payoff. If I'm risk-averse, then in suchsituations it would seem that I should stick to weakly dominantstrategies.The paradox of backward induction, like the puzzles raised byequilibrium refinement, is mainly a problem for those who view gametheory as contributing to a normative theory of rationality(specifically, as contributing to that larger theory the theory ofstrategic rationality). The behaviorist can give a differentsort of account of apparently irrational play and the prudence itencourages. This involves appeal to the empirical fact that actualagents, including people, must learn the equilibriumstrategies of games they play, at least whenever the games are at allcomplicated. Research shows that even a game as simple as thePrisoner's Dilemma requires learning by people (Ledyard 1995, Sally 1995, Camerer 2003, p. 265). What it means to say that people must learn equilibriumstrategies is that we must be a bit more sophisticated than wasindicated earlier in constructing utility functions from behavior inapplication of Revealed Preference Theory. Instead of constructingutility functions on the basis of single episodes, we must do so onthe basis of observed runs of behavior once it hasstabilized, signifying maturity of learning for the subjects inquestion and the game in question. Once again, the Prisoner's Dilemmamakes a good example. People encounter few one-shot Prisoner'sDilemmas in everyday life, but they encounter many repeatedPD's with non-strangers. As a result, when set into what is intended tobe a one-shot PD in the experimental laboratory, people tend toinitially play as if the game were a single round of a repeatedPD. The repeated PD has many Nash equilibria that involve cooperationrather than defection. Thus experimental subjects tend to cooperate atfirst in these circumstances, but learn after some number of rounds todefect. The experimenter cannot infer that she has successfullyinduced a one-shot PD with her experimental setup until she sees thisbehavior stabilize. (As noted in Section 2.7 above, if it does not so stabilize, she must infer that shehas failed to induce a one-shot PD and that her subjects are playingsome other game.) The paradox of backward induction now dissolves. Unless players haveexperienced play at equilibrium with one another in the past, even ifthey are all rational, and all believe this about one another, weshould predict that they will attach some positive probability to theconjecture that interaction partners have not yet learned allequilibria. This then explains why rational agents, unless they enjoyrisk, may play as if they believe in trembling hands. Learning of equilibria by rational agents may take various forms fordifferent agents and for games of differing levels of complexity andrisk. Incorporating it into game-theoretic models of interactions thusintroduces an extensive new set of technicalities. For the most fullydeveloped general theory, the reader is referred to Fudenberg and Levine (1998).

3. Uncertainty, Risk and Sequential Equilibria

The games we've modeled to this point have all involved playerschoosing from amongst pure strategies, in which each seeks asingle optimal course of action at each node that constitutes a bestreply to the actions of others. Often, however, a player's utility isoptimized through use of a mixed strategy, in which she flipsa weighted coin amongst several possible actions. (We will see laterthat there is an alternative interpretation of mixing, not involvingrandomization at a particular information set; but we will start herefrom the coin-flipping interpretation and then build on it in Section 3.1.) Mixing is necessary whenever no pure strategy maximizes the player'sutility against all opponent strategies. Our river-crossing game from Section 1 exemplifies this. As we saw, the puzzle in that game consists in thefact that if the fugitive's reasoning selects a particular bridge asoptimal, his pursuer must be assumed to be able to duplicate thatreasoning. Thus the fugitive can escape only if his pursuer cannotreliably predict which bridge he'll use. Symmetry of logical reasoningpower on the part of the two players ensures that the fugitive cansurprise the pursuer only if it is possible for him to surprisehimself.Suppose that we ignore rocks and cobras for a moment, and imagine thatthe bridges are equally safe. Suppose also that the fugitive has nospecial knowledge about his pursuer that might lead him to venture aspecially conjectured probability distribution over the pursuer'savailable strategies. In this case, the fugitive's best course is toroll a three-sided die, in which each side represents a differentbridge (or, more conventionally, a six-sided die in which each bridgeis represented by two sides). He must then pre-commit himself to usingwhichever bridge is selected by this randomizing device. Thisfixes the odds of his survival regardless of what the pursuer does;but since the pursuer has no reason to prefer any available pure ormixed strategy, and since in any case we are presuming her epistemicsituation to be symmetrical to that of the fugitive, we may supposethat she will roll a three-sided die of her own. The fugitive now hasa 2/3 probability of escaping and the pursuer a 1/3 probability ofcatching him. The fugitive cannot improve on these odds if the pursueris rational, so the two randomizing strategies are in Nashequilibrium.Now let us re-introduce the parametric factors, that is, the fallingrocks at bridge #2 and the cobras at bridge #3. Again, suppose thatthe fugitive is sure to get safely across bridge #1, has a 90% chanceof crossing bridge #2, and an 80% chance of crossing bridge #3. We cansolve this new game if we make certain assumptions about the twoplayers' utility functions. Suppose that Player 1, the fugitive, caresonly about living or dying (preferring life to death) while thepursuer simply wishes to be able to report that the fugitive is dead,preferring this to having to report that he got away. (In other words,neither player cares about how the fugitive lives or dies.)In this case, the fugitive simply takes his original randomizingformula and weights it according to the different levels of parametricdanger at the three bridges. Each bridge should be thought of as alottery over the fugitive's possible outcomes, in which eachlottery has a different expected payoff in terms of the itemsin his utility function. Consider matters from the pursuer's point of view. She will be usingher NE strategy when she chooses the mix of probabilities over thethree bridges that makes the fugitive indifferent among his possiblepure strategies. The bridge with rocks is 1.1 times more dangerous forhim than the safe bridge. Therefore, he will be indifferent betweenthe two when the pursuer is 1.1 times more likely to be waiting at thesafe bridge than the rocky bridge. The cobra bridge is 1.2 times moredangerous for the fugitive than the safe bridge. Therefore, he will beindifferent between these two bridges when the pursuer's probabilityof waiting at the safe bridge is 1.2 times higher than the probabilitythat she is at the cobra bridge. Suppose we use s1, s2 and s3 torepresent the fugitive's parametric survival rates at eachbridge. Then the pursuer minimizes the net survival rate across anypair of bridges by adjusting the probabilities p1 and p2 that she willwait at them so thats1 (1 − p1) = s2 (1 − p2)Since p1 + p2 = 1, we can rewrite this ass1 × p2 = s2 × p1so p1/s1 = p2/s2. Thus the pursuer finds her NE strategy by solving the followingsimultaneous equations: 1 (1 − p1) = 0.9 (1 − p2) = (1 − p3) p1 + p2 + p3 = 1.Then p1 = 49/121 p2 = 41/121 p3 = 31/121 Now let f1, f2, f3 represent the probabilities with which the fugitivechooses each respective bridge. Then the fugitive finds his NEstrategy by solving s1 × f1 = s2 × f2 = s3 × f3 so 1 × f1 = 0.9 × f2 = 0.8 × f3 simultaneously withf1 + f2 + f3 = 1.Then f1 = 36/121 f2 = 40/121 f3 = 45/121 These two sets of NE probabilities tell each player how to weight hisor her die before throwing it. Note the — perhaps surprising— result that the fugitive uses riskier bridges with higher probability. This is the only way of making the pursuerindifferent over which bridge she stakes out, which in turn is whatmaximizes the fugitive's probability of survival.We were able to solve this game straightforwardly because we set theutility functions in such a way as to make it zero-sum, orstrictly competitive. That is, every gain in expected utilityby one player represents a precisely symmetrical loss by the other.However, this condition may often not hold. Suppose now that theutility functions are more complicated. The pursuer most prefers anoutcome in which she shoots the fugitive and so claims credit for hisapprehension to one in which he dies of rockfall or snakebite; and sheprefers this second outcome to his escape. The fugitive prefers aquick death by gunshot to the pain of being crushed or the terror ofan encounter with a cobra. Most of all, of course, he prefers toescape. We cannot solve this game, as before, simply on the basis ofknowing the players' ordinal utility functions, since theintensities of their respective preferences will now berelevant to their strategies.Prior to the work of von Neumann & Morgenstern (1947), situations of this sort were inherently baffling to analysts. This isbecause utility does not denote a hidden psychological variable suchas pleasure. As we discussed in Section 2.1, utility is merely a measure of relative behavioural dispositionsgiven certain consistency assumptions about relations betweenpreferences and choices. It therefore makes no sense to imaginecomparing our players' cardinal—that is,intensity-sensitive—preferences with one another's, since thereis no independent, interpersonally constant yardstick we coulduse. How, then, can we model games in which cardinal information isrelevant? After all, modeling games requires that all players'utilities be taken simultaneously into account, as we've seen.A crucial aspect of von Neumann & Morgenstern's (1947) work was the solution to this problem. Here, we will provide a briefoutline of their ingenious technique for building cardinal utilityfunctions out of ordinal ones. It is emphasized that what follows ismerely an outline, so as to make cardinal utilitynon-mysterious to you as a student who is interested in knowing aboutthe philosophical foundations of game theory, and about the range ofproblems to which it can be applied. Providing a manual you couldfollow in building your own cardinal utility functions wouldrequire many pages. Fortunately, such manuals are available in manytextbooks. In any case, if you are a philosophy student you may notwish to attempt this until you've taken a course in probabilitytheory.Suppose we have an agent whose ordinal utility function is known.Indeed, suppose that it's our river-crossing fugitive. Let's assign himthe following ordinal utility function:Escape >> 4 Death by shooting >> 3Death by rockfall >> 2Death by snakebite >> 1Now, we know that his preference for escape over any form ofdeath is likely to be stronger than his preference for, say, shootingover snakebite. This should be reflected in his choice behaviour inthe following way. In a situation such as the river-crossing game, heshould be willing to run greater risks to increase the relativeprobability of escape over shooting than he is to increase therelative probability of shooting over snakebite. This bit of logic isthe crucial insight behind von Neumann & Morgenstern's (1947) solution to the cardinalization problem. Begin by asking our agent to pick, from the available set of outcomes,a best one and a worst one. ‘Best’ and‘worst’ are defined in terms of rational choice: arational agent always chooses so as to maximize the probability of thebest outcome—call this W—and to minimizethe probability of the worst outcome—call thisL. Now consider prizes intermediate betweenW and L. We find, for a set ofoutcomes containing such prizes, a lottery over them such that ouragent is indifferent between that lottery and a lottery including onlyW and L. In our example, this wouldbe a lottery having shooting and rockfall as its possible outcomes.Call this lottery T . We define a utility functionq = u(T) such that if q isthe expected prize in T , the agent is indifferentbetween winning T and winning a lottery in whichW occurs with probabilityu(T) and L occurs withprobability 1 − u(T).We now construct a compound lottery T* overthe outcome set {W, L} such that theagent is indifferent between T andT*. A compound lottery is one in which the prize inthe lottery is another lottery. This makes sense because, after all,it is still W and L that are atstake for our agent in both cases; so we can then analyzeT* into a simple lottery over W andL. Call this lottery r. It followsfrom transitivity that T is equivalent tor. (Note that this presupposes that our agent doesnot gain utility from the complexity of her gambles.) The rationalagent will now choose the action that maximizes the probability ofwinning W. The mapping from the set of outcomes tou(r) is a von Neumann-Morgensternutility function (VNMuf).What exactly have we done here? We've simply given our agent choicesover lotteries, instead of over prizes directly, and observed how muchextra risk he's willing to run to increase the chances of winningescape over snakebite relative to getting shot or clobbered with arock. A VNMuf yields a cardinal, rather than an ordinal,measure of utility. Our choice of endpoint-values, Wand L, is arbitrary, as before; but once these arefixed the values of the intermediate points are determined. Therefore,the VNMuf does measure the relative preference intensities ofa single agent. However, since our assignment of utility values toW and L is arbitrary, wecan't use VNMufs to compare the cardinal preferences of one agent withthose of another. Furthermore, since we are using arisk-metric as our measuring instrument, the construction ofthe new utility function depends on assuming that our agent'sattitude to risk itself stays constant from one comparison oflotteries to another. This seems reasonable for a single agent in asingle game-situation. However, two agents in one game, or one agentunder different sorts of circumstances, may display very differentattitudes to risk. Perhaps in the river-crossing game the pursuer,whose life is not at stake, will enjoy gambling with her glory whileour fugitive is cautious. In general, a risk-averse agentprefers a guaranteed prize to its equivalent expected value in alottery. A risk-loving agent has the reverse preference. Arisk-neutral agent is indifferent between these options. Inanalyzing the river-crossing game, however, we don't have tobe able to compare the pursuer's cardinal utilities with thefugitive's. Both agents, after all, can find their NE strategies ifthey can estimate the probabilities each will assign to the actions ofthe other. This means that each must know both VNMufs; but neitherneed try to comparatively value the outcomes over which they'regambling.We can now fill in the rest of the matrix for the bridge-crossing gamethat we started to draw in Section 2. If all that the fugitive caresabout is life and death, but not the manner of death, and if all thehunter cares about is preventing the fugitive from escaping, then wecan now interpret both utility functions cardinally. This permits usto assign expected utilities, expressed by multiplying the originalpayoffs by the relevant probabilities, as outcomes in the matrix.Suppose that the hunter waits at the cobra bridge with probabilityx and at the rocky bridge with probability y. Sinceher probabilities across the three bridges must sum to 1, this impliesthat she must wait at the safe bridge with probability 1 −(x + y). Then, continuing to assign the fugitive apayoff of 0 if he dies and 1 if he escapes, and the hunter the reversepayoffs, our complete matrix is as follows:Figure 12 Figure 12We can now read the following facts about the game directly from thematrix. No rows or columns strictly or weakly dominate any others.Therefore, the game's NE must be in mixed strategies.3.1 BeliefsHow should we interpret the processes being modeled by computations ofNE strategy mixes in games like the river-crossing one? One possiblekind of interpretation is an evolutionary one. If thehunter and the fugitive have regularly played games that structurallyresemble this river-crossing game, then selection pressureswill have encouraged habits in them that lead them both to play its NEstrategies and to sincerely rationalize doing so by means ofsome satisfying story or other. If neither party has ever been in asituation like this, and if their biological and/or cultural ancestorshaven't either, and if neither is concerned with revealing informationto opponents in expected future situations of this sort (because theydon't expect them to arise again),and if both parties aren't trainedgame theorists, then their behavior should be predicted not by a gametheorist but by friends of theirs who are familiar with their personalidiosyncrasies. Behaviorists are happy to recognize that game theoryisn't useful for modelling every possible empirical circumstance thatcomes along.However, the philosopher who wants game theory to serve as adescriptive and/or normative theory of strategic rationality cannotrest content with this answer. He must find a satisfying line ofadvice for the players even when their game is alone in the universeof strategic problems. No such advice can be given that isuncontroversially satisfactory—behaviorists, after all,are often behaviorists because they aren't satisfied by anyavailable approach here—but there is a way of handling thematter that many game theorists have found worthy of detailedpursuit. This involves the computation of equilibria inbeliefs.In fact, the behaviorist needs the concept of equilibrium in beliefstoo, but for different purposes. As we've seen, the concept of NEsometimes doesn't go deep enough as an analytical instrument to tellus all that we think might be important in a game. Thus evenbehaviorists who aren't impressed with the project of refinementsmight make use of the concept of subgame-perfect equilibrium (SPE), asdiscussed in Section 2.6, if they think they're dealing with agents who are very well informed(say, because they're in a familiar institutional setting). But nowconsider the three-player imperfect-information game below known as‘Selten's horse’ (for its inventor, Nobel Prize winner ReinhardSelten, and because of the shape of its tree; taken from Kreps (1990), p. 426):Figure 13 Figure 13One of the NE of this game is Lr2l3. This isbecause if Player I plays L, then Player II playing r2 hasno incentive to change strategies because her only node of action, 12,is off the path of play. But this NE seems to be purely technical; itmakes little sense as a solution. This reveals itself in the fact thatif the game beginning at node 14 could be treated as a subgame,Lr2l3 would not be an SPE. Whenever shedoes get a move, Player II should play l2. But ifPlayer II is playing l2 then Player I should switch toR. In that case Player III should switch to r3, sendingPlayer II back to r2. And here's a new,‘sensible’, NE: Rr2r3. I and II ineffect play ‘keepaway’ from III.This NE is ‘sensible’ in just the same way that a SPEoutcome in a perfect-information game is more sensible than othernon-SPE NE. However, we can't select it by applying Zermelo'salgorithm. Because nodes 13 and 14 fall inside a common informationset, Selten's Horse has only one subgame (namely, the whole game). Weneed a ‘cousin’ concept to SPE that we can apply in casesof imperfect information, and we need a new solution procedure toreplace Zermelo's algorithm for such games.Notice what Player III in Selten's Horse is wondering about as heselects his strategy. "Given that I get a move," he asks himself, "wasmy action node reached from node 11 or from node 12?" What, in otherwords, are the conditional probabilities that III is at node13 or 14 given that he has a move? Now, if conditional probabilitiesare what III wonders about, then what Players I and II must makeconjectures about when they select their strategies are III'sbeliefs about these conditional probabilities. In that case,I must conjecture about II's beliefs about III's beliefs, and III'sbeliefs about II's beliefs and so on. The relevant beliefs here arenot merely strategic, as before, since they are not just about whatplayers will do given a set of payoffs and game structures,but about what they think makes sense given some understanding orother of conditional probability.What beliefs about conditional probability is it reasonable forplayers to expect from each other? The normative theorist might insiston whatever the best mathematicians have discovered about the subject.Clearly, however, if this is applied then a theory of games thatincorporated it would not be descriptively true of most people. Thebehaviorist will insist on imposing only behavioral habits that aprocess of natural selection might build into its products. Perhapssome actual or possible creatures might observe habits that respectBayes's rule, which is the minimal true generalization aboutconditional probability that an agent could know if it knows any suchgeneralizations at all. Adding more sophisticated knowledge aboutconditional probability amounts to refining the concept ofequilibrium-in-belief, just as some game theorists like to refine NE.You can imagine what behaviorists think of that project!Here, we will restrict our attention to the least refinedequilibrium-in-belief concept, that obtained when we require players toreason in accordance with Bayes's rule. Bayes's rule tells us how tocompute the probability of an event F given information E (written‘pr(F/E)’):pr(F/E) = [pr(E/F) × pr(F)] / pr(E)We will henceforth assume that players do not hold beliefsinconsistent with this equality.We may now define a sequential equilibrium. A SE has twoparts: (1) a strategy profile § for each player, as before, and(2) a system of beliefs μ for each player. μ assignsto each information set h a probability distribution over thenodes x in h, with the interpretation that these arethe beliefs of player i(h) about where in hisinformation set he is, given that information set h has beenreached. Then a sequential equilibrium is a profile of strategies§ and a system of beliefs μ consistent with Bayes's rule suchthat starting from every information set h in the tree playeri(h) plays optimally from then on, given that whathe believes to have transpired previously is given by μ(h)and what will transpire at subsequent moves is given by §.We now demonstrate the concept by application to Selten'sHorse. Consider again the uninteresting NELr2l3. Suppose that Player III assigns pr(1) toher belief that if she gets a move she is at node 13. Then Player II,given a consistent μ(II), must believe that III will playl3, in which case her only SE strategy is l2. Soalthough Lr2l3 is a NE, it is not a SE. This isof course what we want.The use of the consistency requirement in this example is somewhattrivial, so consider now a second case (also taken from Kreps (1990), p. 429):Figure 14 Figure 14Suppose that I plays L, II plays l2 and III playsl3. Suppose also that μ(II) assigns pr(.3) to node16. In that case, l2 is not a SE strategy for II, sincel2 returns an expected payoff of .3(4) + .7(2) = 2.6, whiler2 brings an expected payoff of 3.1. Notice that if wefiddle the strategy profile for player III while leaving everythingelse fixed, l2 could become a SE strategy forII. If §(III) yielded a play of l3 with pr(.5) andr3 with pr(.5), then if II plays r2 his expectedpayoff would now be 2.2, so Ll2l3 would be a SE.Now imagine setting μ(III) back as it was, but change μ(II) sothat II thinks the conditional probability of being at node 16 isgreater than .5; in that case, l2 is again not a SEstrategy.The idea of SE is hopefully now clear. We can apply it to theriver-crossing game in a way that avoids the necessity for the hunterto flip any coins of we modify the game a bit. Suppose now that II canchange bridges twice during the fugitive's passage, and will catch himjust in case she meets him as he leaves the bridge. Then the hunter'sSE strategy is to divide her time at the three bridges in accordancewith the proportion given by the equation in the third paragraph ofSection 3 above.It must be noted that since Bayes's rule cannot be applied to eventswith probability 0, its application to SE requires that players assignnon-zero probabilities to all actions available in trees. Thisrequirement is captured by supposing that all strategy profiles bestrictly mixed, that is, that every action at everyinformation set be taken with positive probability. You will see thatthis is just equivalent to supposing that all hands sometimes tremble.A SE is said to be trembling-hand perfect if all strategiesplayed at equilibrium are best replies to strategies that are strictlymixed. You should also not be surprised to be told that no weaklydominated strategy can be trembling-hand perfect, since thepossibility of trembling hands gives players the most persuasivereason for avoiding such strategies.

4. Repeated Games and Coordination

So far we've restricted our attention to one-shot games, thatis, games in which players' strategic concerns extend no further thanthe terminal nodes of their single interaction. However, games areoften played with future games in mind, and this cansignificantly alter their outcomes and equilibrium strategies. Ourtopic in this section is repeated games, that is, games inwhich sets of players expect to face each other in similar situationson multiple occasions. We approach these first through the limitedcontext of repeated prisoner's dilemmas. We've seen that in the one-shot PD the only NE is mutual defection.This may no longer hold, however, if the players expect to meet eachother again in future PDs. Imagine that four firms, all making widgets,agree to maintain high prices by jointly restricting supply. (That is,they form a cartel.) This will only work if each firm maintains itsagreed production quota. Typically, each firm can maximize its profitby departing from its quota while the others observe theirs, since itthen sells more units at the higher market price brought about by thealmost-intact cartel. In the one-shot case, all firms would share thisincentive to defect and the cartel would immediately collapse. However,the firms expect to face each other in competition for a long period.In this case, each firm knows that if it breaks the cartel agreement,the others can punish it by underpricing it for a period long enough tomore than eliminate its short-term gain. Of course, the punishing firmswill take short-term losses too during their period of underpricing.But these losses may be worth taking if they serve to reestablish thecartel and bring about maximum long-term prices.One simple, and famous (but not, contrary to widespreadmyth, necessarily optimal) strategy for preserving cooperation inrepeated PDs is called tit-for-tat. This strategy tells eachplayer to behave as follows:Always cooperate in the first round.Thereafter, take whatever action your opponent took in the previousround.A group of players all playing tit-for-tat will never see anydefections. Since, in a population where others play tit-for-tat,tit-for-tat is the rational response for each player, everyone playingtit-for-tat is a NE. You may frequently hear people who know alittle (but not enough) game theory talk as if this is the endof the story. It is not. There are two complications. First, the players must be uncertain asto when their interaction ends. Suppose the players know when the lastround comes. In that round, it will be rational for players to defect,since no punishment will be possible. Now consider the second-lastround. In this round, players also face no punishment for defection,since they know they will defect in the last round anyway. So theydefect in the second-last round. But this means they face no threat ofpunishment in the third-last round, and defect there too. We cansimply iterate this backwards through the game tree until we reach thefirst round. Since cooperation is not rational in that round,tit-for-tat is no longer a rational strategy, and we get the sameoutcome—mutual defection—as in the one-shot PD. Therefore,cooperation is only possible in repeated PDs where the expected numberof repetitions is indeterminate. (Of course, this does apply to manyreal-life games.)But now we introduce a second complication. Suppose that players'ability to distinguish defection from cooperation is imperfect.Consider our case of the widget cartel. Suppose the players observe afall in the market price of widgets. Perhaps this is because a cartelmember cheated. Or perhaps it has resulted from an exogenous drop indemand. If tit-for-tat players mistake the second case for the first,they will defect, thereby setting off a chain-reaction of mutualdefections from which they can never recover, since everyplayer will reply to the first encountered defection with defection,thereby begetting further defections, and so on.If players know that such miscommunication is possible, they mustresort to more sophisticated strategies. In particular, they must beprepared to sometimes risk following defections with cooperation inorder to test their inferences. However, they mustn't be tooforgiving, lest other players find it rationally optimal to exploitthem through deliberate defections. In general, sophisticatedstrategies have a problem. Because they are more difficult for otherplayers to infer, their use increases the probability ofmiscommunication. But miscommunication is what causes repeated-gamecooperative equilibria to unravel in the first place! The moral of thisis that PDs, even repeated ones, are very difficult to escape from.Rational players do best trying to avoid situations that arePDs, rather than relying on cunning stratagems for trying to get out ofthem.Real, complex, social and political dramas are seldomstraightforward instantiations of simple games such as PDs. Hardin (1995) offers an analysis of two recent, very real (and very tragic)political cases, the Yugoslavian civil war of 1991-95, and the 1994Rwandan genocide, as PDs that were nested inside coordinationgames. A coordination game occurs whenever the utility of two ormore players is maximized by their doing the same thing, and wheresuch correspondence is more important to them than what, inparticular, they both do. A standard example arises with rules of theroad: ‘All drive on the left’ and ‘All drive on theright’ are both outcomes that are NEs, and neither is moreefficient than the other. In games of ‘pure’ coordination,it doesn't even help to use more selective equilibrium criteria. Forexample, suppose that we require our players to reason in accordancewith Bayes's rule (see Section 3 above). In these circumstances, anystrategy that is a best reply to any vector of mixed strategiesavailable in NE is said to be rationalizable. That is, aplayer can find a set of systems of beliefs for the other players suchthat any history of the game along an equilibrium path is consistentwith that set of systems. Pure coordination games are characterized bynon-unique vectors of rationalizable strategies. In such situations,players may try to predict equilibria by searching for focalpoints, that is, features of some strategies that they believewill be salient to other players, and that they believe other playerswill believe to be salient to them. (For example, if two people wantto meet on a given day in a big city but can't contact each other toarrange a specific time and place, both might sensibly go to thecity's most prominent downtown plaza at noon.) Unfortunately, in manyof the social and political games played by people (and some otheranimals), the biologically shallow properties by which people sortthemselves into racial and ethnic groups serve highly efficiently assuch features. Hardin's analysis of recent genocides relies on thisfact.According to Hardin, neither the Yugoslavian nor the Rwandandisasters were PDs to begin with. That is, in neither situation, oneither side, did most people begin by preferring the destruction ofthe other to mutual cooperation. However, the deadly logic ofcoordination, deliberately abetted by self-serving politicians,dynamically created PDs. Some individual Serbs (Hutus) wereencouraged to perceive their individual interests as best servedthrough identification with Serbian (Hutu) group-interests. That is,they found that some of their circumstances, such as those involvingcompetition for jobs, had the form of coordination games. They thusacted so as to create situations in which this was true for otherSerbs (Hutus) as well. Eventually, once enough Serbs (Hutus)identified self-interest with group-interest, the identificationbecame almost universally correct, because (1) the mostimportant goal for each Serb (Hutu) was to do roughly what every otherSerb (Hutu) would, and (2) the most distinctively Serbianthing to do, the doing of which permitted coordination, was to excludeCroats (Tutsi). That is, strategies involving such exclusionarybehavior were selected as a result of having efficient focalpoints. This situation made it the case that an individual—andindividually threatened—Croat's (Tutsi's) self-interest was bestmaximized by coordinating on assertive Croat (Tutsi) group-identity,which further increased pressures on Serbs (Hutus) to coordinate, andso on. Note that it is not an aspect of this analysis to suggest thatSerbs or Hutus started things; the process could have been (even if itwasn't in fact) perfectly reciprocal. But the outcome is ghastly:Serbs and Croats (Hutus and Tutsis) seem progressively morethreatening to each