Scientific Explanation (Stanford Encyclopedia of Philosophy) Cite this entry Search the SEP • Advanced Search • Tools • RSS FeedTable of Contents• What's New• Archives• Projected ContentsEditorial Information• About the SEP• Editorial Board• How to Cite the SEP• Special CharactersSupport the SEPContact the SEP ©Metaphysics Research Lab,CSLI,Stanford University Open access to the SEP is made possible by a world-wide funding initiative. Please Read How You Can Help Keep the Encyclopedia FreeScientific ExplanationFirst published Fri May 9, 2003; substantive revision Mon May 12, 2003Issues concerning scientific explanation have been a focus ofphilosophical attention from Pre-Socratic times through the modernperiod. However, recent discussion really begins with the developmentof the Deductive-Nomological (DN) model. This model has hadmany advocates (including Popper 1935, 1959, Braithwaite 1953,Gardiner, 1959, Nagel 1961 ) but unquestionably the most detailed andinfluential statement is due to Carl Hempel (Hempel, 1942, 1965,Hempel and Oppenheim, 1948). These papers and the reaction to themhave structured subsequent discussion concerning scientificexplanation to an extraordinary degree. After some general remarks byway of background and orientation (Section 1), this entry describesthe DN model and its extensions, and then turns to somewell-known objections (Section 2). It next describes a variety ofsubsequent attempts to develop alternative models of explanation,including Wesley Salmon's Statistical Relevance (Section 3)and Causal Mechanical (Section 4) models and theUnificationist models due to Michael Friedman and PhilipKitcher (Section 5). Section 6 provides a summary and discussesdirections for future work.1. Background and Introduction2. The DN Model 2.1 The Basic Idea 2.2 The Role of Laws in the DN Model 2.3 Inductive Statistical Explanation 2.4 Motivation for the DN Model: Nomic Expectability and a Regularity Account of Causation 2.5 Counterexamples to Sufficiency 2.6 Counterexamples to Necessity and the Hidden Structure Strategy 3. The SR Model 3.1 The Basic Idea 3.2 The SR Model and Low Probability Events 3.3 What Do Statistical Theories Explain? 3.4 Causation and Statistical Relevance Relationships 4. The Causal Mechanical Model 4.1 The Basic Idea 4.2 The CM Model and Explanatory Relevance 4.3 The CM Model and Complex Systems 4.4 More Recent Developments 5. A Unificationist Account of Explanation 5.1 The Basic Idea 5.2 Illustrations of the Unificationist Model 5.3 The Illustrations Criticized 5.4 The Heterogeneity of Unification 5.5 The Winner-Take-All Conception of Explanatory Unification 5.6 The Epistemology of Unification 6. ConclusionsBibliographyOther Internet ResourcesRelated Entries1. Background and Introduction. As will become apparent, “scientific explanation” is atopic that raises a number of interrelated issues. Some backgroundorientation will be useful before turning to the details of competingmodels. A presupposition of most recent discussion has been thatscience sometimes provides explanations (rather than something thatfalls short of explanation — e.g., “meredescription”) and that the task of a “theory” or“model” of scientific explanation is to characterize thestructure of such explanations. It is thus assumed that there is (atsome suitably abstract and general level of description) a single kindor form of explanation that is “scientific”. In fact, thenotion of “scientific explanation” suggests at least twocontrasts — first, a contrast between those“explanations” that are characteristic of“science” and those explanations that are not, and,second, a contrast between “explanation” and somethingelse. However, with respect to the first contrast, the tendency inmuch of the recent philosophical literature has been to assume thatthere is a substantial continuity between the sorts of explanationsfound in science and at least some forms of explanation found in moreordinary non-scientific contexts, with the latter embodying in a moreor less inchoate way features that are present in a more detailed,precise, rigorous etc. form in the former. It is further assumed thatit is the task of a theory of explanation to capture what is common toboth scientific and at least some more ordinary forms ofexplanation. These assumptions help to explain (what may otherwisestrike the reader as curious) why, as this entry will illustrate,discussions of scientific explanation so often move back and forthbetween examples drawn from bona-fide science (e.g., explanations ofthe trajectories of the planets that appeal to Newtonian mechanics)and more homey examples involving the tipping over of inkwells. With respect to the second contrast, most models of explanationassume that it is possible for a set of claims to be true, accurate,supported by evidence, and so on and yet unexplanatory (at least ofanything that the typical explanation-seeker is likely to wantexplained). For example, all of the accounts of scientific explanationdescribed below would agree that an account of the appearance of aparticular species of bird of the sort found in a bird guidebook is,however accurate, not an explanation of anything of interest tobiologists (e.g., the development, characteristic features, orbehaviour of that species). Instead, such an account is "merelydescriptive". However, different models of explanation providedifferent accounts of what the contrast between the explanatory andmerely descriptive consists in. A related point is that while most theorists of scientificexplanation have proposed models that are intended to cover at leastsome cases of explanation that we would not think of as part ofscience, they have nonetheless assumed some implicit restriction onthe kinds of explanation they have sought to reconstruct. It has oftenbeen noted that the word “explanation” is used in a widevariety of ways in ordinary English — we speak of explaining themeaning of a word, explaining the background to philosophical theoriesof explanation, explaining how to bake a pie, explaining why one madea certain decision (where this is to offer a justification) and so on.Although the various models discussed below have sometimes beencriticized for their failure to capture all of these forms of“explanation” (see, e.g., Scriven, 1959), it is clear thatthey were never intended to do this. Instead, their intendedexplicandum is, very roughly, explanations of whythings happen, where the “things” in question can beeither particular events or something more general — e.g.,regularities or repeatable patterns in nature. Paradigms of this sortof explanation include the explanation for the advance in theperihelion of mercury provided by General Relativity, the explanationof the extinction of the dinosaurs in terms of the impact of a largeasteroid at the end of the Cretaceous period, the explanation providedby the police for why a traffic accident occurred (the driver wasdrinking and there was ice on the road), and the standard explanationprovided in economics textbooks for why monopolies will, in comparisonwith firms in perfectly competitive markets, raise prices and reduceoutput. Finally, a few words about the broader epistemological/methodological background to the models described below. Manyphilosophers think of concepts like “explanation”,“law”, “cause”, and “support forcounterfactuals” as part of an interrelated family or circle ofconcepts that are “modal” in character . For familiar“empiricist” reasons, Hempel and many other earlydefenders of the DN model regarded these concepts as not wellunderstood, at least prior to analysis. It was assumed that it wouldbe “circular” to explain one concept from this family interms of others from the same family and that they must instead beexplicated in terms of other concepts from outside the modal family— concepts that more obviously satisfied (what were taken to be)empiricist standards of intelligibility and testability. For example,in Hempel's version of the DN model , the notion of a“law” plays a key role in explicating the concept of“explanation”, the assumption being that laws are justregularities that meet certain further conditions that are alsoacceptable to empiricists. As we shall see, these empiricist standards(and an accompanying unwillingness to employ modal concepts asprimitives) have continued to play a central role in the models ofexplanation developed subsequent to the DN model. Suggested Readings: Salmon (1989) is a superbcritical survey of all the models of scientific explanation discussedin this entry. Pitt (1988) and Ruben (1993) are anthologies thatcontain a number of influential articles.2. The DN Model2.1 The Basic Idea According to the Deductive-Nomological Model, a scientificexplanation consists of two major “constituents”: anexplanandum, a sentence “describing the phenomenon tobe explained” and an explanans, “the class ofthose sentences which are adduced to account for the phenomenon”(Hempel and Oppenheim, 1948, reprinted in Hempel, 1965, p. 247). Forthe explanans to successfully explain the explanandum severalconditions must be met. First, “the explanandum must be alogical consequence of the explanans” and “the sentencesconstituting the explanans must be true”. (Hempel, 1965,p. 248). That is, the explanation should take the form of a sounddeductive argument in which the explanandum follows as a conclusionfrom the premises in the explanans. This is the“deductive” component of the model. Second, the explanansmust contain at least one “law of nature” and this must bean essential premise in the derivation in the sense that thederivation of the explanandum would not be valid if this premise wereremoved. This is the “nomological” component of the model— “nomological” being a philosophical term of artwhich, suppressing some niceties, means (roughly)“lawful”. In its most general formulation, theDN model is meant to apply both to the explanation of“general regularities” or “laws” such as (touse Hempel and Oppenheim's examples) why light conforms to the law ofrefraction and also to the explanation of particular events, conceivedas occurring at a particular time and place, such as the bentappearance of the partially submerged oars of a rowboat on aparticular occasion of viewing. As an additional illustration of aDN explanation of a particular event, consider a derivationof the position of Mars at some future time from Newton's laws ofmotion, the Newtonian inverse square law governing gravity, andinformation about the mass of the sun, the mass of Mars and thepresent position and velocity of each. In this derivation the variousNewtonian laws figure as essential premises and they are used, inconjunction with appropriate information about initial conditions (themasses of Mars and the sun and so on), to derive the explanandum (thefuture position of Mars) via a deductively valid argument. TheDN criteria are thus satisfied.2.2 The Role of Laws in the DN Model The notion of a sound deductive argument is (arguably) relativelyclear (or at least something that can be regarded as antecedentlyunderstood from the point of view of characterizing scientificexplanation). But what about the other major component of theDN model — that of a law of nature? The basic intuitionthat guides the DN model goes something like this: Within theclass of true generalizations, we may distinguish between those thatare only “accidentally true” and those that are“laws”. To use Hempel's examples, the generalization(2.2.1) “All members of the Greensbury School Board for 1964 arebald” is, if true, only accidentally so. In contrast, (2.2.2)“All gases expand when heated under constant pressure” isa law. Thus, according to the DN model, the lattergeneralization can be used, in conjunction with information that someparticular sample of gas has been heated under constant pressure, toexplain why it has expanded. By contrast, the former generalization(2.2.1) in conjunction with the information that a particular personn is a member of the 1964 Greensbury schoolboard, cannot beused to explain why n is bald. While this example may seem clear enough, what exactly is it thatdistinguishes true accidental generalizations from laws? This has beenthe subject of a great deal of philosophical discussion, most of whichmust be beyond the scope of this entry.[1] For reasons explained in Section 1, Hempel assumes that an adequateaccount must explain the notion of law in terms of notions that lieoutside the modal family.[2] In his (1965) he considers a number of familiarproposals having this character[3] and finds them all wanting, remarking that the problem ofcharacterizing the notion of law has proved “highlyrecalcitrant” (1965, p.338). It seems fair to say, however, thathis underlying assumption is that, at bottom, laws are justexceptionless generalizations describing regularities that meetcertain additional distinguishing conditions that he is not at presentable to formulate. In subsequent decades, there have been a number of other proposedcriteria for lawhood. Although each proposal has its adherents, nonehas won general acceptance.[4] What implications does this have for the DN model? Onepossible assessment is that all the DN model really requiresis that there be agreement in a substantial range of particular casesabout which generalizations are laws. If such agreement exists; itmatters little for the DN model if we are unable to formulatecompletely general criteria that distinguish between laws andaccidentally true generalizations in all possible cases. For example,even without an adequate account of lawhood, we can surely agree that(2.2.2) is a law and (2.2.1 ) is not and this is all we need toconclude that (2.2.2 ) can figure in DN explanations while(2.2.1 ) cannot. Unfortunately, however, matters are not always sostraightforward. One important issue raised by the DN modelconcerns the explanatory status of the so-called special sciences— biology, psychology, economics and so on. These sciences arefull of generalizations that appear to play an explanatory role andyet fail to satisfy many of the standard criteria for lawfulness. Forexample, although Mendel's law of segregation (M) (which states thatin sexually reproducing organisms each of the two alternative forms(alleles) of a gene specifying a trait at a locus in a given organismhas 0.5 probability of ending up in a gamete) is widely used in modelsin evolutionary biology, it has a number of exceptions, such asmeiotic drive. A similar point holds for the principles of rationalchoice theory (such as the generalization that preferences aretransitive) which figure centrally in economics. Other widely usedgeneralizations in the special sciences have very narrow scope incomparison with paradigmatic laws, hold only over restrictedspatio-temporal regions, and lack explicit theoreticalintegration. There is considerable disagreement over whether such generalizationsare laws. Some philosophers (e.g., Woodward, 2000) suggest that suchgeneralizations satisfy too few of the standard criteria to count aslaws but can nevertheless figure in explanations; if so, it apparentlyfollows that we must abandon the DN requirement that allexplanations must appeal to laws. Others (e. g., Mitchell, 1997),emphasizing different criteria for lawfulness, conclude instead thatgeneralizations like (M) are laws and hence no threat to therequirement that explanations must invoke laws. In the absence of amore principled account of laws, it is hard to evaluate thesecompeting claims and hence hard to assess the implications of theDN model for the special sciences. More generally, in theabsence of a generally accepted account of lawhood, the rationale forthe fundamental contrast between laws and non-laws which is at theheart of what the DN model requires is unclear: it is hard toassess the claim that all explanations must cite laws, without a clearaccount of what a law is and what it contributes to successfulexplanation. At the very least, providing such an account is animportant item of unfinished business for advocates of the DNmodel.2.3 Inductive Statistical Explanation The DN model is meant to capture explanation via deductionfrom deterministic laws and this raises the obvious question of theexplanatory status of statistical laws. Do such laws explain at alland if so, what do they explain, and under what conditions? In his(1965) Hempel distinguishes two varieties of statisticalexplanation. The first of these, deductive-statistical(DS) explanation, involves the deduction of “a narrowerstatistical uniformity” from a more general set of premises, atleast one of which involves a more general statistical law. SinceDS explanation involves deduction of the explanandum from alaw, it conforms to the same general pattern as the DNexplanation of regularities. However, in addition to DSexplanation, Hempel also recognizes a distinctive sort of statisticalexplanation, which he calls inductive-statistical orIS explanation, involving the subsumption of individualevents (like the recovery of a particular person from streptococcusinfection) under (what he regards as) statistical laws (such as a lawspecifying the probability of recovery, given that penicillin has beentaken). While the explanandum of a DN or DS explanation canbe deduced from the explanans, one cannot deduce that some particularindividual, John Jones, has recovered from the above statistical lawand the information that he has taken penicillin. At most what can bededuced from this information is that recovery is more or lessprobable. In IS explanation, the relation between explanansand explanandum is, in Hempel's words, “inductive,” ratherthan deductive — hence the name inductive-statisticalexplanation. The details of Hempel's account are complex, but theunderlying idea is roughly this: an IS explanation will begood or successful to the extent that its explanans confers highprobability on its explanandum outcome.2.4 Motivation for the DN Model: Nomic Expectability and a Regularity Account of Causation Why suppose that all (or even some) explanations have a DNor IS structure? There are two ideas which play a centralmotivating role in Hempel's (1965) discussion. The first connects theinformation provided by a DN argument with a certainconception of what it is to achieve understanding of why somethinghappens — it appeals to an idea about the object or point ofgiving an explanation. Hempel writes … a DN explanation answers the question“Why did the explanandum-phenomenon occur?” byshowing that the phenomenon resulted from certain particularcircumstances, specified in C1,C2, …, Ck, inaccordance with the laws L1,L2, …, Lr. By pointingthis out, the argument shows that, given the particular circumstancesand the laws in question, the occurrence of the phenomenon was tobe expected; and it is in this sense that the explanation enablesus to understand why the phenomenon occurred. (1965, p. 337,italics in original) One can think of IS explanation as involving a naturalgeneralization of this idea. While an IS explanation does notshow that the explanandum-phenomenon was to be expected withcertainty, it does the next best thing: it shows that theexplanandum-phenomenon is at least to be expected with highprobability and in this way provides understanding. Stated moregenerally, both the DN and IS models, share thecommon idea that, as Salmon (1989) puts it, “the essence ofscientific explanation can be described as nomicexpectability — that is expectability on the basis oflawful connections” (1989, p. 57). The second main motivation for the DN/IS model has to dowith the role of causal claims in scientific explanation. There isconsiderable disagreement among philosophers about whether allexplanations in science and in ordinary life are causal and alsodisagreement about what the distinction (if any) between causal andnon-causal explanations consists in.[5] Nonetheless, virtually everyone, including Hempel, agrees that manyscientific explanations cite information about causes. However,Hempel, along with most other early advocates of the DNmodel, is unwilling to take the notion of causation as primitive inthe theory of explanation — that is, he was unwilling to simplysay that X figures in an explanation of Y if andonly if X causes Y. Instead, adherents of theDN model have generally looked for an account of causationthat satisfies the empiricist requirements described in Section 1. Inparticular, advocates of the DN model have generally accepteda broadly Humean or regularity theory of causation, according to which(very roughly) all causal claims imply the existence of somecorresponding regularity (a “law”) linking cause toeffect. This is then taken to show that all causal explanations“imply,” perhaps only “implicitly,” that sucha law/regularity exists and hence that laws are “involved”in all such explanations, just as the DN model claims. To illustrate of this line of argument, consider(2.4.1) The impact of my knee on the desk caused the tipping over ofthe inkwell. (2.4.1) is a so-called singular causal explanation, advanced byMichael Scriven (1962) as a counterexample to the claim that theDN model describes necessary conditions for successfulexplanation. According to Scriven, (2.4.1) explains the tipping overof the inkwell even though no law or generalization figures explicitlyin (2.4.1) and (2.4.1) appears to consist of a single sentence, ratherthan a deductive argument. Hempel's response (1965, 360ff) is that theoccurrence of “caused” in (2.4.1) should not be leftunanalyzed or taken as explanatory just as it stands. Instead (2.4.1)should be understood as “implicitly” or“tacitly” claiming there is a “law” orregularity linking knee impacts to tipping over of inkwells. Accordingto Hempel, it is the implicit claim that some such law holds that“distinguishes” (2.4.1) from “a mere sequentialnarrative” in which the spilling is said to follow the impactbut without any claim of causal connection — a narrative that(Hempel thinks) would clearly not be explanatory. This linking law isthe nomological premise in the DN argument that, according toHempel, is “implicitly” asserted by (2.2.1). There are two related but distinct ways of understanding thisargument, both of which are suggested by portions of Hempel'sdiscussion. According to the first, Hempel's claim is that the realunderlying structure of (2.4.1) is something like:(2.4.2)(L)Whenever knees impact tables on which aninkwell sits and further conditions K are met (whereK specifies that the impact is sufficiently forceful, etc.),the inkwell will tip over. (Reference to K is necessary sincethe impact of knees on table with inkwells does not always result intipping.)(I)My knee impacted a tables on which an inkwell sits andfurther conditions K are met(E)The inkwell tips over Hence, to the extent that it is explanatory, (2.4.1)“implicitly” satisfies the DN/IS requirements afterall — it is a DN /IS argument (namely 2.4. 2) indisguise. There is a second interpretation of Hempel's argument that, unlikethe first interpretation, does not require that we think of the fullcontent of (2.4.2) as somehow already implicit in (2.4.1) Instead,(2.4.2) plays the role of an ideal against which (2.4.1)should be measured. (2.4.2) spells out what information a complete,fully adequate explanation for E would need to contain— information that is present in (2.4.1) only in a partial orincomplete way. On this view of the matter, we think of (2.4.1) as anexplanation-sketch (cf. Hempel, 1965b, 423ff) which conveyssome of the information conveyed by ( 2.4.2) or points in thedirection of the more complete explanation (2.4.2). Ideally, singularcausal explanations like (2.4.1) should be replaced by explicitDN explanations like (2.4.2). On either interpretation, however, the basic idea is that a properexplication of the role of causal claims in explanation leads via aHumean or regularity theory of causation, to the conclusion that, atleast ideally, explanations should satisfy the DN/IS model.Let us call this line of argument the “hidden structure”argument in recognition of the role it assigns to a hidden (or atleast non-explicit) DN structure that is claimed to beassociated with (2.4.1). This strategy will be examined in section 2. 6, but let me firstcomment on a feature of the discussion so far that may seem puzzling.The boundaries of the category “scientific explanation”are far from clear, but while (2.4.1) is arguably an explanation, itis not what one usually thinks of as “science” —instead it is a claim from “ordinary life” or“common sense”. This raises the question of why adherentsof the DN/IS model don't simply respond to the allegedcounterexample (2.4.1) by denying that it is an instance of thecategory “scientific explanation” — that is, byclaiming that the DN/IS model is not an attempt toreconstruct the structure of explanations like (2.4.1) but is ratheronly meant to apply to explanations that are properly regarded as“scientific”. The fact that this response is not oftenadopted by advocates of the DN model is an indication of theextent to which, as noted in section 1, it is implicitly assumed inmost discussions of scientific explanation that there are importantsimilarities or continuities in structure between explanations like(2.4.1) and explanations that are more obviously scientific and thatthese similarities that should be captured by some common account thatapplies to both. Indeed, it is a striking feature not just of Hempel(1965) but of many other treatments of scientific explanation thatmuch of the discussion in fact focuses on “ordinary life”singular causal explanations similar to (2.4.1), the tacit assumptionbeing that conclusions about the structure of such explanations havefairly direct implications for understanding explanation inscience.2.5. Explanatory Understanding and Nomic Expectability: Counterexamples to Sufficiency As explained above, examples like (2.4.1) are potentialcounterexamples to the claim that the DN model providesnecessary conditions for explanation. There are also a numberof well-known counterexamples to the claim that the DN modelprovides sufficient conditions for successful scientificexplanation. Here are two illustrations. Explanatory Asymmetries. There are many cases inwhich a derivation of an explanandum E from a law Land initial conditions I seems explanatory but a“backward” derivation of I from E andthe same law L does not seem explanatory, even though thelatter, like the former, appears to meet the criteria for successfulDN explanation. For example, one can derive the lengths of the shadow cast by a flagpole from the height hof the pole and the angle θ of the sun above the horizon andlaws about the rectilinear propogation of light. This derivation meetsthe DN criteria and seems explanatory. On the other hand, aderivation (2.5.1) of h from s and θ and thesame laws also meets the DN criteria but does not seemexplanatory. Examples like this suggest that at least someexplanations possess directional or asymmetric features to which theDN model is insensitive. Explanatory Irrelevancies. A derivation can satisfythe DN criteria and yet be a defective explanation because itcontains irrelevancies besides those associated with the directionalfeatures of explanation. Consider an example due to Wesley Salmon(Salmon, 1971, p.34):(2.5.2)(L)All males who take birth control pillsregularly fail to get pregnant(K)John Jones is a male who has been taking birth controlpills regularly(E)John Jones fails to get pregnant It is arguable that (L) meets the criteria for lawfulnessimposed by Hempel and many other writers. (If one wants to deny thatL is a law one needs some principled, generally acceptedbasis for this judgment and, as explained above, it is unclear whatthis basis is.) Moreover, (2.5.2) is certainly a sound deductiveargument in which L occurs as an essentialpremise. Nonetheless, most people judge that (L ) and(K) are no explanation of E. There are many othersimilar illustrations. For example (Kyburg 1965), it is presumably alaw (or at least an exceptionless, counterfactual supportinggeneralization) that all samples of table salt that have been hexed bybeing touched with the wand of a witch dissolve when placed inwater. One may use this generalization as a premise in a DNderivation which has as its conclusion that some particular hexedsample of salt has dissolved in water. But again the hexing isirrelevant to the dissolving and such a derivation is noexplanation. One obvious diagnosis of the difficulties posed by examples like(2.5.1) and (2.5.2) focuses on the role of causation in explanation.According to this analysis, to explain an outcome we must cite itscauses and (2.5.1) and (2.5.2) fail to do this. As Salmon (1989, p.47)puts it, “a flagpole of a certain height causes a shadow of agiven length and thereby explains the length of the shadow”. Bycontrast, “the shadow does not cause the flagpole andconsequently cannot explain its height ”. Similarly, takingbirth control pills does not cause Jones' failure to get pregnant andthis is why (2.5.2) fails to be an acceptable explanation. On thisanalysis, what (2.5.1) and (2.5. 2) show is that a derivation cansatisfy the DN criteria and yet fail to identify the causesof an explanandum — when this happens the derivation will failto be explanatory. As explained above, advocates of the DN model would notregard this diagnosis as very illuminating, unless accompanied by someaccount of causation that does not simply take this notion asprimitive. (Salmon in fact provides such an account, which we willconsider in Section 4.) We should note, however, that an apparentlesson of (2.5.1) and (2.5.2) is that the regularity account ofcausation favored by DN theorists is at best incomplete: theoccurrence of c, e and the existence of someregularity or law linking them (or x's having propertyP and x's having property Q and some lawlinking these) is not a sufficient condition for the truth ofthe claim that c caused e or x's havingP is causally or explanatorily relevant to x'shaving Q. More generally, if the counterexamples (2.5.1) and(2.5.2) are accepted, it follows that the DN model fails tostate sufficient conditions for explanation. Explaining an outcomeisn't just a matter of showing that it is nomicallyexpectable. There are two possible reactions one might have to this observation.One is that the idea that explanation is a matter of nomicexpectability is correct as far as it goes, but that something more isrequired as well. According to this assessment, the DN/ISmodel does state a necessary condition for successfulexplanation and, moreover, a condition that is a non-redundant part ofa set of conditions that are jointly sufficient for explanation.However, some other, independent feature, X (which willaccount for the directional features of explanation and insure thekind of explanatory relevance that is apparently missing in the birthcontrol example) must be added to the DN model to achieve asuccessful account of explanation. The idea is thus that NomicExpectability + X = Explanation. Something like this idea isendorsed, by the unificationist models of explanation developed byFriedman (1974) and Kitcher (1989), which are discussed in Section 5below. A second, more radical possible conclusion is that the DNaccount of the goal or rationale of explanation is mistaken in somemuch more fundamental way and that the DN model does not evenstate necessary conditions for successful explanation. As noted above,unless the hidden structure argument is accepted, this conclusion isstrongly suggested by examples like (2.4.1) (“The impact of myknee caused the tipping over of the inkwell”) which appear toinvolve explanation without the explicit citing of a law or adeductive structure. To assess whether the DN/IS modelprovides necessary conditions for explanation, we thus must considerthe hidden structure strategy in more detail.2. 6. The Hidden Structure Strategy. It might seem that the contention of the hidden structure strategythat singular causal explanations like (2.4.1) are implicitDN/IS explanations or sketches of such explanationsis at best relevant to the question of whether the DNIS modelprovides an adequate reconstruction of this particular sort ofexplanation. In fact, however, Hempel's strategy of treatingexplanations as devices for conveying information, but in a“partial” or “incomplete” way, aboutunderlying “ideal” explanations of a prima-facie quitedifferent form that are at least partly epistemically hidden fromthose who use the original, non-ideal explanation has continued to bevery popular in recent theorizing about scientific explanation. Thisstrategy forms the basis, for example, for Peter Railton's (1978,1981) contrast between an “ideal explanatory text” whichcontains all of the causal and nomological information relevant tosome outcome of interest and the “non-ideal” explanationslike (2.4.1)that we actually give. According to Railton, the latterprovide “explanatory information” in virtue of conveyinginformation about some limited portion or aspect of the ideal text andare explanatory in virtue of doing so. The hidden structure strategyalso plays an important role in the unificationist account ofexplanation developed by Philip Kitcher (1989) who likewise insists wemust “distinguish between what is said on an occasion in whichexplanatory information is given and the ideal underlyingexplanation” (Kitcher, 1989, p. 414.) Indeed, any account ofexplanation that, like Kitcher's unificationist model, insists thatlaws (or generalizations of considerable generality) and deductivestructure are necessary conditions for successful explanation willneed to appeal to something like hidden structure strategy since it isgenerally accepted that there are many apparent explanations that donot conform to such conditions in their overt structure. Although the hidden structure strategy deserves more attention thanit can receive here, several points seem clear. First, the notion ofone explanation “conveying information about” another“underlying” explanation requires considerable spellingout. Depending on what “underlying” is understood to mean,it is arguable that there are many explanations underlying (2.4.1)— (i) the explanation (2.4.2 ), assuming that conditionK can be specified in a non-trivial way, (ii) an explanationat the level of classical physics that makes reference to lawsgoverning inelastic collisions, the behavior of liquids when notconfined to containers, and so on, and (iii) an explanation in whichthe behavior of the whole system is characterized in terms of somemore fundamental physical theory (quantum mechanics , superstringtheory etc.). Are all of these explanations implicit in(2.4.1) or does (2.4.1 ) convey partial information about allof them? In what sense of “implicit” or “conveysinformation about” could this possibly be true? Railton (1981) suggests that an explanatory claim providesinformation about an underlying ideal text if the former reducesuncertainty about some of the properties of the text, in the sense ofruling in or out various possibilities concerning its structure. AsRailton recognizes, this has proposal has many counterintuitiveconsequences. To use Railton's own example, “the relevant idealtext contains more than 102 words in English”, iftrue, counts as an explanation for an episode of radioactivedecay. (1981, p. 246). Similarly, the claim that X andY are correlated, will count as a partialexplanation of X and Y on the plausible assumptionthat this claim conveys the information that one of threepossibilities is likely to be true - either X causesY or Y causes X or they have a common cause— and thus reduces uncertainty about the contents of the idealunderlying text. This contrasts with the widespread judgment thatcorrelations in themselves are not explanatory. Indeed, on a view likeRailton's, even the claim that some outcome has no causes or isgoverned by no laws counts as an “explanation” of thatoutcome, supposing that claim is true. In fact, such a claim isapparently maximally explanatory, since it conveys everythingthat there is to be said about the ideal explanatory text associatedwith that event. Examples like these suggest that not every claim thatreduces uncertainty about the contents of an ideal explanatory textshould be regarded as itself explanatory — such a view allowstoo much to count as an explanation. Is it plausible to regard the text that contains all of the fulldetails of causal and nomological information relevant to some outcomeas at least an “ideal” against which various candidateexplanations of that outcome are to be judged? Suppose we arepresented with an explanation from economics or psychology that doesnot appeal to any generalization that we are prepared to count as alaw but that underlying this “non-ideal” explanation issome incredibly complex set of facts described in terms of classicalmechanics and electromagnetism, along with the relevant laws of thesetheories. If, as almost certainly will be the case, this underlying“explanation” is computationally intractable, and full ofirrelevant detail (see section 4 below for more on what this mightmean), one might wonder in what sense it is an ideal against which theoriginal explanation should be measured. Will the economicsexplanation really be better according as to whether it conveys asmuch information as possible about these underlying details? Finally, consider the connection between explanation andunderstanding. One ordinarily thinks of an explanation as somethingthat provides understanding. Relatedly, part of the task of a theoryof explanation is to identify those structural features ofexplanations (or the information they convey) in virtue of which theyprovide understanding. For example, as noted above, the DNmodel connects understanding with the provision of information aboutnomic expectability — the idea is that understanding why anoutcome occurs is a matter of seeing that it was to be expected on thebasis of a law. The problem this raises for the hidden structurestrategy is that the information associated with the hidden structurealleged to underlie “non-ideal” explanations like (2.4.1)is typically unknown or epistemically inaccessible to those who usethe explanation. It is hard to see how this structure or informationcan contribute to understanding if it is epistemically hidden in thisway. For example, it seems plausible that many (if not almost all)users of (2.4.1) (both those who might offer it as an explanation andthose recipients who take it to provide understanding) are unaware ofthe DN structure that underlies it - indeed it is plausiblethat many users lack the notion of a law of nature and of adeductively valid argument and hence any notion that there isany (unknown) DN argument underlying (2.4.1). Ifthis is the case, how can the mere obtaining of this DNstructure, independently of anyone's awareness of its existence,function so as to provide understanding when (2.4.1) is used? Instead,it seems that the features of (2.4.1) that endow it with explanatoryimport — that make it an explanation — must be featuresthat can be known or grasped or recognized by those who use theexplanation. A similar point will hold for many other candidateexplanations that fail to conform to the DN requirements suchas explanations from sciences like economics and psychology that seemto lack laws. What can we conclude from this discussion of the hidden structurestrategy? If the strategy fails, there will be a large number ofapparent explanations that fail to satisfy the necessary conditionsfor explanation imposed by the DN/IS model. On the otherhand, it is possible that there are ways of developing the hiddenstructure strategy that respond adequately to the difficultiesdescribed above. If so, the idea that the DN /ISrequirements are at least necessary conditions for ideal explanationmay be defensible after all, although the counterexamples to thesufficiency of the model noted in will remain. Suggested Readings. The most authoritative andcomprehensive statement of the DN and IS models isprobably Hempel 1965b. This is reprinted in Hempel, 1965a, along witha number of other papers that touch on various aspects of the problemof scientific explanation. In addition to the references cited in thissection, Salmon, 1989, pp. 46ff describes a number of well-knowncounterexamples to the DN/IS models and discusses theirsignificance.3. The SR Model3.1. The Basic Idea. Much of the subsequent literature on explanation has been motivatedby attempts to capture the features of causal or explanatory relevancethat appear to be left out of examples like (2.5.1) and (2.5.2),typically within the empiricist constraints described above. WesleySalmon's statistical relevance (or SR) model (Salmon, 1971)is a very influential attempt to capture these features in terms ofthe notion of statistical relevance or conditional dependencerelationships. Given some class or population A, an attributeC will be statistically relevant to anotherattribute B if and only if P(B|A.C) ≠P(B|A) — that is, if and only if the probability ofB conditional on A and C is different fromthe probability of B conditional on A alone. Theintuition underlying the SR model is that statisticallyrelevant properties (or information about statistically relevantrelationships) are explanatory and statistically irrelevant propertiesare not. In other words, the notion of a property making a differencefor an explanandum is unpacked in terms of statistical relevancerelationships. To illustrate this idea, suppose that in the birth control pillsexample (2.5.2) the original population T includes bothgenders. Then P(Pregnancy|T.Male.Takes birth controlpills) = P(Pregnancy|T.Male) = 0, whileP(Pregnancy|T.Female. Takes birth control pills)≠ P(Pregnancy|T.Female) assuming that not allwomen in the population take birth control pills. In other words, ifyou are a male in this population, taking birth control pills isstatistically irrelevant to whether you become pregnant, while if youare a female it is relevant. In this way we can capture the idea thattaking birth control pills is explanatorily irrelevant to pregnancyamong males but not among females. To characterize the SR model more precisely we need thenotion of a homogenous partition. A homogenous partition ofA is a set of subclasses or cells Ci ofA that are mutually exclusive and exhaustive, whereP(B|A.Ci) ≠ P(B|A.Cj)for all Ci ≠ Cj andwhere no further statistically relevant partition of any of the cellsA, Ci can be made with respect to B -that is, there are no additional attributes Dk inA such that P(B|A.Ci) ≠P(B|A.Ci.Dk). On the SR model, an explanation of why some memberx of the class characterized by attribute A hasattribute B consists of the following information:the prior probability of B within A : P(B|A)= p.A homogeneous partition of A with respect toB, (A. C1, … A. Cn), togetherwith the probability of B within each cell of the partition:P(B|A.Ci) =pi andThe cell of the partition to which x belongs. To employ one of Salmon's examples, suppose we want to construct anSR explanation of why x who has a strep infection =S , recovers quickly = Q. Let T(-T) according to whether x is (is not) treated withpenicillin, and R(-R) = according to whether the subject hasa penicillin-resistant strain. Assume for the sake of argument that noother factors are relevant to quick recovery. There are four possiblecombinations of these properties: T.R, -T.R, T.-R, -T.-R, butlet us assume that P(Q|S.T.R) = P(Q|S.-T.R) =P(Q|S.-T.-R) ≠ P(Q|S. T.-R). That is, theprobability of quick recovery, given that one has strep, is the samefor those who have the resistant strain regardless of whether or notthey are treated and also the same for those who have not beentreated. By contrast, the probability of recovery is different(presumably greater) among those with strep who have been treated anddo not have the resistant strain.In this case [S. (T.R v -- T. R v-R.-T)], [S.T.-R] is ahomogenous partition of S with respect to Q. TheSR explanation of x's recovery will consist of astatement of the probability of quick recovery among all those withstrep (this is (i) above), a statement of the probability of recoveryin each of the two cells of the above partition ((ii) above), and thecell to which x belongs, which is S.T.R ((iii)above). Intuitively, the idea is that this information tells us aboutthe relevance of each of the possible combinations of the propertiesT and R to quick recovery among those with strep andis explanatory for just this reason.3.2 The SR Model and Low Probability Events The SR model has a number of distinctive features that havegenerated substantial discussion. First, note that according to theSR model, and in contrast to the DN/IS model, anexplanation is not an argument — either in the sense ofa deductively valid argument in which the explanandum follows as aconclusion from the explanans or in the sense of an inductive argumentin which the explanandum follows with high probability from theexplanans, as in the case of IS explanation. Instead, anexplanation is an assembly of information that is statisticallyrelevant to an explanandum. Salmon argues (and takes the birth controlexample (2.6.2 ) to illustrate) that the criteria that a good argumentmust satisfy (e.g., criteria that insure deductive soundness) aresimply different from those a good explanation must satisfy. Amongother things, as Salmon puts it, “irrelevancies [are] harmlessin arguments but fatal in explanations” (1989, p. 102). Asexplained above, in associating successful explanation with theprovision of information about statistical relevance relationships,the SR model attempts to accommodate this observation. A second, closely related point is that the SR model departsfrom the IS model in abandoning the idea that a statisticalexplanation of an outcome must provide information from which itfollows the outcome occurred with high probability. As the reader maycheck, the statement of the SR model above imposes no suchhigh probability requirement; instead, even very unlikely outcomeswill be explained as long as the criteria for SR explanationare met. Suppose that, in the above example, the probability of quickrecovery from strep, given treatment and the presence of anon-resistant strain, is rather low (e.g., 0.2). Nonetheless, if thecriteria (i) — (iii) above — a homogneous partition withcorrect probability values for each cell in the partition — aresatisfied, we may use this information to explain why x, whohad a non-resistant strain of strep and was treated, recoveredquickly. Indeed, according to the SR model, we may explainwhy some x which is A is B, even if theconditional probability of B given A and the cellCi to which x belongs(pi = P(B|A.Ci)) isless than the prior probability (p =P(B|A)) of B in A. For example, if theprior probability of quick recovery among all those with any form ofstrep is 0.5 and the probability of quick recovery of those with aresistant strain who are untreated is 0.1, we may nonetheless explainwhy y, who meets these last conditions (-T.R) ,recovered quickly (assuming he did) by citing the cell to which hebelongs ( the fact that he had the resistant strain and wasuntreated), the probability of recovery given that he falls in thiscell, and the other sort of information described above. Moregenerally, what matters on the SR model is not whether thevalue of the probability of the explanandum-outcome is high or low (oreven high or low in comparison with its prior probability) but ratherwhether the putative explanans cites all and only statisticallyrelevant factors and whether the probabilities it invokes arecorrect. One consequence of this, which Salmon endorses whileacknowledging that many will regard it as unintuitive, is that on theSR model, the same explanans E may explain both anexplanandum M and explananda that are inconsistent withM, such as —M. For example, the same explananswill explain both why a subject with strep and certain otherproperties (e.g., T and --R) recovers quickly, if hedoes, and also why he does not recover if he does not. By contrast, onthe DN or IS models, if E explains M,E cannot also explain -- M. The intuition that, contrary to the IS model, the value thata candidate explanans assigns to an explanandum-outcome should notmatter for the goodness of the explanation it provides can bemotivated in the following way. Consider a genuinely indeterministiccoin which is biased strongly (p = 0.9) toward heads whentossed. Suppose that if it is not tossed the coin has probability of0.5 of being in either the heads or tails position and that whether ornot the coin is tossed is the only factor that is statisticallyrelevant to whether it is heads or tails. According to the ISmodel, if the coin is tossed and comes up heads, we can explain thisoutcome by appealing to the fact that the coin was tossed (since underthis condition the probability of heads is high) but if the coin istossed and comes up tails we cannot explain this outcome, since itsprobability is low . The contrary intuition underlying the SRmodel is that we understand both outcomes equally well. The fact thatthe coin has been tossed is the only factor relevant to either outcomeand that factor is common to both outcomes — once we have citedthe toss (and specified the probability values for heads and tails ontossing), we left nothing out that influences the outcome. There is,to be sure, the brute fact that heads is much more probable than tailson tossing but this is not a factor in addition to tossing that isrelevant to or influences the outcome — ex hypothesi,there is no such additional factor. Similarly, Salmon argues, if it isreally true that the partition in the example involving quick recoveryfrom strep is objectively homogenous — if there are no otherfactors that are statistically relevant to quick recovery besideswhether the subject has been treated and has a resistant strain - thenonce we have specified the probability of quick recovery under allcombinations of these factors, and the combination of factorspossessed by the subject whose recovery (or not, as the case may be)we want to explain, we have specified all information relevant torecovery and in this sense fully explained the outcome for the subject.[6]3.3. What Do Statistical Theories Explain? In assessing these claims, it will be useful to take a step back andask just what it is that these competing models of statisticalexplanation (Hempel's IS model and Salmon's SRmodel) are intended to be reconstructions of. In the literature onthis topic two classes of examples or applications figureprominently. First, there are examples drawn from quantum- mechanics(QM). Suppose, for example, a particle has a probabilityp that is strictly between 0 and 1 of penetrating a potentialbarrier. Models of statistical explanation assume that if the particledoes penetrate the barrier, QM explains this outcome —the IS and SR models are intended to capture thestructure of such explanations. Second, there are examples drawn frombiomedical (or epidemiological) and social scientific applications -recovery from strep or, to cite one of Salmon's extended illustrations(Salmon, 1971), the factors relevant to juvenile delinquency inteen-age boys. This is, to say the least, a heterogeneous class of examples. In thecase of QM, the usual understanding is that the variousno-hidden variable results establish that any empirically adequatetheory of quantum mechanical phenomena must be irreduciblyindeterministic. It is thus plausible that when we use the Schrodingerequation to derive the probability that a particle with a certainkinetic energy will tunnel through a potential barrier of a certainshape, this representation satisfies the SR model's“objective homogeneity” condition — there are noadditional omitted variables that would affect the probability ofbarrier penetration. By contrast, it seems quite unlikely that thishomogeneity condition will be satisfied in most (indeed, in any) ofthe biomedical and sociological illustrations that have figured in theliterature on statistical explanation. In the case of recovery fromstrep, for example, it is very plausible that there are many otherfactors besides the two mentioned above that affect the probability ofrecovery — these additional factors will include the state ofthe subject's immune system, various features of the subject's generallevel of health, the precise character of the strain of disease towhich the subject is exposed (resistant versus non-resistant is almostcertainly too coarse-grained a dichotomy) and so on. Similarly forepisodes of juvenile delinquency. In these cases, in contrast to thecases from quantum mechanics, we lack a theory or body of results thatdelimits the factors that are potentially relevant to the probabilityof the outcome that interests us. Thus, in realistic examples ofassemblages of statistically relevant factors from biomedicine andsocial science, the objective homogeneity condition is unlikely to besatisfied, or in any practical sense, satisfiable. A related difference concerns the way in which statistical evidencefigures in these two sorts of applications. Some quantum mechanicalphenomena such as radioactive decay are irreduciblyindeterministic. By contrast, in the biomedical and social scientificapplications, while the relevant evidence is“statistical”, there is typically no correspondingassumption that the phenomena of interest are irreduciblyindeterministic. This particularly clear in connection with the socialscientific examples (such as risk factors for juvenile delinquency)that Salmon discusses. Here the relevant methodology involvesso-called causal modeling or structural equation techniques. At leaston the most straightforward way of applying such procedures, theequations that govern whether a particular individual becomes ajuvenile delinquent are (if interpreted literally) deterministic.According to such approaches, the phenomena being modeledlook as though they are indeterministic because some of thevariables which are relevant to their behavior, the influence of whichis summarized by a so-called error term, are unknown orunmeasured. Statistical information about the incidence of juveniledelinquency among individuals in various conditions plays the role ofevidence that is used to estimate parameters (thecoefficients) in the deterministic equations that are taken todescribe the processes governing the onset of delinquency. A similarpoint holds for at least many biomedical examples.[7] Several preliminary conclusions are suggested by these observations.First, it is far from obvious that we should try to construct asingle, unified model of statistical explanation that applies to bothquantum mechanics and macroscopic phenomena like delinquency orrecovery from infection. Second, and relatedly, while explanation inQM satisfies the objective homogeneity condition, it isdubious that the sorts of “statistical explanations” foundin the social and biomedical sciences do so. In other words, if anobjective homogeneity condition is imposed on statistical explanation,it is not clear that there will be any examples of successfulstatistical explanation outside of quantum mechanics. With these observations in mind, let us revisit the question of whatis explained by statistical theories, whether quantum mechanical ormacroscopic. As we have seen, both Hempel and Salmon, as well as mostsubsequent contributors to the literature on statistical explanation,have tended to assume that statistical theories that assign aprobability to some outcome strictly between 0 and 1 shouldnonetheless be interpreted as explaining that outcome. Given thiscommon starting point, Salmon is quite persuasive in arguing that itis arbitrary to hold, as Hempel does, that only individual outcomeswith high probability can be explained. But why should we accept thestarting point? Why not take Salmon's argument instead to be a reasonfor rejecting the idea that statistical theories explain individualoutcomes, whether of high or low probability? If we take this view, weneed not conclude that a theory like QM is unexplanatory.Instead, we may take the explananda of QM to be facts aboutthe probabilities or expectation values of outcomes rather thanindividual outcomes themselves. On this view, the explananda that areexplained by QM are a (proper) subset of those that can bederived from it — at least in this respect, the explanationsprovided by QM are like DS explanations instructure. Woodward (1989) argues that this construal allows us to sayall that we might legitimately wish to say about the explanatoryvirtues of QM. If this is correct, there is no obvious needfor a separate theory of statistical explanation of individualoutcomes of the sort that Hempel and Salmon sought to devise (But seefootnote 7). In the case of juvenile delinquency and causal modeling techniques itis, if anything, even more intuitive that what is being explained isnot, e.g. why some particular boy, Albert, became a juveniledelinquent, but rather something more general — e.g., why theexpected incidence of delinquency is higher among certain subgroupsthan others. Again such explananda are deducible from the system ofequations used to model juvenile delinquency. Taking this view of whatis explained by statistical theories allows us to avoid variousunintuitive consequences of Hempel's model (e.g., that highprobability but not low probability outcomes are explained) and ofSalmon's model (e.g., the same explanans E explains bothM and —M. At the very least, those who havesought to construct models of statistical explanation of individualoutcomes need to provide a more detailed elucidation of why suchmodels are needed and of the features of scientific theorizing theyare designed to capture.[8]3.4 Causation and Statistical Relevance Relationships As we have just seen, the SR model raises a number ofinteresting questions about the statistical explanation of individualoutcomes — questions that are important independently of the detailsof the SR model itself. This section will abstract away fromsuch questions and focus instead on the root motivation for theSR model. We may take this to consist of two ideas: (i)explanations must cite causal relationships and (ii) causalrelationships are captured by statistical relevancerelationships. Even if (i) is accepted, a fundamental problem with theSR model is that (ii) is false - as a substantial body of work[9] has made clear, casual relationships are greatly underdetermined bystatistical relevance relationships. Consider another example fromSalmon (1971): a system in which atmospheric pressure A is acommon cause of the occurrence of a storm S and the readingof a barometer B with no causal relationship betweenB and S. Salmon claims that in such a systemB and S will be correlated but that B isstatistically irrelevant to S given A —i.e. P(S|A.B) = P(S|A). By contrast, (Salmon claims)A remains relevant to S given B -i. e.g., P(S|A.B) 1 P(S|B). Similarly, S is irrelevant to B given Abut A remains relevant B given S. In thisway, Salmon's SR model attempts to capture the idea thatA is explanatorily (and causally) relevant to Swhile B is not and that A is explanatorily andcausally relevant to B while S is not. These contentions about the connection between causal claims andstatistical relevance relations are consequences of a more generalprinciple called the Causal Markov condition which has beenextensively discussed in the recent literature on causation.[10] A set of variables standing in a causal relationship and anassociated probability distribution over those variables satisfy theCausal Markov condition if and only if conditional on its directcauses every variable is independent of every other variable exceptpossibly for its effects. Two relevant points have emerged fromdiscussion of this condition. The first, which was in effect noted bySalmon himself in work subsequent to his (1971), is that there arecircumstances in which the Causal Markov condition fails and hence inwhich causal claims do not imply the screening off relationshipsdescribed above. This can happen, for example, if the variables towhich the condition is applied are characterized in an insufficientlyfine-grained way.[11] The second and more fundamental observation is that, depending on thedetails of the case, many different sets of causal relationships maybe compatible with the same statistical relevance relationships. Forexample, a structure in which B causes A which inturn causes S will, if we assume the Causal Markov condition(that is, make assumptions like Salmon's connecting causation andstatistical relevance relationships), lead to exactly the samestatistical relevance relationships as in the example in whichA is a common cause of B and S. Similarlyif S causes A which in turn causes B. Instructures with more variables, this underdetermination of causalrelationships by statistical relevance relationships may be far moreextreme. Thus a list of statistical relevance relationships, which iswhat the SR model provides, need not tell us which causalrelationships are operative. To the extent that explanation has to dowith the identification of the causal relationships on which anexplanandum-outcome depends, the SR model fails to fullycapture these. Selected Readings. Salmon, 1971a provides a detailedstatement and defense of the SR model. This essay, as well aspapers by Jeffrey (1969) and Greeno (1970) which defend views broadlysimilar to the SR model, are collected in Salmon, 1971b.Additional discussion of the model as well as a more recentcharacterization of “objective homogeneity” can be foundin Salmon, 1984. Cartwright, 1979 contains some influential criticismsof the SR model. Theorems specifying the precise extent ofthe underdetermination of causal claims by evidence about statisticalrelevance relationships can be found in Spirtes, Glymour and Scheines,1993, 2000, chapter 4.4. The Causal Mechanical Model4.1 The Basic Idea In more recent work (especially, Salmon, 1984) Salmon abandoned theattempt to characterize explanation or causal relationships in purelystatistical terms. Instead, he developed a new account which he calledthe Causal Mechanical (CM) model of explanation — anaccount which is similar in both content and spirit to so-calledprocess theories of causation of the sort defended byphilosophers like Philip Dowe. (Dowe, 2000). We may think of theCM model as an attempt to capture the “somethingmore” involved in causal and explanatory relationships over andabove facts about statistical relevance, again while attempting toremain within a broadly Humean framework. The CM model employs several central ideas. A causalprocess is a physical process, like the movement of a baseballthrough space, that is characterized by the ability to transmit amark in a continuous way. (“Continuous”generally, although perhaps not always, means “spatio-temporallycontinuous”.) Intuitively, a mark is some local modification tothe structure of a process — for example, a scruff on thesurface of a baseball or a dent an automobile fender. A process iscapable of transmitting a mark if, once the mark is introduced at onespatio-temporal location, it will persist to other spatio-temporallocations even in the absence of any further interaction. In thissense the baseball will transmit the scuff mark from one location toanother. Similarly, a moving automobile is a causal process because amark in the form of a dent in a fender will be transmitted by thisprocess from one spatio-temporal location to another. Causal processescontrast with pseudo-processes which lack the ability totransmit marks. An example is the shadow of a moving physicalobject. The intuitive idea is that, if we try to mark the shadow bymodifying its shape at one point (for example, by altering a lightsource or introducing a second occluding object), this modificationwill not persist unless we continually intervene to maintain it as theshadow occupies successive spatio-temporal positions. In other words,the modification will not be transmitted by the structure of theshadow itself, as it would in the case of a genuine causalprocess. We should note for future reference that, as characterized by Salmon,the ability to transmit a mark is clearly a counterfactual notion, inseveral senses. To begin with, a process may be a causal process evenif it does not in fact transmit any mark, as long as it is true thatif it were appropriately marked, it would transmit the mark.Moreover, the notion of marking itself involves a counterfactualcontrast — a contrast between how a process behaves when markedand how it would behave if left unmarked. Although Salmon, likeHempel, has always been suspicious of counterfactuals, his view at thetime that he first introduced the CM model was that thecounterfactuals involved in the characterization of mark transmissionwere relatively unproblematic, in part because they seemedexperimentally testable in a fairly direct way. Nonetheless thereliance of the CM model, as originally formulated, oncounterfactuals shows that it does not completely satisfy the Humeanstrictures described above. In subsequent work, described in Section4.4 below, Salmon attempted to construct a version of the CMmodel that completely avoids reliance on counterfactuals. The other major element in Salmon's model is the notion of acausal interaction. A casual interaction involves aspatio-temporal intersection between two causal processes whichmodifies the structure of both — each process comes to havefeatures it would not have had in the absence of the interaction. Acollision between two cars that dents both is a paradigmatic causalinteraction. According to the CM model, an explanation of some eventE will trace the causal processes and interactions leading upto E (Salmon calls this the etiological aspect ofthe explanation), or at least some portion of these, as well asdescribing the processes and interactions that make up the eventitself (the constitutive aspect of explanation). In this way,the explanation shows how E “fit[s] into a causalnexus”(1984, p.9). The suggestion that explanation involves “fitting” anexplanandum into a causal nexus does not give us any very precisecharacterization of what the relationship between E and othercausal processes and interactions must be if information about thelatter is to explain E. Nonetheless, it seems clear enoughhow the intuitive idea is meant to apply to specific examples. Supposethat a cue ball, set in motion by the impact of a cue stick, strikes astationary eight ball with the result that the eight ball is put inmotion and the cue ball changes direction. The impact of the stickalso transmits some blue chalk to the cue ball which is thentransferred to the eight ball on impact. The cue stick, the cue ball,and the eight ball are causal processes, as is shown by thetransmission of the chalk mark, and the collision of the cue stickwith the cue ball and the collision of the cue and eight balls arecausal interactions. Salmon's idea is that citing such facts aboutprocesses and interactions explains the motion of the balls after thecollision; by contrast, if one of these balls casts a shadow thatmoves across the other, this will be causally and explanatorilyirrelevant to its subsequent motion since the shadow is apseudo-process.4.2 The CM Model and Explanatory Relevance As the cue ball example illustrates, the CM model takes asits paradigms of causal interaction examples such as collisions inwhich there is “action by contact” and no spatio-temporalgaps in the transmission of causal influence. There is little doubtthat explanations in which there are no such gaps (no “action ata distance”) often strike us as particularly satisfying.[12] However, as Christopher Hitchcock shows in an illuminating paper(Hitchcock, 1995), even here the CM model leaves outsomething important. Consider the usual elementary textbook“scientific explanation” of the motion of the balls in theabove example following their collision. This explanation proceeds byderiving that motion from information about their masses and velocitybefore the collision, the assumption that the collision is perfectlyelastic, and the law of the conservation of linear momentum. Weusually think of the information conveyed by this derivation asshowing that it is the mass and velocity of the balls, rather than,say, their color or the presence of the blue chalk mark, that isexplanatorily relevant to their subsequent motion. However, it is hardto see what in the CM model allows us to pick out the linearmomentum of the balls, as opposed to these other features, asexplanatorily relevant. Part of the difficulty is that to express suchrelatively fine-grained judgments of explanatory relevance (that it islinear momentum rather than chalk marks that matters) we need to talkabout relationships between properties or magnitudes and it is notclear how to express such judgments in terms of facts about causalprocesses and interactions. Both the linear momentum and the chalkmark communicated to the cue ball by the cue stick are markstransmitted by the spatio-temporally continuous causal processconsisting of the motion of the cue ball. Both marks are thentransmitted via an interaction to the eight ball. There appears to benothing in Salmon's notion of mark transmission or the notion of acausal process that allows one to distinguish between theexplanatorily relevant momentum and the explanatorily irrelevant bluechalk mark. Ironically, as Hitchcock goes on to note, a similar observation maybe made about the birth control pills example (2.5.2) originallydevised by Salmon to illustrate the failure of the DN modelto capture the notion of explanatory relevance. Spatio-temporallycontinuous causal processes that transmit marks as well as causalinteractions are at work when male Mr. Jones ingests birth controlpills — the pills dissolve, components enter his bloodstream,are metabolized or processed in some way, and so on. Similarly,spatio-temporally continuous causal processes (albeit differentprocesses) are at work when female Ms. Jones takes birth controlpills. However, the pills are irrelevant to Mr. Jones non-pregnancy,and relevant to Ms. Jones' non-pregnancy. Again, it looks as thoughthe relevance or irrelevance of the birth control pills to Mr. orMs. Jones' failure to become pregnant cannot be captured just byasking whether the processes leading up to these outcomes are causalprocesses in Salmon's sense. A similar point holds for the hexed saltexample (2.6.3) — there are a spatio-temporally continuouscausal processes running from the witch's wand that touches the saltsample to the individual Na and Cl ions formed whenthe salt dissolves but this is not sufficient for the hexing to becausally (or explanatorily) relevant to the dissolving. A more general way of putting the problem revealed by these examplesis that those features of a process P in virtue of which itqualifies as a causal process (ability to transmit mark M)may not be the features of P that are causally orexplanatorily relevant to the outcome E that we want toexplain (M may be irrelevant to E with some otherproperty R of P being the property which is causallyrelevant to E). So while mark transmission may well be acriterion that correctly distinguishes between causalprocesses and pseudo-processes, it does not, as itstands, provide the resources for distinguishing thosefeatures or properties of a causal process that arecausally or explanatorily relevant to an outcome and those featuresthat are irrelevant.4.3 The CM Model and Complex Systems A second set of worries has to do with the application of theCM model to systems which depart in various respects fromsimple physical paradigms such as the collision described above. Thereare a number of examples of such systems. First, there are theorieslike Newtonian gravitational theory which involve “action at adistance” in a physically interesting sense. Second, there are anumber of examples from the literature on causation that do notinvolve physically interesting forms of action at a distance but whicharguably involve causal interactions without interveningspatio-temporally continuous processes or transfer of energy andmomentum from cause to effect. These include cases of causation byomission and causation by “double prevention” or “disconnection.”[13] In all these cases, a literal application of the CM modelseems to yield the judgment that no explanation has been provided —that Newtonian gravitational theory is unexplanatory and so on. Manyphilosophers have been reluctant to accept this assessment. Yet another class of examples that raise problems for the CMmodel involves putative explanations of the behavior of complex or“higher level” systems — explanations that do notexplicitly cite spatio-temporally continuous causal processesinvolving transfer of energy and momentum, even though we may thinkthat such processes are at work at a more “underlying”level. Most explanations in disciplines like biology, psychology andeconomics fall under this description, as do a number ofstraightforwardly physical explanations. As an illustration, suppose that a mole of gas is confined to acontainer of volume V1, at pressureP1, and temperature T1. Thegas is then allowed to expand isothermally into a larger container ofvolume V2. One standard way of explaining thebehavior of the gas — its rate of diffusion and its subsequentequilibrium pressure P2 — appeals to thegeneralizations of phenomenological thermodynamics — e. g., theideal gas law, Graham's law of diffusion, and so on. Salmon appears toregard putative explanations based on at least the first of thesegeneralizations as not explanatory because they do not tracecontinuous causal processes — he thinks of the individualmolecules as causal processes but not the gas as a whole.[14] However , it is plainly impossible to trace the causal processes andinteractions represented by each of the 6 ×1023 molecules making up the gas and the successiveinteractions (collisions) it undergoes with every other molecule. Theusual statistical mechanical treatment, which Salmon presumably wouldregard as explanatory, does not attempt to do this. Instead, it makescertain general assumptions about the distribution of molecularvelocities and the forces involved in molecular collisions and thenuses these, in conjunction with the laws of mechanics, to derive andsolve a differential equation (the Boltzmann transport equation)describing the overall behavior of the gas. This treatment abstractsradically from the details of the causal processes involvingparticular individual molecules and instead focuses on identifyinghigher level variables that aggregate over many individual causalprocesses and that figure in general patterns that govern the behaviorof the gas. This example raises a number of questions. Just what does theCM model require in the case of complex systems in which wecannot trace individual causal processes, at least at a fine-grainedlevel? How exactly does the causal mechanical model avoid the(disastrous) conclusion that any successful explanation of thebehavior of the gas must trace the trajectories of individualmolecules? Does the statistical mechanical explanation described abovesuccessfully trace causal processes and interactions or specify acausal mechanism in the sense demanded by the CM model, andif so, what exactly does tracing causal processes and interactionsinvolve or amount to in connection with such a system? As matters nowstand both the CM model and the process theories of causationthat are its more recent descendants are incomplete. There is another aspect of this example that is worthy of comment.Even if, per impossible, an account that traced individualmolecular trajectories were to be produced, there are importantrespects in which it would not provide the sort of explanation of themacroscopic behavior of the gas that we are likely to be looking for— and not just because such an account would be far too complexto be followed by a human mind. There are a very large number ofdifferent possible trajectories of the individual molecules inaddition to the trajectories actually taken that would produce themacroscopic outcome — the final pressure P2— that we want to explain. This information is certainlyexplanatorily relevant to the macroscopic behavior of the gas and wewould like our account of explanation to accomodate this fact. Veryroughly, given the laws governing molecular collisions, one can showthat almost all (i.e., all except a set of measure zero) of thepossible initial positions and momenta consistent with the initialmacroscopic state of the gas, as characterized byP1, T1 , andV1, will lead to molecular trajectories such thatthe gas will evolve to the macroscopic outcome in which the gasdiffuses to an equilibrium state of uniform density through thechamber at pressure P2. Similarly, there is alarge range of different microstates of the gas compatible with eachof the various other possible values for the temperature of the gasand each of these states will lead to a different final pressureP2*. If we just trace the causal processes (inthe form of actual molecular trajectories) that lead toP2, as the CM model requires, we willfail to represent or capture this information about the full range ofconditions under which P2 and alternatives to itwill occur. A similar point holds for explanations of the behavior of other sortsof complex systems, such as those studied in biology and economics.Consider the standard explanation, in terms of an upward shift of thesupply curve, with an unchanged demand curve, for the increase in theprice of oranges following a freeze. Underlying the behavior of thismarket are individual spatio-temporally continuous causal processesand interactions in Salmon's sense — there are a myriad ofindividual transactions in which money in some form is exchanged forphysical goods, all of which involve transfers of matter or energy,there is exchange of information about intentions or commitments tobuy or sell at various prices, all of which must take place in somephysical medium and involve transfers of energy, and so on. However,it also seems plain that producing a full description of theseprocesses (supposing for the sake of argument that it was possible todo this) will produce little or no insight into why these systemsbehave as they do. Again, this is not just because any such“explanation” will overwhelm our information processingabilities. It is also the case that a great deal of the informationcontained in such a description will be irrelevant to the behavior weare trying to explain, for the same reason that a detailed descriptionof the individual molecular trajectories will contain information thatis irrelevant to the behavior of the gas. For example, while thedetailed description of the individual causal processes involved inthe operation of the market for oranges presumably will describewhether individual consumers purchase oranges by cash, check, orcredit card, whether information about the freeze is communicated bytelephone or email, and so on, all of this is to a first approximationirrelevant to the equilibrium price — given the supply anddemand curves, the equilibrium price will be the same as long as thereis a market in which consumers are able to purchase oranges by somemeans, information about the freeze and about prices is available tobuyers and sellers in some form, and so on.[15] Moreover, those factors that are explanatorily relevant tothe equilibrium price, such as the shape of the demand and supplycurves, are not in any obvious sense themselves connected byspatio-temporally continuous processes to the price (it is unclearwhat this claim even means), although as emphasized above, the unknownprocesses underlying the attainment of equilibrium are presumablyspatio-temporally continuous. Again the issue is how an account like Salmon's can capture thisfeature of successful explanation of the behavior of complex systems— how the account guides us to find the “right”level of description of the phenomena we are trying to explain. Infact, as the above examples illustrate, the requirements that Salmonimposes on causal processes-and in particular the requirement ofspatio-temporal continuity — often seem to lead us away from theright level of description. The level at which the spatio-temporalcontinuity constraint is most obviously respected (the level at which,e.g., we describe a particular consumer as exchanging cash for orangesor a grower as making an agreement via telephone with a retailer tosell at a certain price) seems to be the wrong level for achievingunderstanding.4.4 More Recent Developments In more recent work (e.g., Salmon, 1994), prompted in part by adesire to avoid certain counterexamples advanced by Philip Kitcher(Kitcher, 1989) to his characterization of mark transmission, Salmonattempted to fashion a theory of causal explanation that completelyavoids any appeal to counterfactuals. In this new theory which isinfluenced by the conserved process theory of causation of Dowe (Dowe,2000), Salmon defined a causal process as a process that transmits anon-zero amount of a conserved quantity at each moment in itshistory. Conserved quantities are quantities so characterized inphysics — linear momentum, angular momentum, charge, and soon. A causal interaction is an intersection of world lines associatedwith causal processes involving exchange of a conservedquantity. Finally, a process transmits a conserved quantity fromA to B if it possesses that quantity at every stagewithout any interactions that involve an exchange of that quantity inthe half-open interval (A, B]. One may doubt that this new theory really avoids reliance oncounterfactuals, but an even more fundamental difficulty is that itstill does not adequately deal with the problem of causal orexplanatory relevance described above. That is, we still face theproblem that the feature that makes a process causal (transmission ofsome conserved quantity or other) may tell us little about whichfeatures of the process are causally or explanatorily relevant to theoutcome we want to explain. For example, a moving billiard ball willtransmit many conserved quantities (linear momentum, angular momentum,charge etc.) and many of these may be exchanged during a collisionwith another ball. What is it that entitles us to single out thelinear momentum of the balls, rather than these other conservedquantities as the property that is causally relevant to theirsubsequent motion? In cases in which there appear to be noconservation laws governing the explanatorily relevant property (i.e.,cases in which the explanatorily relevant variables are not conservedquantities) this difficulty seems even more acute. Properties like“having ingested birth control pills,” “beingpregnant”, or “being a sample of hexed salt” do notthemselves figure in conservation laws. While one may say that bothbirth control pills and hexed salt are causal processes because bothconsist, at some underlying level, of processes that unambiguouslyinvolve the transmission of conserved quantities like mass and charge,this observation does not by itself tell us what, if anything, aboutthese underlying processes is relevant to pregnancy or dissolution inwater. In a still more recent paper (Salmon, 1997), Salmon conceded thispoint. He agreed that the notion of a causal process cannot by itselfcapture the notion of causal and explanatory relevance. He suggested,however, that this notion can be adequately captured by appealing tothe notion of a causal process and information aboutstatistical relevance relationships (that is, information aboutconditional and unconditional (in)dependendence relationships), withthe latter capturing the element of causal or explanatory dependencethat was missing from his previous account: I would now say that (1) statistical relevance relations, in theabsence of information about connecting causal processes, lackexplanatory import and that (2) connecting causal processes, in theabsence of statistical relevance relations, also lack explanatoryimport. (1997, p.476) This suggestion is not developed in any detail in Salmon's paper, andit is not easy to see how it can be made to work. We noted above thatstatistical relevance relationships often greatly underdetermine thecausal relationships among a set of variables. What reason is there tosuppose that appealing to the notion of a causal process, in Salmon'ssense, will always or even usually remove this indeterminacy? We alsonoted that the notion of a causal process cannot capture fine grainednotions of relevance between properties, that there can be causalrelevance between properties instances of which (at least at the levelof description at which they are characterized) are not linked byspatio-temporally continuous or transference of conserved quantities,and that properties can be so linked without being causally relevant(recall the chalk mark that is transmitted from one billiard ball toanother). As long as it is possible (and why should it not be?) fordifferent causal claims to imply the same facts about statisticalrelevance relationships and for these claims to differ in ways thatcannot be fully cashed out in terms of Salmon's notions of causalprocesses and interactions, this new proposal will fail as well. Selected Readings: Salmon, 1984 provides a detailedstatement of the Causal Mechanical model, as originally formulated.Salmon, 1994 and 1997 provide a restatement of the model and respondto criticisms. For discussion and criticism of the CM model,see Kitcher, 1989, especially pp. 461ff, Woodward, 1989 and Hitchcock,1995.5. A Unificationist Account of Explanation.5.1 The Basic Idea The basic idea of the unificationist account is thatscientific explanation is a matter of providing a unified account of arange of different phenomena. This idea is unquestionably intuitivelyappealing. Successful unification may exhibit connections orrelationships between phenomena previously thought to be unrelated andthis seems to be something that we expect good explanations todo. Moreover, theory unification has clearly played an important rolein science. Paradigmatic examples include Newton's unification ofterrestrial and celestial theories of motion and Maxwell's unificationof electricity and magnetism. The key question, however, is whetherour intuitive notion (or notions) of unification can be made moreprecise in a way that allows us to recover the features that we thinkthat good explanations should possess. Michael Friedman (1974) is an important early attempt to do this.Friedman's formulation of the unificationist idea was subsequentlyshown to suffer from various technical problems (Kitcher, 1976) andsubsequent development of the unificationist treatment of explanationhas been most associated closely with Philip Kitcher (especiallyKitcher, 1989). Let us begin by introducing some of Kitcher's technical vocabulary. Aschematic sentence is a sentence in which some of thenonlogical vocabulary has been replaced by dummy letters. To useKitcher's examples, the sentence “Organisms homozygous for thesickling allele develop sickle cell anemia” is associated with anumber of schematic sentences including “Organisms homozygousfor A develop P” and “For all Xif X is O and A then XisP”. Filling instructions are directions thatspecify how to fill in the dummy letters in schematic sentences. Forexample, filling instructions might tell us to replace A withthe name of an allele and P with the name of a phenotypictrait in the first of the above schematic sentences. Schematicarguments are sequences of schematic sentences.Classifications describe which sentences in schematicarguments are premises and conclusions and what rules of inference areused. An argument pattern is an ordered triple consisting ofa schematic argument, a set of sets of filling instructions, one foreach term of the schematic argument, and a classification of theschematic argument. The more restrictions an argument pattern imposeson the arguments that instantiate it, the more stringent itis said to be. Roughly speaking, Kitcher's guiding idea is that explanation is amatter of deriving descriptions of many different phenomena by usingas few and as stringent argument patterns as possible over and overagain-the fewer the patterns used, the more stringent they are, andthe greater the range of different conclusions derived, the moreunified our explanations. Kitcher summarizes this view as follows:Science advances our understanding of nature by showing us how toderive descriptions of many phenomena, using the same pattern ofderivation again and again, and in demonstrating this, it teaches ushow to reduce the number of facts we have to accept as ultimate.(p.423). Kitcher does not propose a completely general theory of how thevarious considerations he describes — number of conclusions,number of patterns and stringency of patterns — are to be tradedoff against one another, but does suggest that it often will be clearenough what these considerations imply about the evaluation ofparticular candidate explanations. His basic strategy is to attempt toshow that the derivations we regard as good or acceptable explanationsare instances of patterns that taken together score better accordingto the criteria just described than the patterns instantiated by thederivations we regard as defective explanations. Following Kitcher,let us define the explanatory store E(K) asthe set of argument patterns that maximally unifies K, theset of beliefs accepted at a particular time in science. Showing thata particular derivation is a good or acceptable explanation is then amatter of showing that it belongs to the explanatory store.5.2 Illustrations of the Unificationist Model As an illustration, consider Kitcher's treatment of the problem ofexplanatory asymmetries (recall Section 2.5). Our present explanatorypractices — call these P — are committed to theidea that derivations of a flagpole's height from the length of itsshadow are not explanatory. Kitcher compares P with analternative systemization in which such derivations are regarded asexplanatory. According to Kitcher, P includes the use of asingle “origin and development” (OD) pattern ofexplanation, according to which the dimensions of objects-artifacts,mountains, stars, organisms etc. are traced to “the conditionsunder which the object originated and the modifications it hassubsequently undergone” (1989, p. 485). Now consider theconsequences of adding to P an additional pattern S(the shadow pattern) which permits the derivation of the dimensions ofobjects from facts about their shadows. Since the OD patternalready permits the derivation of all facts about the dimensions ofobjects, the addition of the shadow pattern S to Pwill increase the number of argument patterns in P and willnot allow us to derive any new conclusions. On the other hand, if wewere to drop OD from P and replace it with theshadow pattern, we would have no net change in the number of patternsin P, but would be able to derive far fewer conclusions thanwe would with OD, since many objects do not have shadows (orenough shadows) from which to derive all of their dimensions. ThusOD belongs to the explanatory store, and the shadow patterndoes not. Kitcher's treatment of other familiar problem cases is similar. Forexample, he notes that we believe that an explanation of why somesample of salt dissolves in water that appeals to the fact that thesalt is hexed and the generalization (H) that all hexed saltdissolves in water is defective, at least in comparison with thestandard explanation that appeals just to the generalization that(D) all salt dissolves in water. He suggests that the“basis for this belief” is that the derivation thatappeals to (H) instantiates an argument pattern that belongsto a totality of patterns that is less unifying than the totalitycontaining the derivation that appeals to (D). In particular,an explanatory store containing (H) but not (D) willhave a more restricted consequence set than a store containing(D) but not (H), since the latter but not the formerallows for the derivation of facts about the dissolving of unhexedsalt in water. And the addition of (H) to an explanatorystore containing (D) will increase the number of patternswithout any compensating gain in what can be derived. Kitcher acknowledges that there is nothing in the unificationistaccount per se that requires that all explanation bedeductive: “there is no bar in principle to the use ofnon-deductive arguments in the systemization of ourbeliefs”. Nonetheless, “the task of comparing the unifyingpower of different systemizations looks even more formidable ifnondeductive arguments are considered” and in part for thisreason Kitcher endorses the view that “in a certain sense,all explanation is deductive” (p.448). What is the role of causation on this account? Kitcher claims that“the ‘because’ of causation is always derivativefrom the ‘because’ of explanation.” (1989,p.477). That is, our causal judgments simply reflect the explanatoryrelationships that fall out of our (or our intellectual ancestors')attempts to construct unified theories of nature. There is noindependent causal order over and above this which our explanationsmust capture. Like many other philosophers, Kitcher takes veryseriously, even if in the end he perhaps does not fully endorse,standard empiricist or Humean worries about the epistemicaccessibility and intelligibility of causal claims. Taking causal,counterfactual or other notions belonging to the same family asprimitive in the theory of explanation is problematic. Kitcherbelieves that it is a virtue of his theory that it does not dothis. Instead, Kitcher proposes to begin with the notion ofexplanatory unification, characterized in terms of constraints ondeductive systemizations, where these constraints can be specified ina quite general way that is independent of causal or counterfactualnotions, and then show how the causal claims we accept derive from ourefforts at unification.5.3 The Illustrations Criticized As remarked at the beginning of this section, the idea thatexplanation is connected in some way to unification is intuitivelyappealing. Nonetheless Kitcher's particular way of cashing out thisconnection seems problematic. Consider Kitcher's treatment of theflagpole example. This depends heavily on the contingent truth thatsome objects do not cast enough shadows to recover all of theirdimensions. But it seems to be part not just of common sense, but ofcurrently accepted physical theory that it would be inappropriate toappeal to facts about the shadows cast by objects to explain theirdimensions even in a world in which all objects cast enough shadowsthat all their dimensions could be recovered. It is unclear howKitcher's account can recover this judgment. The matter becomes clearer if we turn our attention to a variantexample in which, unlike the shadow example, there are clearly just asmany backwards derivations from effects to causes as there arederivations from causes to effects. Consider, following Barnes (1992),a time-symmetric theory like Newtonian mechanics, applied to a closedsystem like the solar system. Call derivations of the state of motionof planets at some future time t from information about theirpresent positions (at time t0) , masses, andvelocities, the forces incident on them at t0, andthe laws of mechanics predictive. Now contrast suchderivations with retrodictive derivations in which thepresent motions of the planets are derived from information abouttheir future velocities and positions at t, the forcesoperative at t,and so on. It looks as though there will bejust as many retrodictive derivations as predictive derivations, andeach will require premises of exactly the same general sort —information about positions, velocities, masses etc. and the samelaws. Thus the pattern or patterns instantiated by the retrodictivederivations look(s) exactly as unified as the pattern or patternsassociated with the predictive derivations. However, we ordinarilythink of the predictive derivations and not the retrodictivederivations as explanatory and the present state of the planets as thecause of their future state and not vice-versa. It is again far fromobvious how considerations having to do with unification couldgenerate such an explanatory asymmetry. One possible response to this second example is to bite the bulletand to argue that from the point of view of fundamental physics, therereally is no difference in the explanatory import of the retrodictiveand predictive derivations, and that it is a virtue, not a defect, ofthe unificationist approach that it reproduces this judgment. Whatevermight be said in favor of this response, it is not Kitcher's. Hisclaim is that our ordinary judgments about causal asymmetries can bederived from the unificationist account. The example just describedcasts doubt on this claim. More generally, it casts doubt on Kitcher'scontention that one can begin with the notion of explanatoryunification, understood in a way that does not presuppose causalnotions, and use it to derive the content of causal judgments.5.4 The Heterogeneity of Unification This conclusion is reinforced by a more general consideration:unification, as it figures in science is a quite heterogeneous notion,covering many different sorts of achievements.[16] Some kinds of unification consist in the creation of a commonclassificatory scheme or descriptive vocabulary where no satisfactoryscheme previously existed, as when early investigators like Linnaeusconstructed comprehensive and principled systems of biologicalclassification. Another kind of unification involves the creation of acommon mathematical framework or formalism which can be applied tomany different sorts of phenomena, as when the systems of equationsdevised by Lagrange and Hamilton were first developed in connectionwith mechanics and then applied to domains like electromagnetism andthermodynamics. Still other cases involve what might be described asgenuine physical unification, where phenomena previously regarded ashaving quite different causes or explanations are shown to be theresult of a common set of mechanisms or causal relationships. Newton'sdemonstration that the orbits of the planets and the behavior ofterrestrial objects falling freely near the surface of the earth aredue to the same force of gravity and conform to the same laws ofmotion was a physical unification in this sense. Of these three kinds of activities only the third — physicalunification — seems to have much intuitively to do withexplanation, at least if we think of explanation as involving theciting of causal relationships. In particular, depending on thedetails of the case, the kind of unification associated with adoptionof a classificatory scheme may tell us little about causalrelationships. Moreover, as historical studies have made clear, asimilar point holds for formal or mathematical unification: the factthat we can construct a common mathematical framework for dealing witha range of different phenomena does not by any means automaticallyinsure that we have identified some set of common causal factorsresponsible for those phenomena — i.e., that we have produced aunified physical explanation of them. For example, the mere fact thatwe can describe both the behavior of a system of gravitating massesand the operation of an electric circuit by means of Lagrange'sequations does not mean that we have achieved a common explanation ofthe behavior of both or that we have “unified” gravitationand electricity in any physically interesting sense. These considerations raise the following question: Is Kitcher'saccount of unification sufficiently discriminating or nuanced todistinguish those unifications having to do with explanation fromother sorts of unification? The worry is that it is not. Theconception of unification underlying Kitcher's account seems to be atbottom one of descriptive economy or information compression —deriving as much from as few patterns of inference as possible. Manycases of classificatory and purely formal unification involving acommon mathematical framework seem to fit thischaracterization. Consider schemes for biological classification andschemes for the classification of geological and astronomical objectslike rocks and stars. If I know that individuals belong to a certainclassificatory category (e. g. Xs are mammals or polarbears), I can use this information to derive a great many of theirother properties (Xs have backbones, hearts, their young areborn alive etc.) and this is a pattern of inference that can be usedrepeatedly for many different sorts of Xs. But despite thewillingness of some philosophers to regard such derivations asexplanatory, it is common scientific practice to regard such schemesas “merely descriptive” and as telling us little ornothing about the causes or mechanisms that explain why Xshave backbones or hearts.[17] Another illustration of the same general point is provided by thenumerous statistical procedures (factor analysis, cluster analysis,multidimensional scaling techniques) that allow one to summarize orrepresent large bodies of statistical information in an economical,unified way and to derive more specific statistical facts from a muchsmaller set of assumptions by repeated use of the same pattern ofargument. For example, knowing the “loading” of each ofn intelligence tests on a single common factor g,one can derive a much larger number (n(n-1)/2) ofconclusions about pairwise correlations among these tests. Again,however, it is doubtful that by itself this “unification”tells us anything about the causes of performance on these tests.5.5 The Winner-Take-All Conception of Explanatory Unification Another fundamental difficulty with the unificationist accountderives from its reliance on what might be called a “winner takeall” conception of unification. On the one hand, it seems thatany plausible version of that account must yield the conclusion thatgeneralizations and theories can sometimes be explanatory with respectto some set of phenomena even though more unifying explanations ofthose phenomena are known[18]. For example, Galileo's law can be used to explain facts about thebehavior of falling bodies even though it furnishes a less unifyingexplanation than the laws of Newtonian mechanics and gravitationaltheory, the latter are in turn explanatory even though theexplanations they provide are less unified than those provided byGeneral Relativity, the theories of Coulomb and Ampere are explanatoryeven though the explanations they provide are less unified than theexplanations provided by Maxwell's theory, and so on. If we rejectthis idea, we must adopt the conclusion that in any domain only themost unified theory that is known is explanatory at all; everythingelse is non-explanatory. Call this the winner-take-all conception ofexplanatory unification. The winner-take-all conception gives up on the apparently verynatural idea, which one would think that the unificationist would wishto endorse, that an explanation can provide less unification than somealternative, and hence be less deep or less good, but still qualify assomewhat explanatory. However, Kitcher's treatment of the problems ofexplanatory irrelevance and explanatory asymmetry seems to requirejust this conception. Why is it that we cannot appeal to the fact thatthis particular sample of salt has been hexed to explain why itdissolves? According to Kitcher, any explanatory store containing ageneralization about the dissolving of hexed salt will be “lessunified” than a competing explanatory store according to whichthe dissolving of the salt is explained by appeal to thegeneralization that all salt dissolves in water. Similarly, the reasonwhy we cannot explain the height of a flagpole in terms of the lengthof its shadow is that explanations of lengths of objects in terms offacts about shadows do not belong to the “set ofexplanations” which “collectively provides the bestsystemization of our beliefs”. (1989, p. 430). This analysisclearly requires the winner-take-all idea that an explanationT1 that is less satisfactory from the point ofview of unification than some competing alternativeT2 is unexplanatory, rather than merelyless explanatory than T2. If Kitcher wereto reject the winner take all idea and hold instead that even ifT2 is more unified than T1, itdoes not automatically follow that T1 isunexplanatory, then his solution to the problems of explanatoryirrelevance and asymmetry would no longer be available: his conclusionshould be that an “explanation” of Mr. Jones' failure toget pregnant in terms of his ingestion of birth control pills isgenuinely explanatory, although less so than the alternativeexplanation that invokes his gender, and similarly for a derivation ofthe height of a flagpole from the length of its shadow. Intuitively, the problem is that we need a theory of explanation thatcaptures several different possibilities. On the one hand, there aregeneralizations and associated putative explanations (like thegeneralization relating barometric pressure to the occurrence ofstorms and the generalization relating the hexing of salt to itsdissolution in water) that are not explanatory at all; they fall belowthe threshold of explanatoriness. On the other hand, above thisthreshold there is something more like a continuum: a generalizationcan be explanatory but provide less deep or good explanations thansome alternative. What we have just seen is that the unificationistaccount has difficulty simultaneously capturing both of thesepossibilities. Either there is no threshold (every derivation isexplanatory to some extent and it is just that some derivations belongto systemizations that are less unifying and hence less explanatorythan others) or else there is no continuum (only the most unifyingsystemizations are explanatory).5.6 The Epistemology of Unification Recall that, according to Kitcher, causal knowledge derives from ourefforts at unification. However, as Kitcher also recognizes, it ishighly implausible that most individuals deliberately andself-consciously go through the process of comparing competingdeductive systemizations with respect to number and stringency ofpatterns and number of conclusions in order to determine which is mostunifying. His response to this observation is to hold that most peopleacquire causal knowledge by absorbing the “lore” of theircommunities, where this lore does reflect previous systematic effortsat unification. He writes that “our everyday causal knowledge isbased on our early absorption of the theoretical picture of the worldbequeathed to us by our scientific tradition” (1989, p. 469) How exactly is this suggestion supposed to work? While it is surelytrue that individual human beings acquire a substantial amount ofcausal knowledge by cultural transmission, it is also obvious that notall causal knowledge is acquired in this way. Some causal knowledgethat individuals acquire involves learning from experience. Moreover,unless we are willing to make extremely implausible assumptions aboutthe innateness of a large number of specific causal beliefs, the stockof socially transmitted causal knowledge must itself have beeninitially acquired in a way in which learning from experience playedan important role. The question that then arises is how this processof learning from experience is supposed to work on a view likeKitcher's about the source of our causal knowledge. If, as Kitcherclaims, “the idea that any one individual justifies the causaljudgments that he/she makes by recognizing the patterns of argumentthat best unify his/her beliefs is clearly absurd” (1989,p. 436), just what is it that is going on at the individual level whenpeople learn form experience? One possibility is that althoughindividuals do not knowingly go through the process of comparing thedegree of unification achieved by alternative systemizations when theyacquire new causal knowledge by learning from experience, they gothrough this process tacitly or unconsciously, perhaps because of somegeneral disposition of the mind to seek unification. However, Kitcherdoes not seem to endorse this idea and it does not fit very well withhis emphasis on the social transmission of causalinformation. Moreover, it looks as though even unconscious unificationrequires very sophisticated cognitive abilities (construction andcomparison of different deductive systemizations etc.) that it isimplausible to attribute to many causal learners, such as smallchildren. One natural interpretation of the passages quoted above and others inKitcher (1989) is this: a social process of comparing alternativesystemizations of beliefs and drawing out their deductive consequencesoccurs at the community level, with groups of people making argumentsto one another about which overall deductive systemizations best unifythe beliefs of the community as a whole. Particular causal beliefs arejustified at the community level by being shown to be part of the bestoverall systemization of the beliefs of the community, and are thenpassed on from the common community stock to individuals via a processof social transmission. An obvious problem with this picture is that the community-wideprocess of justification must still be carried out in some fashion byindividual actors. If, as appears to be the case, there are manysocieties which possess a substantial amount of causal and explanatoryknowledge but in which no one possesses an explicit or clearlyarticulated concept of a deductively valid argument or is very skilledat drawing out the deductive consequences of beliefs or possessesexplicit versions of Kitcher's concepts of number and stringency ofargument patterns, how exactly are community beliefs that reflect theoperation of these notions supposed to form? If, as Kitcher concedes,it is psychologically unrealistic to assume that individual humanbeings deliberately and self-consciously go through the process ofcomparing alternative systemizations when they acquire causal beliefsthrough experience, why is it any more realistic to suppose that thisprocess somehow occurs through the interactions of individual actorsat the community level [19]? There is a second, related difficulty. Assume, for the sake ofargument, that it is desirable to have a unified belief system inKitcher's sense — whether because unification is connected toexplanation and the latter is intrinsically valuable or becauseunification is connected to other goals (e.g., confirmation) that aredesirable. It is still not obvious why it would be valuable to have aset of beliefs that are a smallish proper subset of the beliefs thatcomprise such a unified system, which is what most people seem tohave, given Kitcher's views about the transmission of causalknowledge. Recall Kitcher's basic picture: when I acquire the beliefthat, say, whether salt is hexed is causally irrelevant to whether itdissolves and that whether it is placed in water is causally relevant,I acquire a fragment of the community's overa | |