Natural Selection and the Criteria by Which a Theory Is Judged

Ronald H. Brady

View article as PDF

Abstract

When recent literature on the falsifiability of natural selection is examined critics and defenders seem to communicate with each other very poorly. An examination of the structure of tautology and that of causal explanation provides criteria by which to examine the claims of both critics and defenders. Natural selection is free of tautology in any formulation that recognizes the causal interaction between the organism and its environment, but most recent critics have already understood this and are actually arguing that the theory is not falsifiable in its operational form. Under examination, the operational forms of the concepts of adaptation and fitness turn out to be too indeterminate to be seriously tested, for they are protected by ad hoc additions drawn from an indeterminate realm. Future knowledge may reduce the organism to a determinate system, but until such time too little is known to investigate organism–environment relations. Researchers should consider whether natural selection is necessary to empiric investigation in their area, and whether it can serve the purpose for which it is applied.

 

 

The principle of natural selection has come under increasing attack, in recent years, for any number of assumed faults. In general, the conclusion of the majority of critics has been that the principle is unnecessary and unfruitful. The defenders have, of course, rejected both claims. The critics usually bring an impressive collection of criteria for a “good scientific theory” to bear on the problem, and the defenders reply in kind. The problems are quite difficult — a number of pitfalls lurk for both sides — and an atmosphere of miscommunication seems to pervade the whole discussion. The ground is obviously not familiar terrain for either side.

A discussion of this sort is, I think a sign of health, particularly if it serves to make us more self-conscious of the manner in which our thinking lays hold of empiric evidence. A clarification of this activity could profit any field of science, for there are questions here which simply have not been settled as yet. Much of the argumentation in this paper, therefore, could be applied to other notions of evolutionary theory, or to theories from other sciences. I do not mean to single out natural selection as the only theoretical area which might benefit from scrutiny. But scrutiny is already under way in this area, and, to my understanding, the critics have uncovered a problem worth looking at, if we can get it into focus.

The specific complaints most often put forward with regard to natural selection are that it is tautological, or untestable, or both. I propose to make an examination of these criticisms, and the counter arguments, using the intent of the theorist as a criterion of judgment. This is a standard to which every researcher should readily assent. The fruitfulness of a theory must be, for the man selecting it, the fruitfulness for his intent, for the purpose to which the theory is to be applied. If a theory, or a research activity, can be shown to fall short of its intended purpose, this would be a useful criticism, and one which the researcher could understand. I suspect that matters will be somewhat clarified by approaching them in this manner. If nothing else, the argument of my own discussion should be easier to follow. I shall begin by examining the troublesome concept of tautology with regard to causal explanation in general, since causal explanation is the intent of causal theory. The accusation of tautological reasoning has been repeated by various critics (Manser, 1965; Macbeth, 1971; Grene, 1974; Bethell, 1976; Peters, 1976) and will be a recurring problem for this discussion. It will be necessary, therefore, to formulate the concept in such away that its effect on the theory will become apparent.

Tautology and Causal Explanation

The etymology of the term would suggest that the Greek idea was “to speak the same” — some form of repetition. The Oxford English Dictionary agrees that this is the underlying notion in the various usages listed, and it is certainly thematic to the usage made of the term in modern logic. We will not require any more rigorous definition than that provided by the Greek etymology, however, for the argument will be quite clear on this basis. It is obvious, for example, that it makes no sense to criticize Gertrude Stein’s repetition — “a rose is a rose is a rose” — by an accusation of tautology, since in this case, the mere repetition is the whole point, whereas the advertisement for the bingo game that invites “all husbands and married men” has misused the tongue. In the former case the repetition serves the author’s intention; in the latter it does not.

Tautology in logic becomes a pejorative term by specifying a useless repetition, and this is indeed what is wrong with the second example above. But we need not build the pejorative sense into the term. The basic notion is one of repetition! Once a repetition is identified, it becomes a simple task to discover whether the repeat is contrary to the speaker’s intention, and therefore useless or unfruitful for his purpose.

The usual purpose, in empirical science, of theoretical propositions is some sort of causal explanation: this because that. Such a proposition is necessarily synthetic, that is, the second half adds something new, something not already contained in the first half. The alternative is the analytic statement — “husbands are married men” — which repeats the first part in the second. Analytic statements are quite useful for purposes of definition — “a deafness is an impairment of the hearing” — but only a synthetic construction can serve as a causal explanation. When the definition strategy is used with causal intent, language breaks down. “Your deafness is caused by an impairment of your hearing” means only that your deafness is caused by your deafness — the intent to add something more than the fact of deafness is not carried through by the formulation. A scientific theory brings distinct elements into a dependent relation: the thunder is caused by the lightning; your deafness is caused by a torn eardrum. As cause and effect are not the same, the two sides of a causal proposition cannot be identical.

A theory becomes empirical by reference to observation, and may be said to gain empirical content, or empirical determination, to the degree that this reference determines either what is stated or whether that statement is accepted. That is, to the degree that such empirical information is allowed to affect our position. Beliefs about the world that are not informed by experience remain metaphysical in nature. They may indeed be true, but they are without evidence. Since the purpose of our reference to experience is clearly the addition of new information, something we do not already know, the intention is again synthetic. We want to discover, for example, whether the proposed relations hold for the observed world, and the generation of testable predictions allows our observations to bear upon the question.

Of course, this requirement can be frustrated. It is not difficult to formulate statements about the world which cannot be altered by reference to that world. If our theory had the form, for instance, of the German saying: “If the cock crows from the manure pile it will rain — or it won’t,” empirical test would be precluded. Observation could add no new information to this form of statement, and the synthetic intention of observation would be thereby denied. The form of the statement left no room for a determination through experience.

An experiment, or observational test, is projected from the theorized dependent relation between cause and effect. The causal statement predicts if this then that. In order to test this empirically we require an identification of at least two observations, or two groups of observations, corresponding to the supposed causal parameters and the situation resulting (respectively, the this and that above). Any failure to distinguish the observation or set of observations belonging to the first part from that belonging to the second — any failure, that is, to provide observations which may be compared — precludes the possibility of testing by precluding the possibility of questioning. The point of the satirical folk-saying above is that since any and all empirical conditions conform to it, no empirical result could ever question it. Because the conditions specified under the that clause do not forbid anything, the statement fails to identify a set of observations distinct from other possibilities. It cannot be empirically determined because it was not theoretically determinate.

If either side of the dependent relation fails to identify a distinct set of observations, the statement is not testable. Thus the result above could also arise from an indistinct this clause. The obvious remark that “if the right conditions occur, we will have a very good harvest,” affirms only that there is a dependent relation, but does not specify what the consequent (the harvest) is dependent on. A theory can gain empirical determination only through a formulation that is able to specify the observations to be compared.

The several frustrations of intention covered here all stem from a failure to add, to our starting point, information not already contained within it. The notion of explanation suggests that we must add something to the facts explained, unless we are to suppose that the facts explain themselves. Empirical explanation suggests further that some element of the addition will be derived from experience. Thus, a theory fails its intention if it does not add information to the events explained, and a research program — which arises through our attempt to treat the theory empirically — falls short of its goal if it does not provide an addition that is empirically derived. If my assumptions about scientific intentions are shared by biologists, then I take it that for them, as well as myself, a successful scientific endeavor must meet the requirement of form outlined here, or it cannot carry through its purpose of explanation. Since it is my impression that these intentions are shared by the biological community, I will continue by examining the various formulations of natural selection in order to discover how well they meet these requirements.

The Purpose of the Theory

The following discussion will assume that the principle of natural selection is at least intended to provide a causal explanation of the origins of morphological diversity, adaptation, and, when the principle is extended as far as Darwin proposed, speciation. It can therefore be equated with the Spencerian phrase “the survival of the fittest” as Darwin recognized in his later editions: “I have called this principle ... by the term natural selection. But the expression often used by Mr. Herbert Spencer, of the Survival of the Fittest, is more accurate, and is sometimes equally convenient” (Darwin,1876). The discussion will be limited to this use of the principle, since I believe it to be the most ubiquitous, and will examine how well or how badly it fulfills this purpose.

Differential Reproduction

When one observes a group of animals over time, it becomes obvious that reproductive differentials exist. From this information alone one may theorize that allele frequencies within a gene pool can shift. But this is not yet an adequate hypothesis for the origin of diversity, since changes in allele frequencies take place within stable populations. If we suppose that, either accidentally or through the agency of some causal power, the changes cumulate in a direction, then we have a hypothesis. Even Darwin did not care to suggest that such a direction could arise without causal control, and so he provided a form of control — i.e., selection. That is, he added the notion of selection to that of differential reproduction in order to generate an explanatory hypothesis. Of course, were selection to be subtracted again, we would be left with the fact of differential reproduction. No explanation of this effect can be derived from the effect itself, and thus no way to explain, much less predict, a direction from the differential.

Since tautology is fatal for any sort of causal explanation, it is somewhat mysterious to find a number of authors advancing an admittedly tautologous formulation of natural selection. Waddington, for example, published the following passage in 1960:

Natural selection, which was at first considered as though it were a hypothesis that was in need of experimental or observational confirmation, turns out on closer inspection to be a tautology, a statement of an inevitable although previously unrecognized relation. It states that the fittest individuals in a population (defined as those which leave the most offspring) will leave the most offspring. Once the statement is made, its truth is apparent. This fact in no way reduces the magnitude of Darwin’s achievement; only after it was clearly formulated, could biologists realize the enormous power of the principle as a weapon of explanation.

Macbeth (1971) found this passage “staggering.” It is even more astonishing to reflect that Macbeth’s reaction was not a common one.

The passage above defines fitness as leaving the most offspring. It then states that the fittest individuals will leave the most offspring, and it calls this statement “a weapon of explanation.” It is, to begin with, extremely difficult to believe Waddington’s claim that the idea that individuals who leave the most offspring do in fact leave the most offspring was at any time an “unrecognized relation.” But it is very clear that biologists might be slow in recognizing the explanatory power of this formulation. If the fact of deafness is to be explained, how does it help to suggest an impairment of hearing? If differential reproduction is to be explained, what light is shed by pointing to differential reproduction? We already know that some individuals leave more offspring. What we want to know is why. In particular, we should like to discover some trait or traits, independent of leaving more offspring, that explain why one individual is more “fit,” i.e., more likely than others to leave a greater number of offspring. Since the tautologous formulation does not produce any criterion of fitness other than the fact to be explained (differential reproduction), it amounts to the claim that facts explain themselves — individuals leave more offspring because they leave more offspring. Contrary to Waddington’s claim, the tautologous repetition is utterly useless for explanatory purposes.

All very obvious, or so one would expect. But J. B. S. Haldane did not think so:

the phrase “survival of the fittest” is something of a tautology. So are most mathematical theorems. There is no harm in stating the same truth in two different ways (Haldane,1935).

I single out Waddington and Haldane from the crowd of biologists who say or imply the same thing, only because they put the fallacy in such exemplary terms. Haldane here puts forward the most popular defense of the tautology — after all, mathematics is tautological. Well yes, it is. But mathematical theorems follow from axioms, by definition. Mathematical truths are analytic in nature. Causal explanation is synthetic. How Haldane missed the difference is hard to understand, but he was clearly oblivious of it when he wrote the passage above.

I have been belaboring a simple point because I want to be well rid of it. Differential reproduction, or even differential mortality (which is closer to the original idea), has no explanatory power, but remains a datum to be explained. If natural selection means anything in a causal framework, it means that a causal factor exists, independent of differential reproduction (or mortality), the discovery of which could explain the differential. It is to this formulation, and its research program, that we must now pass.

The Hand of Nature

We cannot suppose that all the breeds were suddenly produced as perfect and as useful as we now see them: indeed, in several cases we know that this has not been their history. The key is man’s power of accumulative selection: nature gives successive variations; man adds them up in certain directions useful to him. In this sense he may be said to make for himself useful breeds.

Owing to this struggle for life, any variation, however slight and from whatever cause proceeding, if it be in any degree profitable to an individual of any species in its infinitely complex relations to other organic beings and to external nature, will tend to the preservation of that individual, and will generally be inherited by its offspring. The offspring, also, will thus have a better chance of surviving, for, of the many individuals of any species which are periodically born, but a small number can survive. I have called this principle, by which each slight variation, if useful, is preserved, by the term of Natural Selection, in order to mark its relation to man’s power of selection (Darwin, 1859).

Darwin’s original intent was clearly to designate a causal agency that was analogous to the hand of man. The natural environment was a master breeder, far more attentive and meticulous than a human breeder could ever hope to be. We see the results of this process, Darwin claimed, in the close adaptation between modern species and their habitats. Species are fitted to their particular niches as if they were bred to them, as in a sense, they were.

In this formulation, natural selection hypothesizes a determinate relation between specific traits and the environment. Such a hypothesis could give rise to useful research, for obviously these relations remain to be demonstrated. Why then, did anyone think of turning away from the search for such relations to the empty formula of differential reproduction? Because, it would seem, the search for determinate causal relation is fraught with extreme difficulty. Darwin had already hinted at the problem. How can we possibly investigate “infinitely complex relations” (which presumably exist between an organism and its environment)? Darwin repeats the adjective many times in many contexts. The locution cannot be written off as a habit of speech rather than thought, because Darwin clearly sees the concept of infinite steps and complexity as strengthening his argument — the greater the complexity of relationship, the greater the power of natural selection, which acts through all relations at once. But a summing-up of an unknown number of parameters can be very difficult to investigate.

A perusal of the Origin does not raise hopeful prospects of research. Does Darwin suppose that we can perceive the causal connections he hypothesizes? — only with great difficulty:

Man can act only on external and visible characters: nature cares nothing for appearances, except in so far as they may be useful to any being. She can act on every internal organ, on every shade of constitutional difference, on the whole machinery of life ... Under nature, the slightest difference in structure or constitution may well turn the nicely-balanced scale in the struggle for life, and so be preserved ... Can we wonder, then, that nature’s productions should be far “truer” in character than man’s productions; that they should be infinitely better adapted to the most complex conditions of life, and should plainly bear the stamp of far higher workmanship? (Darwin, 1859).

He supposed, at least, that there were far too many parameters for him to pin anything down in his own text. The research program would follow.

Has it followed? It is obvious that much effort has gone into the task, but that is not what I mean. Darwin succeeded in proposing a theory that had, on logical grounds, the form of a causal theory. It was synthetic; it attached cause to effect. It postulated a determinate relation between animal traits and environment, and it initiated a search for these relations. But a successful theory must generate a research program that is possible to carry out. The Darwinian program must somehow unravel a causal nexus of unknown (“infinite”) parameters and re-sum them conceptually. The attempts to perform this task do not always evidence a full awareness of the difficulties involved.

Miscommunications on Testability

Although I have argued that the original form of natural selection was free of tautology, I am well aware that critics have continued to voice the accusation. But most of these criticisms are really concerned with the manner in which the researcher operationalizes the Darwinian concepts. After all, when we attempt to investigate the process by which selective pressures lead to increased fitness, we need some way of identifying, in the organisms, traits that represent fitness. Darwin’s own language indicates some difficulty on this point, and the situation may not have improved with time. Recent statements by prominent biologists, for example, suggest that no clear criterion of fitness has yet been found. Simpson (1953) remarked that “The fallibility of personal judgment as to the adaptive value of particular characters, most especially when these occur in animals quite unlike any now living, is notorious.” Dobzhansky (1975) was willing to go even further, concluding that no biologist “can judge reliably which ‘characters’ are neutral, useful, or harmful in a given species.” With such comments in the literature, it should surprise no one that a few individuals begin to wonder whether the process by which fitness is supposed to be optimalized can be investigated at all.

In his Harper’s article Bethell (1976), a journalist, argued that researchers were compelled to identify fitness with survival since the research program had no way of identifying, in the total animal, what traits conferred what advantages. Darwin, he pointed out, had developed the theory from the analogy of human breeders, who have the “desirable” traits clearly before them because they themselves have defined what is desirable. But as Darwin admitted, we do not readily perceive the relations that confer actual survival on the animal. If fitness could not be detected independently of survival, said Bethell, then any attempt to operationalize the concept with regard to actual observations would wind up equating it with survival, making the argument tautologous (we are really testing the survival of the survivors).

This argument was answered by Gould (1977) who claimed that a criterion of fitness independent of survival was indeed available:

Now, the key point: certain morphological, physiological, and behavioral traits should be superior a priori as designs for living in these environments. These traits confer fitness by an engineer’s criterion of good design, not by the empirical fact of their survival and spread. It got colder before the woolly mammoth evolved its shaggy coat.

Gould is speaking not merely of single traits, but of their integration in a good overall design. He is quite correct in his interpretation of the intent of Darwin’s hypothesis, and his logic is beyond question. If an animal were well designed for the environment, it would do well in that environment, since that is obviously what we mean by “well designed.” And the sum of the right traits would equal such design, which is what the theory says. But of course, the sequence is not definite. Gould supposes that the cold came first and the wool after because that is what his theory postulates. His remark is, therefore, merely a hypothetical example of the proposed relation.

Gould’s answer received accolades from such diverse quarters as Carl Sagan (1977), who thought that viewing natural selection as a tautology was “quaint,” and Peter Medewar (1978), who found Gould’s reply “a pretty accomplished hatchet job on the unlucky Mr. Tom Bethell.” Yet while Gould clearly demonstrates that Darwin did not formulate natural selection as a tautology, he never gets around to discussing what Bethell actually claimed, i.e., that the research program inevitably reduced the principle to tautology in practice. He (Bethell, 1976) made the argument as follows:

A mutation that allows a wolf to run faster than the pack only allows the wolf to survive better if it does, in fact, survive better. But such a mutation could also result in the wolf outrunning the pack a couple of times and getting first crack at the food, then abruptly dropping dead of a heart attack because the extra power in its legs placed an extra strain upon its heart. Fitness must be identified with survival, because it is the overall animal that survives, or does not survive, not the individual parts of it.

The individual trait must be summed in the whole before we know how useful it actually is. Since the summing is beyond the knowledge of the investigator, he does not derive survival from his knowledge of engineering; he observes the fact of survival and then attempts to explain this by reference to design. How do we know that an animal is optimally designed for an environment? It survives in that environment. Thus, no matter how we explain good design after the fact, the criterion used for the detection of good design is always survival.

The problem and the response have been adequately described by David Hull (1974) who saw this dynamic very well:

If one only knew enough about the genetic makeup, the embryological development, and the physiology of the organisms concerned, as well as the vagaries of the environment, one could assign a certain degree of fitness to each of these organisms and hence be able to make reasonable predictions about their chances of survival. With this information, one could in turn predict subsequent changes in the population.

We do not have this information, but even without it, Hull concludes, we are compelled to explain:

The evolutionary development of a particular species or population as such cannot be predicted with any reasonable degree of certainty. Predictions are possible only to the extent that a population or species happens to fit one of the patterns of evolution which have currently been discovered.

One is tempted to rush by all such pragmatic consideration. Perhaps biologists do not know all the relevant variables and could not combine them meaningfully if they did, but surely nature does the summing for us. In principle, every organism that dies without leaving issue has a coefficient of fitness of zero. No matter that two individuals are identical twins with the same genotype — one could have an extremely high coefficient of fitness and the other a very low one, depending on how many offspring each leaves. The appeal to this retrospective deterministic bias is difficult to resist. If one individual dies without reproducing itself and another succeeds in leaving numerous offspring, something must have been responsible for the difference.

Something presumably did make the difference and, as Gould suggests, we may theorize that those designs which may be termed “superior a priori” are the answer. But, if we allow nature to do the summing for us and assume that the process has culled out the highest sums, or best designs, we are reduced to concluding that these survivors must be, by definition, the best designs.

It is this twist that catches Bethell’s attention. If the researcher detects fitness by survival, however that individual reasons about which traits confer which advantages after this fact, it is clear that no test will be forthcoming. We may indeed hypothesize that the organism survives because certain traits, capable of selection, confer fitness upon it, but operationally we have allowed survival to stand for fitness — thus identifying the two — and we cannot meet the requirement that two independent observations be compared. Our claim that fitness causes survival seems to be reduced, in our operational procedure, to the claim that survival causes survival, which explains nothing.

It is perhaps unfortunate that Bethell talked of tautology here rather than lack of testability, which is his real point, for the charge of tautology always leads defenders back to the theory itself, which is not tautologous. Even so it led Gould to explain that Darwin presumed a causal relation between the design of the animal and the structure of its environment. But Gould’s response is equally unfortunate, since it seems less than candid. One would suppose that a reader as sophisticated as Gould could have seen that Bethell understood Darwin’s assumption, and was merely questioning whether it could be made operational in actual research, yet Gould does not rise to the argument on the detection of fitness. From Bethell’s point of view Gould’s response must have seemed more playful than serious. Those designs that are “superior a priori” must be superior in some specific way — in this case, they are superior strategies for survival. From this it follows by definition that they will be the survivors, nature having done its sums properly. It also follows that the survivors must be, a priori, the superior designs — i.e., fittest. Best of all, since all this follows by definition, we need no research program to investigate it. It is, of course, necessary to determine empirically that there are survivors, but since this singular fact was among the data we started out to explain, we already know it. Indeed, we also predict it, for since the theory states that the superior designs should survive, we may look about us and find that, by Jove, they have! — understanding, of course, that any survivors are already known to be the superior designs. It is truly marvelous to see what one can determine prior to the inception of actual research. What on earth was bothering that chap Bethell?

The miscommunication between Bethell and Gould is typical. Most of the present critics of natural selection have difficulty with the research program that could operationalize it, but have not clearly distinguished the operational formulation of the concepts from the formulation of the theory per se. Most defenders have risen to the charge of tautology, explaining once more with feeling, that the theory itself is not tautological, but have not gotten around to the problem of operationalizing the concept of fitness. (Of course, the fact that some versions of the theory — i.e., differential reproduction — are tautological adds to the confusion.) But if the critics, particularly outsiders like Bethell and Macbeth, fail to make enough distinctions through lack of familiarity with the material, the same may not be said of the defenders, who are usually experts in the subject. The critic has the easier job, and it shows. Defenders, even the most knowledgeable, will often be forced to delve rather deeply into philosophic considerations to answer a complaint intuitively advanced by the critic.

J. Maynard Smith’s (1969) defense of neo-Darwinism is an interesting example of the complexity of the defending position. Smith gave the paper at a symposium attended by Marjorie Grene and David Bohm, both sometime critics of the neo-Darwinian position (Grene, 1969, 1974; Bohm, 1969). He was responding to a charge of tautology — he does not cite the source — and decided to defend by showing the neo-Darwinian position to be non-tautologous because falsifiable. Obviously, Smith noted, any position that can be refuted by empirical evidence must be more than a mere repetition of the starting point, since it admits a new element — i.e., the factual data. Although Smith is absolutely correct in this claim, his strategy is not well suited to his purpose.

To begin with, by choosing to defend the basic theory against the charge of tautology by reference to empiric test, Smith has failed to distinguish the formulation of theory per se from the problems of its application to observation, the very thing that led to difficulty with regard to the arguments of critics. The theory of natural selection, the heart of Smith’s defense, is simply not a tautology in the formulation that recognizes a causal interaction between the organism and its environment, and this point is easily made. (It would be a great relief to get the whole problem of tautologous formulation behind us once and for all, but I am afraid that neither critics nor defenders have been very efficient in bringing about this goal.) The moment we turn to empiric observation and falsifiability, we must deal with the problem of operationalizing our concepts in a testable way. If one reads the critics carefully, this is obviously the crucial problem, but Smith cannot deal with it directly since he confuses it with proving that the theory, rather than the research program, has the proper form.

Smith’s discussion produces three rather interesting difficulties which are hardly unique to his paper: his definitions are not operational; he offers a test of a relatively strong part of the theory as if it corroborated a weaker part as well; and he mistakes the canonical value of the theory (the general agreement with known facts) for a basis of testing. Judging from the much greater sophistication of his more recent treatment of this theme (1978), I would suppose that he did not, in the 1969 paper, concern himself with the requirements of a serious program of testing. Yet since he has announced that his subject is falsification, his argument must be judged on this basis, and since the article was recommended by other defenders (Ferguson, 1976), its problems are well worth analysis.

Smith (1969) offers, as an explanation of “fitness,” the following paragraph:

If variation is to lead to evolution, then some variations must alter ‘fitness,’ and at least some of these must increase fitness. By fitness is simply meant the probability of survival and reproduction. A melanistic moth is by definition fitter if it is more likely to survive, and a myopic man may be fitter if his myopia enables him to escape the draft. (Much confusion has arisen because ‘fit’ is not used in this sense in the phrase ‘survival of the fittest.’ If it were so used, the phrase would indeed be tautological. A more precise though less elegant — and hence less ‘fit’ — phrase would be ‘the survival of the adaptively complex,’ i.e. organisms are adaptively complex or, as Bohm might say, ‘harmonious,’ because such organisms survive better than less harmonious ones.) It follows from this definition that fitness can only be compared in a specific environment or range of environments.

In commenting on this passage, neither Bohm (1969) nor Grene (1969) could find much meaning in “the probability of survival and reproduction,” presumably because both were thinking in terms of how this could be operationalized with regard to actual organisms. What we really want to know, if we are concerned with a test of the theory, is how such probability is calculated — how can we tell which changes are adaptive? Smith does not go into this because he is explaining the intent of the theory rather than its application.

But notice that the long parenthetical clarification above runs into the very problems Smith is supposed to be solving. If we substitute “adaptive complexity” for “fitness” we are indeed in danger of tautology, since the term “adaptive,” by presuming some sort of positive adaptation to the environment, already means “fit,” and thus it has no explanatory value here. And if we look further for clarification, we find that organisms are adaptively complex or “harmonious” because they survive better than less harmonious ones. The “because” here can only explain our reasons for speaking in this fashion — that is, Smith uses the words in this manner because he means them to be synonymous; “fit” is “adapted” or “harmonious” — since it cannot contribute to any causal speculation. After all, to tell us that an organism is fit because it survives better than less fit organisms is only to tell us how one will use the words. There is still no hint of what constitutes fitness other than survival.

Smith now moves on to discuss possible falsifications, remarking that the theory has done well in the laboratory, particularly in the prediction of rates of variation. Smith could certainly claim a like success in the industrial melanism experiments, and since I think it may well be on the mind of the reader, I will add the topic here. Successful prediction of rates of variation is a victory for genetic theory, and the shift in allele frequency of the moth population towards the melanistic variant is one for selection theory. Smith mentions that the critics have admitted the strength of the lab work; I have not heard anyone deny the industrial melanism experiments. But these things do not deal with the part of the theory most seriously under attack. The very fact that the critics do not attack here should have alerted Smith to the real focus of recent criticisms, namely, how can the concept of fitness be tested as well as these other points have been? (In case further clarification is needed on the distinction, made above, between the part of selection theory tested by industrial melanism and the part that remains to be tested, let me point out that simple selective pressure has never been seriously in question. That certain conditions can cause selective mortality means only that some alleles can be weeded out, not that this action can combine with variation in order to optimalize adaptation. If we did not believe in selective pressure we would not use the metaphor of weeding, which is drawn from the selective predator pressure we place on the dandelions in the front lawn. But one may accept this relation without supposing that everything else must follow, and it is the optimalization of adaptation that the critics usually worry about.)

Finally, Smith moves to the canonical value of the theory. He drops the example of the laboratory and, turning to nature at large, remarks that when one looks at the end-products of natural selection — i.e., the organisms — it becomes obvious that the theory forbids any number of conceivable arrangements. If these counter-arrangements were actually to turn up, the theory would be refuted:

If one invents counter-examples they seem absurd. Thus, if someone discovers a deep-sea fish with varying numbers of luminous dots on its tail, the number at any one time having the property of a prime number, I should regard this as rather strong evidence against neo-Darwinism. And if the dots took up in turn the exact configuration of the heavenly constellations, I should regard it as an adequate disproof. The apparent absurdity of these examples shows that what we know about existing organisms is consistent with neo-Darwinism. It is, of course, true that there are complex organs whose function is not known. But if it were not the case that most organs can readily be understood as contributing to survival and reproduction, Darwinism would never have been accepted by biologists in the first place.

Perhaps the last sentence gives the game away. There is nothing in these expectations with regard to organisms that was not part of pre-Darwinian biology. When Darwin formed his theory, he drew on the same expectations. How, then, could we suspect that contradiction would come from such material? Such evidence was part of the factual condition Darwin set out to explain. He had to make his theory fit this factual condition, to make it canonical for the known facts, and had he not succeeded in this attempt his theory could never have been given serious consideration at all. But what has this got to do with testing the theory?

The “apparent absurdity” of Smith’s examples is twofold: they run counter to everything we know about organisms and thus are fully counter to any expectation, but the suggestion that the theory is tested by such considerations is also absurd. Newton worked out a theory which gave an account of gravity. When he set to work, the propensity of objects to fall downward was part of the known world, to be related by the theory to other parts. Newton implied that the theory could be tested through its predictions of things we did not yet know. But no one suggested that it could be tested by seeing if objects fell up. To be fair to Smith, it is clear that at this point he is not thinking of actual testing, but merely of empiric content — he is trying to show that the theory has empiric content, has incorporated factual observation, and is not a mere tautology. But that was not the problem, and falsification becomes a serious concern only by the grace of a serious test.

At this point, the general pattern of the discussion to date is visible. Critics have worried about the formulation of tests, defenders, about the formulation of the theory. And where the former group found no empiric content, by which they meant no good test, the latter found empiric riches, by which they meant canonical agreement. (This miscommunication is particularly problematic with Peters [1976], who seems to be pointing to the absence of any operational formulation of the concepts that could provide a good test, but insists on couching his complaint in terms of a supposed lack of empiric content. Yet Peters is hardly more extreme, toward his pole, than Smith is toward his. Each will probably find, in the other, just the mistakes they have set out to correct.)

How important is the demand for a test? After all, the theory can be applied without going so far. Must we test? A better question for this paper might be: do we want to test? Let us look at the requirements.

The Problem of Testing

The real import of a demand for a test is a demand for a formulation of the theory determinate enough to be tested. The reasoning of the critics is based on their notion of science. They are of the opinion that a good scientific theory receives direct support from empiric evidence, and they do not count canonical agreement as support. There are certain philosophic considerations behind this attitude.

One may reflect, for example, that what empiric evidence cannot question it cannot support. The notions, for example, of arithmetic are not questioned by our experience of multiplicity. No experience can ever make us believe that two plus two does not equal four. If the actual enumeration of objects contradicts our calculations, we assume an error in counting or in calculation, but not in the principles of arithmetic. And since this is so, it cannot be experience that makes us believe that four is the proper answer. When students point to the successes of experience in support of mathematics, I can only ask them what sort of experience would make them abandon it. I have never heard a reply to that question.

We are led therefore to ask; what sort of experience would make biologists abandon natural selection? Well, here we have an empirical theory, the truth of which does not rest on the accuracy of the logical operations by which the theorems are deduced from axioms. One may answer the question with relative ease. All sorts of imaginable occurrences could falsify the principle. But these occurrences fall into two very different categories: we could find that the theory simply does not fit the known facts, or we could generate predictions of new data, on the basis of our theory, and find that the predictions fail. The former possibility is limited, of course, by the growth of factual knowledge after the formulation of the theory. Presumably the person who formulated the theory took into account the known facts at that time. It was this data, in fact, that the theory was meant to explain. The theory fits that part of the world known to the theorist at its inception because it was designed to fit it. Only the accumulation of new data after the theory is formulated could give it much trouble, and then only those facts which do not fit the previously known patterns. This sort of testing, if one may call it that, is a desultory business at best. The predictions generated by a theory, however, represent a very different possibility.

Any interesting hypothesis is probably interesting because, as Goethe once remarked, it “makes us believe in the connection of phenomena.” But once we have a notion of how things are connected, we may project the pattern of connection further in anticipation of observations not yet actual. When such observations are made, they either coincide with the predicted pattern and thus extend it, or they do not, and thereby create a problem for the theory. The activity of making and checking predictions tells us something about how well the theory fits its subject matter, something that its original agreement with the known world — its canonical value — cannot tell us, and does so much more efficiently and directly than is possible through the general growth of knowledge apart from such testing.

It would be naive, however, to suppose that testing by generating and checking predictions could either prove or disprove the theory. When a hypothesis predicts that, given specified conditions, a specified consequent will result, the fact that the prediction holds does indeed extend the congruence between the anticipated pattern and the observed pattern, but it does not tell us whether this congruence is coincidental or necessary. Our predictions may hold true, after all, even if our hypothesis is not true. Even so, if we find that although the conditions specified by the hypothesis were met, the predicted consequent did not result, we have certainly refuted the prediction but we have not falsified the hypothesis, for our prediction may fail even though our hypothesis is correct.

When we go about the business of testing, we cannot assume a simple relation between two poles of observation — i.e., conditions and consequences. If I say that the acceleration of gravity is a constant, I am not refuted by showing that a feather falls slower than a parachutist, or a stone faster. I shall reply that my statement, being a general law, did not take air resistance into account. Upon closer inspection, my statement did not predict, in itself, what would happen if actual objects were dropped from a height. It specified only the contribution of gravity to the event. Were I to make all this explicit, I should say something like: if gravity were the only causal parameter involved, all such objects would exhibit the same constant acceleration. If I use the available jargon for the same point, I will say that the acceleration of gravity is constant, assuming ceteris paribus — assuming that all other things are equal (are without an effect on the consequent). Of course, in this case all other things are not equal and air resistance makes a big difference.

General principles do not predict specific events in themselves, but only specify a contribution to those events. When a particular application of a principle is made, it must always be accompanied by a ceteris paribus assumption, since in any concrete situation other causal parameters could interfere with the result. But when we see that in any particular test the congruence or incongruence of the anticipation with the result may be due to parameters external to our theory, it becomes clear that tests neither prove nor disprove theories. The distance between the general principle and the particular application is too great for that. In the case of a confirmed prediction, we record the result and continue to investigate the implications of the theory, developing further predictions as we go along. If all goes well, the pattern of anticipation becomes the pattern of observations, and we may be said to investigate nature methodically. When things do go awry, however, the response will be more complicated.

Faced with the reality of a false prediction, the researcher must suppose either: (1) that the basic theory has been falsified, or (2) that the ceteris paribus clause attendant to the application has been falsified (that the assumption that other parameters would not interfere with our predicted result was not true). Since it would be precipitous to discard the theory in order to save a local ceteris paribus assumption, it is this assumption that will be challenged by the failure of prediction. The researcher will make an ad hoc addition to his hypothesis, or will reformulate it more radically, attempting to identify the extra parameters and take them into account. Of course, since the ceteris paribus clause of any application is potentially inexhaustible — one could never know how many parameters are actually hiding under its blanket inclusion — if the new prediction also fails we are still not finished with our investigation. We may continue to adjust our hypothesis as long as possible causal parameters occur to us.

Smith (1978) has described the problem as it arises in current practice. Replying to a criticism of ad hoc hypothesizing by Lewontin (1977), Smith defends the use of ad hoc expansions with reference to his own work:

What these examples, and many others, have in common is that a model gives predictions that are in part confirmed by observation but that are contradicted in some important respect. I agree with Lewontin that such discrepancies are inevitable if a simple model is used, particularly a model that assumes each organ or behavior to serve only one function. I also agree that if the investigator adds assumptions to his model to meet each discrepancy, there is no way in which the hypothesis of adaptation can be refuted. But the hypothesis of adaptation is not under test. What is under test is the specific set of hypotheses in the particular model.

It is the specific hypothesis which includes, of course, the ceteris paribus clause, and it is therefore at this level that we make our ad hoc foray into the possibility of other parameters. The theoretical background is left intact by this procedure, but the local application shifts:

Tests of the quantitative predictions of optimalization models in particular populations are beginning to be made. It is commonly found that a model correctly predicts qualitative features of the observations, but is contradicted in detail. In such cases, the Popperian view would be that the original model has been falsified. This is correct, but it does not follow that the model should be abandoned. In the analysis of complex systems it is most unlikely that any simple model, taking into account only a few factors, can give quantitatively exact predictions. Given that a simple model has been falsified by observations, the choice lies between abandoning it or modifying it, usually by adding observations.

The “original model” which is falsified may be considered to be the formulation of the general theoretical background needed to make the specific application. Since that formulation will always consist of the general principle plus a structure of assumptions regarding additional parameters, its falsification necessitates the alteration of only one of these elements.

The result of Smith’s discussion is: “the hypothesis of adaptation is not under test.” His treatment is correct, I think, and he shows a clear grasp of ceteris paribus assumptions. Compared to most defending articles, this one is exemplary. I only wish that Smith had continued his examination. (After all, if the hypothesis of adaptation is not under test in the examples discussed, when is it under test? Most critics, including Lewontin, would probably like to have that question answered.) As it is, Smith admits only one serious concern: “There is a real danger that the search for functional explanations in biology will degenerate into a test of ingenuity.” The meaning of his remark will be clear if the literature is examined.

Consider an example recently discussed by Lewontin (1978). A researcher studies the behavior of foraging birds. Since these birds carry the food back to the nest once they pick it up, were they to take the first item that they came upon, the piece of food might be too small to make up for the energy lost in the round trip back to the nest and out again. Thus the researcher proposes that the choice of food particle, with regard to size, will not be random. Instead, the choice will represent an adaptation, and it will optimize the net energy gain of feeding. By surveying the actual distribution of food particles in the foraging area the researcher calculates the size that represents the optimal solution. It is not, of course, restricted to the largest food items, for these are too poorly distributed to be worth the energy needed to search them out. All this done, the researcher compares his figures with the birds’ behavior, and it turns out that the birds are biased in the direction of large particles, but not to the optimum particle size. The example is not far from Smith’s discussion. We get a general hit — a bias toward the larger particles — and a miss on the details — they do not stick to the optimum solution.

Smith would expect an ad hoc addition to the original formulation, and this is just what the researcher does next. The miss can be explained if we assume that another parameter is at work. The behavior exhibited by the birds represents a compromise between the demands of energy efficiency and those of predator pressure — the birds cannot stay away from the nest long enough to conduct a proper search for the optimum particle size, since while they are gone their young are exposed to predators. Now the researcher publishes this account.

Well, the new hypothesis could not be tested, so the researcher does not suggest that it is confirmed. He merely gives the history of his investigation and his final position. What is the import of his data? Only, it would seem, that the simple model which equated adaptation directly with optimum particle size was too simple. Both Smith and Lewontin recognize the described procedure as a common one. But Lewontin is uneasy about the general thesis — in this case the optimalization of adaptation. He wonders how this sort of thing can be said to test the thesis of adaptation. Smith dismisses the difficulty with the remark that it was never the intent of the researcher to test the theoretical background. Only the local application is under test — i.e., the simple model which failed. Their difference outlines an aspect of the problem that has been of crucial importance to the critics.

Smith had warned about the possibility that research could become a test of the researcher’s ingenuity rather than of the hypothesis. But when can we say that this has indeed happened? Where would Smith draw the line? This question seems to be the real import of the exchange between Lewontin and Smith, but as far as I can see it was not answered. Smith never really rises to the problem of testing the hypothesis of adaptation, and Lewontin (1977) seems convinced that this hypothesis cannot be tested — for the reason given above by Bethell:

The example of central-place foraging illustrates a basic assumption of all such engineering analyses, that of ceteris paribus, or all other things being equal. In order to make an argument that a trait is an optimal solution to a particular problem, it must be possible to view the trait and the problem in isolation, all other things being equal. If all other things are not equal, if a change in a trait as a solution to one problem changes the organism’s relation to other problems of the environment, it becomes impossible to carry out the analysis part by part, and we are left in the hopeless position of seeing the whole organism as being adapted to the whole environment.

Bethell’s criticism, when formulated according to my analysis above, is sustained by Lewontin, who evidently understands the engineering criterion suggested by Gould as a purely theoretical notion which is difficult if not impossible to operationalize. But Smith responds to all this with a judgment on what the researcher should or should not do. At no point does he undertake to examine what difficulties might be inherent in the structure of the theory itself, rather than in the practice of the researcher. Lewontin, like most critics, is worried about both the research practices and the nature of the hypothesis being researched.

If the purpose of testing looked to be a clear and straightforward business at the beginning of this section, the practice has turned out to be rather murky. It begins to seem questionable that a procedure of testing could ever be adequate to the purpose offered above. Of course, it may well be the case that things become indeterminate when we think about natural selection and adaptation because the theory is itself indeterminate, and a theory of a different form would facilitate testing in a way that natural selection does not. Or the fault may lie with the procedures of the researcher. Or perhaps the whole notion of testing was impossible from the beginning. These three possibilities must be sorted out.

Determinate Form

By now it has become obvious that theories are not easily challenged by tests. No theory is proven or disproven by such a procedure, and what else is left? Fortunately, there are intermediate possibilities. If a test may not decide a theory is true or false, it can, in some cases, decide whether a theory is superior or inferior to another contender. This is perhaps its strongest form, but there are still other applications. Even when there are no alternative theories to dispute the field, successful predictions extend the known pattern of relations, discovering novel facts and perhaps even shifting the interpretation of the theory and the universe to which it applies. Unsuccessful predictions can generate expectations concerning the parameters supposed responsible for the failure, thus investigating a closely related area. And all testing can serve to refine the conceptual apparatus by which the theory is operationalized, clarifying the relation between the language of the theory and the actual field of observation.

That tests cannot prove or disprove theories, and may even fail to offer any serious challenge, is sometimes known as the Duhem-Quine thesis (Duhem, 1906; Quine, 1953), although the point was also recognized in the early work of Popper (1934). The argument is simply that since all actual applications of a general theory entail elements additional to the theory — at least a ceteris paribus clause and probably several other theories which are used to understand those parameters which are involved but not covered by the general theory (when attempting to estimate the impact of a particular trait on the organism’s relation to its environment, we must take into account all that is known or supposed about the way that environment works, and this will involve many other theories) — a failure of prediction could be due to any one of these three areas. Faced with such a failure we must ask, before discarding the theory under examination, whether we are sure of the theoretical background of the test (does the environment really work in the manner we have supposed?), and whether we are sure that our ceteris paribus assumption is correct (could unknown parameters be interfering?). Only testing could investigate such questions, but testing, as we have just admitted, is never completely certain, since all tests involve assumptions which could be questioned.

Given the possibilities of the scenario that now presents itself, Maynard Smith’s remark on the danger of a degeneration to a test of ingenuity becomes very pointed. Since a test cannot disprove a theory, can it question it? What is to prevent the researcher, for example, from supposing that some unknown parameter is interfering whenever any prediction fails, and thereby insulating the basic theory from being inconvenienced by facts? Such excuses could be generated ad infinitum, if we were willing to do so, and eventually the only thing being tested would be our ingenuity and stamina. Of course, the important phrase is “if we were willing to do so.” If we are not willing, we shall embark on one of the paths mentioned above, all of which lie between proof and disproof on one hand and mere ingenuity on the other.

The strongest form that a test may take is that of arbiter between two competing theories. The orbit of Mercury was an unexplained anomaly for Newtonian theory for eighty-five years, but although the ceteris paribus assumption had been fairly well investigated and no other parameter had been found to explain the unexpected path, the absence of interference could not be proven and the theory simply lived with its anomalies. Things changed when Einstein’s theory accounted for the orbit, as well as for other problems. Of course, Einstein’s theory had its own anomalies, but it could account for more than the Newtonian approach. Thus, the orbit of Mercury became one of the deciding factors in the replacement of Newton’s theory by Einstein’s. In such a situation, given that there is a real imbalance between the success of one theory compared to that of another, the result of prediction can cast a crucial vote. (Notice that this does not mean that the superior theory is true, or even that it is truly superior. After all, perhaps Newton is correct but the interfering parameters have led affairs to a coincidence with Einstein’s predictions. The point is that given present knowledge, Einstein’s theory accounts for more, and until such hidden parameters are found, we should adopt it.)

Since natural selection does not seem to have serious competition at the moment, we could move on to the next case. Successful prediction is of value whether or not we must decide between theories. Such success extends the known pattern of relations in a systematic way, even as success in the research program mounted on the basis of the theory of relativity has done, and even if the theory behind the test were wrong, the new factual material so ordered would be of great value, perhaps as the basis of a better theory when one is needed.

A failure of prediction, on the other hand, calls into question not only the basic theory but also the theoretical background and the ceteris paribus assumption. The researcher must decide which of these is to bear the responsibility for the failure and investigate accordingly. It is possible, after all, to submit the ceteris paribus assumption to scrutiny. If a planetary orbit is found to disagree with the predicted path, we may speculate that some other body is affecting it, and conduct a search for this hypothetical entity. If subatomic events take place for no apparent reason, we may propose that they herald the emergence of a new particle, and search for it. In such cases our proposal will be limited by our calculations, for the whole sphere of operation is determined by quantitative laws, and if we cannot exhaust the possibilities of new parameters, we may make a determinate search for those which our technology puts in our reach.

While doing this, of course, we shall refine our operational language in order to clarify how its terms translate into aspects of observation. As an area of investigation becomes familiar, it should be possible to specify with increasing precision what shall constitute a condition and what a consequent, and to distinguish observations which shall be part of one group from observations that shall be part of the other. Thus, even if our predictions continually fail, some gain is still possible, for the significance of these failures is cumulative when they are so systematically investigated, and we can learn from them.

Well, but it is not always so neat! No it is not, for the scenario I have sketched is merely an imagined one, based on an enumeration of possibilities. As far as I can tell, the criticisms of natural selection are based upon the assumption that the causal relations postulated by a theory should lend themselves to empiric examination, which means, since proof and disproof seem to be mythical, and ingenious ad hoc adjustments are often a point of complaint, that the critics must actually ask for something between these two poles. I have offered my sketch of what lies between the poles in order to pin down the critics. What do they say is missing?

All the critics I have mentioned seem to agree on two difficulties: the theory is not empirically determinate and the defenders are not critical enough. The disagreement between Lewontin and Maynard Smith seems to have this form, for Lewontin asks how the basic theory could be tested and Smith replies that it is not under test. Obviously, for Lewontin, Smith is neither critical enough nor possessed of a testable theory, and where Smith sees good work, Lewontin may see only ingenuity.

As I mentioned above, there is basic agreement between Lewontin and Bethell, for both find the theory too indeterminate. The level at which this indeterminacy appears, however, is now specifiable. Both are really worried about the aftermath of failed predictions. Their disagreement is with the process at that point, and the implications of that process for the state of the theory. Well, it is at that point that the ceteris paribus assumption is called into question, but unlike the case of the aberrant orbit or sub-atomic event, the interaction between the organism and its environment does not lend itself to quantitative investigation. We have no way of summing the contributions of various structures to the survival of the whole organism. All ceteris paribus clauses are inexhaustible, but at least some are theoretically determinate, to be investigated on the basis of the same laws that are under question (the possible parameters influencing the orbit of Mercury would be calculated according to the same theory by which the orbit was itself calculated). The clause that attends our investigations into optimalization of adaptation is not determinate.

Neither Darwinism nor, in its present form, neo-Darwinism, contains a theoretical reduction of the organism to a determinate system, and thus neither contains a way of determining the contributions of various parameters, or even the number of parameters. In practice, of course, predictions are made on the basis of individual traits, but whenever anything goes wrong the resulting foray into ad hoc speculations invokes the notion that the summation of effects is really indeterminable — any parameter could be interfering in who knows how many ways. No reasonable program for investigating the ceteris paribus assumption has been forthcoming for over a hundred years, and none is in sight now. Critics like Bethell find this an impossible situation, for it suggests that the failure of prediction can have no bearing on our theory, which runs counter to Bethell’s notion of science.

The indeterminacy of the theory at this level is, I think, well documented. And just because we understand so little about the contribution of a particular trait toward fitness, it is difficult to inconvenience the theory by its failures. We cannot specify the conditions and the consequent, the this and that of our causal statement, well enough to know whether anything is seriously wrong. Because the theory is not theoretically determinate it cannot be made empirically so.

If other parts of neo-Darwinism have been well scrutinized, natural selection has escaped any serious testing and will continue to do so until we know a great deal more about organism-environment interactions. We are so far from a science that can reduce these interactions to a determinate system that some critics have wondered whether it makes any sense to attempt to operationalize the principle at all. If experience cannot question a theory, how can it support it? Would correct predictions mean very much if we knew that incorrect ones would not mean anything at all, at least for our allegiance to the theory? Questions like this are a source of frustration to the critics, even more so when they are not recognized by the defenders. It is difficult to understand, for example, why Maynard Smith, having admitted that the hypothesis of adaptation by natural selection was not under test in the programs that investigate it, went on as if this were no problem at all. After all, for Lewontin it was a problem, and Smith was answering Lewontin. It may be, of course, that test or no test, the principle of natural selection is considered so fruitful that these worries pale by comparison. Since this is, in fact, a strong possibility, it would be worthwhile to explore the concept.

Fruitfulness

A theory may be judged by its fruitfulness, but only if one is clear about how this is to be measured. The term, when applied to the work of a historical period, usually means something different from a usage describing present work. Phlogiston chemistry might be judged a very fruitful episode in eighteenth century chemistry, depending upon what standards we use to estimate its worth, but one standard we shall not use is correctness. Phlogiston chemistry was also, by modern standards, incorrect. That conclusion does not prevent historians from crediting Priestley and his co-workers with various discoveries, advances in technique and empirical knowledge, etc., but no one will suggest that it would be fruitful, now, to return to an incorrect view. Fruitfulness of this type is limited to a time period — which is past.

It is obvious that the sort of fruitfulness I have just described is possessed in full by the theory of natural selection. The power of this argument to convince its contemporary audience, when first put forward, is largely responsible for placing biology on an evolutionary footing. But evolutionary biology is now in place, and will remain so with or without the aid of natural selection. The present fruitfulness of the principle must be estimated on standards which are independent of past revolutions.

The several ways by which estimations of present fruitfulness might be made, divide into two major categories: those methods that attempt to establish some line of demarcation between the scientific and the non-scientific (such as falsifiability), and those which simply estimate the amount, rather than the quality, of work generated. Let us take the second case first.

A theory which is popular enough to be applied often can be credited with a power to generate activity, but this is desirable only if some profit is derived from it. Mere activity, in itself, seems to have no intrinsic value other than paying the bills. Activity within the context of a community given to reflection and self-criticism, however, might be thought a different matter. The lay public often equates simple activity within that community with progress, but this is only because the same public assumes that the scientist is privy to some criterion by which the quality of activity may be judged — i.e., a line of demarcation. Within the ranks of scientists, such a notion becomes circular.

Scientists are well aware that their profession embraces certain critical standards. But were a scientist to defend a theory on the grounds that it had achieved a consensus in the community, he would neglect the very standards which are premised as the basis of his argument — the standards by which the community judged the value of the theory in the first place. Unless we assume scientists to be infallible, the fact that a theory has been adopted does not justify that adoption.

It is still possible to argue, on the other hand, that the continuous progression of theories through competition insures the quality of the present survivors to be the “best in the field.” As Macbeth (1971) has argued, the best in the field has meaning only in relation to the competing theories. If they are not of high quality, their successors cannot boast of any great achievement. And if we possess a criterion, independent of the results of competition, by which to judge the quality of theories, why refer to the competitive aspect at all? The “best in the field fallacy,” as Macbeth terms it, is only another way of asserting the authority of the community without examining the standards from which that authority is derived.

The authority of the scientific community is, for any member of that community, a compromised witness. It fosters an indefinite faith in serendipity which suggests that any theory that generates activity must advance science. After all, activity implies change, and all change, given the inevitable advance of science, cumulates in progress. The mysticism of this view looks somewhat more recognizable if recast in the form of a sociological description. It is probably true, that is, that any community of co-workers which spends its energy carrying out practices dictated by communal belief and shared technique, will believe that such energy is fruitfully spent. “Business as usual” is profitable business. But while this may be an accurate description, it is self-defeating as an argument. By the assumption that any theory that has been adopted is good enough we relinquish, contrary to intention, the attempt to judge the value of a particular theory. 

The amount of work credited to a theory is finally only a measure of how many people believe it. This standard works fine for historians who are estimating the historical importance of this or that theory, but it has no bearing on the problem at hand. Consensus is a witness of convincing power but not, unless we owe the flat-earthers an apology, of truth.

The alternative seems to be a line of demarcation between good scientific theories and other speculations, but the attempt to use testability in this way with regard to natural selection came to grief in the preceding section. Yet whether natural selection is being tested or not, it certainly is being applied, and it is to these applications that its proponents will point when defending it. If a demarcation is to be found, it must be sought here.

Two possibilities suggest themselves: the generation of new information and the extension of the theory’s representative powers. The first has no value for our intentions because it applies to all theories which generate research. The second is more interesting. The representative power of a theory is its ability to represent, or explain, the world in its terms — to form a picture of what the world would be like, if the theory were true. A theory that cannot represent all of the relevant information after its own model is a nonstarter, for one could not, if the theory failed to do this, even imagine it to be true. It would seem, therefore, that the ability of the theory to explain the known world, and to extend such explanation to new information as it is discovered, should provide scientific credentials for the theory.

The problem at hand, however, concerns theories which are already in place, and therefore already credentialed in this sense. Nor is it very conceivable that a generally accepted theory could exhibit any serious weakness in this area, were that theory unfalsifiable. Could evolutionary biologists, for example, ever lack for “just so” stories — narrative scenarios which explain how it all happened after the fact? No mythology, belief system, etc., has ever been found wanting on this requirement! What is the import, then, for the purposes of estimating theories, of the extension of representation — i.e., new applications? When this activity is scrutinized, it seems to be nothing more than the discovery of what was already implicit in the theory itself. Such work investigates the theory rather than the world.

Let me illustrate the point above. Maynard Smith points out, in a passage quoted above, that “most organs can readily be understood as contributing to survival and reproduction,” and reminds us that were this not the case Darwinism would never have been accepted in the first place. The fact that these organs can be “understood” — represented — in this way is important, but it guarantees nothing more than the possibility of truth. As Smith implies when he mentions this aspect, the ability to represent the world is a necessary minimum for any theory, and thus we would not be considering the theory if it could not do so. Why Smith offers it as evidence in defense of a theory under attack I do not know. Yet the attitude that the representative powers of the theory may be used to form a judgment not merely of the possibility, but of the actuality, of the theory’s coincidence with reality is widely shared. A recent text book (Dobzhansky et al., 1977 [the passage below is by Ayala]) offers the following in a discussion of the “operational ways of measuring adaptation”:

Williams (1966) has proposed that a useful criterion for identifying individual adaptations is whether an analogy can be established between some human artifact and the feature presumed to be an adaptation. A mammalian oviduct may be seen as a mechanism for conveying the early embryo to the uterus; the uterus may be seen as designed for the protection and nourishment of the embryo. Ayala (1968b, 1970) has suggested utility as a criterion for identifying adaptations. A feature of an organism is regarded as an adaptation if it has utility for the organism and if such utility explains the presence of the feature.

Since the author is trying to show, at this point in the text, that natural selection is neither circular nor tautologous, the passage seems oddly misdirected.

By the turn of the nineteenth century it was quite clear that organisms seemed teleologically constructed — i.e., their features looked goal-directed or designed for specific purposes, one of which was always the continuity of the organism. This structure of appearances was appealed to as evidence of intelligent creation, and was one of the problem elements that would have to be explained by any theory of origins which did not invoke intelligent purpose. Darwin met the challenge through an analogy of intelligence — the action of natural selection approximates the action of an intelligent breeder. Given this history, however, the programs for “identifying” adaptations above seem little more than exercises in definition. Consider, for example, that the following points are part of Darwin’s argument:

  1. organisms exhibit goal-directed features

  2. the action of natural selection, seen as analogous to that of a breeder, can account for this.

    If we add to these the following definition:

  3. features which are supposed to have been evolved through this action are termed adaptations

    we must then agree that

  4. the goal-directed features of organisms, recognized by analogy to a human artifact or by the impression of utility, are adaptations.

But since this follows by definition, being simply the new name given to goal-directed appearances by our theory, we have made no empirical advance over pre-Darwinian biology through this extension of theory. The only empirical observations made were practically identical with pre-Darwinian es used to identify the appearance of goal-directedness. Williams and Ayala are exploring what their theory means. Their mode of operationalizing their concepts represents the world according to their model but produces no test of the model. They have assumed that goal-directed appearances are Darwinian adaptations, but the argument offers no evidence for the truth of this assumption.

I said at the beginning of this paper that fruitfulness could only be measured in terms of the user’s intent. A theory that is too indeterminate to be seriously tested, however, can only serve to picture the world rather than to investigate it. The picture is, predictably, generally applicable, for the very lack of determination that prevents serious testing also assures that the picture can be stretched to fit all circumstances. And the picture is compelling — it has become a ubiquitous belief in wider circles than the scientific community. It is only human to wish to extend belief, and by so doing to make the world more understandable. Peoples of all ages have done as much. But is there no distinction to be made between scientific investigation and the extension of belief? Lewontin (1972) clearly thought there was:

For what good is a theory that is guaranteed by its internal logical structure to agree with all conceivable observations, irrespective of the real structure of the world? If scientists are going to use logically unbeatable theories about the world, they might as well give up natural science and take up religion.

Of course, to make sense of this remark one must agree that there is a difference between those two institutions.

In what may be an uncharacteristic slip, Lewontin (1978) has provided me with a good example of the manner in which the Darwinian view exerts a metaphysical compulsion upon our thought. Concluding a discussion of the apparent untestability of adaptation, he writes:

Adaptation is a real phenomenon ... The problem of locomotion in an aquatic environment is a real problem that has been solved by many totally unrelated evolutionary lines in much the same way. Therefore it must be feasible to make adaptive arguments about swimming appendages. And this in turn means that in nature the ceteris paribus assumption must be workable.

The evidence cited, in this passage, consists of observations made before the theory was formulated. Lewontin seems to be saying that when the theory was offered to explain such evidence it was already known to be true, for he seems to assume that these observations can be explained in no other way. Well, if natural selection is the only thing that could possibly be behind the known structure of organisms, we do not need any further test. We need not, in fact, have any concern with testing at all, for we already know the answers, the truth being manifest from the beginning. Peters (1976) concludes that we are “in danger from an evolutionary metaphysic in which the logical Darwanian devices have become great ‘empirical’ truths.” Perhaps this is why the defenders of natural selection have so often missed the point of the criticism. The defenders already know that organisms evolved by natural selection, and they have difficulty realizing that the critics might not know.

The User’s Intent

I have emphasized in the preceding sections, the case for the prosecution because I wanted to make it clear that there was a case. The indeterminacy that clouds organism-environment relations is a crucial ignorance which effectively prevents the researcher from advancing further than the stage of representation. The concepts cannot be operationalized in a fashion determinate enough to afford a test. Of course, all this could change with future discoveries, but for the moment the limitation is clear. The critics are right about that.

On the other hand, this is not to say that it must follow that natural selection is a bad or unscientific theory. Such a decision rests, as always, with the intent of the person who uses the theory. While writing this paper, I was reminded that the criticism offered here would fit a great deal of science, not all of it biological. I must agree, so it would. But is this a definitive consideration? I take it that the critics of natural selection who got around to studying other theories would advance the same critique there if it applied. As Macbeth’s argument on the “best in the field fallacy” points out, the fact that a theory proves to be as good or even better than other theories only means that we have a basis for choosing between these theories, but it says nothing about whether any of them are any good. Natural selection is less determinate than some theories, but certainly as applicable as many others. I never meant to argue that there was a consensus in the scientific community that indeterminacy was a disqualifying criterion. Rather, I wanted to suggest that the mere fact of consensus was not enough to base our judgment upon.

So we return to the purpose of the researcher. What does he or she want from the theory? The theory does provide a rational explanation of appearances in terms of understandable mechanisms. If this is all that is desired, natural selection can meet the demand. On the other hand, the simple models of adaptation have run into trouble with almost predictable regularity (I mean models of optimalization, not mere selective pressure), as Smith admits. If the researcher wants to investigate the process by which fitness is optimalized, I am afraid the picture is rather unpromising. Good empirical investigation, for many researchers, and certainly for the critics I have been reading, demands some sort of testing. The theory manifests an inadequacy here. It is not testable, and should not be defended as such.

Since the theory has difficulties perhaps it should not receive such emphasis. There are alternative questions open to the researcher. The actual discovery of the patterns of nature may not necessitate a theory of their mechanism. The discovery of the natural system in taxonomy did not depend upon a speculation on the mechanism by which the system came about, but Darwin’s speculation on mechanism drew heavily on what was already known of that system. Darwin’s thesis would have been impossible without the advances made by pre-Darwinian taxonomy, particularly with regard to the concept of common plan. A branching diagram of taxonomic information may display a pattern of relations that owes nothing to the speculative mechanisms of natural selection, but which could prove to be of crucial value in investigating the world. Such patterns are themselves theories of a sort; they make predictions, they are testable, they are highly determinable by observation. They do not, of course, tell us about mechanism, at least, not immediately. But this represents an advantage, for we are not so likely to generate a pattern from our theoretical mechanism, rather than from our observations, if we have no theoretical mechanism. Perhaps many researchers simply do not need natural selection to investigate their part of the world. It would be interesting to find out whether this were so. Candid readers must decide for themselves.

Summary and Conclusion

Natural selection has demonstrated its ability to represent the world in its terms and to compel belief. It has been of crucial importance historically, in that it was instrumental in the conversion of the biological community to evolutionary thought. But until the organism can be theoretically reduced to determinate laws, the theory is too incomplete to be testable. Applications of the theory, therefore, tell us more about the theory than the world, and do not serve the purpose of efficiently investigating the latter. When the confusion between test and application is eliminated, the present dialogue on the merit of the theory may emerge from the miasma of miscommunication, and the theory’s value for various intentions be clearly estimated. Given that the stated purpose of most research is the investigation of the world, it may be that, until such time as a complete theory is put forward, the effort expended on applications of natural selection would be best spent elsewhere — either in other research, or in the attempt to formulate a more workable theory to replace Darwin’s speculation.


Ronald H. Brady taught in the school of American Studies at Ramapo College in Mahwah, New Jersey. This article was originally published in 1979 as “Natural Selection and the Criteria by Which a Theory Is Judged”, Systematic Biology vol. 28, pp. 600-21.

Acknowledgments

I am grateful to Donn Rosen for suggesting this paper and to Norman Platnick and David Hull for their careful reading and critical comments. My thanks also to Norman Macbeth for bringing my attention to the problem in the first place, and for many conversations on the subject.

References

Bethell, T. 1976. Darwin’s mistake. Harper’s Magazine 252:70-75.

Bohm, D. 1969. Some comments on Maynard Smith’s contributions. In Waddington, C. H. (ed.), Towards a theoretical biology, vol. 2, pp. 98-105.

Darwin, C. 1859.  On the origin of species. First ed. London. Darwin, C. 1876. On the origin of species. London.

Dobzhansky, T, F. Ayala, G. L. Stebbins, and J. Valentine. 1978. Evolution. W. H. Freeman, San Francisco.

Dobzhansky, T. 1975. Book review. Evolution 29:376-378.

Duhem, P. M. M. 1906.  La Theorie Physique, Son Objet et sa Structure. English trans. The aim and structure of physical theory. Princeton University Press. 1954.

Ferguson, A. J. 1976. Can evolutionary theory predict? Amer. Natur. 110:1101-1104. Gould, S. J. 1976. Darwin’s untimely burial. Nat. Hist. 85:24-30.

Grene, M. 1969.  Notes on Maynard Smith’s “Status of neo-Darwinism.” In Waddington, C. H. (ed.), Towards a Theoretical Biology, vol. 2, pp.97-98.

Grene, M. 1974. The understanding of nature. Essays in the philosophy of biology. Reidel, Boston. Haldane, J. B. S. 1935. Darwinism under revision. Rationalist Ann. 19-29.

Hull, D. 1974. Philosophy of biological science. Prentice-Hall, Englewood Cliffs, New Jersey.

Lewontin, R. C. 1972. Testing the theory of natural selection. Nature 236:181-182.

Lewontin, R. C. 1977. Adaptation. In The Encyclopedia Einaudi. Torino Giulio Einaudi Edition.

Lewontin, R. C. 1978. Adaptation. Sci. Amer. 239-3:212-230.

Manser, A. R. 1965. The concept of evolution. Philosophy 40:18-34. Macbeth, N. 1971. Darwin retried. Gambit, Boston.

Medewar, P. 1978. Book review. The Sciences 18-5:24-27.

Peters, R. H. 1976. Tautology in evolution and ecology. Amer. Natur. 110:1-12.

Popper, K. 1934. Logic der Forschung. English trans. Logic of Scientific Discovery. Hutchinson, London. 1959.

Quine, W. 1953. From a logical point of view. Cambridge. Harvard University Press, Rev. Ed. 1961. 

Sagan, C. 1977. The Dragons of Eden. Random, New York.

Simpson, G. G. 1967. The meaning of evolution. Yale University Press, New Haven.

Smith, J. M. 1969. The status of neo-Darwinism. In Waddington, C. H. (ed.), Towards a theoretical biology, vol. 2, Aldine, New York.

Smith, J. M. 1978. Optimization theory in evolution. Ann. Rev. Ecol. Syst. 9:31-56.

Waddington, C. H. 1960.  Evolutionary adaptations. In Yax, S. (ed.), The evolution of life. University of Chicago Press, p. 385.


Manuscript received July 1979
Revised October 1979

 
Seth Jordan