Scientific method

The scientific method is a logical method of expanding knowledge. It depends on observation, measurement, prediction, experimentation, and verification, and distinguishes science from other fields of knowledge. (See Philosophy of science).

The notion of 'a' scientific method has been criticized for providing an overly simplistic, and perhaps misleading, view of what scientists do, and how they view the world. For example, not all scientists actually work the same way; some seem to use 'insight', others are more methodical. There are also widely varying capacities to apply empirical methods crucial to scientific trust and consensus, e.g. scientists in developing nations or corporate laboratories do not have the same incentives or pressures on their work as those in academic settings in developed nations. It is hard to separate the scientific method from the political economy in which it operates, e.g. value placed on life, which may restrict testing on humans or other primates, or on whole ecologies.

Attempting to understand scientific method is essentially an attempt to make philosophical sense of the underlying mechanism(s), concentrating on the common elements all scientists use. In recent years, an alternative model for the process of science -- including the scientific method and its underlying empirical methods -- has been proposed by Thomas Kuhn, who suggested that sociological mechanisms were important, even central, in science. The word paradigm occurs in his work, but Kuhn later rejected the word, in favor of a critique of what he called 'normal science', that is, incremental science that does not seek breakthroughs or deep changes to the organization of scientists themselves.

Other viewpoints hold that the scientific method is not so much a way to acquire knowledge as it is simply a procedure for validating knowledge already gathered. Regardless, there is a general consensus that the scientific method provides at least a mechanism to improve existing or proposed knowledge about some things and to eliminate errors and cultural biases. Post-20th-century thought on these topics has focused on quasi-empirical methods, e.g. peer review, spread of notations, which are the key common concern of philosophy of science and philosophy of mathematics. In the presentation of the 'ideal' scientific method that follows, one must keep in mind that many parties are simultaneously executing empirical methods and reproducing work of others, and that social and linguistic processes play key roles in deciding the degree of examination that any given hypothesis will receive in practice.

History is replete with examples of accurate theories ignored by peers, and inaccurate ones propagated unduly, due to social factors that no 'scientific method' would choose to promote - but which are inevitable aspects of being fallible social humans. Concepts like 'validating knowledge already gathered' or 'improving knowledge' and 'eliminating error' or 'bias' implies some kind of value system or moral core distinctions between 'good' and 'bad' are in effect. These are usually socially determined or at least socially censored.

Scientists vary on how 'real' their models of reality are - the traditional concern of philosophy of science itself. Extreme skeptics argue that no empirical methods are so truly accurate as to be able to 'validate' any given theory, and therefore all of science must be seen as quasi-empirical. In effect, they argue that mathematics is just another science, and science is just another human construction, and that the scientific method itself is a way that human cultures come to agree on facts, notations, and even predictions.

An interesting piece of evidence for this view is that game theory came gradually to be accepted by the diplomats and scientists of both the U.S.A. and U.S.S.R. as describing the reality of their shared nuclear standoff. The doctrine of mutual assured destruction, advanced by game theorists, was instrumental in convincing both parties to avoid launching pre-emptive attack, and eventually ending the Cold War. One can reasonably argue that there would be no mathematicians nor scientists to pursue any other view of science, or any method at all, had the game theorists failed to convince both nations!

History of the Scientific Method

Before the development of scientific method the tools of knowledge development and testing included Aristotelian logic, the Socratic method, and even divine inspiration. The earliest explicit foundations of the scientific method are often credited to Roger Bacon and Galileo Galilei. Later contributions by Francis Bacon, Rene Descartes, Karl Popper, and others added to the understanding of scientific method. (See History of Science and Technology, philosophy of science).

A Popular Description of the Method

The scientific method is often described today as comprising these main actions:

Observe: Collect evidence and make measurements relating to the phenomenon you intend to study.
Hypothesize: Invent a hypothesis explaining the phenomenon that you have observed.
Predict: Use the hypothesis to predict the results of new observations or measurements.
Verify: Perform experiments to test those predictions. "Testing", or attempting to experimentally falsify, is thought by many to be a better choice of term here.
Evaluate: If the experiments contradict your hypothesis, reject it and form another. If they confirm it, make more predictions and test it further.

These steps are repeated continually, building a larger and larger set of well-tested hypotheses to explain more and more phenomena. They are generally performed in an orderly manner perhaps as listed above, but not necessarily: for example, theoretical physicists often invent totally new hypotheses before using them to decide what phenomena to observe.

Observation

Scientific observation consists mostly of making careful measurements (See Measurement). It is important that the methods of gathering the evidence be disclosed, particularly when the evidence being presented has not been previously reported (as with the results of previous experimentation). This makes it possible for others to repeat the observations independently to check for bias. Failure to disclose methods and techniques has caused several famous scandals, for instance P. Kammerer's discredited work with toads.

Scientists also try to use operational definitions of their measurements. That is, measurements and other criteria for observation are defined in terms of physical actions that can be performed by anyone, rather than being defined in terms of abstract ideas or common understanding. For example, the term "day" is useful in ordinary life and we don't have to define it precisely to make use of it. But in studying the motion of the Earth, you have to be more careful else your measurements be so sloppy as to be useless, so science makes two operational definitions of a day: a solar day is the time between observing the sun at a particular position in the sky and observing it in the same position the next time; a sidereal day is the time between observing a specific star in the night sky at a specific position, and that same observation made the next time. These are useful since they are slightly different as a result of how the Earth moves, and properly using one or the other avoids problems. In particular, you will come to notice that the length of the solar day varies over the course of a year; you can then make a new operational definition of mean solar day as the average of these and study further. And so on.

Hypothesis

In the hypothetical stage, scientists use their own creativity (currently not well understood), or any other methods available, to invent possible explanations for the phenomenon under study. For some philosophers of science the most important aspect of an explanation is that it must be falsifiable, whereby a contrary fact from an experiment must be possible (in other words, if no experiment can ever demonstrate the hypothesis to be false, the hypothesis is unscientific though perhaps true).

The scientist should also be--but need not be and often is not--impartial, considering all known evidence, and not merely the evidence which supports the hypothesis being developed. This makes it more likely that the hypotheses formed will be relevant and useful.

Explanations should also satisfy the principle of Occam's Razor; i.e., the hypothesis is expected to contain the least possible number of unproven assumptions. For example, after a storm a tree is noticed to have fallen. Based on this evidence of "a storm" and "a fallen tree" a reasonable hypothesis would be "a lightning bolt has hit the tree"--a hypothesis which requires only one assumption--that it was, in fact, a lightning bolt (as opposed to a strong wind or an elephant) which knocked over the tree. The hypothesis that "the tree was knocked over by marauding 200 meter tall space aliens" requires several additional assumptions (eg, concerning the very existence of aliens, their ability to travel interstellar distances and an alien biology that allows them to be 200 meters tall in terrestrial gravity) and is therefore inferior. Certainly more than one hypothesis can be entertained to explain the same phenomena, and some of them might even be complex and require 'too many' assumptions for comfort, but Occam's Razor is only a rule of thumb for quickly evaluating which hypotheses are likely to be fruitful; it is not a strict rule, nor an invariable aspect of the scientific method.

It was once thought that science was based on inductive reasoning; that is, if one observes the same thing many times without observing an exception, one can conclude from that observation alone that the phenomenon is consistent. This was the view of Francis Bacon and some other of the empiricists, for example. David Hume's critique of induction itself settled its use in validation or proof. In the modern understanding of scientific method, induction serves only as a means of suggesting hypotheses; these still must be tested by experiment and evaluated in the same way as other hypotheses.

Prediction

Hypotheses are also considered superior to other possible ones if they have more predictive power; that is, if there are many possible observations one might make that would falsify the hypothesis. The hypothesis that "all matter turns into chocolate when no one is looking, and then turns back if anyone looks" cannot be refuted, since the very definition of the problem contradicts testing (ie, makes no testable prediction), and is therefore not a proper scientific hypothesis. A hypothesis that predicts that "light bends in a strong gravitational field" (ie, one aspect of Einstein's theory of general relativity) is a strong hypothesis as it suggests concrete measurements which can be conducted to support or refute the claim. Using the prior "fallen tree" example, the hypothesis 'predicts' that the fallen tree will exhibit scorch marks or similar markings consistent with a lightning strike, and that meteorological records of the storm are likely to show that lightning occurred.

Note that deductive reasoning is generally used to predict the results of the hypothesis. That is, in order to predict what measurements one might find if you conduct an experiment, treat the hypothesis as a premise, and reason deductively from that to some not currently obvious conclusion, then test that conclusion. For example, Einstein's equations implied that time operated differently than had been thought, but that the difference was one which could be tested only under conditions that humans had never seen. Assuming his model and the equations applying to it were accurate, and reasoning deductively from them, it was possible to see that a clock sent on a fast spaceship would slow down compared to an identical clock left on Earth, if Einstein's special relativity model were correct, while if it were wrong, the clocks should stay synchronized, or at least not go out of synch in the way predicted. In 1905, when Einstein published his first special relativity paper, spaceships were purely fantasy. They became less so after World War II and this test became possible. A sufficiently quickly moving clock (ie, in Earth orbit) does indeed slow down with respect to its stationary twin (ie, still on the surface of the Earth). Every such experiment since they became possible has shown the same effect.

Verification

Probably the most important and universal aspect of scientific reasoning is verification: every hypothesis must be tested by performing appropriate physical experiments and measuring the results. since measurements are inherently imperfect (from human involvement if nothing else), and since measuring equipment has been getting better and better, new measurements are often more precise than their predecessors. This is both useful as a practical matter (eg, in chemical engineering or planetary exploration), but have sometimes demonstrated previously unknown variations from currently accepted theory (eg, the CPT experiments of Yang and Lee in the 1950s which forced fundamental changes in much of particle physics). Ideally, the experiments performed should be fully described so that anyone can reproduce them, and many scientists should independently verify every theory with multiple experiments. This is known as reproducibility.

Scientists should also attempt to design their experiments carefully. For example, if the measurements to be taken are difficult or more than ordinarily subject to observer bias, one must be careful to avoid distorting the results by the experimenter's wishes. When experimenting on complex systems, one must be careful to isolate the effect being tested from other possible causes of the intended effect(this is called a controlled experiment). In testing a drug, for example, it is important to carefully test that the supposed effect of the drug is produced only by the drug itself, and not by the placebo effect or by random chance. Doctors do this with what is called a double-blind study: two groups of patients are compared, one of which receives the drug and one of which receives a placebo. No patient in either group knows whether or not they are getting the real drug; even the doctors or other personnel who interact with the patients don't know which patient is getting the drug under test and which is getting a fake drug (often sugar pills), so their knowledge can't influence the patients either.

Note, however, that "verification" may be a misleading word, in that we don't really "confirm" or "verify" a hypothesis so much as we fail to refute it. We do not understand enough about the natural world to be certain that our current understanding of it (or some part of it) is correct. There have been many instances in the history of science in which one or another important scientist announced that there was no more to discover about some subject. These announcements have been, sooner or later, uniformly embarrassing. We may indeed understand the fundamental nature of some natural phenomena, but we know of no way to realize this--even if true. A better word, perhaps, would be "check". Too many "final understandings" have been torpedoed to claim anything stronger.

Evaluation

Any hypothesis, no matter how respected or time-honored, must be discarded once it is contradicted by new reliable evidence. Hence all scientific knowledge is always in a state of flux, for at any time new evidence could be presented that contradicts long-held hypothesises. A classic example is the Wave Theory of Light--although it had been held to be incontrovertible for many decades, it was refuted by the discovery of the photoelectric effect. The currently held theory of light holds that photons (the 'particles' of light) also behave as waves under some circumstances. In the earlier tree example, the lack of scorch marks or of reports of lightning, combined with reports of hurricane force winds would cause the original hypothesis to be re-evaluated as less probable and a new one ("The tree was knocked over by strong winds") to be proposed. Choosing between the two would require additional tests. Note, however, that the tree example involves "historical tests" and illustrates one of the differences between an experimental science (e.g., physics) in which the phenomena being investigated can be reproduced as needed (or as can be affored for some branches of physics) and an observational one (e.g., paleontology or stellar evolution in which the only available 'experiments' are those conducted by 'nature' and which we might be able to observe).

Further, the experiments that reject a hypothesis should be performed by as many different scientists as possible to guard against bias, misunderstanding, and fraud. Scientific journals use a process of peer review, in which scientists submit their results to a panel of fellow scientists (who may or may not know the identity of the writer) for evaluation. Scientists are rightly suspicious of results that do not go through this process; for example, the cold fusion experiments of Fleishman and Pons were never peer reviewed--they were announced directly to the press, before any other scientists had tried to reproduce the results or evaluate their efforts. They have not yet been reproduced elsewhere as yet; and the press announcement was regarded, by most nuclear physicists, as very likely wrong. Proper peer review would have, most likely, turned up problems and led to a closer examination of the experimental evidence Fleishman, Pons, et al believed they had. Much embarrassment, and wasted effort worldwide, would have been avoided.

Scientific Models, Theories and Laws

The terms "hypothesis", "model", "theory" and, "law" are often used incorrectly when applied to scientific ideas. (Let alone that often a hypothesis becomes a dogma or a taboo issue by the passing of the centuries and the immense inertia represented by the huge number of its desperate supporters.)

In general a hypothesis is a contention that has not (yet) been sustained or refuted, as one or more predictions made from it have not yet been tested. However, once the predictive phase has been carried out (at least to some degree) and there is some experimental evidence that supports the hypothesis then it will often begin to be referred to as a "model".

Groups of models may be combined into a "theory"; such as the theory of evolution by natural selection, or the theory of electromagnetism.

Models and theories that have withstood the test of time (and many experimental tests), and that have not been falsified by credible, repeatable experimental evidence or observation, may eventually acquire the 'status' of a "law".

It is a fundamental tenet of the scientific method that all "results" are provisional, and this must include the so-called "laws". Newton's "law of gravitation" is a famous example of a "law" that has been found to be only a partially correct (see general relativity description of gravity and the behavior of matter in motion.

Uninformed observers often have the impression that scientific laws are immutable, having been passed, promulgated, or decreed by some higher body or Being. Insofar as science is a human endeavour (and it certainly is, judging by the number of blunders its practitioners have committed throughout its history), this is simply wrong. A scientific law is just a theory that lots of us believe to be correct. We can hope that 'us' in this case includes well informed, highly capable, people. When it does not, the result has too often been egregious nonsense.

Philosophical Foundations of the Scientific Method

One school of thought asserts that the scientific method (and science in general) relies upon basic axioms or "self-evident truths" such as realism and consistency. While it is true that many scientists believe these things and do assume them in their everyday work, the method itself does not rely on them: all such assumptions are just part of the hypotheses being tested, and many of them are subject to test as well. For example, one of the "common sense" ideas that scientists believed for a long time is that any measurable property of an object is something that exists in the object before it is measured, and our measurements are merely observations of that pre-existing condition; Quantum mechanics rejects this, because experiments have contradicted it.

Some believe that scientific principles have been "solidly" established, beyond question. Some scientists themselves may indeed feel that way, having come to rely upon many of the results of science without having done all the experiments themselves; after all, one cannot expect every individual scientist to repeat hundreds of years' worth of experiments. Many scientists even encourage an attitude of skepticism toward claims that contradict the current state of common knowledge; but that only means such claims must meet a higher burden before being accepted, not that they can never be accepted. In the extreme, some, including some scientists, may believe in this or that scientific principle, or even "science" itself, as a matter of faith in a manner similar to those of religious believers. However, neither science nor scientific method itself rely on faith; all scientific facts (i.e., measurements) and explanations (i.e., hypotheses) are subject to test, and will eventually be rejected as the best available hypothesis upon new evidence falsifying them. (See more under falsificationism.

This is the reason that political, religious, or social enforcement of scientific convictions is inherently pernicious. Examples include the Roman Catholic Church's action against Galileo's non-Aristotelian discoveries about the behavior of the planets (they violated some prestigious, and ancient, philosophical speculation the Church had promoted to dogma), and Stalin's support for Lysenko's biological and genetic beliefs (what was wrong with standard genetics in Stalin's view is not clear; Lysenko was either a deliberate con man or incapable of following standard genetics).

Criticisms of the Existence of a Scientific Method

It is unlikely that anyone would dispute that the application of the "scientific method" is a standard approach for (retroactively) testing the status of any scientific hypothesis or theory. Certainly no respectable peer-reviewed journal would publish any scientific work if it could not be demonstrated that its hypotheses can (in principle) be shown to be in accordance with the method as presented here. Except by mistake. What is regularly disputed is the contention that scientific research is, in fact, consistently carried out in the procedural manner described above.

The scientific method, as presented, offers no guidelines for the production of new hypotheses. Scientific folklore is strewn with stories of scientists describing a "flash of inspiration" which then motivated them to look for evidence to support their assertion. Some accounts tell of scientists operating on a "hunch" or a "gut instinct" prior to obtaining any supporting evidence for their hypotheses. Likewise, many scientists will follow a theory because it is "elegant" or "beautiful"; or choose not to follow a theory because it is "counter-intuitive". These psychological reactions are quite common in scientists as with us all. Thomas Edison, primarily an engineer and not a scientist, spoke of "99% perspiration and 1% inspiration", which is most amusing in English. But regardless of the source(s) of hypotheses, it is fundamental that science (through its practitioners, scientists) test all hypotheses (or theories if you prefer); the results of those tests are the ONLY scientific criteria for retaining a hypothesis. Neither beauty nor intuitive conviction nor prestige of proposer nor political or religious support is acceptable as a substitute.

Another criticism of the scientific method (as here presented) is that it fails to acknowledge the incalculable impact that mathematics has had on scientific research and direction. A hypothesis about the physical world that is based solely on implications derived from mathematical analysis can hardly be said to be in accordance with the "observational" phase of the scientific method (a purely mathematical property cannot properly be called a "fact" about the physical universe). Nonetheless scientific history includes hundreds of occasions on which a scientific theory has been proposed based solely on mathematics, notably some aspects of quantum mechanics and the application of fractal geometry to certain areas within biology, among others. A simple reply to this is that it does not affect the scientific method itself to add an extra prior step: that is, to use mathematical results to choose what to observe and to inspire hypotheses.

Imre Lakatos showed how practitioners and philosophers of science throughout the ages have constructed historical accounts to suit their pet philosophies and methods. This "rational reconstruction", as it is known, of the history of science is then used to justify certain ideological assumptions, producing what might tentatively be called a mythology of science. This criticism was also levelled by Paul Feyerabend against all attempts to produce demarcation criteria. According to Feyerabend, proposed scientific methods fail on two counts. Firstly they fail as a descriptive account of the historical record. Secondly he objected to any single prescriptive scientific method on the grounds that science has no single aim. Without a fixed ideology, or the introduction of religious tendencies, the only approach which does not inhibit progress (using whichever definition of progress you see fit) is "anything goes": "'anything goes' is not a 'principle' I hold [...] but the terrified exclamation of a rationalist who takes a closer look at history." (Feyerabend, 1975).

A further criticism of the scientific method is that it provides no guidelines for choosing between two equally possible hypotheses that meet all the other requirements for simplicity, evidential compliance, etc. Any scientist in such a situation will tend to support the hypothesis which "feels the best", and hence is likely to make a subjective selection influenced by cultural and/or personal bias. Of course, if there is no physical experiment to distinguish one scientific hypothesis from another, then it cannot matter in one's ordinary life which one chooses to support. Nor would it matter in science; either would be acceptable until further data is available which falsifies one or both.

It is not the goal of science to answer all questions, nor even to 'explain' any phenomena which are not experimentally accessible. Science does not produce truth, it merely improves the currently best hypothesis about some aspect of reality. It cannot therefore be a source of value judgements. It can certainly speak to matters of ethics and public policy by pointing to the likely consequences of actions; it simply can't tell us which of those consequences to desire or which is 'best'. What one projects from the currently most reasonable scientific hypothesis into other realms of interest is not a strictly scientific question and the scientific method offers no assistance for those who wish to do so. They often claim scientific justification, nevertheless.

Scientific Method and Public Policy Questions

In matters of public policy, the quality of 'scientific support' claimed for a position is generally inversely related to that position's benefit to the claimer. In short, if 'junk science' will help a position that will benefit me, only considerable ethical uprightness will prevent me from using it. Such ethical standards are regrettably less common than we would all hope. Since the audience (ie, everyone for some such debates) is rarely in a position to independently evaluate the scientific support claimed by anyone, much 'junk science' has achieved prominence. Without mastering the underlying science, about the only thing the non-scientist can do is attempt to filter out economic and social interests, taking seriously only those who don't seem to have a stake in having one or another position adopted as a proxy for evaluating the quality of the science. For instance, a chemical company caught dumping something in a local stream claims it has scientific support for the harmlessness of the dumping and therefore nothing should be done, certainly not at its expense, about the dumping. The local law provides that those who dump dangerous stuff should clean it up. Local environmentalists claim to have scientific support for the danger and that therefore the company should be compelled to clean up the contamination. What should local government do? How should the citizenry judge the government's performance. A first evaluation is probably to look to 'the science'. But, whose science is correct? Perhaps neither, but as a first attempt to decide between the two positions, the company's financial interest indicates that its scientific support need not be believed out of hand. It has a higher burden of 'disbelief' because of that interest. In such cases, governments often call for an independent scientific evaluation and announce they will take action based on that report. At which point the dispute will change into an attempt to find 'independent' scientists who are believed to be likely to support one side or the other.

Actual science has little place in such disputes since they are essentially economic or social, not scientific. Scientific method has even less, for the same reason.

External links

An Introduction to Science: Scientific Thinking and the Scientific Method by Steven D. Schafersman.
The Myth of the Scientific Method by Dr. Terry Halwes
Rational Reconstruction and Historical Reconstruction, Horus Publications

References

Feyerabend, 1975. Against Method. London: Verso. (ISBN 0860916464)
Feyerabend, Lakatos, 2000. For and Against Method. University of Chicago Press. (ISBN 0226467759)