An Introduction to Daubert v. Merrell Dow
This is a slight abridgement and an article from The Florida Bar Journal, April 1999. I have edited and updated it slightly. This law is moving very quickly: less than a month after this article was published, the Court handed down Kumho Tire v. Carmichael, __ U.S.__ (1999). Kumho extends Daubert to non-scientific testimony, and sometimes in somewhat conflicted ways. See The Impact of Kumho Tire
The Impact of Daubert v. Merrell Dow Pharmaceuticals, Inc., on Expert Testimony:
With Applications to Securities Litigation
Stephen Mahle, April 1999.
When the United States Supreme Court handed down its opinion in Daubert v. Merrell Dow Pharmaceuticals, Inc., it began a wide-ranging debate about the rules that govern the admissibility of expert testimony in both state and federal trials. The courts of nineteen states have adopted Daubert, and those of eleven, including Florida, have apparently rejected it. Over one hundred articles have been published in response to the decision, and there are several Daubert web sites, including one sponsored by Harvard Law School. This vigorous response is not surprising, because Daubert held that the Federal Rules of Evidence had displaced the fifty-year-old Frye "generally accepted" standard for the admissibility of scientific testimony in federal trials and then determined a new "standard for admitting expert scientific testimony in a federal trial." Despite substantial disagreement in the legal community about what Daubert really means, this articleís perspectives are that the meaning of the Courtís scientific dialogue is fairly clear and that even though the scientific principles that the Court articulates are ultimately discussed as scientific and statistical concepts that are somewhat alien to the legal system, the Courtís holding has its basis in principles of philosophy and logic which have long informed the legal system.
The philosophy of science that the Court draws so heavily upon focuses on the nature of scientific investigation and informs virtually all of modern scientific inquiry, from DNA testing (do the blood samples match?), through medicine (does smoking cause lung cancer?), epidemiology (does Bendectin cause birth defects in human embryos?), economics (does spending rise with income?), and finance (did the release of fraudulent information cause the firmís stock to rise?). The philosophy of science provides the framework that practitioners in all of these disciplines use to analyze data to find out whether their theories (smoking causes cancer, the release of fraudulent information caused the stockís price to rise) are correct, and once one understands the philosophical basis of science upon which the Court relies , much of the statistical part of scientific testimony just plain makes sense and that is half, and perhaps more, of understanding the entirety of the expert testimony that is offered in courts today.
This article begins by outlining the Courtís holding and discussing the scientific framework that is the basis of the Courtís analysis. Included in this discussion of the scientific framework employed by the Court is a discussion of how the fundamental statistical concepts that experts use in their testimony have evolved from the scientific framework that the Court articulates. Most of the articleís statistical analysis of existing cases is from disciplines like epidemiology and DNA testing, since those are the areas that have most notably made their way into the legal system. However, the article also draws parallels from the techniques used by epidemiologists and DNA analysts to analogous techniques used in other fields, especially finance and economics, and cites to other articles that draw similar parallels to analogous techniques used in medical research and accounting. Demonstrating the similarity of the scientific techniques employed in these diverse branches of science supports the articleís contention that once one becomes conversant in the scientific techniques used in any one of these disciplines, that knowledge goes a long way to understanding the scientific techniques used in the others. The article concludes with a discussion of the role of Daubert in Florida Courts, which appears to be somewhat more extensive than what some Florida Courts believe it to be.
The Daubert Court begins its explanation of the criteria that trial courts should use to screen "purportedly scientific evidence" by parsing Rule 702, focusing on the meanings of "scientific" and "knowledge." An important key to understanding the Courtís reliability-based analysis of the admissibility of expert testimony lies in the Courtís focus on the requirement that, in order for expert testimony to be admissible: "[t]he subject of an expert's testimony must be Ďscientific . . . knowledge,í" because it is "the requirement that an expertís testimony pertain to Ďscientific knowledgeí" that "establishes a standard of evidentiary reliability" (emphasis added). But, "in order to qualify as Ďscientific knowledge,í an inference or assertion must be derived by the scientific method . . . ." In brief, since only scientific knowledge can be offered as expert testimony, and the Court regards as scientific knowledge only that which is derived by the scientific method, only inferences that are derived by the scientific method can be offered as expert opinion testimony.
The Court repeatedly uses the phrase "the scientific method." This is a term of art with a specific meaning in the scientific community, and the Courtís discussion of the scientific method quotes from seminal works on scientific inquiry more than enough to make it clear that the Court is using the term in that manner. Indeed, much of the language relied upon by the Court in its discussion of the scientific method is strikingly similar to the language used in several amicus briefs filed by or on behalf of scientists from industry and academia. The Court stated that:
"Ordinarily, a key question to be answered in determining whether a theory or technique is scientific knowledge that will assist the trier of fact will be whether it can be (and has been) tested. "Scientific methodology today is based on generating hypotheses and testing them to see if they can be falsified; indeed, this methodology is what distinguishes science from other fields of human inquiry."(emphasis added)
The testing of hypotheses that the Courtís emphasized language requires is called "hypothesis testing" in the scientific community and as the Courtís quotations indicate, hypothesis testing is the essence of the scientific method. It is noteworthy that the Daubert Court required that experts follow this "scientific method" even before it turned to the four factors that commentators and lower courts have fixed upon. This is noteworthy both because the scientific method is the cornerstone of the philosophy of science and because testimony that proceeds in accordance with the scientific method will always satisfy the courts first two criteria, which seem primarily statistical, but which have their basis in philosophical tenants that have been generally accepted within the scientific community for hundreds of years. The four Daubert criteria for evaluating the admissibility of expert testimony are: (1) whether the methods upon which the testimony is based are centered upon a testable hypothesis; (2) the known or potential rate of error associated with the method; (3) whether the method has been subject to peer review; and (4) whether the method is generally accepted in the relevant scientific community. Given the rest of the opinion, it seems appropriate that the first two of the Courtís four criteria amount to asking whether the techniques upon which the testimony is based are grounded in the scientific method. It is no less appropriate that virtually no expert testimony will satisfy the last two factors unless it satisfies the first two.
The Scientific Method and Daubert's Four Factors for Admissibility of Expert Testimony
Hypothesis testing: Hypothesis testing is the process of deriving some proposition (or hypothesis) about an observable group of events from accepted scientific principles, and then investigating whether, upon observation of data regarding that group of events, the hypothesis seems true. Because it is hypothesis testing that distinguishes the scientific method of inquiry from non-scientific methods, and because the scientific method of inquiry is required for the resulting inferences to be the basis of admissible expert testimony, hypothesis testing would be deserving of careful consideration even if it were not one of the Courtís four enumerated factors. The basic technique of hypothesis testing has been well settled for decades and an example demonstrates the technique.
A simple example of hypothesis testing: Looking at a single six-sided die might lead one to the proposition (or hypothesis) that each of the six numbers is equally likely to be rolled on each roll of the die. This hypothesis is tested scientifically by proposing the "null hypothesis" that each number is equally likely to land face up, and then rolling the die (say) 600 times and recording the number of times that each number is actually found face up. If an appropriate statistical test is used, and if each number occurred about 100 times, the statistical test will be unable to reject the null hypothesis of equal probabilities, and the scientist will be left with the likelihood that the die is fair. However, if the number Ď3í occurs a disproportionate number of times, say 200 times out of 600 rolls, then the statistical test will be likely to reject the null hypothesis of equal probabilities and the scientist will interpret this as evidence that the die is loaded and reject the null hypothesis.
Examples of hypothesis testing and, therefore, the scientific method alluded to by the Court in Daubert are plentiful in all of the most highly regarded professional journals that publish empirical research.
The known or potential error rate: The second parameter that Daubert suggests that trial judges use in evaluating the scientific validity and, therefore, evidentiary reliability of "purported scientific testimony" is the "known or potential rate of error" associated with using the particular scientific technique. In plain language, this is the likelihood of being wrong that the scientist associates with the assertion that an alleged cause has a particular effect. Most scientists routinely require that this error rate be very small, usually between one and five percent
There are two types of error rates in testing hypotheses and they are denoted as "Type I error" and "Type II error." Type I error is the testís propensity for false positives while Type II error is the testís propensity for false negatives. For example, if a drug test for a substance comes back positive, but the tested individual has not actually used the drug, a lay person would call that a false positive, while a scientist would call it a Type I error. This Type I error is the most commonly cited component of the "error rate" in hypothesis testing. This error rate is also known both as the "level of confidence" of the hypothesis test and as the level of statistical significance of the testís result. Determining this error rate is actually part of conducting an hypothesis test. A common assertion in scientific research is that "the null hypothesis is rejected at the 1% level," or equivalently "the result is statistically significant at the 1% level," which means that the statistical technique used to test the hypothesis, if applied to data where the null hypothesis is true, would reject the null hypothesis only 1% of the time. If such a statement were made about the example of the single die above, it would mean that if the die were not loaded and the experiment of rolling it 600 times and testing the null hypothesis that the die was fair were done 100 times, 99 of those tests would correctly show the die to be fair, while 1 of those tests would incorrectly show the die to be loaded.
The relationship between the Courtís first two criteria, the hypothesis test and the error rate, is so close that it is virtually unheard of for a scientist to report that a hypothesis was rejected without stating the level of confidence at which it was rejected. Indeed, such a report would be completely meaningless. The Type II error is more subtle and not commonly reported in scientific studies.
Peer review and publication: The third criteria that the Supreme Court suggested for use by trial courts in determining whether expert testimony reaches the trier of fact is "whether the theory or technique has been subjected to peer review and publication." Publication is typically the purpose for which research is offered up for peer review and passing the peer review is required for publication. "Peer review and publication" of a scientistís work is largely a term of art that means that the scientistís peers have sanctioned the work as credible and accepted it for publication. Publication then exposes the work to further review by other scientists whose responses to the research indicate their agreement or disagreement with the methods and results of the work. Scientistís peers often express agreement with the work of a particular scientist by citing the work with approval or as authority, or by extending the work. Properly executed hypothesis tests with their attendant error rates are the essence of scientific method and are very nearly necessary conditions for peer review to result in publication.
General Acceptance: Like the Courtís third criterion, this is a summary measure of the extent to which the expertís methods produce information that qualifies as scientific knowledge. Scientific methods begin the process of becoming generally accepted in the scientific community by bringing appropriate hypothesis testing techniques to bear on questions (or hypotheses) of interest to the scientific community in a fashion that results in the peer approval required for publication. They move toward general acceptance by then withstanding the scrutiny of the broader scientific community to which publication exposes the methods.
Of course there are numerous odd propositions for which there exists some collection of individuals who will assert that they comprise a relevant scientific community and that the proposition is generally accepted within their community. The Court discussed briefly the characteristics of a "relevant scientific community," citing the analysis of United States v. Downing, which says that that the inquiry should focus on "the non-judicial uses to which the scientific techniques are put." The Downing courtís elaboration of that point noted that the absence of non-litigation uses for a scientific technique is taken as evidence of a lack of reliability while the existence of non-litigation uses for a technique is taken as evidence of the reliability of the technique. The inescapable conclusion is that the relevant scientific community within which the technique finds acceptance must be the community of real world scientists who pursue science for non-litigation purposes and that finding general acceptance within the community of forensic scientists does not constitute general acceptance in the relevant scientific community. Of course, this must be the rule, for were it otherwise, defendantsí hired experts could generally accept one sham technique that serves their purposes, while plaintiffsí hired experts could generally accept another sham technique that serves their purposes and both would be supported for admissibility by the general acceptance criteria despite the fact that they were both sham techniques.
It is interesting to note that a scientist reading Frye and Daubert might say that Daubert explains Frye at least as much as it displaces it. Frye defined the evidentiary issue as reliability and then deferred fully to an amorphously defined "scientific community" for its "general acceptance" which it used as a proxy for evidentiary reliability. Daubert still looks to the scientific community for its general acceptance as an indicator of evidentiary reliability but it goes further and defines that general acceptance in two ways. First, it addresses the characteristics of the scientific community to which it defers for the general acceptance that it uses as a determinant of evidentiary reliability. Second, it recognizes that there is a basic structure of inquiry known as the scientific method that is the standard used across different branches of science for their scientific investigations, and it requires that the science proffered to the federal bench be grounded in that basic structure. This requires posing and testing hypotheses and specifying the rates of error for those hypothesis tests. Daubert first specified these criteria indirectly by requiring that expert testimony adhere to the scientific method, and then subsequently posed them explicitly as the first two of its four Daubert factors. Finally the Court specified them indirectly again when it posed general acceptance and publication in peer-reviewed journals as criteria for admissibility of expert testimony, because peer review, publication and general acceptance require hypothesis testing and error rates. In this light it is interesting to note that while the Court introduced its list of four factors as not being a definitive list, testimony that is not based upon a test of a hypothesis will tend to fail all four of the Courtís criteria and therefore be inadmissible.
An example from securities litigation: Current events provide an example of how the principles that underlie Daubert are applied in a securities litigation context. A large Florida corporation has recently been in the news because of allegations that the companyís financial statements reported exaggerated sales figures. Following the release of this information, the corporationís stock fell sharply and several pending lawsuits allege that a class of the corporationís stockholders has been damaged by purchasing stock whose price was inflated by the alleged overstatements. If this litigation proceeds, economists will estimate the damages that were alleged to have been suffered by this class of stockholders.
Economists have a generally accepted technique for measuring the impact of the release of new information on the price of a publicly traded security. This technique, called an event study, has been the basis of hundreds of articles that have been published in peer reviewed journals.
Economists believe that the current value of a security is equal to the present value of all of the payments that the security will make to its owners throughout its life, and that the value of a security changes when new information is released into the market that changes the marketís assessment of the future payments that the security will make to its holder. When information comes into the market that is hypothesized to affect the value of a particular stock, economists test that hypothesis by comparing how that particular stock performed right after the release of the information to how the stock would have been expected to performed in the absence of the release of the new information.
The event study technique ascribes this change in the stockís performance or value to the event that the information disclosed. In the case of the Florida corporation mentioned here, this information would be the release of allegations that its reported sales figures were inflated. This is the financial economistís standard technique for determining the impact of mergers, dividend and earnings announcements, management changes, and a host of other phenomena upon the value of the subject firmís stock, so it has well established non-litigation uses. The heart of the technique is a test of the null hypothesis that the information had no impact upon the price of the stock. The economist will reject this null hypothesis if and only if the hypothesis test yields both an estimate of the change in the stockís value that is non-zero, and an error rate of the test that convinces the economist that sampling error has not caused the non-zero estimate of the change in the stockís value. This technique meets all of the Daubert criteria: it poses and tests a hypothesis, reports the pertinent error rates, and is based upon peer reviewed and published techniques that are so pervasively used within the relevant scientific community that they are regarded as generally accepted as the best existing tool for evaluating the impact of the release of new information upon the value of a publicly traded security.
A few notes on the Florida Frye test
In Brim v. State, the Florida Supreme Court rejected Daubert, writing that "despite the federal adoption of a more lenient standard in Daubert . . . we have maintained the higher standard of reliability as dictated by Frye." However, in rejecting Daubert, the Court seems to move in the direction of construing Frye, itself a reliability driven test, into a sort of virtual Dauberthood.
Floridaís Frye test relies on general acceptance as a proxy for evidentiary reliability and publication in peer-reviewed journals as an indication of that general acceptance. But, as noted earlier, publication is part of the process by which a scientific technique becomes generally accepted and such publication in scientific journals is very rare unless the study's conclusions are derived by posing hypotheses and testing them at given levels of statistical significance. In short, when Florida Courts require that the expertís methods be accorded general acceptance and perhaps publications in peer-reviewed journals as evidence of that acceptance, they seem implicitly to require that the expertís methods be based upon hypothesis testing and the error rates of those tests.
Analyzing Brimís Frye-test logic seems to confirm the existence of this implicit requirement. The court observes that "the DNA testing process consists of two distinct steps," that the first step relies on biology and chemistry, while "a second statistical step is needed to give significance to a match" because "to say that two patterns match without any scientifically valid estimate of the frequency with which such matches might occur by chance is meaningless." This "chance matching" is precisely the error rate of the test of the hypothesis that the DNA profiles match. Since the calculation of such an error rate requires that the hypothesis be tested, the de facto requirement posed here by the court is that the expert conduct the hypothesis test and report its error rate.
In its discussion of the need for the statistical step the court relies substantially on The Evaluation of Forensic DNA Evidence, published by the National Academy of Science and authored by a blue ribbon panel of scientists, professors and lawyers. Chapter 5 of this volume describes the statistical techniques used in DNA research which involves posing hypotheses, testing them and specifying their error rates. The chapter begins its discussion by suggesting that the scientist "evaluate the probability of finding" a false positive. Again, this probability of finding a false positive is the Type I error, or to use its more familiar name, it is the level of statistical significance at which a hypothesis is tested. Of course, since finding the probability of rejecting a true hypothesis requires testing that hypothesis, this necessarily tells DNA testers to test an appropriate hypothesis and report the results of the hypothesis test and its level of statistical significance, which is of course the probability of finding a false positive.
Chapter 5 continues by discussing the notion of Confidence Intervals, which is the more intuitive but mathematically identical twin of the technique of testing a hypothesis at a given level of significance.
Even though in Brim the Florida Supreme Court has declared its rejection of Daubert, the opinion cites with approval scientific and statistical language that ultimately instructs the empirical scientist/expert witness to pose a hypothesis that is implied by generally accepted biology or chemistry and then instructs them to test that hypothesis, reporting the results of the test and the error rate associated with the test. In short, the sources that the Brim court cites tell experts to conform to Daubertís first two criteria. Since Florida's Frye test already rests on Daubert's third and fourth criteria, it is becoming increasingly difficult to distinguish between testimony that will satisfy Daubert and testimony that will satisfy Florida's Frye progeny. Of course this is only one study cited by the Florida Supreme Court but as long as the Florida Courts consult learned works produced by scientists and scientifically sophisticated lawyers, they are going to see those works being couched in terms of the hypothesis tests and error rates that facilitate publication in peer-reviewed journals and Daubert's requirements will continue to leak into the opinions of Florida Courts, bringing in by the back door what the Florida Supreme Court has, so far, declined to admit by the front.