Apr 24 2015

The FBI, Forensic Science, and the CSI Effect

The FBI recently acknowledged that over a two decade period prior to 2000 they used a flawed forensic technique in their investigations – hair analysis. As reported in the Washington Post:

Of 28 examiners with the FBI Laboratory’s microscopic hair comparison unit, 26 overstated forensic matches in ways that favored prosecutors in more than 95 percent of the 268 trials reviewed so far, according to the National Association of Criminal Defense Lawyers (NACDL) and the Innocence Project, which are assisting the government with the country’s largest post-conviction review of questioned forensic evidence.

That is shocking and disappointing, but I don’t think it’s an isolated case.

Our society has come to expect high tech investigative techniques, especially at the level of the FBI and in high-stakes criminal cases such as murder trials. This is partly due to shows like CSI which showcase such technology, and exaggerate the speed and precision with which forensic scientists can tease information out of trace evidence. This effect may even be affecting juries, who expect any murder trial to be accompanied by such evidence.

Crime dramas may have given us an unwarranted confidence in the state of many types of forensic science, as this recent hair analysis revelation demonstrates.

The problem in this case was overstating the confidence with which a match can be made between hair taken from a crime scene and a hair sample from a suspect. The technique involves looking at the hair under a microscope and analyzing it not only for color but for structure and patterns. In this way the microscopic picture of a hair is treated like a fingerprint.

This technique, it is now coming to light, is more useful in ruling out a match than making a match. In other words, if the sample hair does not match the target sample, then the target can be ruled out. If, however, the sample hair matches the target hair from the suspect, that does not predict that the hair came from that specific suspect and only them. The hair may match many people. In other words, this type of analysis has  a massive false positive rate.

False positive matches favor prosecutors, which is what the current review has found. Also, this seems to be a generic problem of forensic techniques based on subjective pattern matching. A recent review by the National Academy of Science found:

Those advances, however, also have revealed that, in some cases, substantive information and testimony based on faulty forensic science analyses may have contributed to wrongful convictions of innocent people. This fact has demonstrated the potential danger of giving undue weight to evidence and testimony derived from imperfect testing and analysis. Moreover, imprecise or exaggerated expert testimony has sometimes contributed to the admission of erroneous or misleading evidence.

Subjective techniques include not only hair matching, but fingerprint analysis, bite mark analysis, tire and shoe impressions, handwriting, and analysis of bullets and firearms. These are all essentially exercises in pattern recognition, which has a well-recognized false-positive bias.

The NAS review makes a number of recommendations, which reflect basic skeptical principles. First, we need to separate objective from subjective types of evidence. DNA analysis, for example, is very objective, and we have good statistical information about how predictive apparent matches are. However, such objective evidence is only available in about 20% of cases. Most of the time prosecutors are relying upon subjective evidence, and experts (the NAS review found) systematically oversell the power of such subjective analysis.

We need more research into the predictive value of subjective matches. How many people share the same basic hair color and structure and would be likely to match a sample?

Forensic analysis should also be much more rigorous. Samples and targets should be assessed in a blinded fashion, with controls. If the expert knows the target, they can convince themselves of a match. If they have to pick the target out of 20 controls, that would be much more convincing.

Further, important analysis such as that should be replicated by multiple people. This can further be used to develop inter-rater reliability ratings – if 10 experts make the same analysis, how often will they agree?

These are the kinds of safeguards that are often performed in rigorous clinical trials, for example: blinded controls, inter-rater reliability scores, and independent analysis.

Finally, the NAS report recommends that subjective forensic evidence, when presented in court, be presented appropriately to juries, without overselling their confidence or predictive value.


Subjective forensic analysis is a huge problem in our criminal justice system, which tends to result in false positives and therefore false convictions. This bias dovetails with other documented courtroom biases, such as gender and racial bias, and magnifies their effects.

It does seem, however, that this problem is recently gaining attention and hopefully will lead to some much needed fixes. These types of analyses, including but not limited to hair analysis for matching samples to suspects, need to be held to a much more rigorous standard, and their real predictive and statistical values correctly communicated to jurors in the courtroom.

I would also like to see (but I’m not holding my breath) popular forensic dramas, like CSI, address this issue.

In any case, this is yet another situation in which a solid understanding of cognitive biases and critical thinking skills is invaluable as a check against pseudoscience. Many lives have been ruined by pseudoscience in the courtroom, and we know how to remedy the situation. We just need the awareness and political will to do so, and hopefully we are heading in that direction.

20 responses so far