Feb 02 2024

How To Prove Prevention Works

Homer: Not a bear in sight. The Bear Patrol must be working like a charm.
Lisa: That’s specious reasoning, Dad.
Homer: Thank you, dear.
Lisa: By your logic I could claim that this rock keeps tigers away.
Homer: Oh, how does it work?
Lisa: It doesn’t work.
Homer: Uh-huh.
Lisa: It’s just a stupid rock.
Homer: Uh-huh.
Lisa: But I don’t see any tigers around, do you?
[Homer thinks of this, then pulls out some money]
Homer: Lisa, I want to buy your rock.
[Lisa refuses at first, then takes the exchange]


This memorable exchange from The Simpsons is one of the reasons the fictional character, Lisa Simpson, is a bit of a skeptical icon. From time to time on the show she does a descent job of defending science and reason, even toting a copy of “Jr. Skeptic” magazine (which was fictional at the time then created as a companion to Skeptic magazine).

What the exchange highlights is that it can be difficult to demonstrate (let alone “prove”) that a preventive measure has worked. This is because we cannot know for sure what the alternate history or counterfactual would have been. If I take a measure to prevent contracting COVID and then I don’t get COVID, did the measure work, or was I not going to get COVID anyway? Historically the time this happened on a big scale was Y2K – this was a computer glitch set to go off when the year changed to 2000. Most computer code only encoded the year as two digits, assuming the first two digits were 19, so 1995 was encoded as 95. So when the year changed to 2000, computers around the world would think it was 1900 and chaos would ensue. Between $300 billion and $500 billion were spent world wide to fix this bug by upgrading millions of lines of code to a four digit year stamp.

Did it work? Well, the predicted disasters did not happen, so from that perspective it did. But we can’t know for sure what would have happened if we did not fix the code. This has lead to speculation and even criticism about wasting all that time and money fixing a non-problem. There is good reason to think that the preventive measures worked, however.

At the other end of the spectrum, often doomsday cults, predicting that the world will end in some way on a specific date, have to deal with the day after. One strategy is to say that the faith of the group prevented doomsday (the tiger-rock strategy). They can now celebrate and start recruiting to prevent the next doomsday.

The question is – how do we know when our preventive efforts have been successful or if they were not needed. In either scenario above you can use the absence of anything bad happening as both evidence that the problem was fake all along, or that the preventive measures worked. The absence of disaster fits both narratives. The problem can get very complicated. When preventive measures are taken and negative outcomes happen anyway, can we argue that it would have been worse? Did the school closures during COVID prevent any deaths? What would have happened if we tried to keep schools open? The absence of a definitive answer means that anyone can use the history to justify their ideological narrative.

How do we determine if a preventive measure works. There are several valid methods, mostly involving statistics. There is no definitive proof (you can’t run history back again to see what happens), but you can show convincing correlation. Ideally the correlation will be repeatable with at least some control of confounding variables. For public health measures, for example, we can compare data from either a time or a place without the preventive measures to those with the preventive measures. This can vary by state, province, country, region, demographic population, or over historic time. In each country where the measles vaccine is rolled out, for example, there is an immediate sharp decline in the incidence of measles. And if vaccine compliance decreases there is a rises in measles. If this happens often enough, the statistical data can be incredibly robust.

This relates to a commonly invoked (but often misunderstood) logical fallacy, the confusion of correlation with causation. Often people will say “correlation does not equal causation”. This is true but can be misleading. Correlation is not necessarily due to a specific causation, but it can be. Over applying this principle is a way to dismiss correlational data as useless – but it isn’t. The way scientists use correlation is to look for multiple correlations and triangulate to the one causation that is consistent with all of them. Smoking correlates with an increased risk of lung cancer. But also, duration and intensity also correlate, as does filtered vs unfiltered, and quitting correlates with reduced risk over time back to baseline. There are multiple correlations that only make sense in total if smoking causes lung cancer. Interestingly, the tobacco industry argued for decades that this data does not prove smoking causes cancer, because it was just correlation.

Another potential line of evidence is simulations. We cannot rerun history, but we can simulate it to some degree. Our ability to do so is growing fast, as computers get more powerful and AI technology advances. So we can run the counterfactual and ask, what would have happened if we had not taken a specific measure. But of course, these conclusions are only as good as the simulations themselves, which are only as good as our models. Are we accounting for all variables? This, of course, is at the center of the global climate change debate. We can test our models both against historical data (would they have predicted what has already happened) and future data (did they predict what happened after the prediction). It turns out, the climate models have been very accurate, and are getting more precise. So we should probably pay attention to what they say is likely to happen with future release of greenhouse gases.

But I predict that if by some miracle we are able to prevent the worst of climate change through a massive effort of decarbonizing our industry, future deniers will argue that climate change was a hoax all along, because it didn’t happen. It will be Y2K all over again but on a more massive scale. That’s a problem I am willing to have, however.

Another way to evaluate claims for prevention is plausibility. The tiger rock example that Lisa gives is brilliant for two reason. First, the rock is clearly “just a stupid rock” that she randomly picked up off the ground. Second, there is no reason to think that there are any tigers anywhere near where they are. For any prevention claim, the empirical data from correlation or simulations has to be put into the context of plausibility. Is there a clear mechanism? The lower the plausibility (or prior probability, in statistical terms) then the greater the need for empirical evidence to show probable causation.

For Y2K, there was a clear and fully understood mechanism at play. They could also easily simulate what would happen, and computer systems did crash. For global climate change, there is a fairly mature science with thousands of papers published over decades. We have a pretty good handle on the greenhouse effect. We don’t know everything (we never do) and there are error-bars on our knowledge (climate sensitivity, for example) but we also don’t know nothing. Carbon dioxide does trap heat, and more CO2 in the atmosphere does increase the equilibrium point of the total heat in the Earth system. There is no serious debate about this, only about the precise relationship. Regarding smoking, we have a lot of basic science data showing how the carcinogens in tobacco smoke can cause cancer, so it’s no surprise that it does.

But if the putative mechanism is magic, then a simple unidirectional correlation would not be terribly convincing, and certainly not the absence of a single historical event.

Of course there are many complicated example about which sincere experts can disagree, but it is good to at least understand the relevant logic.

No responses yet