Nov 17 2008

Through the Looking Glass of Acupuncture Research

Clinical research tends to follow a certain arc: first smaller and preliminary studies are done to see if there is a potential for a new treatment or approach, then larger and more tightly designed studies are done exploring the relevant research questions, and finally large, double-blind, placebo-controlled consensus trials are completed and the basic question of efficacy is settled.

In scientific jargon we often talk about the null hypothesis, the hypothesis that a new claim is not true, or in the context of medicine that a new treatment does not work. The question for a study is framed as follows: does the data support the rejection of the null hypothesis. This is not a subtle or unimportant distinction, it puts the burden of proof on demonstrating the positive new claim – that a treatment works. Unless the data compels us to reject the null hypothesis, it is retained as the default conclusion. Therefore, in these large and well-controlled trials, if the treatment does not work consistently and both clinically and statistically significantly better than placebo, we do not reject the null hypothesis. In practice we conclude that the treatment does not work and it is appropriately discarded in favor of better treatments or new ideas.

Unless of course you live in the alternate universe of acupuncture research (or more generally that of complementary and alternative medicine – CAM).

In this topsy-turvy world up is down and black is white. When a treatment does not work better than a placebo (an inactive control), then that must mean that the placebo is magically not inert.

Take this latest study in acupuncture and in-vitro fertilization (IVF)  – a technique used to fertilize a woman’s egg with sperm in a dish and then implant the embryo back in the womb. IVF is a tricky and expensive procedure (not to mention emotionally draining for the patients), so anything that increases the success rate is very valuable.

Acupuncture is the practice of sticking thin needles into specific locations of the body for imagined health effects.  The traditional explanation is that the needles manipulate chi – the magical life energy of Eastern culture. Placing and manipulating the needles frees the flow of chi or balances the chi. Of course, this is little more than superstition. Chi has never been demonstrated scientifically, and the notion of a life force or vitalism was rejected over a century ago as unnecessary.

Acupuncture was exported to the West due primarily to the promotion by credulous media, but also riding the wave of so-called CAM. In the West proponents have tried to find more scientific language by which to defend a possible physiological effect from acupuncture, but this effort has largely failed also. There remains no proven specific physiological effect from acupuncture, and (more importantly) acupuncture remains unproven as effective for any specific indication.

The notion that acupuncture may help with IVF is not based upon any plausible biology or physiology. Even the more plausible (if hand-waving) mechanisms proposed for acupuncture would have no specific effect on IVF. At best one could argue that the ritual of acupuncture (which may involve relaxation and even gentle touch as the acupuncture points are palpated) helps reduce anxiety, and reduced anxiety leads to higher IVF success. But even this plausible explanation lacks evidence (including from this new study).

A recent systematic review of acupuncture in IVF concluded:

Currently available literature does not provide sufficient evidence that adjuvant acupuncture improves IVF clinical pregnancy rate.

That’s conservative science speak for – it doesn’t work.

In short – acupuncture is a highly implausible and prescientific modality and its application to IVF is doubly implausible, and current evidence shows it does not work.

Despite the lack of plausibility and evidence, acupuncture remains somewhat popular for IVF in Europe and especially in Asia. For this reason there is still interest in researching this treatment, and the clinical trials are going through the typical evolution toward better design. In acupuncture this means beginning with no control group, then including a standard care control group, and then progressing to sham acupuncture as a control group.

Sham acupuncture means that needles are inserted, but not in the proscribed locations. These studies gave mixed results, which is expected from an ineffective treatment. Negative trials could be dismissed by saying that the sham acupuncture was effective – after all, needles are being inserted. Positive trials could be dismissed by arguing that the treatment was not properly blinded. After all, the acupuncturist knows if they are giving true or sham acupuncture and may convey that to the subject. So while the evidence generally favors the null hypothesis (no effect), and the better trials showed at least that it does not matter where you stick the needles (arguing strongly against the idea that acupuncture is a specific technique), there was still plenty of wiggle room on both sides.

This led to the development of placebo acupuncture – where a dull needle is encased in an opaque sheath and does not penetrate the skin, so neither the patient nor the acupuncturist knows if real or placebo acupuncture is being given. So far, these best designed studies of acupuncture have been negative (at least all the high-profile ones I have seen, if anyone knows of a positive example, send it along).

This was the design of this new study, completed by Emily Wing Sze So and others at the University of Hong Kong. Here are the methods:

On the day of embryo transfer (ET), 370 patients were randomly allocated to either real or placebo acupuncture according to a computer-generated randomization list in sealed opaque envelopes. They received 25 min of real or placebo acupuncture before and after ET. The endometrial and subendometrial vascularity, serum cortisol concentration and the anxiety level were evaluated before and after real and placebo acupuncture.

The results – 55.1% of those getting placebo acupuncture became pregnant versus 43.8% of the real acupuncture. The P value (0.05 is generally accepted as a reasonable cutoff, which translates to a 1 in 20 chance of falsely rejecting the null hypothesis) was 0.038. So this is statistically significant. Other endpoints measured, like changes to endometrial and subendometrial vascularity, serum cortisol concentration and the anxiety level, were not significantly different.

If you dig deeper into the data you also see that some cherry picking was going on. The only measure of pregnancy that was statistically significant was urine pregnancy testing, which is notoriously inaccurate. All other measures were not statistically significant – these include clinical pregnancy, ongoing pregnancy, and the all-important (actually the only one that really matters) live birth rate. The live birth rate was 29% for acupuncture and 38% for placebo acupuncture, but these differences were not statistically significantly different, and are in line with typical IVF success rates of about 35%.

Scientific conclusion- null hypothesis not rejected, i.e. acupuncture does not work for IVF.

Alternative universe CAM conclusion – “Placebo acupuncture may not be inert.”

That was the only conclusion in the abstract of the paper – that is the one the authors chose to emphasize.  It is also the least plausible conclusion to be drawn from this study. This has also been their emphasis in dealing with the media. In an article in the Washington Post about this study the headline reads: “Placebo Acupuncture Tied to Higher IVF Pregnancies.” In this article study author Ernest Hung Yu Ng is quoted as saying:

“Placebo acupuncture is similar to acupressure and therefore is good enough to improve the pregnancy rate,” said Ng, who added it’s also possible that real acupuncture may, in some way, reduce the pregnancy rate.

That was the ONLY two possibilities discussed – that placebo acupuncture somehow works, or that “real” acupuncture has a negative effect. There was no discussion of the most plausible interpretation of this research, and the one strictly required by standard scientific tradition – that acupuncture does not work for IVF. What we are seeing is the cherry picking of a statistical fluke in the comparison of two inert treatments (real and placebo acupuncture). It was just as likely that real acupuncture came out a bit on top.

It is to be noted that the physiological and anxiety measures were no different. For a physiologically active treatment we like to see that objective measure of physiological correlates should go along with the main clinical outcome.  If they do not then we have to revise our thinking about mechanism (and there really isn’t any plausible notion of mechanism in this case) or we conclude that any effects we are seeing are random statistical noise.

Further it should be noted that the results are barely statistically significant – 1 in 30 or so such trials will show a positive or negative correlation by chance alone. In fact the odds of a random positive result are even higher if you factor in the multiple comparisons that were made, only one of which was significant. Of course, when they are positive they are trumpeted as verification for the treatment. If negative, we are now given an even more fanciful interpretation.

This reflects the need to interpret the literature and not just one study. When we look at the acupuncture in IVF literature we find a distinct lack of a consistent and compelling signal in the noise – the better the study the smaller the effect, and the best studies are negative. Any intellectually honest researcher should conclude from this study, when put in the context of all the acupuncture and IVF research – that acupuncture simply does not improve the success rate of IVF – it does not work.

Concluding from this study that placebo acupuncture is not inert is absurd. At best you can say that there are non-specific effects from the process of getting acupuncture that is having an effect – but that is the definition of a placebo effect. The point of placebo treatments is to control for all the other stuff – everything except for a physiological response to the active treatment.

Some clinical studies do suffer from the real problem of having an active placebo – but then you need a control group that was getting some active intervention. This does not really apply here, because both arms were the same (we are told) except for the one variable of whether or not the needle pierced the skin. So really all we can conclude is that that one variable has a negative effect or no effect and any difference was a fluke.

So even if you try to put the best face on acupuncture research (and why would you, unless you had an ideological bias), all you can say is that there is some effect from the ritual of acupuncture, but it does not matter where you put the needles or if the needles pierce the skin or not. But that pretty much kills any claims for the traditional interpretation of acupuncture or any claims for a specific physiological effect from acupuncture. The needles, it seems, are irrelevant – and what is acupuncture without the needles. It’s relaxation, and maybe some gentle palpation.

Acupuncture does not work for IVF, and in my opinion for anything else. If it were a drug it would not get past the FDA. In the world of science-based medicine, it should be discarded as a failed approach. But it survives in the alternate universe of CAM.

Coming next – functional studies of acupuncture in IVF. This means that the control arms will be discarded in favor of studying acupuncture for IVF in a more “natural and conducive” setting. Well-controlled studies are negative, so proponents will go back to doing poorly-controlled studies where their biases can assure a favorable outcome, and they will justify this travesty of science and reason with pleasing language about studying acupuncture in its more “pragmatic” context. This has already happened for acupuncture in other applications, like for headaches.

Meanwhile the mainstream media will eat up the press releases written by proponents, spinning negative or simply worthless studies into more and more free advertising for a failed treatment.


David Gorski has also reviewed this study over at Science-Based Medicine.

