Prozac and Placebos: Review of a Media Maelstrom
Last week we had the interesting experience of watching how a PLoS Medicine meta-analysis of anti-depressant drug trials generated a furore in the media. It featured on the front page of four UK national newspapers (the Guardian, the Telegraph, the Independent and the Times), was the leading item on the BBC News and prompted stories in Time, the Wall Street Journal and the Economist.
The paper not only posed questions about the benefits of antidepressants, it revealed how many clinical trial results do not see the light of day. But whilst the issues relating to it continue to be debated – a discussion leads the Guardian Science Weekly Podcast this week– some of the headlines in the media maelstrom misrepresented the study.
Irving Kirsch and colleagues used Freedom of Information legislation in the United States (see the methods section) to get access to both published and unpublished clinical trials of several SSRI/SNRI antidepressants, including fluoxetine (Prozac), that had been submitted to the Food and Drug Administration for approval. Analyzing the full dataset, which included studies of varying duration and quality, the researchers found no clinically significant difference between a patient’s response to the placebo and these antidepressants for most depressed people. Their analysis did find clinically relevant effects for a subset of the most severely depressed patients.
The headlines started appearing at 1am GMT on Tuesday 26th, as soon as the embargo ended. The front page of the Independent (“Antidepressant drugs don’t work – official study”), the Guardian (“Prozac, used by 40 million people, does not work”) and the Times (“Depression drugs don’t work, finds data review”) all opted for an outright statement that antidepressants don’t work. But the study does not show that antidepressants do not work. Rather, the evidence reviewed in this analysis did not show these antidepressants to produce enough of a beneficial effect over the placebo to be termed clinically significant. Language Log, a linguistics blog, gives an account of how some journalists got tangled up in their own sentences when trying to describe the results. (Thanks to Rebecca Walton for pointing this one out).
Time Magazine opted for a slightly different presentation (“Antidepressants Hardly Help”), pointing out that there is a difference between “statistical significance” and “clinical significance.” The article quotes the authors directly to say that only those “at the upper end of the very severely depressed category” get a clinically significant benefit. It also makes reference to the Hamilton Depression Rating Scale: the tool used by UK authorities to determine clinical significance.
The interview on the BBC Radio 4 Today programme touched upon the crucial issues in the space of five minutes. It features a debate between Irving Kirsch and Richard Tiner of the Association of British Pharmaceutical Industry. Irving Kirsch summarizes the results of the study, before they dispute whether or not NICE (National Institute for Health and Clinical Excellence), the body guiding clinicians in the UK, could get access to all of the information that they needed. This item makes the point – missing from some of the other coverage – that people taking antidepressants should not change their behavior on the basis of this report or the headlines. If worried, they should consult their doctor.
So some of the headlines were off the mark, but one good thing that can come out of the coverage is a re-ignition of the debate in the media about the importance of having access to all clinical trials data – not just the positive results pushed for publication by pharmaceutical companies.
In Bad Science, Ben Goldacre also sees the real importance of the study lies in the “fascinating story of buried data.” Noting that the authors had to use the Freedom of Information Act to get all the data from the FDA, he says the fact that “medical academics should need to use that kind of legislation to obtain information about trials on pills which are prescribed to millions of people is absurd.” The Independent ran an article to explain publication bias to a wider audience. In her Comment is Free piece, Sarah Boseley asks whether we can trust that the licensing authorities have all the data they need to approve drugs.
In comparison with the wall to wall headlines the study got in the UK, there was less coverage in the United States. Although the study was covered in North America (Fox News linked to the paper and CTV covered it in Canada, amongst others), of the major American papers we only found that the Wall Street Journal has original coverage. As Washington Monthly remarked (with bloggers theorizing), this is interesting considering that the study uses clinical trials data from the Food and Drug Administration, the licensing authority in the United States.
Pulse reports that NICE are to consider unpublished data on antidepressants before their guidance on depression is published later this year, so the ramifications of the study are set to continue. In the light of the PLoS Medicine study the Guardian returned to the earlier story that the UK government plans to train additional therapists to combat the difficulties that patients have in accessing non-pharmaceutical forms of therapy for depression.
Hopefully some informed debate about drugs and depression in general can come out of all of this coverage. Read some of the reader responses that we’ve received – in particular the compelling response by Jeanne Lenzer and Shanon Brownlee, which does a good job of suming up the most important implications of the study. They write:
“The take home message of Kirsch’s analysis is that it is difficult if not impossible to come to conclusions about the relative merits and risk of medications when only parts (usually positive parts) of the data are available. The problem of publication bias is so powerful that it has certainly distorted interventions besides antidepressants – a problem we discuss in our commentary in this week’s BMJ“.
It would be misleading to attribute all the innacurate coverage of this paper purely to the media.
The results themselves, and the interpretation given to them by the authors and PLoS editors’ summary, set the tone.
It is unfortunate that these results are somewhat less straightforward than they appear because it was a valuable service of Kirsch et al to obtain the details of all these pre-licencing studies, and it remains imperative to secure access to all data of this kind if we are to truly assess the clinical benefits of medical treatments.
In particular, recommendations about the suitability of newer anti-depressants for the APA/NICE categories of ‘moderate’ and ‘severe’ depression in this paper are made on the basis of essentially one or two studies and the extrapolation of a regression line, because all other studies included in this analysis were in the ‘very severe’ range. Rather than make recommendations on such flimsy data it would have been better for Kirsch et al to point out that the FDA did not have strong evidence available at the time of licencing to justify the use of these newer anti-depressants in milder forms of depression.
The use by Kirsch et al of a standardised mean difference (SMD) as the outcome measure for each group (rather than the raw change score) is unusual and the resulting comparison (the drug SMD minus the placebo SMD) is difficult to interpret (in particular with respect to the NICE effect size criterion for ‘clinical significance) in comparison to more conventional outcome measures (such as the drug change in HRSD minus the placebo change in HRSD, the more obvious measure for outcome using the same instrument as in this study, or even the SMD of the change, which is more appropriate for studies using different instruments). Carrying out separate regressions for the drug and placebo groups rather than on the overall effect size is also unusual.
Many of the findings in this paper depend on the authors’ use of the SMD. In particular their finding that placebo responses decrease while drug responses remain constant with increasing baseline HRSD severity does not hold when the raw HRSD change score is used, in fact the reverse is the case and placebo response remains fairly constant while drug responsiveness increases.
Also, by analysing raw HRSD change scores the NICE criteria for ‘clinical significance’ are met by paroxetine and venlafaxine, and an overall regression on these scores suggests that the NICE criteria are exceeded at a lower baseline HRSD score than Kirsch et al report.
It is probably also worth noting that Moncrieff & Kirsch have previously noted the arbitrary nature of the NICE ‘clinical significance’ criteria and that the APA/NICE severity category of ‘severe’ depression would be considered moderate in clinical practice.