Asking for evidence from All Trials

I’ve written before about how the statistic “50% of all clinical trials are unpublished”, much beloved of the All Trials campaign, is simply not evidence based.

The charity Sense About Science, who run the All Trials campaign, also run a rather splendid website called “ask for evidence“, which encourages people to ask for evidence when people make dodgy claims.

So I used Sense About Science’s website to ask for evidence from Sense About Science for their claim that 50% of all trials are unpublished.

To their credit, they responded promptly, and pointed me to this article, which they claimed provided the evidence behind their claim.

So how well does it support their claim?

Interestingly, the very first sentence states that they don’t really have evidence. The first paragraph of the document reads as follows:

“We may never know the answer to this question. In some ways, maybe it doesn’t matter. Even one clinical trial left unreported is unacceptable.”

So if they don’t know, why are they making such a confident claim?

As an aside, they are of course spot on in the rest of that paragraph. Even if the proportion of unpublished trials is substantially less than 50%, if it’s greater than zero it’s still too high.

They go on to further emphasise the point that they really don’t know what proportion of trials are unpublished:

“It is clearly not a statistic, and we wouldn’t advocate trying to roll up the results of all the studies listed below to produce something spuriously precise.”

They then go on to explain the complexity of estimating the proportion of unpublished trials. It certainly is complex, and they give a good explanation of why. It’s not a bad document, and even includes some studies showing much higher rates of disclosure that they don’t admit to on their main website article.

But if they understand, as they clearly do, that the claim that half of all trials are unpublished is spuriously precise and that it would be wrong to claim that, why do they do so anyway?

Not only do they make the claim very confidently on their own website,  the claim also often appears in the very many articles that their PR machine churns out. These articles state it as fact, and do not acknowledge the problems that they describe in their background document. You will never see those articles citing the most recent research showing greater than 90% disclosure rates.

This still seems dishonest to me.

The Dianthus blog

Those of you who have known me for some time will know that I used to blog on the website of my old company, Dianthus Medical.

Well, Dianthus Medical is no more, but I have preserved the blog for posterity. You can find it again at its original home, If by some remote chance you happen to have any links to any of the old blogposts, they should work again now. All the blogposts have their original URLs.

The Dianthus blog no longer accepts comments, but if you have an urgent need to leave a comment on anything posted there, you are welcome to leave a comment on this page.

Sugar tax

One of the most newsworthy features of yesterday’s budget was the announcement that the UK will introduce a tax on sugary drinks.

There is reason to think this may have primarily been done as a dead cat move, to draw attention away from the fact the the Chancellor is missing all his deficit reduction targets and cutting disability benefits (though apparently he can still afford tax cuts for higher rate tax payers).

But what effect might a tax on sugary drinks have?

Obviously it will reduce consumption of sugary drinks:  it’s economics 101 that when the price of something goes up, consumption falls. But that by itself is not interesting or useful. The question is what effect will that have on health and well-being?

The only honest answer to that is we don’t know, as few countries have tried such a tax, and we do not have good data on what the effects have been in countries that have.

For millionaires such as George Osborne and Jamie Oliver, the tax is unlikely to make much difference. Sugary drinks are such a tiny part of their expenditure, they will probably not notice.

But what about those at the other end of the income scale? While George Osborne may not realise this, there are some people for whom the weekly grocery shop is a significant proportion of their total expenditure. For such people, taxing sugary drinks may well have a noticeable effect.

For a family who currently spends money on sugary drinks, 3 outcomes are possible.

The first possibility is that they continue to buy the same quantity of sugary drinks as before (or sufficiently close to the same quantity that their total expenditure still rises). They will then be worse off, as they will have less money to spend on other things. This is bad in itself, but also one of the strongest determinants of ill health is poverty, so taking money away from people is unlikely to make them healthier.

The second possibility is that they reduce their consumption of sugary drinks by an amount roughly equivalent to the increased price. They will then be no better or worse off in terms of the money left in their pocket after the weekly grocery shopping, but they will be worse off in welfare terms, as they will have less of something that they value (sugary drinks). We know that they value sugary drinks, because if they didn’t, they wouldn’t buy them in the first place.

Proponents of the sugar tax will argue that they will be better off in health terms, as sugary drinks are bad for you, and they are now consuming less of them. Well, maybe. But that really needs a great big [citation needed]. This would be a relatively modest decrease in sugary drink consumption, and personally I would be surprised if it made much difference to health. There is certainly no good evidence that it would have benefits on health, and given that you are harming people by depriving them of something they value, I think it is up to proponents of the sugar tax to come up with evidence that the benefits outweigh those harms. It seems rather simplistic to suppose that obesity, diabetes, and the other things the the sugar tax is supposed to benefit are primarily a function of sugary drink consumption, when there are so many other aspects of diet, and of course exercise, which the sugar tax will not affect.

The third possibility is that they reduce their consumption by more than the amount of the price increase. They will now have more money in their pocket at the end of the weekly grocery shop. Perhaps they will spend that money on vegan tofu health drinks and gym membership, and be healthier as a result, as the supporters of the sugar tax seem to believe. Or maybe they’ll spend it on cigarettes and boiled sweets. We simply don’t know, as there are no data to show what happens here. The supposed health benefits of the sugar tax are at this stage entirely hypothetical.

But whatever they spend it on, they would have preferred to spend it on sugary drinks, so we are again making them worse off in terms of the things that they value.

All these considerations are trivial for people on high incomes. They may not be for people on low incomes. What seems certain is that the costs of the sugar tax will fall disproportionately on the poor.

You may think that’s a good idea. George Osborne obviously does. But personally, I’m not a fan of regressive taxation.

Are a fifth of drug trials really designed for marketing purposes?

A paper by Barbour et al was published in the journal Trials a few weeks ago making the claim that “a fifth of drug trials published in the highest impact general medical journals in 2011 had features that were suggestive of being designed for marketing purposes”.

That would be bad if it were true. Clinical trials are supposed to help to advance medical science and learn things about drugs or other interventions that we didn’t know before. They are not supposed to be simply designed to help promote the use of the drug. According to an editorial by Sox and Rennie, marketing trials are not really about testing hypotheses, but “to get physicians in the habit of prescribing a new drug.”

Marketing trials are undoubtedly unethical in my opinion, and the question of how common they are is an important one.

Well, according to Barbour et al, 21% of trials in high impact medical journals were designed for marketing purposes. So how did they come up with that figure?

That, unfortunately, is where the paper starts to go downhill. They chose a set of criteria which they believed were associated with marketing trials. Those criteria were:

“1) a high level of involvement of the product manufacturer in study design 2) data analysis, 3) and reporting of the study, 4) recruitment of small numbers of patients from numerous study sites for a common disease when they could have been recruited without difficulty from fewer sites, 5) misleading abstracts that do not report clinically relevant findings, and 6) conclusions that focus on secondary end-points and surrogate markers”

Those criteria appear to be somewhat arbitrary. Although Barbour et al give 4 citations to back up those criteria, none of the papers cited provides any data to validate those criteria.

A sample of 194 papers from 6 top medical journals were then assessed against those criteria by 6 raters (or sometimes 5, as raters who were journal editors didn’t assess papers that came from their own journal), and each rater rated each paper as “no”, “maybe”, or “yes” for how likely it was to be a marketing trial. Trials rated “yes” by 4 or more raters were considered to be marketing trials, and trials with fewer than 4 “yes” ratings could also be considered marketing trials if there were no more than 3 “no” ratings and a subsequent consensus discussion decided they should be classified as marketing trials.

The characteristics of marketing trials were then compared with other trials. Not surprisingly, the characteristics described above were more common in the trials characterised as marketing trials. Given that that’s how the “marketing” trials were defined, that outcome was completely predictable. This is a perfectly circular argument. Though to be fair to the authors, they do acknowledge the circularity of their argument in the discussion.

One of the first questions that came to my mind was how well the 6 raters agreed. Unfortunately, no measure of inter-rater agreement is presented in the paper.

Happily, the authors get top marks for their commitment to transparency here. When I emailed to ask for their raw data so that I could calculate the inter-rater agreement myself, the raw data was sent promptly. If only all authors were so co-operative.

So, how well did the authors agree? Not very well, it turns out. The kappa coefficient for agreement among the raters was a mere 0.36 (kappa values vary between 0 and 1, where 0 is no better than guessing and 1 is perfect agreement, with values above about 0.7 generally considered to be acceptable agreement). This does not suggest that the determination of what counted as a marketing trial was obvious.

To look at this another way, of the 41 trials characterised as marketing trials, only 4 of those trials were rated “yes” by all raters, and only 9 were rated “yes” by all but one. This really doesn’t suggest that the authors could agree on what constituted a marketing trial.

So what about those 4 trials rated “yes” by all reviewers? Let’s take a look at them and see if the conclusion that they were primarily for marketing purposes stacks up.

The first paper is a report of 2 phase III trials of linaclotide for chronic constipation. This appears to have been an important component of the clinical trial data leading to licensing of rifamixin for IBS, as the trials are mentioned in the press release where the FDA describes the licensing of the drug. So the main purpose of the study seems to have been to get the drug licensed. And in contrast to point 6) in the criteria for determining a marketing study, the conclusions were based squarely on the primary endpoint. As for point 5), obviously the FDA thought the findings were clinically relevant as they were prepared to grant the drug a license on the back of them.

The second is a report of 2 phase III trials of rifamixin for patients with irritable bowel syndrome. Again, the FDA press release shows that the main purpose of the studies was for getting the drug licensed.  And again, the conclusions were based on the primary endpoint and were clearly considered clinically relevant by the FDA.

The third paper reports a comparative trial of tiotropium versus salmeterol for the prevention of exacerbations of COPD. Tiotropium was already licensed when this trial was done so this trial was not for the purposes of original licensing, but it does appear that it was important in subsequent changes to the licensing, as it is specifically referred to in the prescribing information.  Again, the conclusions focussed on the primary outcome measure, which was prevention of exacerbations: certainly a clinically important outcome in COPD.

The fourth paper was also done after the drug was originally licensed, which in this case was eplerenone. The study looked at overall mortality in patients with heart failure. Again, the study is specifically referenced in the prescribing information, and again, the study’s main conclusions are based on the primary outcome measure. In this case, the primary outcome measure was overall mortality. How much more clinically relevant do you want it to be?

Those 4 studies are the ones with the strongest evidence of being designed for marketing purposes. I haven’t looked at any of the others, but I think it’s fair to say that there is really no reason to think that those 4 were designed primarily for marketing.

Of course in one sense, you could argue that they are all marketing studies. You cannot market a drug until it is licensed. So doing studies with the aim of getting a drug licensed (or its licensed indications extended) could be regarded as for marketing purposes. But I’m pretty sure that’s not what most people would understand by the term.

So unfortunately, I think Barbour et al have not told us anything useful about how common marketing studies are.

I suspect they are quite rare. I have worked in clinical research for about 20 years, and have worked on many trials in that time. I have never worked on a study that I would consider to be designed mainly for marketing. All the trials I have worked on have had a genuine scientific question behind them.

This is not to deny, of course. that marketing trials exist. Barbour et al refer to some well documented examples in their paper. Also, in my experience as a research ethics committee member, I have certainly seen studies that seem to serve little scientific purpose and the accusation of being designed mainly for marketing would be reasonable.

Again, they are rare: certainly nothing like 1 in 5. I have been an ethics committee member for 13 years, and typically review about 50 or so studies per year. The number of studies I have suspected of being marketing studies in that time could be counted on the fingers of one hand. If it had been up to me, I would have not given those studies ethical approval, though other members of my ethics committee do not share my views on the ethics of marketing trials, so I was outvoted and the trials were approved.

So although Barbour et al ask an important question, it does not seem to me that they have answered it. Still, by being willing to share their raw data, they have participated fully in the scientific process. Publishing something and letting others scrutinise your results is how science is supposed to be done, and for that they deserve credit.




Solving the economics of personalised medicine

It’s a well known fact that many drugs for many diseases don’t work very well in in many patients. If we could identify in advance which patients will benefit from a drug and which won’t, then drugs could be prescribed in a much more targeted manner. That is actually a lot harder to do than it sounds, but it’s an active area of research, and I am confident that over the coming years and decades medical research will make much progress in that direction.

This is the world of personalised medicine.

Although giving people targeted drugs that are likely to be of substantial benefit to them has obvious advantages, there is one major disadvantage. Personalised medicine simply does not fit the economic model that has evolved for the pharmaceutical industry.

Developing new drugs is expensive. It’s really expensive. Coming up with a precise figure for the cost of developing a new drug is controversial, but some reasonable estimates run into billions of dollars.

The economic model of the pharmaceutical industry is based on the idea of a “blockbuster” drug. You develop a drug like Prozac, Losec, or Lipitor that can be used in millions of patients, and the huge costs of that development can be recouped by the  huge sales of the drug.

But what if you are developing drugs based on personalised medicine for narrowly defined populations?  Perhaps you have developed a drug for patients with a specific variant of a rare cancer, and it is fantastically effective in those patients, but there may be only a few hundred patients worldwide who could benefit. There is no way you’re going to be able to recoup the costs of a billion dollars or more of development by selling the drug to a few hundred patients, without charging sums of money that are crazily unaffordable to each patient.

Although the era of personalised medicine is still very much in its infancy, we have already seen this effect at work with drugs like Kadcyla, which works for only a specific subtype of breast cancer patients, but at £90,000 a pop has been deemed too expensive to fund in the NHS. What happens when even more targeted drugs are developed?

I was discussing this question yesterday evening over a nice bottle of Chilean viognier with Chris Winchester. I think between us we may have come up with a cunning plan.

Our idea is as follows. If a drug is being developed for a suitably narrow patient population that it could be reasonably considered a “personalised medicine”, different licensing rules would apply. You would no longer have to obtain such a convincing body of evidence of efficacy and safety before licensing. You would need some evidence, of course, but the bar would be set much lower. Perhaps some convincing laboratory studies followed by some small clinical trials that could be done much more cheaply than the typical phase III trials that enrol hundreds of patients and cost many millions to run.

At that stage, you would not get a traditional drug license that would allow you to market the drug in the normal way. The license would be provisional, with some conditions attached.

So far, this idea is not new. The EMA has already started a pilot project of “adaptive licensing“, which is designed very much in this spirit.

But here comes the cunning bit.

Under our plan, the drug would be licensed to be marketed as a mixture of the active drug and placebo. Some packs of the drug would contain the active drug, and some would contain placebo. Neither the prescriber nor the patient would know whether they have actually received the drug. Obviously patients would need to be told about this and would then have the choice to take part or not. But I don’t think this is worse than the current situation, where at that stage the drug would not be licensed at all, so patients would either have to find a clinical trial (where they may still get placebo) or not get the drug at all.

In effect, every patient who uses the drug during the period of conditional licensing would be taking part in a randomised, double-blind, placebo-controlled trial.  Prescribers would be required to collect data on patient outcomes, which, along with a code number on the medication pack, could then be fed back to the manufacturer and analysed. The manufacturer would know from the code number whether the patient received the drug or placebo.

Once sufficient numbers of patients had been treated, then the manufacturer could run the analysis and the provisional license could be converted to a full license if the results show good efficacy and safety, or revoked if they don’t.

This wouldn’t work in all cases. There will be times when other drugs are available but would not be compatible with the new drug. You could not then ethically put patients in a position where a drug is available but they get no drug at all. But in cases where no effective treatment is available, or the new drug can be used in addition to standard treatments, use of a placebo in this way is perfectly acceptable from an ethical point of view.

Obviously even when placebo treatment is a reasonable option, there would be logistical challenges with this approach (for example, making sure that the same patient gets the same drug when their first pack of medicine runs out). I don’t pretend it would be easy. But I believe it may be preferable to a system in which the pharmaceutical industry has to abandon working on personalised medicine because it has become unaffordable.

Made up statistics on sugar tax

I woke up this morning to the sound of Radio 4 telling me that Cancer Research UK had done an analysis showing that a 20% tax on sugary drinks could reduce the number of obese people in the UK by 3.7 million by 2025. (That could be the start of the world’s worst ever blues song, but it isn’t.)

My first thought was that was rather surprising, as I wasn’t aware of any evidence on how sugar taxes impact on obesity. So I went hunting for the report with interest.

Bizarrely, Cancer Research UK didn’t link to the full report from their press release (once you’ve read the rest of this post, you may conclude that perhaps they were too embarrassed to let anyone see it), but I tracked it down here. Well, I’m not sure even that is the full report. It says it’s a “technical summary”, but the word “summary” makes me wonder if it is still not the full report. But that’s all that seems to be made publicly available.

There are a number of problems with this report. Christopher Snowdon has blogged about some of them here, but I want to focus on the extent to which the model is based on untested assumptions.

It turns out that the conclusions were indeed not based on any empirical data about how a sugar tax would impact on obesity, but on  a modelling study. This study made various assumptions about various things, principally the following:

  1. The price elasticity of demand for sugary drinks (ie the extent to which an increase in price reduces consumption)
  2. The extent to which a reduction in sugary drink consumption would reduce total calorie intake
  3. The effect of total calorie intake on body mass

The authors get 0/10 for transparent reporting for the first of those, as they don’t actually say what price elasticity they used. That’s pretty basic stuff, and not to report it is somewhat akin to reporting the results of a clinical trial of a new drug and not saying what dose of the drug you used.

However, the report does give a reference for their price elasticity data, namely this paper. I must say I don’t find the methods of that paper easy to follow. It’s not at all clear to me whether the price elasticities they calculated were actually based on empirical data or themselves the results of a modelling exercise. But the data that are used in that paper come from the period 2008 to 2010, when the UK was in the depths of  recession, and when it might be hypothesised that price elasticities were greater than in more economically buoyant times. They don’t give a single figure for price elasticity, but a range of 0.8 to 0.9. In other words, a 20% increase in the price of sugary drinks would be expected to lead to a 16-18% decrease in the quantity that consumers buy. At least in the depths of the worst recession since the 1930s.

That figure for price elasticity is a crucial input to the model, and if it is wrong, then the answers of the model will be wrong.

The next input is the extent to which a reduction in sugary drink consumption reduces total calorie intake.  Here, an assumption is made that total calorie intake is reduced by 60% of the amount of calories not consumed in sugary drinks. Or in other words, that if you forego the calories of a sugary drink, you only make up 40% of those from elsewhere.

Where does that 60% figure come from? Well, they give a reference to this paper. And how did that paper arrive at the 60% figure? Well, they in turn give a reference to this paper. And where did that get it from? As far as I can tell, it didn’t, though I note it reports the results of a clinical study in people trying to lose weight by dieting. Even if that 60% figure is based on actual data from that study, rather than just plucked out of thin air, I very much doubt that data on calorie substitution taken from people trying to lose weight would be applicable to the general population.

What about the third assumption, the weight loss effects of reduced calorie intake? We are told that reducing energy intake by 100 KJ per day results in 1 kg body weight loss. The citation given for that information is this study, which is another modelling study. Are none of the assumptions in this study based on actual empirical data?

A really basic part of making predictions by mathematical modelling is to use sensitivity analyses. The model is based on various assumptions, and sensitivity analyses answer the questions of what happens if those assumptions were wrong. Typically, the inputs to the model are varied over plausible ranges, and then you can see how the results are affected.

Unfortunately, no sensitivity analysis was done. This, folks, is real amateur hour stuff. The reason for the lack of sensitivity analysis is given in the report as follows:

“it was beyond the scope of this project to include an extensive sensitivity analysis. The microsimulation model is complex involving many thousands of calculations; therefore sensitivity analysis would require many thousands of consecutive runs using super computers to undertake this within a realistic time scale.”

That has to be one of the lamest excuses for shoddy methods I’ve seen in a long time. This is 2016. You don’t have to run the analysis on your ZX Spectrum.

So this result is based on a bunch of heroic assumptions which have little basis in reality, and the sensitivity of the model to those assumptions were not tested. Forgive me if I’m not convinced.


The dishonesty of the All Trials campaign

The All Trials campaign is very fond of quoting the statistic that only half of all clinical trials have ever been published. That statistic is not based on good evidence, as I have explained at some length previously.

Now, if they are just sending the odd tweet or writing the odd blogpost with dodgy statistics, that is perhaps not the most important thing in the whole world, as the wonderful XKCD pointed out some time ago:

Wrong on the internet

But when they are using dodgy statistics for fundraising purposes, that is an entirely different matter. On their USA fundraising page, they prominently quote the evidence-free statistic about half of clinical trials not having been published.

Giving people misleading information when you are trying to get money from them is a serious matter. I am not a lawyer, but my understanding is that the definition of fraud is not dissimilar to that.

The All Trials fundraising page allows comments to be posted, so I posted a comment questioning their “half of all clinical trials unpublished” statistic. Here is a screenshot of the comments section of the page after I posted my comment,  in case you want to see what I wrote:Screenshot from 2016-02-02 18:16:32

Now, if the All Trials campaign genuinely believed their “half of all trials unpublished” statistic to be correct, they could have engaged with my comment. They could have explained why they thought they were right and I was wrong. Perhaps they thought there was an important piece of evidence that I had overlooked. Perhaps they thought there was a logical flaw in my arguments.

But no, they didn’t engage. They just deleted the comment within hours of my posting it. That is the stuff of homeopaths and anti-vaccinationists. It is not the way that those committed to transparency and honesty in science behave.

I am struggling to think of any reasonable explanation for this behaviour other than that they know their “half of all clinical trials unpublished” statistic to be on shaky ground and simply do not wish anyone to draw attention to it. That, in my book, is dishonest.

This is such a shame. The stated aim of the All Trials campaign is entirely honourable. They say that their aim is for all clinical trials to be published. This is undoubtedly important. All reasonable people would agree that to do a clinical trial and keep the results secret is unethical. I do not see why they need to spoil the campaign by using exactly the sort of intellectual dishonesty themselves that they are campaigning against.

New alcohol guidelines

It has probably not escaped your attention that the Department of Health published new guidelines for alcohol consumption on Friday. These guidelines recommend lower limits than the previous guidelines, namely no more than 14 units per week. The figure is the same for men and women.

There are many odd things about these guidelines. But before I get into that, I was rightly picked up on a previous blogpost for not being clear about my own competing interests, so I’ll get those out of the way first, as I think it’s important.

I do not work either for the alcohol industry or in public health, so professionally speaking, I have no dog in this fight. However, at a personal level, I do like a glass of wine or two with my dinner, which I have pretty much every day. So my own drinking habits fall within the recommended limits of the previous guidelines (no more than 4 units per day for men), but under the new guidelines I would be classified as an excessive drinker. Do bear that in mind when reading this blogpost. I have tried to be as impartial as possible, but we are of course all subject to biases in the way we assess evidence, and I cannot claim that my assessment is completely unaffected by being classified as a heavy drinker under the new guidelines.

So, how were the new guidelines developed? This was a mixture of empirical evidence, mathematical modelling, and the judgement of the guidelines group. They were reasonably explicit about this process, and admit that the guidelines are “both pragmatic and evidence based”, so they get good marks for being transparent about their overall thinking.

However, it was not always easy to figure out what evidence was used, so they get considerably less good marks for being transparent about the precise evidence that led to the guidelines. It’s mostly available if you look hard enough, but the opacity of the referencing is disappointing. Very few statements in the guidelines document are explicitly referenced. But as far as I can tell, most of the evidence comes from two other documents, “A summary of the evidence of the health and social impacts of alcohol consumption” (see the document “Appendix 3 CMO Alcohol Guidelines Summary of evidence.pdf” within the zip file that you can download here) ,and the report of the Sheffield modelling group.

The specific way in which “14 units per week” was derived was as follows. The guidelines team investigated what level of alcohol consumption would be associated with no more than an “acceptable risk”, which is fair enough. Two definitions of “acceptable risk” were used, based on recent work in developing alcohol guidelines in Canada and Australia. The Canadian definition of acceptable risk was a relative risk of alcohol-related mortality of 1, in other words, the point at which the overall risk associated with drinking, taking account of both beneficial and harmful effects, was the same as the risk for a non-drinker. The Australian definition of acceptable risk was that the proportion of deaths in the population attributable to alcohol, assuming that everyone in the population drinks at the recommended limit, is 1%. In practice, both methods gave similar results, so choosing between them is not important.

To calculate the the levels of alcohol that would correspond to those risks, a mathematical model was used which incorporated empirical data on 43 diseases which are known to be associated with alcohol consumption. Risks for each were considered, and the total mortality attributable to alcohol was calculated from those risks (although the precise mathematical calculations used were not described in sufficient detail for my liking).

These results are summarised in the following table (table 1 in both the guidelines document and the Sheffield report). Results are presented separately for men and women, and also separately depending on how many days each week are drinking days. The more drinking days you have per week for the same weekly total, the less you have on any given day. So weekly limits are higher if you drink 7 days per week than if you drink 1 day per week, because of the harm involved with binge drinking if you have your entire weekly allowance on just one day.

Table 1

Assuming that drinking is spread out over a few days a week, these figures are roughly in the region of 14, so that is where the guideline figure comes from. The same figure is now being used for men and women.

Something you may have noticed about the table above is that it implies the safe drinking limits are lower for men than for women. You may think that’s a bit odd. I think that’s a bit odd too.

Nonetheless, the rationale is explained in the report. We are told (see paragraph 46 of the guidelines document) that the risks of immediate harm from alcohol consumption, usually associated with binge-drinking in a single session, “are greater for men than for women, in part because of men’s underlying risk taking behaviours”. That sounds reasonably plausible, although no supporting evidence is offered for the statement.

To be honest, I find this result surprising. According to table 6 on page 35 of the Sheffield modelling report, deaths from the chronic effects of alcohol (eg cancer) are about twice as common as deaths from the acute affects of alcohol (eg getting drunk and falling under a bus). We also know that women are more susceptible than men to the longer term effect of alcohol. And yet it appears that the acute effects dominate this analysis.

Unfortunately, although the Sheffield report is reasonably good at explaining the inputs to the mathematical model, specific details of how the model works are not presented. So it is impossible to know why the results come out in this surprising way and whether it is reasonable.

There are some other problems with the model.

I think the most important one is that the relationship between alcohol consumption and risk was often assumed to be linear. This strikes me as a really bad assumption, perhaps best illustrated with the following graph (figure 11 on page 45 of the Sheffield report).

Figure 11

This shows how the risk of hospital admission for acute alcohol-related causes increases as a function of peak day consumption, ie the amount of alcohol drunk in a single day.

A few moments’ thought suggest that this is not remotely realistic.

The risk is expressed as a relative risk, in other words how many times more likely you are to be admitted to hospital for an alcohol-related cause than you are on a day when you drink no alcohol at all. Presumably they consider that there is a non-zero risk when you don’t drink at all, or a relative risk would make no sense. Perhaps that might be something like being injured in a road traffic crash where you were perfectly sober but the other driver was drunk.

But it’s probably safe to say that the risk of being hospitalised for an alcohol-related cause when you have not consumed any alcohol is low. The report does not make it clear what baseline risk they are using, but let’s assume conservatively that the daily risk is 1 in 100, or 1%. That means that you would expect to be admitted to hospital for an alcohol-related cause about 3 times a year if you don’t drink at all. I haven’t been admitted to hospital 3 times in the last year (or even once, in fact) for an alcohol related cause, and I’ve even drunk alcohol on most of those days. I doubt my experience of lack of hospitalisation is unusual. So I think it’s probably safe to assume that 1% is a substantial overestimate of the true baseline risk.

Now let’s look at the top right of the graph. That suggests that my relative risk of being admitted to hospital for an alcohol-related cause would be 6 times higher if I drink 50 units in a day. In other words, that my risk would be 6%. And remember that that is probably a massive overestimate.

Now, 50 units of alcohol is roughly equivalent to a bottle and a half of vodka. I don’t know about you, but I’m pretty sure that if I drank a bottle and a half of vodka in a single session then my chances of being hospitalised – if I survived that long – would be close to 100%.

So I don’t think that a linear function is realistic. I don’t have any data on the actual risk, but I would expect it to look something more like this:

Alcohol graph

Here we see that the risk is negligible at low levels of alcohol consumption, then increases rapidly once you get into the range of serious binge drinking, and approaches 100% as you consume amounts of alcohol unlikely to be compatible with life. The precise form of that graph is something I have just guessed at, but I’m pretty sure it’s a more reasonable guess than a linear function.

A mathematical model is only as good as the data used as inputs to the model and the assumptions used in the modelling. Although the data used are reasonably clearly described and come mostly from systematic reviews of the literature, the way in which the data are modelled is not sufficiently clear, and also makes some highly questionable assumptions. Although some rudimentary sensitivity analyses were done, no sensitivity analyses were done using risk functions other than linear ones.

So I am not at all sure I consider the results of the mathematical modelling trustworthy. Especially when it comes up with the counter-intuitive result that women can safely drink more than men, which contradicts most of the empirical research in this area.

But perhaps more importantly, I am also puzzled why it was felt necessary to go through a complex modelling process in the first place.

It seems to me that the important question here is how does your risk of premature death depend on your alcohol consumption. That, at any rate, is what was modelled.

But there is no need to model it: we actually have empirical data. A systematic review of 34 prospective studies by Di Castelnuovo et al published in 2006 looked at the relationship between alcohol consumption and mortality. This is what it found (the lines on either side of the male and female lines are 99% confidence intervals).

Systematic review

This shows that the level of alcohol consumption associated with no increased mortality risk compared with non-drinkers is about 25 g/day for women and 40 g/day for men. A standard UK unit is 8 g of alcohol, so that converts to about 22 units per week for women and 35 units per week for men: not entirely dissimilar to the previous guidelines.

Some attempt is made to explain why the data on all cause mortality have not been used, but I do not find them convincing (see page 7 of the summary of evidence).

One problem we are told is that “most of the physiological mechanisms that have been suggested to explain the protective effect of moderate drinking only apply for cohorts with overall low levels of consumption and patterns of regular drinking that do not vary”. That seems a bizarre criticism. The data show that there is a protective effect only at relatively low levels of consumption, and that once consumption increases, so does the risk. So of course the protective effect only applies at low levels of consumption. As for the “patterns of regular drinking”, the summary makes the point that binge drinking is harmful. Well, we know that. The guidelines already warn of the dangers of binge drinking. It seems odd therefore, to also reject the findings for people who split their weekly consumption evenly over the week and avoid binge drinking, as this is exactly what the guidelines say you should do.

I do not understand why studies which apply to people who follow safe drinking guidelines are deemed to be unsuitable for informing safe drinking guidelines. That makes no sense to me.

The summary also mentions the “sick quitter hypothesis” as a reason to mistrust the epidemiological data. The sick quitter hypothesis suggests that the benefits of moderate drinking compared with no drinking may have been overestimated in epidemiological studies, as non-drinkers may include recovering alcoholics and other people who have given up alcohol for health reasons, and therefore include an unusually unhealthy population.

The hypothesis seems reasonable, but it is not exactly a new revelation to epidemiologists, and has been thoroughly investigated. The systematic review by Di Castelnuovo reported a sensitivity analysis including only studies which excluded former drinkers from their no-consumption category. That found a lower beneficial effect on mortality than in the main analysis, but the protective effect was still unambiguously present. The point at which drinkers had the same risk as non-drinkers in that analysis was about 26 units per week (this is an overall figure: separate figures for men and women were not presented in the sensitivity analysis).

A systematic review specifically of cardiovascular mortality by Ronksley et al published in 2011 also ran a sensitivity analysis where only lifelong non-drinkers were used as the reference category, and found it made little difference to the results.

So although the “sick quitter hypothesis” sounds like a legitimate concern, in fact it has been investigated and is not a reason to distrust the results of the epidemiological analyses.

So all in all, I really do not follow the logic of embarking on a complex modelling exercise instead of using readily available empirical data. Granted, the systematic review by Di Castelnuovo et al is 10 years old now, but surely a more appropriate response to that would have been to commission an updated systematic review rather than ignore the systematic review evidence on mortality altogether and go down a different and problematic route.

Does any of this matter? After all, the guidelines are not compulsory. If my own reading of the evidence tells me I can quite safely drink 2 glasses of wine with my dinner most nights, I am completely free to do so.

Well, I think this does matter. If the government are going to publish guidelines on healthy behaviours, I think it is important that they be as accurate and evidence-based as possible. Otherwise the whole system of public health guidelines will fall into disrepute, and then it is far less likely that even sensible guidelines will be followed.

What is particularly concerning here is the confused messages the guidelines give about whether moderate drinking has benefits. From my reading of the literature, it certainly seems likely that there is a health benefit at low levels of consumption. That, at any rate, is the obvious conclusion from Di Castelnuovo et al’s systematic review.

And yet the guidelines are very unclear about this. While even the Sheffield model used to support the guidelines shows decreased risks at low levels of alcohol consumption (and those decreased risks would extend to substantially higher drinking levels if you base your judgement on the systematic review evidence), the guidelines themselves say that such decreased risks do not exist.

The guideline itself says “The risk of developing a range of diseases (including, for example, cancers of the mouth, throat, and breast) increases with any amount you drink on a regular basis”. That is true, but it ignore the fact that it is not true for other diseases. To mention only the harms of alcohol and ignore the benefits in the guidelines seems a dishonest way to present data. Surely the net effect is what is important.

Paragraph 30 of the guidelines document says “there is no level of drinking that can be recommended as completely safe long term”, which is also an odd thing to say when moderate levels of drinking have a lower risk than not drinking at all.

There is no doubt that the evidence on alcohol and health outcomes is complex. For obvious reasons, there have been no long-term randomised controlled trials, so we have to rely on epidemiological research with all its limitations. So I do not pretend for a moment that developing guidelines on what is a safe amount of alcohol to drink is easy.

But despite that, I think the developers of these guidelines could have done better.

Dangerous nonsense about vaping

If you thought you already had a good contender for “most dangerous, irresponsible, and ill-informed piece of health journalism of 2015”, then I’m sorry to tell you that it has been beaten into second place at the last minute.

With less than 36 hours left of 2015, I am confident that this article by Sarah Knapton in the Telegraph will win the title.

The article is titled “E-cigarettes are no safer than smoking tobacco, scientists warn”. The first paragraph is

“Vaping is no safer that [sic] smoking, scientists have warned after finding that e-cigarette vapour damages DNA in ways that could lead to cancer.”

There are such crushing levels of stupid in this article it’s hard to know where to start. But perhaps I’ll start by pointing out that a detailed review of the evidence on vaping by Public Health England, published earlier this year, concluded that e-cigarettes are about 95% less harmful than smoking.

If you dig into the detail of that review, you find that most of the residual 5% is the harm of nicotine addiction. It’s debatable whether that can really be called a harm, given that most people who vape are already addicted to nicotine as a result of years of smoking cigarettes.

But either way, the evidence shows that vaping, while it may not be 100% safe (though let’s remember that nothing is 100% safe: even teddy bears kill people), is considerably safer than smoking. This should not be a surprise. We have a pretty good understanding of what the toxic components of cigarette smoke are that cause all the damage, and most of those are either absent from e-cigarette vapour or present at much lower concentrations.

So the question of whether vaping is 100% safe is not the most relevant thing here. The question is whether it is safer than smoking. Nicotine addiction is hard to beat, and if a smoker finds it impossible to stop using nicotine, but can switch from smoking to vaping, then that is a good thing for that person’s health.

Now, nothing is ever set in stone in science. If new evidence comes along, we should always be prepared to revise our beliefs.

But obviously to go from a conclusion that vaping is 95% safer than smoking to concluding they are both equally harmful would require some pretty robust evidence, wouldn’t it?

So let’s look at the evidence Knapton uses as proof that all the previous estimates were wrong and vaping is in fact as harmful as smoking.

The paper it was based on is this one, published in the journal Oral Oncology.  (Many thanks to @CaeruleanSea for finding the link for me, which had defeated me after Knapton gave the wrong journal name in her article.)

The first thing to notice about this is that it is all lab based, using cell cultures, and so tells us little about what might actually happen in real humans. But the real kicker is that if we are going to compare vaping and smoking and conclude that they are as harmful as each other, then the cell cultures should have been exposed to equivalent amounts of e-cigarette vapour and cigarette smoke.

The paper describes how solutions were made by drawing either the vapour or smoke through cell media. We are then told that the cells were treated with the vaping medium every 3 days for up to 8 weeks. So presumably the cigarette medium was also applied every 3 days, right?

Well, no. Not exactly. This is what the paper says:

“Because of the high toxicity of cigarette smoke extract, cigarette-treated samples of each cell line could only be treated for 24 h.”

Yes, that’s right. The cigarette smoke was applied at a much lower intensity, because otherwise it killed the cells altogether. So how can you possibly conclude that vaping is no worse than smoking, when smoking is so harmful it kills the cells altogether and makes it impossible to do the experiment?

And yet despite that, the cigarettes still had a larger effect than the vaping. It is also odd that the results for cigarettes are not presented at all for some of the assays. I wonder if that’s because it had killed the cells and made the assays impossible? As primarily a clinical researcher, I’m not an expert in lab science, but not showing the results of your positive control seems odd to me.

But the paper still shows that the e-cigarette extract was harming cells, so that’s still a worry, right?

Well, there is the question of dose. It’s hard for me to know from the paper how realistic the doses were, as this is not my area of expertise, but the press release accompanying this paper (which may well be the only thing that Knapton actually read before writing her article) tells us the following:

“In this particular study, it was similar to someone smoking continuously for hours on end, so it’s a higher amount than would normally be delivered,”

Well, most things probably damage cells in culture if used at a high enough dose, so I don’t think this study really tells us much. All it tells us is that cigarettes do far more damage to cell cultures than e-cigarette vapour does. Because, and I can’t emphasise this point enough, THEY COULDN’T DO THE STUDY WITH EQUIVALENT DOSES OF CIGARETTE SMOKE BECAUSE IT KILLED ALL THE CELLS.

A charitable explanation of how Knapton could write such nonsense might be that she simply took the press release on trust (to be clear, the press release also makes the claim that vaping is as dangerous as smoking) and didn’t have time to check it. But leaving aside the question of whether a journalist on a major national newspaper should be regurgitating press releases without any kind of fact checking, I note that many people (myself included) have been pointing out to Knapton on Twitter that there are flaws in the article, and her response has been not to engage with such criticism, but to insist she is right and to block anyone who disagrees: the Twitter equivalent of the “la la la I’m not listening” argument.

It seems hard to come up with any explanation other than that Knapton likes to write a sensational headline and simply doesn’t care whether it’s true, or, more importantly, what harm the article may do.

And make no mistake: articles like this do have the potential to cause harm. It is perfectly clear that, whether or not vaping is completely safe, it is vastly safer than smoking. It would be a really bad outcome if smokers who were planning to switch to vaping read Knapton’s article and thought “oh, well if vaping is just as bad as smoking, maybe I won’t bother”. Maybe some of those smokers will then go on to die a horrible death of lung cancer, which could have been avoided had they switched to vaping.

Is Knapton really so ignorant that she doesn’t realise that is a possible consequence of her article, or does she not care?

And in case you doubt that anyone would really be foolish enough to believe such nonsense, I’m afraid there is evidence that people do believe it. According to a survey by Action on Smoking and Health (ASH), the proportion of people who believe that vaping is as harmful or more harmful than smoking increased from 14% in 2014 to 22% in 2015. And in the USA, the figures may be even worse: this study found 38% of respondents thought e-cigarettes were as harmful or more harmful than smoking. (Thanks again to @CaeruleanSea for finding the links to the surveys.)

I’ll leave the last word to Deborah Arnott, Chief Executive of ASH:

“The number of ex-smokers who are staying off tobacco by using electronic cigarettes is growing, showing just what value they can have. But the number of people who wrongly believe that vaping is as harmful as smoking is worrying. The growth of this false perception risks discouraging many smokers from using electronic cigarettes to quit and keep them smoking instead which would be bad for their health and the health of those around them.”

STAT investigation on failure to report research results

A news story by the American health news website STAT has appeared in my Twitter feed many times over the last few days.

The story claims to show that “prestigious medical research institutions have flagrantly violated a federal law requiring public reporting of study results, depriving patients and doctors of complete data to gauge the safety and benefits of treatments”. They looked at whether results of clinical trials that should have been posted on the website actually were posted, and found that many of them were not. It’s all scary stuff, and once again, shows that those evil scientists are hiding the results of their clinical trials.

Or are they?

To be honest, it’s hard to know what to make of this one. The problem is that the “research” on which the story is based has not been published in a peer reviewed journal. It seems that the only place the “research” has been reported is on the website itself. This is a significant problem, as the research is simply not reported in enough detail to know whether the methods it used were reliable enough to allow us to trust its conclusions. Maybe it was a fantastically thorough and entirely valid piece of research, or maybe it was dreadful. Without the sort of detail we would expect to see in a peer-reviewed research paper, it is impossible to know.

For example, the rather brief “methods section” of the article tells us that they filtered the data to exclude trials which were not required to report results, but they give no detail about how. So how do we know whether their dataset really contained only trials subject to mandatory reporting?

They also tell us that they excluded trials for which the deadline had not yet arrived, but again, they don’t tell us how. That’s actually quite important. If a trial has not yet reported results, then it’s hard to be sure when the trial finished. The website uses both actual and estimated dates of trial completion, and also has two different definitions of trial completion. We don’t know which definition was used, and if estimated dates were used, we don’t know if those estimates were accurate. In my experience, estimates of the end date of a clinical trial are frequently inaccurate.

Some really basic statistical details are missing. We are told that the results include “average” times by which results were late, but not whether they are mean or medians. With skewed data such as time to report something, the difference is important.

It appears that the researchers did not determine whether results had been published in peer-reviewed journals. So the claim that results are being hidden may be totally wrong. Even if a trial was not posted on, it’s hard to support a claim that the results are hidden if they’ve been published in a medical journal.

It is hardly surprising there are important details missing. Publishing “research” on a news website rather than in a peer reviewed journal is not how you do science. A wise man once said “If you have a serious new claim to make, it should go through scientific publication and peer review before you present it to the media“. Only a fool would describe the STAT story as “excellent“.

One of the findings of the STAT story was that academic institutions were worse than pharmaceutical companies at reporting their trials. Although it’s hard to be sure if that result is trustworthy, for all the reasons I describe above, it is at least consistent with more than one other piece of research (and I’m not aware of any research that has found the opposite).

There is a popular narrative that says clinical trial results are hidden because of evil conspiracies. However, no-one ever has yet given a satisfactory explanation of how hiding their clinical trial results furthers academics’ evil plans for global domination.

A far more likely explanation is that posting results is a time consuming and faffy business, which may often be overlooked in the face of competing priorities. That doesn’t excuse it, of course, but it does help to understand why results posting on is not as good as it should be, particularly from academic researchers, who are usually less well resourced than their colleagues in the pharmaceutical industry.

If the claims of the STAT article are true and researchers are indeed falling below the standards we expect in terms of clinical trial disclosure, then I suggest that rather than getting indignant and seeking to apportion blame, the sensible approach would be to figure out how to fix things.

I and some colleagues published a paper about 3 years ago in which we suggest how to do exactly that. I hope that our suggestions may help to solve the problem of inadequate clinical trial disclosure.