The Trials Tracker and post-truth politics

The All Trials campaign was founded in 2013 with the stated aim of ensuring that all clinical trials are disclosed in the public domain. This is, of course, an entirely worthy aim. There is no doubt that sponsors of clinical trials have an ethical responsibility to make sure that the results of their trials are made public.

However, as I have written before, I am not impressed by the way the All Trials campaign misuses statistics in pursuit of its aims. Specifically, the statistic they keep promoting, “about half of all clinical trials are unpublished”, is simply not evidence based. Most recent studies show that the extent of trials that are undisclosed is more like 20% than 50%.

The latest initiative by the All Trials campaign is the Trials Tracker. This is an automated tool that looks at all trials registered on since 2006 and determines, using an automated algorithm, which of them have been disclosed. They found 45% were undisclosed (27% of industry sponsored-trials and 54% of non-industry trials). So, surely this is evidence to support the All Trials claim that about half of trials are undisclosed, right?


In fact it looks like the true figure for undisclosed trials is not 45%, but at most 21%. Let me explain.

The problem is that an automated algorithm is not very good at determining whether trials are disclosed or not. The algorithm can tell if results have been posted on, and also searches PubMed for publications with a matching ID number. You can probably see the flaw in this already. There are many ways that results could be disclosed that would not be picked up by that algorithm.

Many pharmaceutical companies make results of clinical trials available on their own websites. The algorithm would not pick that up. Also, although journal publications of clinical trials should ideally make sure they are indexed by the ID number, in practice that system is imperfect. So the automated algorithm misses many journal articles that aren’t indexed correctly with their ID number.

So how bad is the algorithm?

The sponsor with the greatest number of unreported trials, according to the algorithm, is Sanofi. I started by downloading the raw data, picked the first 10 trials sponsored by Sanofi that were supposedly “undisclosed”, and tried searching for results manually.

As an aside, the Trials Tracker team get 7/10 for transparency. They make their raw data available for download, which is great, but they don’t disclose their metadata (descriptions of what each variable in the dataset represents), so it was rather hard work figuring out how to use the data. But I think I figured it out in the end, as after trying a few combinations of interpretations I was able to replicate their published results exactly.

Anyway, of those 10 “undisclosed” trials by Sanofi, 8 of them were reported on Sanofi’s own website, and one of the remaining 2 was published in a journal. So in fact only 1 of the 10 was actually undisclosed. I posted this information in a comment on the journal article in which the Trials Tracker is described, and it prompted another reader, Tamas Ferenci, to investigate the Sanofi trials more systematically. He found that 227 of the 285 Sanofi trials (80%) listed as undisclosed by Trials Tracker were in fact published on Sanofi’s website. He then went on to look at “undisclosed” trials sponsored by AstraZeneca, and found that 38 of the 68 supposedly undisclosed trials (56%) were actually published on AstraZeneca’s website. Ferenci’s search only looked at company websites, so it’s possible that more of the trials were reported in journal articles.

The above analyses only looked at a couple of sponsors, and we don’t know if they are representative. So to investigate more systematically the extent to which the Trials Tracker algorithm underestimates disclosure, I searched for results manually for 100 trials: a random selection of 50 industry trials and a random selection of 50 non-industry trials.

I found that 54% (95% confidence interval 40-68%) of industry trials and 52% (95% CI 38-66%) of non-industry trials that had been classified as undisclosed by Trials Tracker were available in the public domain. This might be an underestimate, as my search was not especially thorough. I searched Google, Google Scholar, and PubMed, and if I couldn’t find any results in a few minutes then I gave up. A more systematic search might have found more articles.

If you’d like to check the results yourself, my findings are in a csv file here. This follows the same structure as the original dataset (I’d love to be able to give you the metadata for that, but as mentioned above, I can’t), but with the addition of 3 variables at the end. “Disclosed” specifies whether the trial was disclosed, and if so, how (journal, company website, etc). It’s possible that trials were disclosed in more than one place, but once I’d found a trial in one place I stopped searching. “Link” is a link to the results if available, and “Comment” is any other information that struck me as relevant, such as whether a trial was terminated prematurely or was of a product which has since been discontinued.

Putting these figures together with the Trials Tracker main results, this suggests that only 12% of industry trials and 26% of non-industry trials are undisclosed, or 21% overall (34% of the trials were sponsored by industry). And given the rough and ready nature of my search strategy, this is probably an upper bound for the proportion of undisclosed trials. A far cry from “about half”, and in fact broadly consistent with the recent studies showing that about 80% of trials are disclosed. It’s also worth noting that industry are clearly doing better at disclosure than academia. Much of the narrative that the All Trials campaign has encouraged is of the form “evil secretive Big Pharma deliberately withholding their results”. The data don’t seem to support this. It seems far more likely that trials are undisclosed simply because triallists lack the resources to write them up for publication. Research in industry is generally better funded than research in academia, and my guess is that the better funding explains why industry do better at disclosing their results. I and some colleagues have previously suggested that one way to increase trial disclosure rates would be to ensure that funders of research ringfence a part of their budget specifically for the costs of publication.

There are some interesting features of the 23 out of the 50 industry-sponsored trials that really did seem to be undisclosed. 9 of them were not trials of a drug intervention. Of the 14 undisclosed drug trials, 4 were of products that had been discontinued and a further 3 had sample sizes less than 12 subjects, so none of those 7 studies are likely to be relevant to clinical practice. It seems that undisclosed industry-sponsored drug trials of relevance to clinical practice are very rare indeed.

The Trials Tracker team would no doubt respond by saying that the trials missed by their algorithm have been badly indexed, which is bad in itself. And they would be right about that. Trial sponsors should update with their results. They should also make sure that the ID number is included in the publication (although in several cases of published trials that were missed by the algorithm, the ID number was in fact included in the abstract of the paper, so this seems to be a fault of Medline indexing rather than any fault of the triallists).

However, the claim made by the Trials Tracker is not that trials are badly indexed. If they stuck to making only that claim, then the Trials Tracker would be a perfectly worthy and admirable project. But the problem is they go beyond that, and claim something which their data simply do not show. Their claim is that the trials are undisclosed. This is just wrong. It is another example of what seems to be all the rage these days, namely “post-truth politics”. It is no different from when the Brexit campaign said “We spend £350 million a week on the EU and could spend it on the NHS instead” or when Donald Trump said, well, pretty much every time his lips moved really.

Welcome to the post-truth world.


3 thoughts on “The Trials Tracker and post-truth politics”

  1. Fabulous work. I think the job you did of tracking down the pharma sponsored trials which Trial Tracker claimed were undisclosed would be a fantastic “citizen science” task which would get interested members of the public engaged with this issue better than All Trials and Open Trials have done up to now. As you point out, the main undisclosers are from academia not industry. Tracking down conflicts of interest could be another interesting task to crowd source.

  2. Your point about the lack of metadata for the csv is a good one. Would be to good have csvy or csvw are two good options However, I do think the Notebook they published on github “Examine unreported trials on” is what you are looking for in terms of “transparent and reproducible research”

  3. Very good work. Hats off to you.

    You may have understimated the number of disclosed trials by pharma because as you know some trials are given in-house ID which at times are difficult to tie up with other IDs such as NCTs.

    Poor science is in no one’s interest.

