News

Is data duplication a problem in cancer research?

25 Jun 2015
Is data duplication a problem in cancer research?

In a recent piece in The Times, ecancer's Founding Editor, Prof Gordon McVie, was asked to comment upon a study that seemed to show high levels of data duplication in cancer research. As always, Prof McVie was pithy - but there are still depths to consider.

Data duplication – the redundant publication of results already published - is of key interest to the open access publishing community. With  information partitioned as it is - important resources hidden behind paywalls or inaccessible due to language barriers - some duplication of results is bound to occur. Many scientists are unwittingly - but innocently - duplicating data because they are barred from accessing all the information.

The All Trials campaign, set up by Ben Goldacre and others, encourages scientists to publish all the results of trials - not just positive ones.  The results from around half of all clinical trials remain hidden, which leads to more redundant results as clinicians repeat redundant work. Making all the results of research completely open access will lead to less duplication of data, which could become less of a problem in the future.

There is also a push to publish all data relating to research, with influential open access publishers such as PLoS enacting a new data policy to encourage authors to submit all their data. Accessing these supplementary data files will allow researchers to see exactly which pieces of data are being published.

Some inherent problems with peer review form part of the problem with data duplication. There are serious flaws in academic publishing's current peer review system, which means that articles with questionable data are getting through the net. 

Many publishers, particularly progressive open access ones, are looking for new ways to improve peer review; ideas like open peer review, post-publication peer discussion, and double-blind peer review are becoming more popular. 

As various recent “stings” have shown, the impact factor isn’t necessarily indicative of good quality peer review, so the way that the field evaluates the quality of individual journals also needs to be overhauled.

While data duplication is an interesting and significant challenge, it also suggests some new opportunities for the field of academic publishing - particularly cancer research - to develop and grow.