Opening up science

Written by Pratik Kanjilal | Updated: Feb 6 2014, 08:51am hrs
New Jersey-based health care giant Johnson & Johnson took a leadership position in open science on Friday, leaving behind competitors who are reluctant to outgrow the current way of doing science and technology, which believes that the highest value accrues from exclusive ownership of knowledge, protected by a punitive patents regime, strong data security laws and a climate of corporate secrecy. This model cannot possibly scale up to the ever-larger questions and opportunities that science and technology will encounter soon. Sharing across national borders and corporate and academic firewalls would be an easier and faster route to take.

J&J has forged an alliance with the Yale Open Data Access (YODA) Project of the Center for Outcomes Research and Evaluation (CORE) at Yale-New Haven Hospital to release clinical trials data to other researchers. Traditionally, pharmaceutical companies have closely held trials data, only sharing with regulators as required. Negative results were especially unlikely to be released. It has been feared that the data can be used by the competition to save development time, or to launch political attacks on the company. Not all criticism of companies in the life sciences is in the public interest, and they are naturally not eager to shovel grist into the mill.

So J&J is being daringly progressive in turning the model on its head. YODA will act as a clearing house vetting requests for data coming in from researchers and practitioners all over the world. Credible agencies will be given access to raw data in bulk, scrubbed of patient identifiers. Starting with drug trials, the programme will eventually include medical devices. Crucially, all the data will be revealed, not only sets relating to positive results, or sets which are immediately germane to the query. Interesting possibilities will emerge, including some which may be unanticipated.

For starters, researchers would be able to verify company results independently. This would have implications for scientific issues which have been overtaken by politics, such as genetically modified foods. In such cases, there is room for politics rather than reason because the tests run are not seen to be entirely compelling, or the testers not entirely disinterested parties. Reappraisals of the raw data by third parties at arm's length could pave the way to acceptable resolutions.

But recursing old trials is only an elementary outcome, though it can have profound consequences for trust. More interestingly, large, cumulative volumes of released data would make big data approaches feasible. The beauty of the method is that it can tease meaning out of junk data. Technically, even the results of poorly designed tests, which failed to prove or disprove anything, could throw up patterns at high volumes. Big data can repurpose old research to reach conclusions which may have little to do with the purposes of the original researchers.

Indeed, the most interesting developments can emerge if multiple disciplines embrace open science. Data from the most diverse sources can be mashed up to make up fresh databases, and queries run over them to reveal unsuspected patterns in human knowledge. Something of the sort is already happening, though in a very tentative sort of way, in Wolfram Alpha, the computational engine which maps relations between facts across disciplines as diverse as algebra and materials science.

A better parallel may be seen in the massive mashing of talent on which modern science depends. Universities compete globally to attract the best talent, and some of the most interesting work is the result of inter-departmental cooperation. The space race, which featured the most ambitious targets and the fastest development in the 20th century, was won by the US because it was able to accumulate the best minds in the field. In our century, the search for the Higgs boson has been the most ambitious project. Its discovery annihilated the vast gap which had opened up in physics between the blackboard and the laboratory, between the postulate and the experimental proof on which the method of science depends.

CERN, the European Organisation for Nuclear Science, whose Large Hadron Collider confirmed the existence of the Higgs boson, runs the huge volumes of data it generates to multiple agencies all over the world for distributed analysis. It draws talent from over 600 universities and institutions in over 100 nations. It has given full membership to a nation outside Europe (Israel) and India hopes to be an associate member. This is the order of human multiplexing required by the world's biggest particle lab, which confirmed the existence of the Higgs boson almost half a century after it was postulated in 1964.

CERN is a triumph of centralised science, but the model has physical limits. Open science aims to bypass them by sharing freely. It aims to show that an ancient adage is still valid, that casting your bread upon the waters remains a profitable activity.