DNAdigest is a nonprofit organization whose objectives include engaging, facilitating and educating the community about access to genomic data. In an interview with Dr. Mohammad Al-Ubaydli, DNAdigest covers the challenges to sharing data, especially nuances to sharing genomic data, for medical research.
This interview first appeared on DNAdigest
Dr Mohammad Al-Ubaydli is the founder and CEO of Patients Know Best, an organisation that moves the data custodianship into the hands of the patients, to facilitate data sharing and data access between the patient and the clinicians or service providers.
We interviewed Mohammad about his views of data sharing in the domain of genetics research and he gave a number of examples of how data sharing is a multi-faceted problem, and several aspects of the problem are not often discussed.
What is your background?
I’m a physician and programmer from Cambridge – I wrote six books about IT in health care, two of which explain how to share medical records with patients – and I’m a patient with a rare disease. Because of my interest in patients understanding their records and thus their health I started Patients Know Best, a social enterprise that puts patients in control of their data. It is currently used by over 100 customers across 8 different countries, including one customer rolling this out for over 1 million patients’ records.
“Data sharing for medical research is a good thing” – or not? What is your take?
I think making data available for medical research is a good thing and I think everyone agrees in principle that data sharing for research will bring us all more cheaply, more quickly to cures for genetic diseases. And in theory the more open the data and the earlier it is shared, the faster this benefit should be possible.
However, you cannot just talk about ‘open open open’, there are some problems that no one wants to talk about and they need to be addressed. There are some very practical considerations that are true blockers, but they are not being discussed enough.
What kind of blocking challenges are these?
I can see at least three different categories of challenges. The total number of challenges is bigger of course, but these are the challenges that continue to be overlooked by the proponents of data sharing for research:
- Rosalind Franklin issue – in science it takes a very long time to generate the right data to test your research hypothesis. There is a large amount of the hard work required to design experiments, acquire samples and produce good data from your experiments. However, it may take a short amount of time for another scientist to find and publish results on that data. This is a general theme across many different areas of research, for instance, Rosalind spent years getting X-Ray crystallography data to decipher the structure of the DNA protein, but it was Crick & Watson who when they got access to the data soon published a hallmark paper on the structure of DNA and were awarded the Nobel prize for their work. Franklin died before the award and Nobel prizes cannot be awarded posthumously, so the hard work and scientific breakthroughs of Rosalind Franklin were not acknowledged. That story makes scientists extremely uncomfortable, because there is an ever-present fear of not being given credit for your work, and if data that you spent a long time producing is made available to other researchers, there is a real possibility that they may use the data to claim breakthrough findings instead of you. In other words, the real challenge is to solve the issue of giving appropriate credit for the work and efforts put into procuring, generating, curating and publishing scientific data.
- There are some field-specific differences in the way IP is treated that affects data sharing. For instance disciplines that lead to inventions, especially patentable hardware, those scientists are being told to keep their work secret, to get patents and to protect their IP. While in other disciplines where no patents exists, the researchers are not allowed to “capture benefits” although they contribute just as much to society as a whole. This second category includes findings about genetic sequences, which I believe rightly cannot be patented, but this means a push for researchers to make their findings widely available, even if their findings will loose the association with the first researcher who made the discovery.
- There are clear concerns about data sharing when thinking from a data privacy perspective. It is a standard explanation to patients that “data will be anonymized” and kept secure and governed in accordance with consent given for its use eg for scientific research. However, DNA is not anonymous. Your genome sequence is a unique identifier for you as an individual. Look at this concept in the context of what personal information that used to be private and offline but is now public and online e.g. your name, email, home address, postcode, work address, work relationships etc, from social media etc. You did not think this would happen when you signed up on social media, but the way the technology works, and the way the technology is used, makes it possible to retrieve a lot of data about an individual. The same thing will happen with genetic data – to be able to join up all your data from different sources based on your DNA profile. This means that fully informed consent for data usage and governance is not possible because it is not possible to be aware of all the future potential for pulling data together.
Would you contribute your own medical and genetical data for research? Would you share your data openly on the web?
I do share my data with all my doctors and regularly ask them if there any trials I can contribute research materials to. The treatment I receive was created from past research trials, I hope that the data I contribute will help the next generation of patients to benefit. It’s sad to me that the professionals are surprised to hear this request from me, it is a shame it is not more common. But I recommend it wholeheartedly for every patient, the researchers look so happy when you offer them data for their wonderful work, it’s such a good feeling to help.
But I have not agreed for my whole genome sequence to be published on the web (e.g. the 100,000 genome project) because the privacy of my genetic information affects the privacy of my family. (Remember, anonymous today does not mean anonymous tomorrow.) So I cannot make the decision myself on their behalf and still keep that information private.