It really is not surprising that scientific data is disappearing all the time. But, oh, think of the value of this data – in terms of cost and knowledge. Irreplaceable.
The original article from Smithsonian.com is here: The vast majority of raw data from old scientific studies may now be missing.
A new survey of 20-year-old studies shows that poor archives and inaccessible authors make 90 percent of raw data impossible to find.
When a group of researchers tried to email the authors of 516 biological studies published between 1991 and 2011 and ask for the raw data, they were dismayed to find that more 90 percent of the oldest data (from papers written more than 20 years ago) were inaccessible. In total, even including papers published as recently as 2011, they were only able to track down the data for 23 percent.
“Some of the time, for instance, it was saved on three-and-a-half inch floppy disks, so no one could access it, because they no longer had the proper drives,” Vines says. Because the basic idea of keeping data is so that it can be used by others in future research, this sort of obsolescence essentially renders the data useless.
And preserving data is so important, it’s worth remembering, because it’s impossible to predict in which directions research will move in the future.
Seems to me that the openEHR approach to data definitions is an excellent candidate for preventing the health data ‘black hole’ too!
Non-proprietary, open data specifications are a key component for future-proofing irreplaceable clinical and research data.