Accurate, Focused Research on Law, Technology and Knowledge Discovery Since 2002

More than 2 million research papers have disappeared from the Internet

Nature: “More than one-quarter of scholarly articles are not being properly archived and preserved, a study of more than seven million digital publications suggests. The findings, published in the Journal of Librarianship and Scholarly Communication on 24 January, indicate that systems to preserve papers online have failed to keep pace with the growth of research output. “Our entire epistemology of science and research relies on the chain of footnotes,” explains author Martin Eve, a researcher in literature, technology and publishing at Birkbeck, University of London. “If you can’t verify what someone else has said at some other point, you’re just trusting to blind faith for artefacts that you can no longer read yourself.” Eve, who is also involved in research and development at digital-infrastructure organization Crossref, checked whether 7,438,037 works labelled with digital object identifiers (DOIs) are held in archives. DOIs — which consist of a string of numbers, letters and symbols — are unique fingerprints used to identify and link to specific publications, such as scholarly articles and official reports. Crossref is the largest DOI registration agency, allocating the identifiers to about 20,000 members, including publishers, museums and other institutions. The sample of DOIs included in the study was made up of a random selection of up to 1,000 registered to each member organization. Twenty-eight percent of these works — more than two million articles — did not appear in a major digital archive, despite having an active DOI. Only 58% of the DOIs referenced works that had been stored in at least one archive. The other 14% were excluded from the study because they were published too recently, were not journal articles or did not have an identifiable source…”

See also Crossref Blog Post by Martin Eve: “What do we know about DOIs – Crossref holds metadata for approximately 150 million scholarly artifacts. These range from peer reviewed journal articles through to scholarly books through to scientific blog posts. In fact, amid such heterogeneity, the only singular factor that unites such items is that they have been assigned a document object identifier (DOI); a unique identification string that can be used to resolve to a resource pertaining to said metadata (often, but not always, a copy of the work identified by the metadata)…”

Sorry, comments are closed for this post.