Open Access in biology

A large and growing number of open access (OA) journals have become an important part of the publishing landscape in biology. In addition to publishers whose portfolios comprise only purely OA journals, there are many others who publish both closed-access and OA journals, or who offer hybrid options. When it comes to using preprint repositories, biology is still lagging behind other disciplines. Besides publications in scholarly journals, access to data is provided by a large number of repositories and databases.

Open Access journals

The Directory of Open Access Journals (DOAJ) lists more than 250 OA journals in biology (as of August 2017). Given the large number, only selected representative examples and new developments will be described in what follows.

One of the locomotives of the OA movement in biology is the publisher Public Library of Science (PLOS). With the journal PLOS Biology, which was launched in 2003, it laid the foundation stone for the dissemination of the OA idea in the life sciences. Besides PLOS Biology, PLOS operates a number of other, specialised journals, for example PLOS Computational Biology and PLOS Neglected Tropical Diseases. One of the most well-known journals from the house of PLOS is PLOS ONE, which published over 100 000 articles in the course of its first seven years. Based on the annual number of articles published, it is the largest scholarly journal in the world. One important feature of PLOS ONE is that submissions are assessed only on the basis of their scientific validity and are not rejected because they are lacking in relevance, which is the case with many other journals. Hence, PLOS ONE is a prototype of the so-called megajournal – a model that has since been adopted by many other publishers (e.g. the Scientific Reports of the Nature Publishing Group and the Royal Society journal Open Biology). Another publisher that plays a pioneering role in OA is BioMed Central, which operates a large number of OA journals, for example BMC Biology, and is a co-founder of the Open Access Scholarly Publishers Association. The Hindawi Publishing Corporation, which also publishes numerous journals in the life sciences, had converted all its scholarly journals to OA by 2007. In 2012, the journal eLife was founded by the Howard Hughes Medical Institute, the Max Planck Society, and the Wellcome Trust with the aim of making it an OA flagship that offers a high-quality alternative to the “triumvirate” comprising Nature, Science, and Cell by being highly selective. In addition, there are also OA publishers who specialise in the life sciences. They include Pensoft Publishing, whose journals are closely integrated with databases such as ZooBank, IPNI and GBIF, and whose Biodiversity Data Journal is completely XML-based. A number of the journals published by the OA publisher Copernicus Publications, which focuses on the geosciences, cover subfields of biology. They include, for example, Biogeosciences and Fossil Record.

One criticism that is frequently voiced in the OA debate is that authors have to pay high costs to publish their articles in OA (e.g. US$1350 and US$2900 respectively to publish in PLOS ONE or PLOS Biology). The OA journal PeerJ, which was established in 2012, is challenging the industry with its new business model and very low author-side fees. Instead of paying per article, PeerJ charges a once-off contribution (“publishing plan”), which entitles the author to life-long membership. By paying US$99, one article per author can be published each year (payment of US$199 covers 2 articles per year; a contribution of US$299 entitles the author to publish an unlimited number of articles). Every co-author of an article must have a publishing plan. However, if an article has more than 12 authors, only 12 must have a paid publishing plan, while the remaining authors can avail of a free publishing plan. In general, publication costs can be enormously reduced by this model. Numerous organisations, for example the Max Planck Society, have concluded institutional agreements with the publisher. Like many BioMed Central and Copernicus journals, PeerJ also offers the option of open peer review: if the referees and the authors agree, the complete peer review history is made available online. One OA journal that makes the peer review process completely transparent is F1000 Research, which has an open post-publication peer review model whereby submissions are published after minimal editorial assessment and processing and are then publicly reviewed by invited reviewers’ whose names and affiliations are published alongside their reviews. Each reviewer report is made available online immediately after receipt. ScienceOpen has also espoused the post-publication peer-review model. At the same time, it has moved further away from the classical peer-review model practised by journals by including elements of social networks and facilitating collaborative work.

Disciplinary repositories

Unfortunately, the use of preprints is not as widespread in the biosciences as it is in other disciplines, such as physics, for example. Nonetheless, besides making works publicly available on group web pages or in university repositories, there are now a number of other possibilities of taking the green road to OA. For some years now, the preprint server arXiv, which is operated by Cornell University Library, has had a Quantitative Biology archive with 11 categories. Since 2013, the Cold Spring Harbor Laboratory has been operating bioRxiv, a preprint server for the biosciences. Moreover, PeerJ maintains its own preprint server, PeerJ PrePrints.

The main archives for postprint versions of journal literature in the biosciences are PubMed Central and Europe PubMed Central. The U.S. National Institutes of Health (NIH) require that all scholarly articles that arise from the projects they fund be submitted to PubMed Central upon acceptance for publication and that they be made publicly available no later than 12 months after publication. For this reason, a large number of articles can now be found in PubMed Central. However, they do not have to be made available under an open licence, nor must they be in the Open Access Subset, which can be separately accessed and searched.

In addition to repositories for scholarly articles, numerous databases serve as important tools in the everyday lives of researchers in the biological sciences. Via web pages and application programming interfaces (API), the National Center for Biotechnology Information (NCBI) and the European Bioinformatics Institute (EBI), in particular, offer structured access to various collections of data, for example sequences and structures of biomolecules such as DNA, RNA and proteins, metabolic pathways, and much more. The Registry of Research Data Repositories (re3data) lists over 600 reviewed repositories in biology (as of August 2017). In addition to such specialised repositories, platforms like Zenodo, figshare and Dryad enable research data for which there are no specific repositories (the so-called ”long tail” of science) to be made publicly accessible on a long-term basis.

Literature searches in biology

Originally developed to index medical literature, the literature database MEDLINE now contains most of the literature metadata of biosciences journals. The corresponding search engine PubMed offers (unfortunately well-hidden) options to filter search results to show only OA articles. Alternative search engines for MEDLINE, such as GoPubMed, or transdisciplinary literature search engines such as BASE (Bielefeld Academic Search Engine) also offer such filter options.

Key players

There is growing evidence that the major research funders now recognize the advantages of OA publications and openly licensed research results and data in all disciplines. All the major German research organisations, such as the Max Planck Society, the German Research Foundation (DFG), the Fraunhofer-Gesellschaft, the Leibniz Association, and the Helmholtz Association were among the first signatories of the Berlin Declaration on Open Access to Knowledge in the Sciences and Humanities, which has also been signed by numerous other research institutions and universities outside Germany. In November 2014, the DFG launched an Appeal for the use of open licences in science. Unfortunately, commitment to OA on the part of the professional societies in the field of biology in Germany has been relatively weak up to now. Organisations in other countries have moved further ahead in this respect. For example, the American Society for Microbiology operates mBio, an OA journal that is highly regarded in the scientific community. And the NIH’s decision to require that scholarly articles arising from the projects they fund be made publicly available no later than 12 months after publication is at least a first step towards making the results of publicly funded research accessible on a broad and long-term basis.

In degree programmes in the biosciences, the  topics of “open access” and “open licences” are generally missing. However, universities must consider it their duty to ensure that future scientists are taught the basics in this regard (for example the importance of rights of reuse and Creative Commons licences). Lack of knowledge about the various licensing models leads to wrong decisions and is used by some publishing houses in order to blur the lines between OA and “merely” cost-free access, i.e. access without rights of reuse (so-called “open washing”).

Open Science

Open access refers mainly to the opening of access to the formal scholarly and scientific publications that generally represent the last step in the research process. However, on the way to this last step a lot of other information is generated, most of which remains hidden from the public. The fact that only a fraction of the information generated is actually published can be regarded as an historical artefact. Before the establishment of digital technologies such as the internet, data storage and dissemination was much more expensive than it is today. In recent years, a movement that is endeavouring to use the new technical possibilities in order to make all parts of the scientific process publicly accessible has gathered under the banner of "open science". Reusability, transparency, and reproducibility are important objectives aimed at enhancing the quality of research and increasing the effectiveness of the use of research funds. There are many such biosciences and transdisciplinary projects that develop on a grass-roots basis in the respective communities and address various aspects of the scientific process. For example, there are projects dedicated to making the following resources and information publicly accessible online: research data (open data), source codes of software programmes (open source), laboratory notebooks (open notebook science), manuscript review reports (open peer review), teaching resources (open education resources), the quantification of the influence of publications (open metrics), and research grant proposals and grant reviews. However, open science also includes opening up participation in scientific research to members of the general public (citizen science). In addition to groups such as the Open Knowledge Foundation’s Open Science Working Group, the Center for Open Science, and Mozilla Science Labs, some journals recognize the need to increase the transparency of research. For example, since 2014 PLOS ONE requires that all data underlying published findings be made accessible. PLOS Biology and PLOS Genetics support the Research Resource Identification Initiative, with whose help research resources and methods receive persistent and unique identifiers.

It can be clearly seen that the existing technical possibilities are still far from being adequately exploited in order to guarantee an effective exchange of knowledge. This gap between possibilities and their actual realisation can be explained by the fact that, when recruiting employees and granting funding, the basis for assessing scientists and their works remains unchanged. In general, such evaluations are based only on formal publications in scholarly journals and on the journal impact factor, a metric that has many inherent weaknesses. This practice reduces the motivation to make research results, data, and methods publicly available at an early stage, and thereby prevents their wider and quicker use by others. It would be desirable for the major research organisations, the professional societies in the field of biology, and the universities to rethink their evaluation practices. In addition to making the necessary infrastructure available, incentives must be created to make the results of scientific endeavour publicly accessible as early as possible and as free as possible of technical and legal barriers.

Content editor of this web page: Dr Konrad Förstner