What are repositories?

Repositories are document servers operated at higher education or research institutions in which scientific and scholarly materials are archived and made accessible worldwide free of charge.

Documents that are made publicly available via a repository are frequently formally published by a publisher as well, for example in a journal or edited volume, or as a monograph. In contrast to publishing a work in an open access (OA) journal, for example, open content licences are of secondary importance when a work is deposited in an OA repository. This is due to the fact that scholars and scientists who self-archive their work in a repository in addition to formal publication have often transferred exclusive rights of use in the published work to the publisher and are not, therefore, entitled to make it available under an open content licence. Rather, the possibility of self-archiving a work that has already been formally published, for example in a journal or edited volume or as a monograph, presupposes that the author has retained the right to self-archive a version of the work in a repository; that the publisher has an OA policy that provides for this option; or that the work falls within the scope of the OA provisions of the licences of the Alliance of German Science Organisations (Alliance licences) or the relevant provisions of the copyright act (in Germany: the UrhG). Detailed information on self-archiving a version of formally published works in repositories can be found in the section entitled “Self-archiving documents in repositories”.

The use of repositories to provide OA to a version of published works represents the so-called green road to OA. The rights situation is easier when repositories are used for the primary publication of scholarly works in OA - the gold road to OA - for example for the publication of conference proceedings, working paper series, or other texts that are not published by a publishing house at the same time. In such cases, the authors usually hold the exclusive rights in these works and are free to choose the conditions under which they make them available. Hence, they can be made available under one of the aforementioned open content licences or simply under the provisions of the copyright act (in Germany: the UrhG). Primary publications via repositories can be found, for example, in the arXiv and Zenodo repositories. As repositories often contain content from different sources or from different disciplines, the use of a specialised search engine such as BASE (Bielefeld Academic Search Engine) for document searches is recommended.

Institutional and disciplinary repositories

A distinction is made between institutional and disciplinary repositories. Institutional repositories are document servers maintained by institutions (mostly university libraries, other infrastructure organisations, or research institutions) to enable their members to digitally publish or self-archive their scientific and scholarly documents. Disciplinary repositories, by contrast, are supra-institutional and subject-based, and are available to scholars and scientists for the publication and archiving of their works (e.g. peDOCS, a disciplinary full-text [German-language] educational science repository, the Social Science Open Access Repository [SSOAR], and arXiv, which specialises in articles from the fields of physics, mathematics, computer sciences, quantitative biology, and quantitative finance and statistics). Both user access to, and self-archiving of, scientific and scholarly documents in institutional and disciplinary repositories are, as a rule, free of charge.

As of the beginning of September 2014, the Directory of Open Access Repositories (OpenDOAR) listed 2,730 repositories via which higher education and research institutions enable their scientists and scholars to provide OA to their scholarly documents.

Content and added value of repositories

There are many different types of repositories, and they have rich and varied functionalities. Besides the classical distinction between institutional and disciplinary repositories, the offerings of the various archives also differ in terms of their content, value-added services, and technical design. Many repositories provide for the archiving of all types of digital (text) documents, for example preprints/postprints of journal articles, working papers, monographs and book chapters, teaching and learning materials, and theses and dissertations. Other repositories exclude certain document types, such as theses and dissertations.

Networked repositories offer numerous value-added services via their interfaces and reference services. They include

  • The generation of virtual collections of thematically related documents from different repositories in order to save users from having to conduct laborious searches in many different archives, and to afford the easiest possible access to scientific and scholarly texts.
  • Personalisation functions, for example the generation of profiles so that users can be informed of new documents in a particular discipline, on a particular subject, or from a particular institution or author.
  • The automatic generation of publication lists or bibliographies of individual authors or working groups, which can be dynamically integrated into personal websites or other online profiles, for example research information systems.
  • Services such as INSPIRE or Google Scholar enable the identification of different versions of the same document. In this way, users who do not have access to a formal, closed-access publication may find an OA version in a repository.
  • Repositories frequently permit bibliographic data to be downloaded. These data can be imported into literature administration programs, thereby facilitating the convenient use of the information in future publications.
  • Services such as INSPIRE and Google Scholar facilitate the calculation of citation statistics for works stored in repositories, thereby enabling the citation impact of the documents to be determined.
  • Information on the dissemination and impact of documents in repositories can also be collected in the form of download statistics (e.g. by using the data generated by the DFG-funded project OA-Statistics (OAS) or another statistical solution) or altmetrics, which measure the circulation of scientific and scholarly information in academic and public networks and media. By means of reciprocal links between related objects such as texts and research data or -software, for example, scientists can make the entire output of a research project available to users. This facilitates the search for information and gives researchers the opportunity to present their published texts in context, together with other documents, data, or software.
  • Repositories enable researchers to comply with the funding guidelines of research funders who mandate that publicly funded projects must provide OA to scientific and scholarly publications relating to their results. Some repositories even enable self-archived documents to be automatically assigned to a particular project of a research funder by means of a project ID. For example, researchers who post documents to Zenodo or another Open AIRE compliant repository can include the EU grant ID in the metadata. In this way, they comply with the funding body’s requirements, and their documents are indexed by, and assigned to, the appropriate project in the OpenAIRE portal, which provides a single point of access to OA works generated by EU-funded projects.

The aim of the aforementioned value-added services and functionalities is to increase the visibility and dissemination of the publications in the repositories. When pursuing this aim, repositories rely in particular on networking and collaboration with other information services and storage systems such as search engines and subject-specific or multi-disciplinary databases. Of equal importance are links to information offerings such as personal or institutional websites, university information systems such as annual bibliographies or research information systems that index documents in repositories, and the support that repositories offer their users to enable them to comply with research funders’ OA guidelines.

Document discoverability - the OAI Protocol

The fact that most repositories use the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) simplifies the search for scientific and scholarly documents immensely and ensures the aforementioned dissemination of the OA content. The OAI-PMH enables OAI service providers (e.g. OAIster or BASE) to harvest the metadata of documents in geographically dispersed repositories and to bring them together in a single database. A search in this database accesses the records of the digital resources of all the repositories covered by the service provider in question. Technically speaking, the OAI-PMH is an XML-based protocol that supports requests for, and harvesting of, metadata. The service provider harvests the metadata of the individual repositories (the data providers), processes these data, and makes them available for search queries. This guarantees the findability and maximum dissemination - and therefore the visibility - of the scientific and scholarly texts.

Quality standards of repositories - the DINI certificate

The German Initiative for Network Information (DINI) is a not-for-profit organisation committed to improving the information and communication services of higher education and research institutions and promoting and supporting the implementation of the necessary developments. The DINI Certificate for Document and Publication Services, which was developed by DINI’s Electronic Publishing working group, facilitates the standardised assessment of the evaluated repositories and their services. The DINI Certificate is intended as a quality control instrument for document and publication services. The criteria on which certification is based include the visibility of the overall service provided, the support offered to authors, the security, authenticity, and integrity of the technical system, and the long-term availability and discoverability of the archived documents. To date (as of September 2014), 49 repositories have been certified by DINI.

Since September 2014, applications can be submitted for the current DINI Certificate 2013, which, for the first time, also provides for the certification of OA journals and the pre-certification of technical platforms (as “DINI-ready”).

Repository overview lists

  • DINI overview of all German repositories, 49 of which have a DINI Certificate

Search engines for repositories