What Are Repositories?

Repositories are document servers operated by universities or research institutions in which scholarly materials are archived and made available free of charge worldwide.

Categories of Repositories

A distinction is made between institutional and disciplinary repositories. Institutional repositories are document servers operated by institutions (mainly university libraries, other infrastructural institutions, or research organisations) that enable their members to digitally publish scholarly documents. Disciplinary depositories, by contrast, are trans-institutional. They are available to scientists and scholars to publish and archive their works on specific subjects, for example, in a specialist discipline. Examples are media/rep/, an open access repository for publications in the field of media studies; Social Science Open Access Repository (SSOAR) for the social sciences; PubMed Central for the biomedical and life sciences; and arXiv for scholarly articles, especially from the fields of physics, mathematics, and computer science.

As a rule, authors do not have to pay a fee to make scholarly works available in institutional or disciplinary repositories, and users do not have to pay to access these works.

As of August 2020, the Directory of Open Access Repositories (OpenDOAR) listed 5,395 repositories, via which universities and research institutions enable scientists to make scholarly documents freely available to the public.


Rights in the Case of Self-Archiving

Documents that are made available in repositories are frequently also published formally with a publisher – for example, in a journal, an edited volume, or as a monograph. The use of a repository to make a work that has also been published with a publisher available to the public is referred to as “self-archiving”; it constitutes green open access. In this case – in contrast to open access journals, for example – content licences are of minor importance. Scientists and scholars who, in addition to formal publication, also make their works available on a document server have often already transferred all rights of use to the publisher, and can therefore no longer license the self-archived version under a content licence. Rather, the possibility of self-archiving documents that have already been published in a journal or with a publisher presupposes either that the author has reserved the right to make the work available in a repository, or that the work is covered by the relevant provisions of copyright law (in Germany, e.g., the so-called Zweitveröffentlichungsrecht). The publisher’s open access policy might provide for the option that the work will be covered by the open access provisions of the German Alliance licences. The infographic at the bottom of this page provides brief information on self-archiving formally published works in repositories.

Rights in the Case of Gold Open Access

The rights situation is much simpler when repositories are used to publish scholarly works for the first time (gold open access) – for example, to publish conference proceedings, series, or other texts that are not published in parallel with a publisher. In this case, the authors usually hold the exclusive rights in the content of the works, and are therefore completely free to choose the conditions under which they are made available. Hence, they can be licensed under one of the aforementioned content licences, or they can simply be made available in accordance with the provisions of copyright law. Works that are first published in repositories, which are often known as preprints, can be found, for example, on the document servers arXiv or Zenodo. Because the documents found in repositories often vary in terms of content or subject matter, it is expedient when searching for documents to use a specialised search engine, such as the Bielefeld Academic Search Engine (BASE), which accesses full texts in repositories.

Content and Added Value of Repositories

The designs and functionalities of repositories are varied and extensive. Besides the classical differentiation between institutional and disciplinary repositories, repositories also vary in terms of the content provided, additional services, and their technical configuration. Many document servers provide for the archiving of all types of electronic (text) documents – for example, preprints and post-prints of journal articles, working papers, books and contributions to books, theses, and teaching and learning materials – whereas others exclude some document types.

Via their interfaces and indexing services, networked repositories offer numerous value-added services:

  • the virtual aggregation of thematically related documents from different document servers, thereby avoiding the need for time-consuming searches on numerous different servers and allowing the easiest possible access to scholarly texts
  • personalisation functions, for example, the creation of profiles by means of which readers can be informed about new documents on a particular subject
  • the automatic creation of publication lists or bibliographies of authors or working groups that can be dynamically integrated into personal websites or other online profiles, for example, research information systems
  • services such as Unpaywall, INSPIRE, or Google Scholar that allow different versions of the same document to be identified, thereby possibly enabling users who do not have access to a formal closed access publication to find an open access version on a document server
  • Repositories often allow the downloading of bibliographic data, which can then be imported into reference administration programs, thereby enabling comfortable use of the information in future publications.
  • Services such as INSPIRE and Google Scholar make it possible to access citation data for works on document servers and thus to identify their citation impact.
  • Statistical data show the dissemination and impact of documents in repositories. This information can be obtained in the form of download statistics (e.g., by using the data of the DFG project Open-Access-Statistik (OAS) or another statistics solution). Altmetrics capture the circulation of scholarly information in scholarly and public networks and media.
  • Via mutual linking between related objects, such as texts and research data or software, scientists can make the entire output of a research project available in an integrated way.
  • Document servers enable authors to comply easily with the funding guidelines of research funders, whereby project-related publications are to be made available in open access. Many repositories even enable publications to be assigned to a specific research project. For example, anyone who imports a document into Zenodo or another OpenAIRE-compatible repository can add the EU grant ID via the metadata.

The aim of the above-mentioned value-added services and functions is to increase the visibility and dissemination of the publications in the repositories. In this pursuit, the repositories rely mainly on networking and cooperation with other information services or search engines. Of equal importance is the linking of personal or institute websites, annual bibliographies, or current research information systems (CRIS).

Findability of the Documents – The OAI Protocol

The fact that most repositories use the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) immensely facilitates the search for scholarly documents. Moreover, it ensures the aforementioned dissemination of openly accessible information. The OAI-PMH allows the metadata of documents in distributed repositories to be aggregated in one database. Technically speaking, the OAI-PMH is an XML-based protocol that serves to request and transmit metadata. An OAI service provider (e.g.,. OAIster or BASE) supports the search by using the OAI-PMH. The service provider harvests metadata from the individual repositories (the content providers), processes them, and makes them available for search queries. This guarantees the findability as well as the maximum dissemination, and thus the visibility, of the scholarly texts.

Quality Standards for Repositories – The DINI Certificate

The Deutsche Initiative für Netzwerkinformation e. V. (German Initiative for Network Information, DINI) seeks to improve information and communication services and to promote and support the necessary developments. The DINI Certificate for Open Access Publication Services enables the standardised assessment of the reviewed document servers and the services they provide. DINI sees the certificate as quality control for document and publication services. The criteria on which certification is based include the visibility of the overall service; author support; the security, authenticity, and integrity of the technical system; and the long-term availability and findability of the archived documents. To date (as of September 2020), 64 repositories have been certified by DINI. In October 2019, the sixth edition of the DINI Certificate was released.

Search Engines for Repositories

How to Use Repositories

Infographic on how to use open access repositories (CC BY 4.0 International).
Source: Harris, Emma & Issakson, Pelle (2019). Infographic: How to Use Open Access Repositories. Zenodo. https://doi.org/10.5281/zenodo.3666627