Skip to main content

Research Data Management: Where to submit data

Where to submit data

The information provided below is from an Information Sheet compiled by the CONUL Research Group for researchers who want to share / submit their research data to a repository.

CONUL Research Group (2018). Where to submit data: CONUL Information Sheet.

CONUL is the Irish Consortium of National and University Libraries (www.conul.ie) and is the representative body of research libraries in Ireland and Northern Ireland. The CONUL Research Group is one of the groups within CONUL, with a remit of exploring and promoting best practice and providing guidance and expertise in a wide range of activities, including research data management, open access, scholarly communications, research impact, digital repositories, digitisation and digital preservation, common infrastructures, and digital scholarship.

Reasons for sharing data

​Depositing your research data in a data archive or data repository will facilitate its discovery and preservation. 

  • Impact & longevity: Your data may be cited by others. Open publications and data receive more citations, over longer periods
  • Compliance: Funders, publishers and institutions may require that you share your data
  • Transparency & quality: Your findings can be replicated and compared with other studies
  • Collaboration: creates opportunities for follow on research and collaboration
  • Re-use: Your data can be used in novel ways. Data sharing facilitates re-use of your data for future / follow-on research and discovery as data collection can be funded / collected once, and used many times for a variety of purposes
  • Efficiency: Data sharing is good research practice!

There may be reasons for not sharing your data e.g. privacy and confidentiality issues, commercial value of the data.  Horizon 2020 has coined the phrase “As open as possible, as closed as necessary.”

If you are unable to publicly share your data, consider the possibility that you may wish to make your data available internally to future researchers to facilitate follow-on research, and/or to create a metadata record in your chosen archives or repository. A metadata record will describe your data and aid others in knowing about it. In order to ensure this can happen you will need to manage your data.

Advantages of a data repository or archive

A data repository allows researchers to upload and publish their data, thereby making the data available for other researchers to re-use. Similarly, a data archive allows users to deposit and publish data but will generally offer greater levels of curation to community standards, have specific guidelines on what data can be deposited and is more likely to offer long-term preservation as a service. Sometimes the terms data repositories and data archives are used interchangeably.

A data repository or archive will provide services such as:

  • Persistent identifier such as a “digital object identifier” or DOI; the presence of a DOI facilitates discoverability and citeability
  • Assistance with metadata provision e.g. through the use of a template
  • Allow you to apply a licence to your data
  • Aid compliance with the FAIR data principles (data that are Findable, Accessible, Interoperable, and Reusable) as data are published online with appropriate metadata and are assigned a persistent identifier, see Jones, Sarah, & Grootveld, Marjan. (2017, November). How FAIR are your data?. Zenodo. http://doi.org/10.5281/zenodo.1065991
  • Accept a wide range of data types
  • Long-term access and, in some cases, long-term preservation
  • Offer useful search, navigation and visualisation functionality
  • Reach a wider audience of potential users
  • Manage requests for data on your behalf

When to select a data repository

Choose early so that you can familiarise yourself with the repository’s requirements.

Requirement may include:

  • depositing in certain file formats
  • using a specific metadata standard
  • inclusion of documentation to help describe your data.

Understanding such requirements will enable you to design your data collection materials for easier metadata and documentation creation.

Initial questions

How to select a data repository

Ask:

  • Is it reputable? Is it listed in Re3data thereby meeting their conditions of inclusion?
  • Is it appropriate to my discipline?
  • Will it take the data you want to deposit?
  • Is there a size limit?
  • Does it provide a DOI / persistent identifier?
  • Does it provide guidance on how the data should be cited?
  • Does it provide access control, where necessary,  for your research data?
  • Does it ensure long-term preservation / curation?
  • Does it provide expert help with e.g. metadata provision, curation?
  • Is there a charge?

Other questions may pertain depending on your requirements. For more information see the UK’s Digital Curation Centre’s checklist: http://www.dcc.ac.uk/resources/how-guides-checklists/where-keep-research-data/where-keep-research-data  

Locate a data repository

Some universities have their own data repositories that offer the facility for researchers to deposit, share and licence their data resources for discovery and use by others. There are more than 600 discipline-specific data repositories worldwide with community specific standards. They may also be called data centres or archives.

re3data.org (Registry of Research Data Repositories) is the primary place to locate a data repository.  You can search it by specific research discipline and then filter by access categories, data usage licenses, whether the repository gives the data a persistent identifier etc. 

Re3data uses a series of symbols to indicate key services e.g.

  • To be registered in re3data.org a research data repository must:
    • be run by a legal entity, such as a sustainable institution (e.g. library, university)
    • clarify access conditions  to the data and repository as well as the terms of use
    • have focus on research data

Discipline-specific repositories

Discipline-specific repositories have the expertise and resources to deal with particular types of data. They have different policies and may charge for their services.

See also PLOS and Springer Nature recommended repositories

Multidisciplinary repositories

If there is no disciplinary-specific repository in your area select a general repository. These can handle a variety of different data types. Charges may apply but can be included in a funding application. Key general repositories are listed in the table below. This list is for information purposes only and is not exhaustive:

Data Hub provides free access to its core features letting you search for data, register published datasets, create and manage groups of datasets

Dataverse

Dryad hosts a wide range of data types. For some journals there is no charge to deposit in Dryad.

figshare archives data and software for all subjects and is suitable for small to medium sized projects that do not require specialised curation

Github is a code hosting site where you can store and share code for free

Open Science Framework

Zenodo is a multi-disciplinary data repositories where researchers can deposit both publications and data and create links between them

See also:

ICPSR is an international consortium of more than 750 academic institutions and research organizations that maintains a data archive of more than 250,000 files of research in the social and behavioral sciences

Licensing and publishing data

When you make your data available you need to use a license so that potential users know what they are allowed to do with your data. 

A license states what can be done with the data and how that data can be redistributed e.g. Creative Commons Licences

Ball, A. (2014). ‘How to License Research Data’. DCC How-to Guides. Edinburgh: Digital Curation Centre.

Learn more about rights relating to research data from the UK Data Service 

Uk Data Services, Best practice in governance of data for research: Licensing and accessing Webinar and Slides and Q&A, 18 April 2018

Code

GitHub is the main platform for hosting and reviewing code: https://github.com/

GitHub offers a number of advantages such as assigning DOIs (which facilitates discoverability and citeability) and allowing integration from Zenodo and FigShare repositories to enable the citing of your GitHub repository in academic literature.

How to find datasets

  • Web of Science includes the Data Citation Index to research data from repositories across disciplines and around the world. You can access it from the Library catalogue. It indexes data and provides links to repositories where it is stored. Click here for short tutorial on the Data Citation Index

  • ​Scopus searches include simultaneous searches for relevant research data and the search results page includes dataset results. Read more about this feature on the Scopus blog.

Data journals

Conventional journals may link to or embed research data within the structure  of the scientific article, data journals offer a platform for publicationof "data articles" or "dataset papers" that are typically short articles providing a technical description of a dataset. Some data journals also publish (i.e. host) the dataset themselves. Others link to datasets hosted on dedicated data repositories. Where data journals link to external datasets there are often minimum requirements for the third-party hosting e.g Geoscience Data Journal specifies the datacentre must be able to mint a DOI. Some journal articles link to datasets in publications like Nature. Other new data journals that are peer reviewed and citable include Scientific Data (Nature) and the Geoscience Data Journal (Wiley).

Ware, Mark , & Mabe, Michael. (2015). The STM report: An overview of scientific and scholarly journal publishing: STM: International Association of Scientific, Technical and Medical Publishers.