To make research data available and publish it openly, it is recommended that it is registered in a data repository. There exists a large number of data repositories for publishing and sharing data. In some research areas, there are already well-established subject-specific data repositories.
Publishing research data in established repositories gives you several advantages as a researcher:
- You will "automatically" create standard metadata that makes your dataset interoperable and reusable
- The dataset will get a permanent identifier (PID), which makes it accessible and enables correct citation where the dataset is linked to a publication.
Datasets containing personal data of any kind cannot be uploaded to open repositories. You can still use repositories to create a metadata record to describe the dataset that contains personal information, while the files in the dataset should be uploaded to a secure storage space on KI.
If you use scripts to analyse data, good practice is to annotate the scripts and publish them openly. That way, it will be clear to other researchers what methods were used, regardless of whether the data set is openly available or not.
Examples of repositories
- The Swedish National Data Service (SND) is a national platform where research data can be shared openly or made findable through published descriptions of datasets which for various reasons cannot be shared openly. SND's system for describing and sharing research data is called DORIS. This is the best option for sharing research data that contains sensitive data such as personal data. Read more here [link to the DORIS page].
- Zenodo is a repository created by CERN and OpenAIRE. In Zenodo you can upload up to 50 GB per dataset free of charge. It is possible to connect data in Zenodo with the associated source code made available through GitHub.
- Figshare is a general data repository used by and available to researchers from all disciplines. It is possible to upload up to 20 GB free of charge.
- BioStudies is a data repository for descriptions of biological studies. BioStudies is included in ELIXIR's list of recommended databases.
- GitHub is a web-based storage service for code. Here you can publish research code. It is possible to use Github for version control and get a time-stamped copy with a persistent identifier through the Zenodo repository.
More repositories, both general and subject-specific, are listed at Re3data.org. Here you can search for information about repositories or browse and filter by topic, area, certification, metadata standards, etc.