Publish and share research data - an overview
Expectations of publishing and making research data available are rising. The situation with COVID-19 has accelerated the process. Both publishers and funders often require a reference to the underlying data when your article is published. Even when your manuscript undergoes a peer review, the underlying data can be requested.
Since the autumn of 2020 a new service, the Data Access Unit (DAU), has been established at the Karolinska Institutet University Library (KIB). We are a team with different competences and our task is to support you as a researcher and employee at KI in the process of publishing and sharing your research data.
Checklist - What is important to think about?
Your research data ...
FAIR stands for Findable, Accessible, Interoperable and Reusable. According to the FAIR principles research data have to be findable, there must be information on how to access them, they have to be compatible with other data, and they must be reusable. The FAIR principles play an important role in the work for Open Science and describe some of the most central guidelines for good data management and for Open Access to research data.
A large amount of research data at KI contains sensitive data with various forms of personal data. Not all such data can be shared openly, but they can be described, e.g. with metadata in an open data repository such as DORIS, which is offered via KI's Data Access Unit (DAU).
Personal data in research at KI
You will need to find a suitable data repository where you can describe and publish your dataset. If the data contains personal data or other information worthy of protection, KI's repository linked to DORIS, is the solution currently available to the KI researchers. With this solution, a description of the dataset becomes openly available while access to data is restricted. If your dataset does not contain information worthy of protection, it can be published publicly available. In that case, there are many different repositories to choose from.
When your dataset is published in a data repository, it can be assigned a Persistent Identifier (PID). A PID is a unique and persistent reference that makes it possible to find and reuse digital material. A Digital Object Identifier (DOI) is a PID that is often used for research data and scientific articles. In the article publishing process, it is common that the publisher requests a DOI to underlying data. The DOI will be used to create a consistent link from the article to the research data the analysis is based on. To make sure the DOI works as a link, the DOI code needs to be preceded by https://doi.org/. Example: https://doi.org/10.5878/6fcv-1795.
In the database Registry of Research Data Repositories you could find information about if a data repository assigns published datasets with a persistent identifier and, if so, what type of identifier, e.g. DOI.
In parallel with the fact that many research funders have increased their requirements for openly shared and published research data, journal publishers also have offers and occasionally requirements for the publication of datasets. This is something you might get information about when uploading your manuscript for peer review.
When choosing a journal, it may therefore be wise to review in advance how you ought to publish or share your data related to the manuscript. And that the publishing method the publisher recommends also complies with the requirements and instructions from your research funder.
The recommendations of the research funders and the journal publishers.
Do you need to know more about how to publish or share research data? The Data Access Unit (DAU) at the library offers tailor-made workshops on the subject for KI's research groups and departments. A number of fixed, digital opportunities per semester are also offered for researchers and doctoral students to register for. All workshops are held in English.
Cite and refer to research data
Citing data means referring to a published collection of research data in the same way as you do for journal articles, reports, conference papers and other publications. A reference to a published data collection should include sufficient information to be able to find the correct version.
DataCite and CrossRef have developed a tool: DOI Citation Formatter where you can choose the style of citation. A citation to data should include sufficient information for others to be able to find the correct version of data.
The citation must contain:
- Principal investigator(s) / organisation
- Data repository
To refer to datasets, use the library's reference guides for APA & Vancouver
Licenses for research data
In order for research data that has been made openly available to be reusable, the material needs to be assigned with a license. A license provides information about what is allowed to do with the dataset and what is not allowed. Creative Commons and Open Data Commons licenses are commonly used open licenses for research data.
Read more about Licenses, embargoes and restrictions for research data at the Swedish National Data Service.
Documentation of research data and analyzes improves the conditions for the analyzes to be reproducible. This allows both you and your research group members and others to repeat the analyzes and arrive at the same results.
Do you think this sounds complicated and want to know more? You are always welcome to contact us with your questions at the Data Access Unit (DAU). You are also welcome to register to our workshops, current workshops are advertised in our calendar.