Topic, definition and scope
- How do repositories support FAIR?
- The topic concerns the capacity of digital repositories to ensure the research data findability, accessibility, interoperability and reusability. This content is a summarised version suitable as a recap for those with pre-existing knowledge of data science.
- Data repositories are key in putting the FAIR principles into practice. They not only enable findability and accessibility but also provide persistent identifiers, documentation, and metadata, thus fostering reusability for humans and machines.
- A “form” needs to be filled –metadata by default.
- The form complies with a specific metadata standard.
- Metadata will then become machine-actionable and searchable in an online resource.
- A persistent identifier for the data is automatically generated.
- References to other data or metadata can be included.
- Authentication and authorization procedures are in place.
- Access can be regulated from closed to open.
- The provision of machine-readable licences enhances the reusability of the data.
- The use of standards and controlled vocabularies is enforced.
- Interfaces for external services like OAI-PMH allow harvesting of metadata for stored records** **
- Background:
- A number of previous projects and working groups have been discussing what a common set of attributes should be to enable FAIR data, and to allow repository stakeholders to make their own decisions about which repository is best for them. Details of these previous efforts are summarised in the case statement of one existing cross-domain, worldwide effort under the auspices of the RDA: the RDA Data repository attributes Working Group. Therefore, how FAIR is implemented in a repository, and how each FAIR principle aligns with a particular data attribute, can be discovered from these efforts.
Summary of Tasks / Actions
- Check lists
- Is your repository suitable for data FAIRification?
- Do you know of any challenges and how to remedy them?
- Assess data FAIRness using F-UJI in different repositories and data resources and explain the differences among them:
- FTP server (e.g: The-normalised-Sentinel-1-Global-Backscatter-Model-mapping-Earths-land-surface-with-C-band-microwaves.pdf)
- URL (e.g: https://researchdata.tuwien.ac.at/records/n2d1v-gqb91)
- DOI (e.g: https://doi.org/10.48436/n2d1v-gqb91 )
- Can you explain what is missing in terms of FAIRness for each?
- Which one gives the best FAIR results?Why?
- Use the handout that is listed in **Materials / Equipment **and check which FAIR principles/ sub-principles were not implemented in the repositories.
- Which is more FAIR? Citing research data using URL or DOI?Why?
- Go through FAIR_principles_translation_SNSF_logo (snf.ch) sheet to get get familiar with FAIR requirements that can be be fulfilled by the repository ( keep in mind that not all requirements are manageable by the repositories, some are Researcher’s responsibility!)
- Repositories and their implemented data standards (as well as the data policies that recommend their use) can all be discovered in FAIRsharing (documentation on searching within FAIRsharing).
- Use Case:
- Here is a link to the FAIRsharing documentation page that created specifically for this lesson plan; navigation of FAIRsharing to discover suitable resources for a particular researcher’s use case (a user story emerge of a multi-omics RA who is using the library services to help them figure out how to implement FAIR according to a articular funder’s data policy).
https://fairsharing.gitbook.io/fairsharing/how-to/unsure-where-to-start
- Match the following requirements to their corresponding FAIR principle/sub-principles:
- A “form” needs to be filled –metadata by default.
- A persistent identifier for the data is automatically generated.
- References to other data or metadata can be included.
- Access can be regulated from closed to open.
- The use of standards and controlled vocabularies is enforced.
- A DOI is issued to every published record.
- The form complies with a specific metadata standard (DataCite)
- Metadata contains the PID
- Create a user account in a repository.
FAIR Principle | FAIR Sub-Principle | FAIR implementation in a Repository |
Findable | F1: (meta)data are assigned a globally unique and persistent identifier | |
F2: data are described with rich metadata (defined by R1 below) | ||
F3: metadata clearly and explicitly include the identifier of the data it describes | ||
F4: (meta)data are registered or indexed in a searchable resource | ||
Accessable | A1: (meta)data are retrievable by their identifier using a standardised communications protocol | |
A1.1: the protocol is open, free, and universally implementable | ||
A1.2: the protocol allows for an authentication and authorization procedure, where necessary | ||
A2: metadata are accessible, even when the data are no longer available | ||
Interoperable | I1: (meta)data uses a formal, accessible, shared, and broadly applicable language for knowledge representation. | |
I3: (meta)data include qualified references to other (meta)data | ||
I2: (meta)data use vocabularies that follow FAIR principles | ||
Reusable | R1: (meta)data are richly described with a plurality of accurate and relevant attributes | |
R1.1: (meta)data are released with a clear and accessible data usage licence | ||
R1.2: (meta)data are associated with detailed provenance | ||
R1.3: (meta)data meet domain-relevant community standards |
Materials / Equipment
- Understand the FAIR principles, eg. https://www.go-fair.org/fair-principles/
- Have a trusted repository or decide on using one eg. https://huspi.com/blog-open/software-code-repositories/
References
- https://www.openaire.eu/item/fair-data-and-trusted-repositories
- https://www.fairsfair.eu/news/fair-data-repositories-key-features-defined
- https://www.snf.ch/en/7GhWDP8omTMLZ00O/news/news-210122-open-research-data-which-data-repositories-can-be-used
- Zenodo
- FAIR_principles_translation_SNSF_logo (snf.ch)
- FAIRsharing’s educational factsheet on databases
Take home tasks/preparation
- Test your repository with FAIRification of one data using the above Handout
-
Think about an example similar to what we explained in the above use case; of how to find what a particular role (e.g. Data Steward) needs in FAIRsharing.
For example, start with a requirement they have, e.g. a funder data policy, and move them step-by-step from that data policy to a shortlist of standards and/or databases that they will need to align with and/or submit to. This example has now been written here: https://fairsharing.gitbook.io/fairsharing/how-to/unsure-where-to-start