Data Repositories and FAIR

Status

Ready for review

FAIR elements

There are no prerequisites defined for this lesson plan.

After completing this lesson plan, the participants are capable of:

Understand how digital repositories can offer support for metadata

Learn about repositories and interoperable file formats

Know how repositories aid persistent identifiers for future citations

FAIRsharing stores information about which persistent identifiers are used by which repository

Navigate some trustworthy repositories to understand how they implement FAIR

Understand what FAIR principles and sub-principle are fulfilled by the repository and what are researchers' responsibility.

Assess data FAIRness using F-UJI

Topic, definition and scope

How do repositories support FAIR?
The topic concerns the capacity of digital repositories to ensure the research data findability, accessibility, interoperability and reusability. This content is a summarised version suitable as a recap for those with pre-existing knowledge of data science.
Data repositories are key in putting the FAIR principles into practice. They not only enable findability and accessibility but also provide persistent identifiers, documentation, and metadata, thus fostering reusability for humans and machines.
A “form” needs to be filled –metadata by default.
The form complies with a specific metadata standard.
Metadata will then become machine-actionable and searchable in an online resource.
A persistent identifier for the data is automatically generated.
References to other data or metadata can be included.
Authentication and authorization procedures are in place.
Access can be regulated from closed to open.
The provision of machine-readable licences enhances the reusability of the data.
The use of standards and controlled vocabularies is enforced.
Interfaces for external services like OAI-PMH allow harvesting of metadata for stored records** **
Background:
- A number of previous projects and working groups have been discussing what a common set of attributes should be to enable FAIR data, and to allow repository stakeholders to make their own decisions about which repository is best for them. Details of these previous efforts are summarised in the case statement of one existing cross-domain, worldwide effort under the auspices of the RDA: the RDA Data repository attributes Working Group. Therefore, how FAIR is implemented in a repository, and how each FAIR principle aligns with a particular data attribute, can be discovered from these efforts.

Summary of Tasks / Actions

Check lists
Is your repository suitable for data FAIRification?
Do you know of any challenges and how to remedy them?
Assess data FAIRness using F-UJI in different repositories and data resources and explain the differences among them:
- FTP server (e.g: The-normalised-Sentinel-1-Global-Backscatter-Model-mapping-Earths-land-surface-with-C-band-microwaves.pdf)
- URL (e.g: https://researchdata.tuwien.ac.at/records/n2d1v-gqb91)
- DOI (e.g: https://doi.org/10.48436/n2d1v-gqb91 )
  - Can you explain what is missing in terms of FAIRness for each?
  - Which one gives the best FAIR results?Why?
  - Use the handout that is listed in **Materials / Equipment **and check which FAIR principles/ sub-principles were not implemented in the repositories.
  - Which is more FAIR? Citing research data using URL or DOI?Why?
Go through FAIR_principles_translation_SNSF_logo (snf.ch) sheet to get get familiar with FAIR requirements that can be be fulfilled by the repository ( keep in mind that not all requirements are manageable by the repositories, some are Researcher’s responsibility!)
Repositories and their implemented data standards (as well as the data policies that recommend their use) can all be discovered in FAIRsharing (documentation on searching within FAIRsharing).
Use Case:
- Here is a link to the FAIRsharing documentation page that created specifically for this lesson plan; navigation of FAIRsharing to discover suitable resources for a particular researcher’s use case (a user story emerge of a multi-omics RA who is using the library services to help them figure out how to implement FAIR according to a articular funder’s data policy).
https://fairsharing.gitbook.io/fairsharing/how-to/unsure-where-to-start
Match the following requirements to their corresponding FAIR principle/sub-principles:
- A “form” needs to be filled –metadata by default.
- A persistent identifier for the data is automatically generated.
- References to other data or metadata can be included.
- Access can be regulated from closed to open.
- The use of standards and controlled vocabularies is enforced.
- A DOI is issued to every published record.
- The form complies with a specific metadata standard (DataCite)
- Metadata contains the PID
- Create a user account in a repository.

FAIR Principle	FAIR Sub-Principle	FAIR implementation in a Repository
Findable	F1: (meta)data are assigned a globally unique and persistent identifier
	F2: data are described with rich metadata (defined by R1 below)
	F3: metadata clearly and explicitly include the identifier of the data it describes
	F4: (meta)data are registered or indexed in a searchable resource
Accessable	A1: (meta)data are retrievable by their identifier using a standardised communications protocol
	A1.1: the protocol is open, free, and universally implementable
	A1.2: the protocol allows for an authentication and authorization procedure, where necessary
	A2: metadata are accessible, even when the data are no longer available
Interoperable	I1: (meta)data uses a formal, accessible, shared, and broadly applicable language for knowledge representation.
	I3: (meta)data include qualified references to other (meta)data
	I2: (meta)data use vocabularies that follow FAIR principles
Reusable	R1: (meta)data are richly described with a plurality of accurate and relevant attributes
	R1.1: (meta)data are released with a clear and accessible data usage licence
	R1.2: (meta)data are associated with detailed provenance
	R1.3: (meta)data meet domain-relevant community standards

Materials / Equipment

Understand the FAIR principles, eg. https://www.go-fair.org/fair-principles/
Have a trusted repository or decide on using one eg. https://huspi.com/blog-open/software-code-repositories/

References

Take home tasks/preparation

Test your repository with FAIRification of one data using the above Handout
Think about an example similar to what we explained in the above use case; of how to find what a particular role (e.g. Data Steward) needs in FAIRsharing.

For example, start with a requirement they have, e.g. a funder data policy, and move them step-by-step from that data policy to a shortlist of standards and/or databases that they will need to align with and/or submit to. This example has now been written here: https://fairsharing.gitbook.io/fairsharing/how-to/unsure-where-to-start

Samah Jaber

Dayane Araujo

Branka Franicevic

Allyson Lister

Diana Pilvar

Anne-Françoise Adam-Blondon

The terms4FAIRskills project has created a formalised terminology that describes the competencies, skills and knowledge associated with making and keeping data FAIR.

Data steward data manager data scientist researcher	wants competency in	data discovery knowledge to choose fair data handling approaches appropriate to the research phenomena understanding persistent identifiers knowledge of theories underlying fair implementation, [data sharing](http://purl.obolibrary.org/obo/T4FS_0000482)
Online documentation	confers competency about	data discovery knowledge to choose fair data handling approaches appropriate to the research phenomena understanding persistent identifiers knowledge of theories underlying fair implementation data sharing
Online documentation	confers knowledge about	metadata standard persistent identifier repository citable data access