Skip to content Skip to footer

Data Repositories and FAIR

Topic, definition and scope

  • How do repositories support FAIR?
  • The topic concerns the capacity of digital repositories to ensure the research data findability, accessibility, interoperability and reusability. This content is a summarised version suitable as a recap for those with pre-existing knowledge of data science.
  • Data repositories are key in putting the FAIR principles into practice. They not only enable findability and accessibility but also provide persistent identifiers, documentation, and metadata, thus fostering reusability for humans and machines.
  • A “form” needs to be filled –metadata by default.
  • The form complies with a specific metadata standard.
  • Metadata will then become machine-actionable and searchable in an online resource.
  • A persistent identifier for the data is automatically generated.
  • References to other data or metadata can be included.
  • Authentication and authorization procedures are in place.
  • Access can be regulated from closed to open.
  • The provision of machine-readable licences enhances the reusability of the data.
  • The use of standards and controlled vocabularies is enforced.
  • Interfaces for external services like OAI-PMH allow harvesting of metadata for stored records** **
  • Background:
    • A number of previous projects and working groups have been discussing what a common set of attributes should be to enable FAIR data, and to allow repository stakeholders to make their own decisions about which repository is best for them. Details of these previous efforts are summarised in the case statement of one existing cross-domain, worldwide effort under the auspices of the RDA: the RDA Data repository attributes Working Group. Therefore, how FAIR is implemented in a repository, and how each FAIR principle aligns with a particular data attribute, can be discovered from these efforts.

Summary of Tasks / Actions

  • Check lists
  • Is your repository suitable for data FAIRification?
  • Do you know of any challenges and how to remedy them?
  • Assess data FAIRness using F-UJI in different repositories and data resources and explain the differences among them:
  • Go through FAIR_principles_translation_SNSF_logo (snf.ch) sheet to get get familiar with FAIR requirements that can be be fulfilled by the repository ( keep in mind that not all requirements are manageable by the repositories, some are Researcher’s responsibility!)
  • Repositories and their implemented data standards (as well as the data policies that recommend their use) can all be discovered in FAIRsharing (documentation on searching within FAIRsharing).
  • Use Case:
    • Here is a link to the FAIRsharing documentation page that created specifically for this lesson plan; navigation of FAIRsharing to discover suitable resources for a particular researcher’s use case (a user story emerge of a multi-omics RA who is using the library services to help them figure out how to implement FAIR according to a articular funder’s data policy).

    https://fairsharing.gitbook.io/fairsharing/how-to/unsure-where-to-start

  • Match the following requirements to their corresponding FAIR principle/sub-principles:
    • A “form” needs to be filled –metadata by default.
    • A persistent identifier for the data is automatically generated.
    • References to other data or metadata can be included.
    • Access can be regulated from closed to open.
    • The use of standards and controlled vocabularies is enforced.
    • A DOI is issued to every published record.
    • The form complies with a specific metadata standard (DataCite)
    • Metadata contains the PID
    • Create a user account in a repository.
FAIR Principle FAIR Sub-Principle FAIR implementation in a Repository
Findable F1: (meta)data are assigned a globally unique and persistent identifier
F2: data are described with rich metadata (defined by R1 below)
F3: metadata clearly and explicitly include the identifier of the data it describes
F4: (meta)data are registered or indexed in a searchable resource
Accessable A1: (meta)data are retrievable by their identifier using a standardised communications protocol
A1.1: the protocol is open, free, and universally implementable
A1.2: the protocol allows for an authentication and authorization procedure, where necessary
A2: metadata are accessible, even when the data are no longer available
Interoperable I1: (meta)data uses a formal, accessible, shared, and broadly applicable language for knowledge representation.
I3: (meta)data include qualified references to other (meta)data
I2: (meta)data use vocabularies that follow FAIR principles
Reusable R1: (meta)data are richly described with a plurality of accurate and relevant attributes
R1.1: (meta)data are released with a clear and accessible data usage licence
R1.2: (meta)data are associated with detailed provenance
R1.3: (meta)data meet domain-relevant community standards

Materials / Equipment


References


Take home tasks/preparation

  • Test your repository with FAIRification of one data using the above Handout
  • Think about an example similar to what we explained in the above use case; of how to find what a particular role (e.g. Data Steward) needs in FAIRsharing.

    For example, start with a requirement they have, e.g. a funder data policy, and move them step-by-step from that data policy to a shortlist of standards and/or databases that they will need to align with and/or submit to. This example has now been written here: https://fairsharing.gitbook.io/fairsharing/how-to/unsure-where-to-start