FAIR vs open data/science

Status

Ready for review

FAIR elements

Findability
Accessibility
Reusability

For this lesson plan, participants should have a foundational understanding of:

Have basic knowledge of the FAIR Principles

After completing this lesson plan, the participants are capable of:

Beginner and Intermediate

Compare FAIR and open data

Beginner Level

Identify official university support channels and online repositories for open science inquiries.

Beginner level

Describe the benefits and challenges of open science

Beginner Level

Explain the challenges of making data open

Intermediate

Choose a suitable repository to make data open

Topic, definition and scope

This lesson plan explores the topics of Open Science and FAIR.

Open Science (OS) is the movement to make scientific research, data and their dissemination available to any member of an inquiring society, from professionals to citizens. It impinges on principles of scientific growth and public access including practices such as publishing open research and campaigning for open access, with the ultimate aim of making it easier to publish and communicate scientific knowledge. From development to the dissemination of knowledge, several concepts belong under the umbrella term of Open Science.

Open Science is the practice of science in such a way that others can collaborate and contribute, where research data, lab notes and other research processes are freely available, under terms that enable the reuse, redistribution and reproduction of the research and its underlying data and methods.

Topics to be covered in the lesson plan:

FAIR and open science definitions (beginner, intermediate, expert)
- Open, restricted, embargo and closed access (beginner)
- Open data and open research (methods, software, tools, codes) (beginner/intermediate)
- Reproducibility problem (beginner)
- Data availability statement (expert)
- Berlin declaration (data stewards)
Benefits open data (intermediate)
- Comparison with FAIR data
Reasons for not making the data open (intermediate)
- Sensitive/patient data, anonymization
How can I adopt the principles and tooling of open science? (expert)
- Practical examples - link to unit on choosing a repository

FAIR element(s)

Findable - for you to participate in open science, you need to put your research somewhere. \

F4. (Meta)data are registered or indexed in a searchable resource

Accessible - in open science, the data should be fully available to everybody. However, that is not always possible. For this reason, authentication and authorisation procedures need to be in place, as specified in the FAIR Principles.

A1.1 The protocol is open, free, and universally implementable
A1.2 The protocol allows for an authentication and authorisation procedure, where necessary

Reusable - in Europe, to have full legal rights to use any source, a license has to be added to it.

R1.1: (Meta)data are released with a clear and accessible data usage license

Lesson content

Activity

Time

Type

Level

Before the lesson

Exercise:

Comparison between FAIR and Open Data

Students will compare two different scenarios in groups of 3. They will at the end discuss whether it is FAIR or Open Data in the examples.

Time 15 minutes

Scenario 1: Patient Trial

A highly structured, machine-readable dataset of clinical trial results. It uses standardized medical vocabulary, possesses a unique DOI (digital footprint identifier), and features detailed metadata. However, because it contains private medical data, researchers must sign a strict privacy agreement to get an encrypted access token.

Correct Answer: FAIR but closed

Scenario 2: the Github Treasure

A genomics lab uploads a gene-sequencing dataset to a public repository. It has a unique DOI, uses standard FASTA file formatting, includes rich metadata explaining the methodology, and carries an open-use license.

Correct Answer: Open and FAIR

Plenary discussion:

Time 10 minutes

The following questions can be used in the Plenary discussion:

If Scenario 1 isn’t ‘Open,’ why is it still incredibly valuable for science? (This helps them realize that protecting privacy doesn’t mean data should be messy or unfindable).
What would we need to change in Scenario 2 to make it only Open, but no longer FAIR? (This tests if they can reverse-engineer the concepts—e.g., stripping the metadata and DOI, and dumping the data as a raw, unlabeled text file).

Open Discussion

During the lesson

Exercise:

Identify the right Help-Line

Time 15 minutes:

Present in a Mentimeter different catastrophic scenarios for a researcher. Then have the participants choose who they right contact person might be.

Scenario 1: “I am writing a grant proposal and the funder requires a 2-page Data Management Plan (DMP) by next Friday. I don’t know where to start.”
- Correct Channel: ➡️ Your Central Data Steward at the Faculty or University Hospital (They provide DMP templates and review drafts).
Scenario 2: “I have 600 gigabytes of human neuroimaging data. I know it needs to be FAIR, but I don’t know what metadata standard or file repository my specific faculty prefers.”
- Correct Channel: ➡️ Your Central Data Steward at the Faculty or University Hospital (They are embedded experts who know domain-specific standards).
Scenario 3: “I want to publish my article Open Access, but the journal is charging an 1800 euro fee. I need to know if our university has an agreement to cover this cost.”
- Correct Channel: ➡️ The Library Open Access Team (They manage journal publisher deals and funding pots).
Scenario 4: “My dataset contains highly encrypted personal identification keys. I need a secure server to store it while we analyze it.”
- Correct Channel: ➡️ The University or University Hospital Privacy Team (They handle secure infrastructure and GDPR storage compliance).

Open Discussion

Exercise

Identifying Benefits and Challenges of Open Science and FAIR

Time 10 minutes

Put students in pairs. Assign every pair one specific stakeholder from the research world (e.g., Pair A looks at the Individual Researcher, Pair B looks at The Public/Society, Pair C looks at The Scientific Community.
Instruct the pairs to write down two things on their sheet or digital board:

Describe The Benefit: Describe one major reason why Open/FAIR science helps their stakeholder.
Describe The Cost The Challenge: Describe one major roadblock or headache this stakeholder faces when trying to do it.

Call out each stakeholder group and have one pair rapidly read aloud their descriptions. The instructor notes them on the board to build a collective map.

10 minutes

Open Discussion

Exercise:

Explaining the Challenges of Making data Open

Project a fake, high-risk research snippet on your Mentimeter screen. Tell the students: (e.g.) You want to share this raw study dataset openly on GitHub today. Look at the text—which details are a massive legal or ethical hazard if published?

2.The Mentimeter Vote:

Time 2 - 5 minutes

Have students select from a multiple-choice list or use a word cloud to flag the dangerous data points. Ask them to think about GDPR, Participant Privacy, and Intellectual Property .

3.The Peer Explanation Debrief:

Time 5 - 10 minutes

Turn to the class and ask: “Why can’t we just use a simple ‘Find and Replace’ to delete the names and call it a day?” Have 2 or 3 students explain the hidden challenges of anonymization.

Open Discussion

After the lesson

Exercise:

Time 15 minutes

Explain the Case Study:

Have the participants read the following case study and decide which repository they would choose to publish at least the study’s meta-data

Dr. Alcaraz is leading a qualitative research project exploring long-term health outcomes and access to medical care for children living in temporary immigration communities in the Philippines.

The Data: 45 hours of raw audio recordings and transcriptions from deep, semi-structured interviews with children (ages 6–12) and their legal caretakers. The transcripts mention specific medical conditions, illegal housing arrangements, and exact geographical locations.
The Dilemma: Dr. Alcaraz knows the raw data is far too sensitive to be completely public under data protection regulations (like GDPR or local privacy laws). However, her funding body requires her to make the project FAIR. She decides she will keep the raw transcripts securely locked away, but she wants to publish the metadata record openly so other global health researchers know the project exists and can request collaboration.
Give the participants the following list of Repositories to choose from:

Repository Option	Key Features
Option A: Zenodo	A massive, free global generalist repository hosted by CERN. It assigns an automatic DOI and allows anyone to upload anything instantly. It has a “Restricted Access” feature where you can post the metadata openly but keep the files hidden unless you manually approve a user’s request.
Option B: DataverseNL	A secure, institutional/national repository network used by Dutch universities. It offers dedicated, long-term curation support, lets you publish metadata seamlessly, and provides built-in, highly structured “restricted data” workflows that comply strictly with European institutional guidelines.
Option C: Qualitative Data Repository (QDR)	A domain-specific repository explicitly designed for archiving qualitative and multi-method social science data. It specializes in digital security protocols for sensitive interview transcripts and human participant data, offering expert review of metadata before it goes live.

Debriefing choices:

When the groups make their choice and present their arguments, there isn’t one single “perfect” answer, but there are distinct advantages they should explain:

If they choose DataverseNL: This is an excellent choice if Dr. Alcaraz is based at a Dutch institution. It ensures local institutional compliance, provides institutional backing, and has reliable infrastructure for restricted-access metadata.
If they choose QDR (Domain-Specific): This is technically the Gold Standard according to the repository selection hierarchy (domain-specific first). Because QDR specializes in qualitative data, their curators will actually look at the metadata to ensure no identifying information about the children accidentally slipped into the abstract or project description.
If they choose Zenodo: While highly accessible and great for rapid DOI generation, it is a generalist repository. It lacks the human-curated oversight that a highly sensitive project involving minors might require to prevent accidental data leaks in the public text fields.

Open Discussion

Additional resources

Heleri Inno

Anna Swan

Pradeep Eranti

Diana Pilvar

The terms4FAIRskills project has created a formalised terminology that describes the competencies, skills and knowledge associated with making and keeping data FAIR.

Data steward researcher manager trainer/teacher	wants competency in	flexibility in relating fair criteria to openness data sharing
Online documentation	confers competency about	flexibility in relating fair criteria to openness data sharing
Online documentation	confers knowledge about	open data access
Online documentation	supports implementation of	F4. (meta)data are registered or indexed in a searchable resource A1.1 The protocol is open, free, and universally implementable A1.2 The protocol allows for an authentication and authorisation procedure, where necessary R1.1 (Meta)data are released with a clear and accessible data usage license