Status

FAIR elements

Interoperability
Findability
Accessibility
Reusability

For this lesson plan, participants should have a foundational understanding of:

A basic understanding of the FAIR principles (Findable, Accessible, Interoperable, Reusable).
Familiarity with the concept of metadata and how data are described in structured formats (e.g., spreadsheets or database tables).
Awareness that different datasets may use different terminologies to describe similar concepts (semantic heterogeneity).

After completing this lesson plan, the participants are capable of:

Understand what ontologies and vocabularies are, explain the difference between the two and describe their purpose

Understand how ontologies help to organise metadata and what part they play in making metadata schemas more meaningful

Understand/analyse the use of ontologies in a repository

Understand the use of/apply ontology lookup services to find relevant ontologies and vocabularies

Understand/analyse/evaluate the steps, challenges and solutions involved in creating an ontology or vocabulary

Overview

Topic

This lesson plan will help to better understand what ontologies and vocabularies are, how they relate to the FAIR principles, and why they are important for research. At the end of this lesson, participants will be able to define and distinguish ontologies and vocabularies, understand how they are used in repositories, explain their role in metadata organisation, use ontology lookup services, and evaluate the process of creating an ontology or vocabulary.

Added value

Ontologies and vocabularies enhance standardisation and foster a shared understanding between people and machines, thereby increasing the semantic interoperability of digital resources. By providing a common framework for organising knowledge, they promote the discovery of relevant scientific information, improve data mining and analysis, and support the development of intelligent systems.

References

>800 terminologies listed within FAIRsharing.
FAIRsharing’s educational factsheet on standards, including terminologies
https://faircookbook.elixir-europe.org/content/recipes/interoperability/introduction-terminologies-ontologies.html
https://faircookbook.elixir-europe.org/content/recipes/interoperability/selecting-ontologies.html
https://faircookbook.elixir-europe.org/content/recipes/interoperability/ontology-new-term-request-recipe.html
https://faircookbook.elixir-europe.org/content/recipes/interoperability/creating-data-dictionary.html
https://rdmkit.elixir-europe.org/metadata_management#how-do-you-find-appropriate-vocabularies-or-ontologies

^{Content - Details & information for the activities of the lesson plan}

^{Introductory presentation}

^{Provide a clear overview of what ontologies and vocabularies are, how they differ, and why they matter for structuring and enriching metadata, using examples tailored to the audience’s domain (e.g., biology, social sciences, cultural heritage). Explain how ontologies support consistency, interoperability, and meaningful metadata. The lecture should explain semantic heterogeneity and show examples of how vocabularies and ontologies solve it.}

^{Information to include in the presentation:}

^{1. Why metadata needs semantic structure - short explanation of why metadata alone is not enough}

^{Metadata can be ambiguous without standardised terms (e.g., “mouse” the animal vs. “mouse” the device).}
^{Different disciplines may use different labels for the same concept (semantic heterogeneity)}
^{Machines cannot make assumptions - semantic structure makes data understandable, linkable, and interoperable.}
^{Controlled vocabularies and ontologies enable FAIR-aligned metadata (Interoperable & Reusable)}
^{Additional resources:}^{FAIR Principles}

^{2. Semantic heterogeneity}

^{Different, independently developed, or used data systems represent the same real-world concept with different meanings, interpretations, or terminology.}
^{Ontologies help solve the problem by (i) defining the relations in terms such as is-a, part-of, causes, measured-by, (ii) enabling semantic search (e.g. find all participants that are born in the Netherlands), (iii) supporting data integration across disciplines and repositories, (iv) allowing machines to infer new knowledge through automated reasoning}
^{Possible examples: Synonyms: “physician” vs. “doctor”, Different levels of detail: “disease” vs. “infectious disease”, Domain-specific variation: “culture” (biology) vs. “culture” (anthropology), Different coding systems: ICD-11 vs. SNOMED CT}

^{3. From vocabulary to ontology - increasing structure increases machine interpretability of data}

^{Figure 1 – From Vocabulary to Ontology} ^{Diagram illustrating Flat list → Hierarchical taxonomy → Ontology with relationships} ^{Key Concepts:} ^{A controlled vocabulary is a predetermined list of terms used consistently to describe data elements.} ^{A taxonomy organises terms hierarchically (e.g., broader and narrower concepts), supporting classification and navigation.} ^{An ontology formally describes concepts and their interrelationships. Ontologies facilitate machine-readable representations of knowledge and support automated reasoning, semantic querying, and data integration.} ^{Together, these tools tackle semantic heterogeneity, the issue where different systems or communities use different terms to describe similar concepts.}
^{Figure 2 – Semantic Interoperability Challenge} ^{Illustration showing: Dataset A (uses free text), Dataset B (uses standard codes), Mapping layer using ontology}
^{Additional resources:}^{Ontology pipeline}

^{4. How to find and explore vocabularies/ontologies - prepare the learners for the practical activities}

^{There are look-up tools, e.g. BioPortal, Ontology Lookup Service, FAIRSharing, Linked Open Vocabularies, OBO Foundry, Wikidata}
^{Good vocabulary or ontology follows best practices}

^{Additional resources:}^{W3C Best practices for publishing vocabularies}^,^{OBO Best practices}

^{Workshop: Searching for Ontology Terms Using the BioPortal API}

^{Implementation Routes}

^{The workshop can be adapted to different audience levels. The following three routes vary in depth, technical complexity, and learning focus. Educators may select the route that best matches the participants’ experience level and available time.}

^{Route 1 – Beginner}

^{Target audience: Early career researchers, PhD students, research support staff}

^{Focus: Understanding how ontologies can be accessed programmatically and why persistent identifiers are important for semantic interoperability.}

^{Outcome emphasis: Primarily Understanding ★, light Application ★★}

^{Suggested duration: 60–90 minutes}

^Structure

^{Concept introduction (15 min, lecture)}

^{Ontology lookup services allow researchers to find standardised concepts and identifiers that describe data consistently across systems.}

^{Instead of manually browsing ontology portals, these services can also be accessed programmatically using an API (Application Programming Interface).}

^{An API enables software systems to communicate with each other in a structured way. The BioPortal API allows users to send search queries and retrieve ontology concepts, their definitions, and their persistent identifiers.}

^{A typical API request includes three components:}

^{Request URI}

^{Example search query:}^{http://data.bioontology.org/search?q=diabetes}

^{The parameter q=diabetes instructs the API to search for ontology concepts related to the term diabetes.}

^Headers

^{The BioPortal API requires an authentication header containing the user’s API key:}

^{Authorization: apikey YOUR_API_KEY}

^{The API key identifies the user and allows controlled access to the service.}

^{Response format}

^{The API returns results in JSON (JavaScript Object Notation), a structured data format commonly used for exchanging data between systems.}

^{Example BioPortal API Response (simplified)}

^{Below is a simplified example of a result returned for the query “diabetes”:}

^{ ^{“collection”: [} ^{ ^{“prefLabel”: “Diabetes mellitus”,} ^{“@id”: “http://purl.bioontology.org/ontology/SNOMEDCT/73211009”,} ^{“definition”: [} ^{“A metabolic disease characterised by hyperglycaemia resulting from defects in insulin secretion, insulin action, or both.”} ^], ^{“ontology”: “SNOMEDCT”} ^} ^] ^}

^{Interpreting the Response}

^{Key elements in the JSON response include:}

Field	Meaning
prefLabel	Preferred human-readable name of the concept
@id	Persistent identifier (URI) for the ontology concept
definition	Formal description of the concept
ontology	Source ontology providing the concept

^{The URI is the most important element for semantic interoperability.}

^{Instead of storing free-text descriptions such as “diabetes”, datasets can store the URI:}

^{http://purl.bioontology.org/ontology/SNOMEDCT/73211009}

^{This ensures that the concept’s meaning remains consistent, unambiguous, and machine-interpretable across systems.}

^{Figure X – BioPortal API Workflow}

^{Caption: Programmatic access to ontology terms enables scalable semantic annotation.}

^{API exploration (20 min, instructor demonstration)}

^{The instructor demonstrates a simple API request.}

^Example:

^{curl -X GET “http://data.bioontology.org/search?q=diabetes” -H “Authorization: apikey YOUR_API_KEY”}

^{Participants observe the returned JSON response and identify key elements such as:}

^{preferred term label}
^{ontology source}
^{concept URI}

^{Discussion questions:}

^{What information does the API return?}
^{Which ontology defines the concept?}
^{Why is the URI important for interoperability?}

^{Guided exploration (25 min, hands-on activity)}

^{Participants perform ontology searches using one of the following tools:}

^{curl (command line)}
^Postman
^{Hoppscotch or another browser-based API client}

^{Participants search for several terms such as:}

^diabetes
^asthma
^hypertension

^{For each concept they identify:}

^{preferred label}
^{ontology source}
^{concept URI}

^{Participants record results in a table:}

| Term | Preferred Label | Ontology | URI | | —- | ————— | ——– | — |

^{Practical exercise – dataset annotation (20 min)}

^{Participants annotate a simple dataset using ontology identifiers.}

^{Example dataset:}

ID	Diagnosis
1	diabetes
2	asthma
3	hypertension

^Task:

^{Use the BioPortal API to search for each diagnosis.}
^{Retrieve the corresponding ontology URI.}
^{Add a new column Ontology_URI.}

^Result:

| ID | Diagnosis | Ontology_URI | | – | ——— | ————- |

^{Discussion questions:}

^{Why is storing URIs better than free-text labels?}
^{How does this improve FAIR interoperability?}
^{What risks arise when automatically selecting the first search result?}

^{Reflection discussion (10 min)}

^{Participants discuss:}

^{challenges encountered when selecting ontology terms}
^{how incorrect annotations may affect data integration}
^{when new ontology terms may need to be requested}

^{The discussion should emphasise validation, documentation, and governance of ontology mappings.}

^{Route 2 – Intermediate}

^{Target audience: Data stewards, research software engineers, and experienced researchers}

^{Focus: Applying ontology lookup workflows and analysing how semantic identifiers can be integrated into research data pipelines.}

^{Outcome emphasis: Application ★★, Analysis ★★, some Evaluation ★★★}

^{Suggested duration: 2–2.5 hours}

^Structure

^{Concept recap (10 min)}

^{Brief overview of semantic interoperability and the role of shared vocabularies.}

^{Explain how ontology APIs support:}

^{automated metadata annotation}
^{dataset harmonisation}
^{semantic data integration}

^{API exploration exercise (30 min)}

^{Participants experiment with ontology searches using the BioPortal API.}

^{Example query:}

^{curl -X GET “http://data.bioontology.org/search?q=diabetes” -H “Authorization: apikey YOUR_API_KEY”}

^{Participants explore:}

^{different search terms}
^{restricting results to specific ontologies}

^Example:^{http://data.bioontology.org/search?q=diabetes\&ontologies=SNOMEDCT}

^{Participants analyse differences between ontologies and discuss concept ambiguity.}

^{Python implementation exercise (45 min)}

^{Participants run a Python script to retrieve ontology identifiers.}

^Example:

^{import requests}

^{API_KEY = "YOUR_API_KEY"}

^{url = "http://data.bioontology.org/search"}

^{params = {} ^{"q": "diabetes",} ^{"ontologies": "SNOMEDCT"} ^}

^{headers = {} ^{"Authorization": f"apikey {API_KEY}"} ^}

^{response = requests.get(url, params=params, headers=headers)}

^{data = response.json()}

^{for result in data["collection"]:} ^{print("Label:", result["prefLabel"])} ^{print("URI:", result["@id"])}

^{Participants modify the script to query different concepts.}

^{Dataset annotation exercise (40 min)}

^{Participants annotate a dataset using ontology URIs retrieved via the API.}

^{They compare results and discuss differences in ontology selection.}

^{Challenge discussion (20 min)}

^{Discussion topics:}

^{ambiguous concepts}
^{missing ontology terms}
^{ontology selection criteria}
^{documentation of mapping decisions}

^{Route 3 – Advanced}

^{Target audience: Senior data stewards, infrastructure architects, interoperability specialists}

^{Focus: Evaluating ontology lookup strategies and designing scalable semantic annotation workflows.}

^{Outcome emphasis: Evaluation and strategic reasoning ★★★}

^{Suggested duration: Half-day workshop}

^Structure

^{Rapid framing (10 min)}

^{Discuss challenges of implementing semantic interoperability at scale.}

^{Topics include:}

^{automated annotation}
^{ontology versioning}
^{governance of semantic mappings}

^{Advanced API exploration (45 min)}

^{Participants explore advanced BioPortal API usage:}

^{Comparing ontology coverage}
^{retrieving additional metadata}
^{analysing concept hierarchies}

^{Pipeline design exercise (60 min)}

^{Participants design a conceptual workflow integrating ontology lookup into a data pipeline.}

^{Example workflow:}

^{Dataset → Concept extraction → BioPortal API query → URI retrieval → Semantic annotation → Data integration}

^{Groups propose validation and documentation steps.}

^{Governance simulation (45 min)}

^{Participants discuss governance questions such as:}

^{who approves ontology mappings}
^{how mapping decisions are documented}
^{how ontology updates are managed}

^{Strategic reflection (30 min)}

^{Participants reflect on trade-offs between:}

^{automation and manual validation}
^{scalability and precision}
^{ontology reuse and extension}

Lesson content

Activity

Time

Type

Level

Before the lesson

^{Have participants read the FAIR Cookbook’s}^{Introducing the FAIR Principles}^{to get an idea of what the FAIR principles entail.}

20 min

Individual exercise

^{There should also be an activity to ensure learners are familiar with metadata, since it is not addressed in this lesson plan.}

20 min

Individual exercise

During the lesson

^{Introductory presentation}

^{Present what ontologies and vocabularies are, how they differ, and why they matter for structuring and enriching metadata. Show how semantic heterogeneity arises and demonstrate how vocabularies and ontologies address it.}

15-30 min

Lecture

^{Demo of ontology search}

^{Demonstrate how to look up ontology terms using an ontology lookup platform (e.g., OLS, BioPortal), showing preferred labels, URIs, and ontology sources.}

15 min

Demo

^{Look up relevant ontologies}

^{In pairs or trios, have participants search for relevant ontology terms for their extracted concepts across multiple ontology platforms and compare the results.}

30-45 min

Group exercise

^{Build vocabulary}

^{Using selected concepts, groups identify essential terms in their domain, resolve duplicates, clarify definitions, and group terms into a structured vocabulary.}

30 min

Group exercise

^{Link the vocabulary to existing standards}

^{Ask participants to link their vocabulary terms to existing ontology concepts using URIs or identifiers where available.}

60 min

Group exercise

^{Harmonize terminologies}

^{Provide small datasets with different terminologies and have groups harmonise them by mapping terms to shared ontology identifiers.}

30-40 min

Group exercise

After the lesson

^{Reflection on applying ontologies in practice}

^{Invite participants to reflect on how ontology use could influence their own research workflows or metadata practices.}

15-20 min

Group discussion

^{Closure and Q\&A}

10 min

Discussion

^{Reflection on ontology challenges}

^{Facilitate a discussion about challenges in selecting ontology terms, semantic ambiguity, and gaps in existing vocabularies.}

15-20

Group discussion

Additional resources

Aliya Aktau

Niek van Ulzen

Ana Konrad

The terms4FAIRskills project has created a formalised terminology that describes the competencies, skills and knowledge associated with making and keeping data FAIR.

Data steward data librarian software engineer terminology manager ontologist researcher	wants competency in	choosing the appropriate terminology for your data
Online documentation	confers competency about	choosing the appropriate terminology for your data
Online documentation	confers knowledge about	semantic interoperability controlled vocabulary
Online documentation	supports implementation of	findability of digital assets interoperability of digital assets reuse of digital assets