Overview
Topic
This lesson plan will help to better understand what ontologies and vocabularies are, how they relate to the FAIR principles, and why they are important for research. At the end of this lesson, participants will be able to define and distinguish ontologies and vocabularies, understand how they are used in repositories, explain their role in metadata organisation, use ontology lookup services, and evaluate the process of creating an ontology or vocabulary.
Added value
Ontologies and vocabularies enhance standardisation and foster a shared understanding between people and machines, thereby increasing the semantic interoperability of digital resources. By providing a common framework for organising knowledge, they promote the discovery of relevant scientific information, improve data mining and analysis, and support the development of intelligent systems.
References
- >800 terminologies listed within FAIRsharing.
- FAIRsharing’s educational factsheet on standards, including terminologies
- https://faircookbook.elixir-europe.org/content/recipes/interoperability/introduction-terminologies-ontologies.html
- https://faircookbook.elixir-europe.org/content/recipes/interoperability/selecting-ontologies.html
- https://faircookbook.elixir-europe.org/content/recipes/interoperability/ontology-new-term-request-recipe.html
- https://faircookbook.elixir-europe.org/content/recipes/interoperability/creating-data-dictionary.html
- https://rdmkit.elixir-europe.org/metadata_management#how-do-you-find-appropriate-vocabularies-or-ontologies
Content - Details & information for the activities of the lesson plan
Introductory presentation
Provide a clear overview of what ontologies and vocabularies are, how they differ, and why they matter for structuring and enriching metadata, using examples tailored to the audience’s domain (e.g., biology, social sciences, cultural heritage). Explain how ontologies support consistency, interoperability, and meaningful metadata. The lecture should explain semantic heterogeneity and show examples of how vocabularies and ontologies solve it.
Information to include in the presentation:
1. Why metadata needs semantic structure - short explanation of why metadata alone is not enough
- Metadata can be ambiguous without standardised terms (e.g., “mouse” the animal vs. “mouse” the device).
- Different disciplines may use different labels for the same concept (semantic heterogeneity)
- Machines cannot make assumptions - semantic structure makes data understandable, linkable, and interoperable.
- Controlled vocabularies and ontologies enable FAIR-aligned metadata (Interoperable & Reusable)
- Additional resources: FAIR Principles
2. Semantic heterogeneity
- Different, independently developed, or used data systems represent the same real-world concept with different meanings, interpretations, or terminology.
- Ontologies help solve the problem by (i) defining the relations in terms such as is-a, part-of, causes, measured-by, (ii) enabling semantic search (e.g. find all participants that are born in the Netherlands), (iii) supporting data integration across disciplines and repositories, (iv) allowing machines to infer new knowledge through automated reasoning
- Possible examples: Synonyms: “physician” vs. “doctor”, Different levels of detail: “disease” vs. “infectious disease”, Domain-specific variation: “culture” (biology) vs. “culture” (anthropology), Different coding systems: ICD-11 vs. SNOMED CT
3. From vocabulary to ontology - increasing structure increases machine interpretability of data
- Figure 1 – From Vocabulary to Ontology Diagram illustrating Flat list → Hierarchical taxonomy → Ontology with relationships Key Concepts: A controlled vocabulary is a predetermined list of terms used consistently to describe data elements. A taxonomy organises terms hierarchically (e.g., broader and narrower concepts), supporting classification and navigation. An ontology formally describes concepts and their interrelationships. Ontologies facilitate machine-readable representations of knowledge and support automated reasoning, semantic querying, and data integration. Together, these tools tackle semantic heterogeneity, the issue where different systems or communities use different terms to describe similar concepts.
- Figure 2 – Semantic Interoperability Challenge Illustration showing: Dataset A (uses free text), Dataset B (uses standard codes), Mapping layer using ontology
- Additional resources: Ontology pipeline
4. How to find and explore vocabularies/ontologies - prepare the learners for the practical activities
- There are look-up tools, e.g. BioPortal, Ontology Lookup Service, FAIRSharing, Linked Open Vocabularies, OBO Foundry, Wikidata
- Good vocabulary or ontology follows best practices
Additional resources: W3C Best practices for publishing vocabularies, OBO Best practices
Workshop: Searching for Ontology Terms Using the BioPortal API
Implementation Routes
The workshop can be adapted to different audience levels. The following three routes vary in depth, technical complexity, and learning focus. Educators may select the route that best matches the participants’ experience level and available time.
Route 1 – Beginner
Target audience: Early career researchers, PhD students, research support staff
Focus: Understanding how ontologies can be accessed programmatically and why persistent identifiers are important for semantic interoperability.
Outcome emphasis: Primarily Understanding ★, light Application ★★
Suggested duration: 60–90 minutes
Structure
Concept introduction (15 min, lecture)
Ontology lookup services allow researchers to find standardised concepts and identifiers that describe data consistently across systems.
Instead of manually browsing ontology portals, these services can also be accessed programmatically using an API (Application Programming Interface).
An API enables software systems to communicate with each other in a structured way. The BioPortal API allows users to send search queries and retrieve ontology concepts, their definitions, and their persistent identifiers.
A typical API request includes three components:
Request URI
Example search query: http://data.bioontology.org/search?q=diabetes
The parameter q=diabetes instructs the API to search for ontology concepts related to the term diabetes.
Headers
The BioPortal API requires an authentication header containing the user’s API key:
Authorization: apikey YOUR_API_KEY
The API key identifies the user and allows controlled access to the service.
Response format
The API returns results in JSON (JavaScript Object Notation), a structured data format commonly used for exchanging data between systems.
Example BioPortal API Response (simplified)
Below is a simplified example of a result returned for the query “diabetes”:
{ “collection”: [ { “prefLabel”: “Diabetes mellitus”, “@id”: “http://purl.bioontology.org/ontology/SNOMEDCT/73211009”, “definition”: [ “A metabolic disease characterised by hyperglycaemia resulting from defects in insulin secretion, insulin action, or both.” ], “ontology”: “SNOMEDCT” } ] }
Interpreting the Response
Key elements in the JSON response include:
| Field | Meaning |
|---|---|
| prefLabel | Preferred human-readable name of the concept |
| @id | Persistent identifier (URI) for the ontology concept |
| definition | Formal description of the concept |
| ontology | Source ontology providing the concept |
The URI is the most important element for semantic interoperability.
Instead of storing free-text descriptions such as “diabetes”, datasets can store the URI:
http://purl.bioontology.org/ontology/SNOMEDCT/73211009
This ensures that the concept’s meaning remains consistent, unambiguous, and machine-interpretable across systems.
Figure X – BioPortal API Workflow
Caption: Programmatic access to ontology terms enables scalable semantic annotation.
API exploration (20 min, instructor demonstration)
The instructor demonstrates a simple API request.
Example:
curl -X GET “http://data.bioontology.org/search?q=diabetes” -H “Authorization: apikey YOUR_API_KEY”
Participants observe the returned JSON response and identify key elements such as:
- preferred term label
- ontology source
- concept URI
Discussion questions:
- What information does the API return?
- Which ontology defines the concept?
- Why is the URI important for interoperability?
Guided exploration (25 min, hands-on activity)
Participants perform ontology searches using one of the following tools:
- curl (command line)
- Postman
- Hoppscotch or another browser-based API client
Participants search for several terms such as:
- diabetes
- asthma
- hypertension
For each concept they identify:
- preferred label
- ontology source
- concept URI
Participants record results in a table:
| Term | Preferred Label | Ontology | URI | | —- | ————— | ——– | — |
Practical exercise – dataset annotation (20 min)
Participants annotate a simple dataset using ontology identifiers.
Example dataset:
| ID | Diagnosis |
|---|---|
| 1 | diabetes |
| 2 | asthma |
| 3 | hypertension |
Task:
- Use the BioPortal API to search for each diagnosis.
- Retrieve the corresponding ontology URI.
- Add a new column Ontology_URI.
Result:
| ID | Diagnosis | Ontology_URI | | – | ——— | ————- |
Discussion questions:
- Why is storing URIs better than free-text labels?
- How does this improve FAIR interoperability?
- What risks arise when automatically selecting the first search result?
Reflection discussion (10 min)
Participants discuss:
- challenges encountered when selecting ontology terms
- how incorrect annotations may affect data integration
- when new ontology terms may need to be requested
The discussion should emphasise validation, documentation, and governance of ontology mappings.
Route 2 – Intermediate
Target audience: Data stewards, research software engineers, and experienced researchers
Focus: Applying ontology lookup workflows and analysing how semantic identifiers can be integrated into research data pipelines.
Outcome emphasis: Application ★★, Analysis ★★, some Evaluation ★★★
Suggested duration: 2–2.5 hours
Structure
Concept recap (10 min)
Brief overview of semantic interoperability and the role of shared vocabularies.
Explain how ontology APIs support:
- automated metadata annotation
- dataset harmonisation
- semantic data integration
API exploration exercise (30 min)
Participants experiment with ontology searches using the BioPortal API.
Example query:
curl -X GET “http://data.bioontology.org/search?q=diabetes” -H “Authorization: apikey YOUR_API_KEY”
Participants explore:
- different search terms
- restricting results to specific ontologies
Example: http://data.bioontology.org/search?q=diabetes\&ontologies=SNOMEDCT
Participants analyse differences between ontologies and discuss concept ambiguity.
Python implementation exercise (45 min)
Participants run a Python script to retrieve ontology identifiers.
Example:
import requests
API_KEY = "YOUR_API_KEY"
url = "http://data.bioontology.org/search"
params = {
"q": "diabetes",
"ontologies": "SNOMEDCT"
}
headers = {
"Authorization": f"apikey {API_KEY}"
}
response = requests.get(url, params=params, headers=headers)
data = response.json()
for result in data["collection"]:
print("Label:", result["prefLabel"])
print("URI:", result["@id"])
Participants modify the script to query different concepts.
Dataset annotation exercise (40 min)
Participants annotate a dataset using ontology URIs retrieved via the API.
They compare results and discuss differences in ontology selection.
Challenge discussion (20 min)
Discussion topics:
- ambiguous concepts
- missing ontology terms
- ontology selection criteria
- documentation of mapping decisions
Route 3 – Advanced
Target audience: Senior data stewards, infrastructure architects, interoperability specialists
Focus: Evaluating ontology lookup strategies and designing scalable semantic annotation workflows.
Outcome emphasis: Evaluation and strategic reasoning ★★★
Suggested duration: Half-day workshop
Structure
Rapid framing (10 min)
Discuss challenges of implementing semantic interoperability at scale.
Topics include:
- automated annotation
- ontology versioning
- governance of semantic mappings
Advanced API exploration (45 min)
Participants explore advanced BioPortal API usage:
- Comparing ontology coverage
- retrieving additional metadata
- analysing concept hierarchies
Pipeline design exercise (60 min)
Participants design a conceptual workflow integrating ontology lookup into a data pipeline.
Example workflow:
Dataset → Concept extraction → BioPortal API query → URI retrieval → Semantic annotation → Data integration
Groups propose validation and documentation steps.
Governance simulation (45 min)
Participants discuss governance questions such as:
- who approves ontology mappings
- how mapping decisions are documented
- how ontology updates are managed
Strategic reflection (30 min)
Participants reflect on trade-offs between:
- automation and manual validation
- scalability and precision
- ontology reuse and extension
Lesson content
Have participants read the FAIR Cookbook’s Introducing the FAIR Principles to get an idea of what the FAIR principles entail.
There should also be an activity to ensure learners are familiar with metadata, since it is not addressed in this lesson plan.
Introductory presentation
Present what ontologies and vocabularies are, how they differ, and why they matter for structuring and enriching metadata. Show how semantic heterogeneity arises and demonstrate how vocabularies and ontologies address it.
Demo of ontology search
Demonstrate how to look up ontology terms using an ontology lookup platform (e.g., OLS, BioPortal), showing preferred labels, URIs, and ontology sources.
Look up relevant ontologies
In pairs or trios, have participants search for relevant ontology terms for their extracted concepts across multiple ontology platforms and compare the results.
Build vocabulary
Using selected concepts, groups identify essential terms in their domain, resolve duplicates, clarify definitions, and group terms into a structured vocabulary.
Link the vocabulary to existing standards
Ask participants to link their vocabulary terms to existing ontology concepts using URIs or identifiers where available.
Harmonize terminologies
Provide small datasets with different terminologies and have groups harmonise them by mapping terms to shared ontology identifiers.
Reflection on applying ontologies in practice
Invite participants to reflect on how ontology use could influence their own research workflows or metadata practices.
Closure and Q\&A
Reflection on ontology challenges
Facilitate a discussion about challenges in selecting ontology terms, semantic ambiguity, and gaps in existing vocabularies.