Learning Objectives
This course introduces the foundational principles of Open Data, equipping undergraduate students in Information Science with the knowledge and skills to find, analyse, and effectively reuse open datasets. Through a combination of lectures, practical activities, and assignments, students will explore the benefits and challenges of Open Data and develop a deeper understanding of its potential applications and implications. It is designed with the following objectives:
- To gain an understanding of Open Data, its essential aspects, and the principles of opening data;
- to learn how to find, analyse, and reuse open datasets;
- to learn the processes involved in preparing and publishing open datasets.
These objectives provide the foundation for the course and guide the learning outcomes.
Content
The topics covered in this course include:
- Background on open data;
- benefits and risks of opening data;
- models related to using and publishing open data.
Methods
The course employs the following methods to support learning:
- Course presentations;
- practical examination and exploration of online services;
- assignments
These methods are intended to foster both theoretical understanding and practical application of the course material.
Course Outline (2024-2025)
17.02.2025 |
Course Overview, Characteristics of Open Data, Associated Movements, Exercise 1 |
24.02.2025 |
Associated Principles, Exercise 2, Open Data Platforms and Organisations, Exercise 3 |
03.03.2025 |
Assessment, Data Quality, and Best Practices, Techniques, Software, and Tools, Exercise 4 |
10.03.2025 |
Showcases, Assignment Workshop |
Exercises
Exercise 1
Find/Define your assignment!
Available options (except for UC15, which can be assigned to multiple students; all other use cases are allocated on a first-come, first-served basis):
- The deletion of datasets from data.gov: Since Trump has once again been elected President of the United States, health and climate information/datasets have disappeared from federal websites [see https://www.theverge.com/news/604484/donald-trumps-data-purge-has-begun and https://fediscience.org/@petersuber/113874200243958520].
- UC01: The disappearance of health data from official US websites: an analyis.
- UC02: The disappearance of environmental data from official US websites: an analysis.
- UC03: Linked Open Data Implementation: Comparing practices across European national libraries
- UC04: Comparing CKAN, uData and Piveau for managing data portals
- UC05: IIIF Implementation in Swiss libraries: a comparative analysis of digital collections accessibility across institutions
- UC06: Canton-level Open Data: Comparing data publishing practices across Swiss cantons
- UC07: Art Museum Licensing Practices: A comparative analysis
- Case studies:
- Rijksmuseum (Netherlands)
- Le Louvre (France)
- British Museum (UK)
- Kunstmuseum Basel (Switzerland)
- National Gallery of Art (USA)
- UC08: Swiss Healthcare Statistics: Accessibility and reusability of cantonal health data
- UC09: Swiss Public Transportation Data: Analysis of SBB’s open data usage in third-party applications
- UC10: Comparing MeteoSwiss open data practices with other European meteorological services
- UC11: IIIF for Education: analysing implementation patterns and pedagogical uses across academic digital libraries
- UC12: Swiss Energy Data: Examining transparency in power consumption data across municipalities
- UC13: Open Data Community Events in Switzerland (2019-2024): Analysis of event types (Open Data Beer, GeoBeer, Linked Data Meetups, Love Data Week, etc.), stakeholder participation (government, academia, GLAM institutions), and their potential impact
- UC14: Swiss Federal Linked Data Ecosystem: Analysis of technical infrastructure, vocabularies, and semantic interoperability across the Federal Administration
- UC15: Find it yourself! Select one or more datasets from one or several platforms discussed during the course (if several datasets are selected, there must be a common thread).
Exercise 2
Identify the Movements and Principles
Match each of these frameworks — OA, Open Science, FLOSS, FAIR, CARE, Collections as Data, LOUD — with exactly one of the following propositions (note that while there may be overlap between them, each proposition corresponds to one main framework):
- Code sharing practices
- Developer-centric data accessibility
- Unrestricted access to scholarly publications
- Persistent identifier assignment
- Digital object interoperability
- Ethical data stewardship
- Research transparency and reproducibility
Exercise 4
OpenRefine
Getting started
- Install the software (https://openrefine.org/docs)
- Run it locally (accessible at http://127.0.0.1:3333/)
- Have a look at the different pages and functionalities
- Create a new project by importing any supported files
Create a project with an extract from the CAS photographic archives
- Create a project by importing the data extract from the CAS photographic archives:
ekws_extract.csv
(which can be found on Cyberlearn)
- Review the dataset
- Clean the dataset by removing unnecessary columns
- Undertake some reconciliation with external services for agents (people and institutions).
A step further…
- Create a new project by importing a dataset from one of the ORD/OGD portals
- Analyse and curate the dataset
Alternatively: go through this tutorial from Library Carpentry: https://librarycarpentry.org/lc-open-refine/
Course Assessment
- Analyse, describe and identify the use case or the (potential) uses of the dataset(s).
- Between 900 and 1,100 words (excluding references) in either PDF or in
.qmd
if you would like it to be published on this website. Please refer to the template.
- Short paper to be submitted either to the lecturer by email or via a pull request on the GitHub repository by Friday 14 March.
This assignment is weighted at 20% of the 7C2-CT module.
Criteria
Introduction and Contextualisation |
5 |
Analysis and Argumentation |
20 |
Structure and Writing |
10 |
Presentation and Referencing |
5 |
Total |
40 |
Additional details will be discussed during the course.
Back to top