Exercise 5: The Reuser’s Perspective (OGD)
The Reuser’s Perspective (OGD)
Browse, search and do what is described below. You can do this exercise alone or in pairs. The whole exercise should take between 20 and 30 minutes. Afterwards, you will share what you have discovered and/or created with the class.
A - Exploring and Reusing Swiss OGD
Go to opendata.swiss. What are the three cantons that provide the most datasets on opendata.swiss?
Go to the Ist-Daten (Target/Actual Comparison of SBB departure/arrive times) page (https://opendata.swiss/en/perma/ist-daten-sbb@schweizerische-bundesbahnen-sbb) and download the CSV file. Open it in a spreadsheet application (Excel, LibreOffice Calc, Google Sheets) or a text editor. The file contains data from the previous day depending on when you download it. Answer the following:
Most punctual station: Which train station (
Stop name) has the best on-time performance? Compare the scheduled times (Arrival time/Departure time) with the actual times (Arrival actual time/Departure actual time) to determine where delays are smallest on average.Most delayed station: Conversely, which station has the highest average delay?
Cancelled trains: How many trains were cancelled on that day? Look at the
Cancelled TFcolumn (wheretrueindicates a cancellation).
Now try to get transport information via an API instead of downloading a file. The Swiss public transport API at https://transport.opendata.ch/ is freely accessible without authentication. Try the following:
Look up the next departures from a station of your choice, e.g.:
https://transport.opendata.ch/v1/stationboard?station=Bern&limit=5Try changing the
stationparameter to another station and observe the results.Compare the experience: what are the advantages and limitations of using a live API versus analysing a downloaded CSV file?
How many mailboxes are there in your commune or city? Explore the application at https://wieviele-briefkaesten-gibt-es.streamlit.app/. Where do these data originally come from?
- Q1: Use the “Organisation” or “Canton” filter on the opendata.swiss search page to see the counts.
- Q2a/b: In a spreadsheet, calculate delays by subtracting the scheduled time from the actual time. Try using a pivot table or
AVERAGEIF-style formulas grouped byStop name. Ignore rows where times are missing or whereArrival forecast status/Departure forecast statusshow unusual values (e.g.UNBEKANNT). Stations with very few stops may appear artificially punctual or delayed — consider filtering for stations with a minimum number of stops. - Q2c: Filter or count rows where
Cancelled TFequalstrue. In Excel/LibreOffice, you can use=COUNTIF(column, "true"). - Q3a/b: The transport.opendata.ch API returns JSON. You can paste the URL directly in your browser to see the result. Try the
/v1/connectionsendpoint as well (e.g.?from=Lausanne&to=Genève). The API documentation at https://transport.opendata.ch/docs.html describes all available parameters. - Q3c: Think about: real-time vs. historical data, ease of use, volume of data retrievable, and what you can do with each approach.
- Q4: The application estimates the number of delivery addresses / mailboxes (Zustelladressen / Briefkästen) for a user-defined area. To investigate where the data come from, you can look at the source code on GitHub (https://github.com/davidoesch/wo-sind-briefkaesten) or use your browser’s developer tools (Network tab) while the app loads to see which data sources are being fetched. The underlying data come from the Federal Building and Housing Register (GWR) of the Federal Statistical Office (BFS) and business locations from the Overture Maps Foundation.
B - OGD as Linked Data (LINDAS)
LINDAS (Linked Data Service) is the Swiss federal platform for publishing and querying government data as Linked Data.
Visualize
Go to visualize.admin.ch and use the Verkehrsleistung auf den Flugplätzen dataset (published by the Bundesamt für Zivilluftfahrt BAZL). This dataset records the number of annual air traffic movements (1 movement = 1 take-off or 1 landing) at Swiss airports.
For the year 2023, were there more flight movements at Geneva or at Basel? Try to visualise this comparison on a map.
Look at the data for Zurich Airport across multiple years. By how many movements (or what percentage) did flight traffic drop in 2020 and 2021 compared to 2019?
SPARQL
Go to the LINDAS SPARQL Endpoint.
Query the same aviation dataset via SPARQL. The dataset is available as a Linked Data cube at
https://transport.ld.admin.ch/foca/FOCA_Air_traffic_movements/5. Try to retrieve the number of flight movements per airport for a given year. Start by exploring the structure of the data with a simple query, then refine.Now try a completely different dataset: the Archive database of the Swiss Federal Archives (
https://culture.ld.admin.ch/.well-known/dataset/ais). This dataset uses the Records in Context (RiC) ontology. Try the following query to find archival records related to “Bern”:PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX rico: <https://www.ica.org/standards/RiC/ontology#> SELECT ?record ?identifier ?title ?beginDate ?endDate ?expressedDate ?holder WHERE { GRAPH <https://lindas.admin.ch/sfa/ais> { ?record rdf:type rico:Record . ?record rico:title ?title . ?record rico:identifier ?identifier . OPTIONAL { ?record rico:hasOrHadHolder ?holder } OPTIONAL { ?record rico:isAssociatedWithDate ?date . OPTIONAL { ?date rico:beginningDate ?beginDate } OPTIONAL { ?date rico:endDate ?endDate } OPTIONAL { ?date rico:expressedDate ?expressedDate } } FILTER(CONTAINS(LCASE(?title), "bern")) } } ORDER BY ?beginDate LIMIT 100Run this query, then try modifying it to search for a different topic (e.g. a different city, a subject that interests you, or a specific date range).
- Q1: On visualize.admin.ch, search for “Verkehrsleistung” or “Flugplätzen” to find the dataset. Once selected, choose a map visualisation and filter by the year 2023. Compare the circle sizes or values for Geneva and Basel.
- Q2: Select the line chart view and filter for Zurich Airport only.
- Q3: To explore the aviation cube via SPARQL, start with a
DESCRIBEquery to understand the structure, or trySELECT * WHERE { ?s ?p ?o } LIMIT 20within the FOCA graph. The observations contain dimensions for airport, year, and a measure for the number of movements. You may need to experiment with the predicate URIs — look at what properties the observations have. - Q4: Records in Context (RiC) is an ontology developed by the International Council on Archives (ICA) for describing archival records as Linked Data. In the query,
rico:Recordrepresents an archival record,rico:titleits title, andrico:isAssociatedWithDatelinks to date information. Try replacing"bern"in theFILTERwith another term, or remove the filter entirely and addFILTER(?beginDate > "1900-01-01"^^xsd:date)to explore records from the 20th century onwards.