Hello everyone, Thank you for joining me today for my PhD defense. I’d like to extend my gratitude to my supervisors for their guidance and insights throughout this journey. My thesis examines how collaborative structures and standardized data can support meaningful exchange and reuse of information. It also reflects my interest for using web-based and community-driven technologies to transform the way we understand, share and value cultural heritage.
Let's start with a famous picture taken by Ernst Brunner in 1939 in Guarda showcasing people dancing in a circle. With the protocols and tools available, it is now relatively easy to zoom in on a digital image. What you are actually seeing is not one image, but several of them, or parts of them, known as tiles, delivered by a server and displayed by a client, here Leaflet. Both programs support the IIIF Image API, allowing compatible images to be displayed in different environments.
This image offers multiple avenues for exploration. It can be studied through various lenses: the date it was taken, when it was acquired by the photographic archives, or even when it was digitized. Notably, the photograph was also part of the iconic "Family of Man" exhibition, organized by Edward Steichen and first shown at the Museum of Modern Art in New York in 1955. Now, the question is: while we as humans understand the semantics and these different entities (object, people, exhibition), how can we ensure that machines do as well? How do we make catalogues of metadata easily shareable and interoperable across memory institutions?
Linked Open Usable Data - or LOUD - aims to bring the vision of the Semantic Web into practical, global use, especially within cultural heritage, by building on community-driven standards like the International Image Interoperability Framework (IIIF) and Linked Art. LOUD has five design principles as the baseline for usability and are targeted primarily to software developers. It is centred on JavaScript Object Notation for Linked Data (JSON-LD) as the preferred serialisation format. This snippet represents "the tip of the iceberg." Behind it lies the challenge of organizing these pieces of information in ways that different systems can interpret consistently. Achieving this level of interoperability requires consensus on metadata standards and structures.
The main research question of my thesis seeks to situate LOUD within the borader framework of cultural heritage and digital humanities. To some extent, I also explore its potential to reshape the perception and use of Linked Data. To answer this question, the thesis draws on a blend of theories that together provide a framework.
The theoretical framework of this thesis is primarily grounded in Actor-Network Theory (ANT), supported by theories like Situated Knowledges, Boundary Objects, and the Philosophy of Information. ANT was chosen as the central framework due to its strength in analyzing socio-technical networks, essential for understanding and map relationships between human and non-human actors. Complementary theories add further insight, helping to interpret specific dynamics. However, they are not always central to the analysis but rather serve to deepen understanding of contextual aspects.
As a Digital Humanist, I indeed combined quantitative and qualitative approaches. I collected data from web pages, meeting minutes, GitHub issues and conducted a survey as well as interviews. For analysis and report, I mainly used R or Python as wel as Gephi. The empirical studies follow a three-act structure, namely I studied the social fabrics of the IIIF and Linked Art communities, the PIA research project as a laboratory to implement LOUD standards, and investigated LUX, Yale Collections Discovery platform. While I may have sacrificed some depth in individual areas, it was crucial for covering LOUD’s wide-ranging applications.
With the exception of annual conferences or face-to-face workshops, the vast majority of meetings take place online. These are open, with a clear agenda written on Google Docs, and use GitHub as a central platform for collecting use cases and collaboratively writing specifications.
Consensus-building is indeed a sustaining factor, as demonstrated here with the release of the IIIF Image and Presentation APIs 3.0 in 2020. The diagram illustrates the process undertaken by the IIIF community to review, validate, and release both specifications. This structured approach begins with groups such as the Editors, who identify and propose changes. These changes are then rigorously discussed and reviewed by members of the Technical Review Committee (or TRC). We can see the distributed agency that defines this community-driven process.
This chart illustrates the patterns of community participation in Linked Art meetings held between January 2019 and March 2024. During this period, 130 individuals attended 115 meetings. The average attendance was 13.57 sessions per participant, but the median was only 2, suggesting a small core of consistently active members alongside a larger group with sporadic involvement. This pattern reflects the challenges and successes of maintaining sustained engagement, especially within a volunteer-driven environment. The challenge lies not only in attracting more participants but in encouraging those who are currently passive to engage more actively. A similar participation pattern is observed in the IIIF community.
The IIIF and Linked Art communities offer a model for transparent, open collaboration that fosters growth and innovation, though their future relevance will depend on integrating diverse perspectives to avoid perceptions of exclusivity and Anglo-Saxon bias.
PIA provided an opportunity to demonstrate how it could act as a testbed for the implementation of LOUD standards and to benefit from the best practices and tools created by the communities. PIA has developed interfaces that support indexing and interaction with digital resources, and are designed to foster reflective engagement.
This is a synoptic View of the PIA infrastructure and it showcases its Connection to the DaSCH Service Platform, where the data is preserved for the long-term, and the Cultural Anthroplogy Switzerland Photo Archive Website. The use of LOUD standards in this exploratory setting has shown how quickly these APIs can be implemented, while underscoring the importance of a sustainable, long-term strategy for their maintenance. For instance, the IIIF Image API service could be later managed uniquely through the DaSCH infrastructure, reducing backend maintenance demands. However, sustained operational support for the IIIF Presentation and Change Discovery APIs—potentially in a read-only format—would require managing hosting costs post-project.
This video recording illustrates a materiality use case within the PIA project, where the protective films of a photo album are modeled as a IIIF resource. You can also see a loose photograph alongside its original placement on the album page. In total, there are four photographs that can each be displayed as layers in the Mirador viewer. Modeling complex digital objects like this is typically handled on a case-by-case basis to faithfully represent the materiality of photographic albums. This approach, selectively applied, reflects practices at institutions like the Getty Institute, which similarly addresses unique cases by adapting standards to capture complex physical details.
PIA demonstrates how LOUD specifications, especially IIIF APIs, have the potential to support public engagement through proxy mechanisms and participatory practices, though, for now, this engagement relies primarily on reusing tools developed by community-driven processes; achieving scalability and lasting impact will require ongoing commitments to both data preservation and service maintenance beyond the project’s duration.
Transitioning from a project-specific implementation, we now look at LUX as an example of LOUD applied on a larger institutional scale, demonstrating its adaptability to meet diverse needs across multiple institutions. LUX integrates collections from the Yale University Library, Yale Center for British Art, Yale Peabody Museum, and Yale University Art Gallery.
As part of the research, I conducted interviews with people involved in the LUX initiative to gain insight into its collaborative impact and adaptability. Here is an excerpt from an interview with Heather Gendron, Director of the Robert B. Haas Family Arts Library at Yale University, who reflects on LUX as a model for collegial and productive collaboration within Yale and potentially beyond. The success of Yale’s collaboration can be attributed to a shared vision that values the connections between diverse collections across various domains. This vision has fostered active contributions over the years. This commitment shows how sustained engagement and a unified approach can improve the effectiveness and reach of institutional initiatives.
Beyond the collaborative vision, LUX’s effectiveness relies on a carefully constructed infrastructure that supports its diverse data connections. For instance, I examined the vocabulary consistency, which plays a critical role in aligning and connecting the metadata across Yale’s collections. This plot highlights the shared usage patterns of Getty Vocabulary terms between the Yale Center for British Art and the Art Gallery, revealing significant intersections in how both institutions apply controlled terms. Reconciliation through the LUX pipeline ensures that terms from controlled vocabularies align seamlessly, allowing the collections to be enriched with consistent metadata across units.
As a large-scale Linked Data initiative, LUX illustrates how socio-technical principles can align data and human contributors around common goals, achieved through iterative development and interdisciplinary collaboration. Going forward, LUX will need to prioritise user feedback, expand collections and refine metadata to improve representation and maintain its role as a flagship cultural heritage project that sets a benchmark for other institutions - although the full impact of this work is still unfolding.
In addition to the dissertation, I wrote multiple articles, often with PIA project members, and published most of the scripts and data models on GitHub to promote transparency and reusability. The three highlighted contributions focus on (1) machine-generated annotations using IIIF and the Web Annotation Data Model, (2) envisioning what a LOUD ecosystem could look like, and (3) demonstrating that LUX, through its data pipeline and architecture, enables the automated enrichment in the cultural heritage sector.
In conclusion, I emphasized LOUD’s role in improving accessibility and usability in cultural heritage through IIIF and Linked Art. The PIA project showed LOUD’s impact on participatory practices and data reuse, while JSON-LD allows for both technical rigor and accessibility, even if it doesn’t transform perceptions of Linked Data entirely. I acknowledge critiques about balancing community practices with semantic interoperability, reflecting current adoption challenges. Future research could address this by expanding engagement with the Global South, which would add cultural diversity and extend LOUD’s reach. Strengthening semantic web integration will also be important to improve interoperability without compromising accessibility.
Many thanks for your attention. I now look forward to your questions and am ready to engage in a thoughtful discussion.