Challenge view
Back to ProjectLinked data approach for the description of heritage collections
Decentralized and pluralist description of heritage collections using "archives at", "documentation files at", etc. properties on Wikidata
Challenge: Linked data approach for the (decentralized and pluralist) description of heritage collections
Over the past years, a methodology has been proposed, tested, and further elaborated that allows for a decentralized and pluralist description of heritage collections. It is “decentralized” in the sense that it does not rely on one specific authority (typically the holding institution) to describe a given collection. It is however “centralized” in the sense that it proposes to maintain the respective data on Wikidata, which facilitates its use. The approach relies on the use of “archives at”, “documentation files at”, “object collection at”, etc. properties on Wikidata as seen on the screenshot below.
Source: https://commons.wikimedia.org/wiki/File:Referencing_archival_fonds_in_Wikidata.png
Authors: Bolliger, Stephanie; Brüderlin, Brigitte; Gasser, Michael; Lyskawa, Julia; Maier, Petra; Schmitt, Lothar
The approach can be used in a straightforward, incremental manner, which allows for the progressive interlinking of the data. It relies on current infrastructures and technologies, in a way that is scalable and can be used by institutions and research communities around the world starting today, independently of their resource situation, making use of infrastructures and tools that are accessible to all and are in line with the Digital Public Goods Standard[1]. Furthermore, the approach allows for a pluralist take on the description of heritage holdings, based on an open collaborative approach, independently from the processes in place at individual institutions.
State-of-the-Art regarding the Description of Collections
As regards references to, and the description of heritage collections, based on Wikidata, the approach has been concretized and documented for the field of the performing arts[2], but is extensible to any field and all types of heritage collections. So far, most of the data entered on Wikidata concern archival holdings about persons. Further entities to be linked to may include organizations, places/venues, creative works, (cultural) events, etc. Research has been carried out to study data modeling practices, resulting in recommendations for best practices[3]. Approaches to add statements based on existing collection metadata have successfully been tested[4].
The flexibility of Wikidata allows almost everything (or even „the sum of the human knowledge“) to be described in its knowlegde graph. The area of event data is has so far only sparsely been covered in authority files. Only about 9 % of the data stored in the „Gemeinsame Normdatei” (GND) is about events[5]. At the same time, the materials stored and catalogued in heritage institutions are full of (hidden) relations to events, mentioned either in the finding aids/catalogues or in the content itself. Thus, by providing more precise event (authority) data, the retrieval of those materials would be made much more effective. Such events may occur in any domain, such as political events (elections, polls), artistic events, sport events, or any other occurrence where something could be linked to a date or time period, linked with place(s), and possibly persons related to those events[6].
State-of-the-Art Regarding Applications
In addition to ingesting the data into Wikidata, it is important to develop applications making use of the data thus made available. First applications exist for the referencing of archival holdings about persons (e.g. infoboxes on French Wikipedia). Furthermore, a pilot application has been developed in the course of a Hackathon Series among Swiss Research Libraries that integrates the approach in a Library Discovery System[7].
A dialogue with the ETH Library, SWITCH, and SLSP, in view of the integration of such data in the online discovery service of the Swiss libraries, Swisscovery, as well as into the SWITCH Research Data Connectome[8], is currently taking place.
A general search application exploiting both the “archives at”, “documentation files at”, etc. statements on Wikidata in all their combinations with different types of entities as well as “depicts” statements on Wikimedia Commons, is currently missing.
Some initial ideas what could be worked on:
- Carry out prototypical ingests of less common combinations of the “archives at”, “documentation files at”, etc. properties and different types of entities. Resolve data modeling issues. Submit additional property requests where needed. Document showcases.
- From a GLAM perspective: Ingest metadata from existing finding aids / catalogues (e.g. after interlinking named entities with Wikidata) - both on a small scale (for complicated cases requiring data cleansing and reconciliation) or on a large scale (for relatively straightforward cases).
- From a researcher, volunteer contributor, etc. perspective: Add further descriptive data to entries in online finding aids / catalogues.
- Develop tools facilitating the above (e.g. Browser extension facilitating the interlinking between entries in online catalogues and Wikidata).
- Develop a search & discovery tool making use of the data made available through Wikidata.
Author of the challenge:
- Beat Estermann (Opendata.ch / Bern Academy of the Arts)
[1] https://digitalpublicgoods.net/standard/
[2] https://www.wikidata.org/wiki/Wikidata:WikiProject_Performing_arts/Typologies#Artefacts_Documenting_Activities_Related_to_the_Performing_Arts
[3] "archives at" Statements - Towards a Best Practice
[4] Estermann, B. (2020): Wikidata for Libraries Hackday Series, This Month in GLAM, 10(1), January 2020.
[5] https://lobid.org/gnd/search?q= An overall search in lobid.org/gnd results in a filter list of all top level entities, from where it appears that about 880.000 entries are classified as „Veranstaltung“, compared to over 6,000,000 entries describing persons.
[6] During the cataloging and structuring of articles of the Lucerne journal „GasseZiitig“, ZHB Luzern is creating Wikidata items for several Swiss federal initiatives like the one about drug policy in 1998 (https://www.wikidata.org/wiki/Q119139925). Such items are typically „events“ and can serve as a showcase to demonstrate the flexibility of the Wikidata data schema in describing each event within its own purpose. It is now possible to search for exactly that „event“ in ZentralGut: https://t.ly/_kkMS using the Wikidata QID.
[7] Estermann, B. (2020): Wikidata for Libraries Hackday Series, This Month in GLAM, 10(1), January 2020.
[8] https://www.switch.ch/connectome/