GLAMhack 2022 - taxonMAP

GLAMhack 2022

Challenge 3 “taxonMAP”

Alessia Guggisberg

ETH Zurich

Goal: Set up a pipeline based on a collection inventory, to find out where a given specimen is/could be filed.

Problem: When herbaria receive a new donation, when loans get returned after several years, or guests visit a given institution to work on a set of specimens, one is irremediably faced with the same, redundant question: where is/could the putative objects be filed? Within the context of large-scale digitization projects, whereby specimens are being revised, renamed and finally published in aggregators like GBIF, more and more discrepancy may result between the taxonomy used in the database and the physical collection, as well as the taxonomy used in different parts of the physical collections, for big institutions cannot digitize, hence revise their immense collections at once.

Benefit: Developing the proposed pipeline would (i) save herbarium staff much time when searching or filing their vouchers and simultaneously prevent them from doing putative errors (wrong filing, miss of multiple hits/pre-existing folders), (ii) save guests much time when searching their study objects, knowing that most institutions a different classification system, and (iii) provide herbarium curators with an assessment of how up-to-date their naming systems are. Obviously, this pipeline can be used in other natural history collections.

Case study: The vascular plant collections of the United Herbaria Zurich Z+ZT of the University (Z) and ETH Zurich (ZT) encompass about 2.5 millions objects. Between 2019 and 2021, all (ca. 125,000) species names currently used in the collections have been inventoried along with the respective volumes in centimetre stack height. About 15% of the collections for a total of approximately 350,000 specimens have been digitised so far and may be filed under the current accepted names, but remaining vouchers may follow deprecated species, genus or family concepts.

Method: (i) Import a list of taxa including species (genus, species-epithet and authority) and corresponding family names as recorded in the institution, (ii) specify publisher as in GBIF, (iii) map taxa to GBIF, and (iv) record following results when typing a given taxon name in the search field:

-            species, genus and family name(s) with authority as recorded in the institution (if relevant/available);

-            species, genus and family name with authority as recorded in GBIF (if relevant/available), along with taxonomic status (ambiguous, synonym, accepted);

-            number of vouchers from that taxon already publicly available from this institution in GBIF.

All attendees, sponsors, partners, volunteers and staff at our hackathon are required to agree with the Hack Code of Conduct. Organisers will enforce this code throughout the event. We expect cooperation from all participants to ensure a safe environment for everybody.

Creative Commons LicenceThe contents of this website, unless otherwise stated, are licensed under a Creative Commons Attribution 4.0 International License.

GLAMhack 2022