Identifying colonial traces in early modern travelogues

Challenge

Zentralbibliothek Zürich provides a text corpus of printed travelogues from the 16th to 19th century. Can you identify and extract colonial traces in these French and German texts? For instance, you could try describe the gaze on the "other" by tracking down mentions of geographic regions, different languages, certain ethnicities or one of the following semantic fields with NLP methods:

the concept of the "noble savage"
slavery
representation of dominance
hierarchies and structures of control

For instance, the last field could be looked at from an economic, military, cultural or political perspective. Starting from a military perspective, word clusters like "Truppe – Fortifikation – Kriegszug - verfolgt – Beute – Treueeid – schwören – Vasall – kniend - Tribut" could be interesting to search for inside the German texts.

***Postcolonial reading of travelogue data***

What perspectives on colonialism can be gained by the extracted entities, that are normally not considered in travel narratives?

**Listing names of persons**

- Shift the focus away from the positioning and the role of the author in the colonial project

- Undertake a schematic grouping: Identify intermediaries, rand-and-file employees or marginalized voices

**Listing the names of organisations**

- Map the colonial infrastructure in an arena

**Listing the names of historical places**

- Compare the different naming for places in time

## Process

We used the SBB-NER-Tagger, developed by the Staatsbibliothek zu Berlin (SBB), for Named Entity Recognition in the travelogues. We were able to extract person, organization and place entities from the OCR texts. The SBB-NER-Tagger contains a BERT-based model which has been trained on the SBB collections of early modern prints in German, French and English language, and (at first sight) seems to produce some decent results.

We created one JSON file per text page, starting from the original JSON file of the dataset, and enhancing each page object with the identified entity strings and types (person, place, organization).

Result

We have a prototypical frontend displaying the entities present per book page: https://luminous-speculoos-e2e07f.netlify.app/

For a reference of the entity categories used, please see the BERT documentation.

A possible next step would be to use the Named Entity Linking Tool SBB-NED from SBB to link the entities found in the texts to Wikidata objects.

Challenge

Identifying colonial traces in early modern travelogues

Description

the concept of the "noble savage"
slavery
representation of dominance
hierarchies and structures of control

This content is a preview from an external site.

👋 Contact ✨ Demo 💻 Source

⎌

Edited (version 30)

1 year ago ~ annabellewiegart

Project

⎌

Edited (version 29)

1 year ago ~ annabellewiegart

Event finish

⎌

Edited (version 28)

1 year ago ~ MauriceBonvin

⎌

Edited (version 27)

1 year ago ~ MauriceBonvin

⎌

Edited (version 26)

1 year ago ~ MauriceBonvin

⎌

Edited (version 25)

1 year ago ~ MauriceBonvin

⎌

Edited (version 24)

1 year ago ~ MauriceBonvin

⎌

Edited (version 23)

1 year ago ~ MauriceBonvin

⎌

Edited (version 22)

1 year ago ~ MauriceBonvin

Update README.md (@annalauraw)

1 year ago

Update README (@annalauraw)

1 year ago

⎌

Edited (version 21)

1 year ago ~ annabellewiegart

Project

https://luminous-speculoos-e2e07f.netlify.app/

1 year ago ~ ibrahim_halil_kuray

⎌

Edited (version 19)

1 year ago ~ annabellewiegart

⎌

Edited (version 18)

1 year ago ~ annabellewiegart

⎌

Edited (version 17)

1 year ago ~ annabellewiegart

⎌

Edited (version 16)

1 year ago ~ annabellewiegart

⎌

Edited (version 15)

1 year ago ~ annabellewiegart

⎌

Edited (version 14)

1 year ago ~ annabellewiegart

Update README (@annalauraw)

1 year ago

Requirements (@annalauraw)

1 year ago

Front-end : https://github.com/Ibrahim-Halil-Kuray/Glam-Hack-2023.git

1 year ago ~ ibrahim_halil_kuray

Project

Joined the team

1 year ago ~ ibrahim_halil_kuray

⎌

Edited (version 12)

1 year ago ~ annabellewiegart

Additional JSON files with entities (@annalauraw)

1 year ago

Delete old data structure (@annalauraw)

1 year ago

Script to produce JSON files containing entity info per page (@annalauraw)

1 year ago

⎌

Edited (version 11)

1 year ago ~ annabellewiegart

Project

4 title files with entities (@annalauraw)

1 year ago

Joined the team

1 year ago ~ MauriceBonvin

Project

Entities per page (@annalauraw)

1 year ago

Example file with entities (@annalauraw)

1 year ago

Joined the team

1 year ago ~ Basil

Project

Raw text file per title (@annalauraw)

1 year ago

Initial commit (@annalauraw)

1 year ago

Start

⎌

Edited (version 8)

1 year ago ~ gaston

⎌

Edited (version 5)

1 year ago ~ annabellewiegart

⎌

Edited (version 4)

1 year ago ~ annabellewiegart

⎌

Edited (version 3)

1 year ago ~ annabellewiegart

⎌

Edited (version 2)

1 year ago ~ annabellewiegart

Joined the team

1 year ago ~ annabellewiegart

Repository updated

1 year ago ~ annabellewiegart

Challenge shared
Tap here to review.

1 year ago ~ annabellewiegart

All attendees, sponsors, partners, volunteers and staff at our hackathon are required to agree with the Hack Code of Conduct. Organisers will enforce this code throughout the event. We expect cooperation from all participants to ensure a safe environment for everybody.

The contents of this website, unless otherwise stated, are licensed under a Creative Commons Attribution 4.0 International License.

Decolonise the MEG inventories
GLAMhack 2023
Interactive Provenance Research of Chinese Paintings