Repository updated
Unveiling the Secrets of Zurich’s Nightly Visitors (1780-1818)
The Zentralbibliothek Zürich has digitized a collection of historical documents known as the "Zürcher Nachtzedel," spanning from 1780 to 1818. These nightly registers, meticulously recorded by the town’s "Nachtschreiber," capture the names of visitors staying in Zürich’s inns each evening, offering a fascinating glimpse into the past. Notable figures such as Goethe and Hölderlin are among those whose stays in Zürich are documented, making this collection an invaluable resource for historians, researchers, and enthusiasts alike. However, this rich data is currently locked in image format, without machine-readable text (OCR). To unlock its potential, we need your help!
Your Mission:
Join us in transforming this historical data into a valuable, searchable resource by applying OCR technology and verifying the results. Once we have high-quality text, we can visualize and analyze the data to uncover who stayed in Zürich and how often, revealing patterns of social interaction and stratification.
What We're Looking For:
- OCR Experts: We need 2-3 participants with experience in OCR software to extract text from the digitized images with accuracy.
- Data Verifiers: We need several meticulous individuals to review the OCR-generated text, ensuring it correctly identifies names, dates, and locations. Your task will be to verify and correct any errors, ensuring the data's integrity.
- Named Entity Recognition and Linking Specialists: We want to automatically differentiate between persons, locations and possibly occupations-entities and link as many as possible to authority resources like Wikidata. Can you help us with best practices scripts or AI-prompts?
- Data Visualizers: Help us visualize the history of nightly visitors to Zürich: At what time did people from what places visit Zürich?
Why Participate?
- Contribute to History: Your work will help make this unique dataset accessible to researchers worldwide\, shedding new light on Zürich’s historical visitors.
- Sharpen Your Skills: Whether in OCR technology\, data verification\, named entity recognition and linking\, or historical research\, this challenge offers a chance to hone your expertise in a meaningful project.
- Collaborate and Network: Work alongside passionate individuals\, share knowledge\, and make connections that could last beyond the hackathon.
Join us in this exciting challenge to bring Zürich's past to life and make history accessible to all!
Link to Github-Repo:
If you want to handle the images, this repo might be helpful...
Link to our data set:
Challenge 14 "Unveiling the Secrets of Zurich’s Nightly Visitors (1780-1818)" /GLAMhack 2024
This Repo contains the JPG-images from Zürcher Nachtzedel (1780-1784) and an OCR example with Claude 3.5 on Colaboratory.
Contents
- "ideas" folder contains;
- Python-Notebook for OCR with Claude 3.5
- OCR results in txt format (in the folder "ocrmitclaude")
- Text contents in TEI/XML format (in the folder "tei_beispiele")
- "images" folder contains;
- images from Zürcher Nachtzedel (1780-1784). The images here are not in the orginal format
- images with odd number are removed, because they are the rear sides of the flyers which have no addtional information
Project
Event finish
Joined the team
Merge branch 'main' of https://github.com/NbtKmy/nachtzeddel (@wogsland)
presentation ready (@wogsland)
Project
added (@NbtKmy)
even more data (@Rouven-Schabinger)
Merge branch 'main' of https://github.com/NbtKmy/nachtzeddel (@wogsland)
better script (@wogsland)
Update of data with the new files from 1782 (@pablogit)
Update of short tables (@pablogit)
Update after adding 1782 txt files (@pablogit)
Added exports for lines with Dandelion API places (@pablogit)
geojson added (@NbtKmy)
Add files via upload (@sarahkiener)
Merge branch 'main' of https://github.com/NbtKmy/nachtzeddel (@wogsland)
wikipedia links (@wogsland)
use data of all the years (@Rouven-Schabinger)
added places from dandelion API (@pablogit)
find repeated guests in 5000 stays (@Rouven-Schabinger)
Add files via upload (@sarahkiener)
streamline heatmap (@Rouven-Schabinger)
add: use NER spots and look for wikipedia coordinates (@Rouven-Schabinger)
ai generated project poster (@Rouven-Schabinger)
added (@NbtKmy)
ocr notebook update (@pablogit)
Merge branch 'main' of https://github.com/NbtKmy/nachtzeddel (@pablogit)
dandelion api results (@pablogit)
add things (@NbtKmy)
add real data (@Rouven-Schabinger)
WIP cleanup (@wogsland)
Add files via upload (@sarahkiener)
add historical map (@Rouven-Schabinger)
weirdness (@wogsland)
WIP (@wogsland)
Add heatmap plotting code (@ficovaz)
Delete txt.txt file (@ficovaz)
bla (@NbtKmy)
everybody gets some links... (@wogsland)
txt from 1781 added (@NbtKmy)
Merge branch 'main' of https://github.com/NbtKmy/nachtzeddel (@pablogit)
1783-01 to 1784-12 txt extractions (@pablogit)
Generated map (@ficovaz)
Add script and data (@ficovaz)
Create directory (@ficovaz)
txts from 1781 (@NbtKmy)
some texts added (1781) (@NbtKmy)
Merge branch 'main' of https://github.com/NbtKmy/nachtzeddel (@wogsland)
hits, mostly garbage (@wogsland)
This data is quirky and tangled and reminds me of the SBB conductors and their surveys of GA-passengers. You've made it clear there are still many stories here to dig into and illustrate. Here's to the night sleuths! 🕵🏼