Update woerterStillstand_Result.tsv (@thodel)
Sex and Crime und Kneippenschlägereien in Early Modern Zurich
Minutes reported by pastor in Early Modern Zurich
Make the "Stillstandsprotokolle" searchable, georeferenced and browsable and display them on a map.
For more Info see our Github Repository
Access the documents: archives-quickaccess.ch/search/stazh/stpzh
Data
- Primary Data
- Secondary data
- Siedlungsverzeichnis des Kantons Zürich: http://www.web.statistik.zh.ch/cms_siedlungsverzeichnis/daten.php
Team
- Ernst Rosser, ernst.rosser@gmail.com
- Tobias Hodel, tobias.hodel@ji.zh.ch
- Barbara Leimgruber, Barbara.Leimgruber@ji.zh.ch
- Rebekka Plüss, Rebekka.Pluess@ji.zh.ch
- Ismail Prada, ismail.prada@gmail.com
- Matthias Mazenauer, matthias.mazenauer@statistik.ji.zh.ch
#glamhack2018
Sex and Crime und Kneipenschlägereien in der Frühen Neuzeit
Goal
Make the data ("Stillstandsprotokolle des 17. Jahrhunderts") better searchable and georeference it for visualization.
Team
- Ernst Rosser, ernst.rosser@gmail.com
- Barbara Leimgruber, Barbara.Leimgruber@ji.zh.ch
- Rebekka Plüss, Rebekka.Pluess@ji.zh.ch
- Ismail Prada, ismail.prada@gmail.com
- Matthias Mazenauer, matthias.mazenauer@statistik.ji.zh.ch
- Tobias Hodel, tobias.hodel@ji.zh.ch
Data sources:
Primary Data
Secondary data
Steps taken
- Create lookup for normalized strings (https://github.com/mmznr/Staatsarchiv-GLAMhack/blob/master/woerterStillstand_Result.tsv)
- Annotate named entities (normalization) -> places (also add BfS-data) -> persons (normalization to be used for auto-complete in search)
- Cluster words -> based on "Frequenztabelle Stillstandsprotokolle", see https://github.com/mmznr/Staatsarchiv-GLAMhack/blob/master/README.md#frequency-list-of-word-cluster -> to be used to refer to topic/concept
- Cluster documents -> to be used as keyword(s) in TEI header = Scripts for clustering, see folder "code"
- Create script to add information as tags (in body) to write in XML (in work)
Lemmatization/Normalisation
Done: Wordlist and Frequencies
ToDo: POS tagging
Named Entities
Names of persons: done A-D
Names of places: done A-K
Visualization
Word-Cluster
Visualization
(using fasttext) https://github.com/mmznr/Staatsarchiv-GLAMhack/tree/master/Visualisierungen/clusters.png https://github.com/mmznr/Staatsarchiv-GLAMhack/tree/master/Visualisierungen/clusters2.png
Frequency list of Word-Cluster
https://docs.google.com/spreadsheets/d/1rFo7p9YsQRwJufMuWGw2677acOsWevcmm-lN5RVBJv4/edit?usp=sharing
GIS Visualization
https://beta.observablehq.com/@mmznrstat/sex-and-crime-und-kneipenschlagereien-in-der-fruhen-neuzei
Done: Borders from swisstopo via Linked Data, Matching of the settlements of the canton of Zurich
ToDo: Get List of old names of this settlements, match them and show all relating documents of a settlement (or municipality)
Event finish
Update README.md (@thodel)
Merge branch 'master' of https://github.com/mmznr/Staatsarchiv-GLAMhack (@raykyn)
Added README (@raykyn)
Update search_stories.md (@thodel)
Upload code for clustering. (@raykyn)
Update woerterStillstand_Result.tsv (@thodel)
Add files via upload (@thodel)
Update woerterStillstand_Result.tsv (@thodel)
Update README.md (@thodel)
Merge branch 'master' of https://github.com/mmznr/Staatsarchiv-GLAMhack (@raykyn)
Update README.md (@thodel)
Merge branch 'master' of https://github.com/mmznr/Staatsarchiv-GLAMhack (@raykyn)
Update README.md (@thodel)
Added doc clusters with avg method. (@raykyn)
Update README.md (@thodel)
Merge branch 'master' of https://github.com/mmznr/Staatsarchiv-GLAMhack (@raykyn)
more word clusters. (@raykyn)
Update woerterStillstand_Result.tsv (@thodel)
Update and rename searchstories.txt to searchstories.md (@thodel)
New cluster lists. (@raykyn)
Update README.md (@thodel)
Update woerterStillstand_Result.tsv
Correction of problematic cases (@thodel)
Update woerterStillstand_Result.tsv
correct spellings (@thodel)
Add files via upload (@thodel)
Added doc cluster list excluding named entities. (@raykyn)
Merge branch 'master' of https://github.com/mmznr/Staatsarchiv-GLAMhack (@raykyn)
Created cluster folder. Added document cluster list. (@raykyn)
Update README.md (@thodel)
Added link to frequency table. (@raykyn)
cluster list and visualization added. (@raykyn)
Upload annotiertes Ortsverzeichnis
Mit Angaben zu BFS-Nummer/bzw. Nummer des Siedlungsverzeichnisses (@thodel)
Update README.md (@thodel)
readMe (@mmznr)
Update search_stories.txt (@thodel)
Add wordlist of eStPZH with frequencies
To get an idea, what words/strings are used most frequent. (@thodel)
Create search_stories.txt
First draft, please add (@thodel)
upload txt and docx of Stillstandsprotokolle
to search in docs (i.e. for context) in antconc or similar software (@thodel)
Siedlingsverzeichnis added (@mmznr)