e-rara: Recognizing mathematical Formulas and Tables

The ETH Library enables access to a large number of scientific titles on e-rara.ch, which are provided with OCR. However, these old prints often also contain mathematical formulas and tables. Such content is largely lost during OCR processing, and often only individual numbers or letters are recognized. Special characters, systems of equations and tabular arrangements are typically missing in the full text.
Title
The aim of this project is to develop a procedure for selected content, how this information could be restored.

Data
Images & Full texts from e-rara.ch
OAI: https://www.e-rara.ch/oai/?verb=Identify

Contact
Team Rare Books and Maps ETH Library
Melanie, Oliver, Sidney, Roman
ruk@library.ethz.ch

Event finished

17.04.2021 17:30

training the model

17.04.2021 13:23 ~ ruk_ethbib

Event started

16.04.2021 08:30

Edited content

14.04.2021 05:25 ~ ruk_ethbib

Joined the team

01.04.2021 13:27 ~ ruk_ethbib

Challenge posted

01.04.2021 13:27 ~ ruk_ethbib
 
Contributed 2 years ago by ruk_ethbib for GLAMhack 2021

Connect to our community on: forum.opendata.ch | twitter | facebook

All attendees, sponsors, partners, volunteers and staff at our hackathon are required to agree with the Hack Code of Conduct. Organisers will enforce this code throughout the event. We expect cooperation from all participants to ensure a safe environment for everybody. For more details on how the event is run, see the Guidelines on our wiki.

Creative Commons LicenceThe contents of this website, unless otherwise stated, are licensed under a Creative Commons Attribution 4.0 International License.

GLAMhack 2021