Text Analysis Challenge: Detect Looted Art.

Help Automate Analysis, Flagging and Ranking of Museum Art Provenance Texts by the Probability of a Hidden History

The Question: How to sift through the millions of objects in museums to identify top priorities for intensive research by humans?
The Goal: Automatically Classify and Rank 70,000 art provenance texts by probability that further research will turn up a deliberately concealed history of looting, forced sale, theft or forgery.
The Challenge: Analyse texts quickly for Red Flags, quantify, detect patterns, classify, rank, and learn. Whatever it takes to produce a reliable list of top suspects

For this challenge several datasets will be provided.

1) DATASET:70,000 art provenance texts for analysis
2) DATASET: 1000 Red Flag Names
3) DATESET: 10 Key Words or Phrases

TRIAGE: You're the doctor and the texts are your patients! Who's in good health and who's sick? How sick? With what disease? What kind of tests and measurements can we perform on the texts to help us to reach a diagnosis? What kind of markers for should we look for? How to look for them?

Download Provenance Texts Dataset: CSV

See code at https://github.com/parisdata/GLAMhack2020

👋 Contact 💻 Source

Connect to our community on: forum.opendata.ch | twitter | facebook

All attendees, sponsors, partners, volunteers and staff at our hackathon are required to agree with the Hack Code of Conduct. Organisers will enforce this code throughout the event. We expect cooperation from all participants to ensure a safe environment for everybody. For more details on how the event is run, see the Guidelines on our wiki.

The contents of this website, unless otherwise stated, are licensed under a Creative Commons Attribution 4.0 International License.

Previous
GLAMhack 2020