Audio Analysis Challenge

(07) Retrieve as much information as possible from an audio collection, through various Machine Learning/Natural Language Processing methods

Demo

Challenge

Retrieve as much information as possible from an audio collection, through various Machine Learning/Natural Language Processing methods:

  • speech-to-text
  • speech emotion recognition / sentiment analysis (from the transcription text or directly on audio, if doable): classify and tag speech/speakers’ sentiment based on their polarity (positive, negative, or neutral) or beyond (different emotions)
  • eventually data visualizations based on the results (e.g., https://50-jahre-hitparade.ch/analysis/ - from where the chart above comes from)

Dataset

Collection “Radio pleine lune”: Radio Pleine Lune, was a feminist radio program in the Geneva region that started with pirate broadcasts in 1979. The collection has been deposited in the Archives contestataires in Geneva, which collects, preserves, and valorizes documents from social movements of the second half of the 20th century. The program existed from 1980 to 1999. It is of particular importance for the Archives contestataires insofar as it gives an account of the various media forms used by protest movements in the second half of the 20th century. The materials represent broadcasts, thus direct recordings in the studio, as well as some rush documents, essentially interviews.

Information about the collection:

http://inventaires.archivescontestataires.ch/index.php/fonds-radio-pleine-lune https://memobase.ch/fr/recordSet/acc-001

Metadata:

https://api.memobase.ch/record/advancedSearch?q=isPartOf:mbrs:acc-001 Metadata are in French. Most relevant fields are the title, the abstract and the keywords (hasSubject).

Data: 443 audio recordings.

Possible issues:

  • not enough training data
  • chaotic corpus (multiple voices, live speaking)

Needs: developers with experience with audio analysis algorithms; eventually, web designers.

Event finish

Joined the team

2 years ago ~ loc_jaouen

Project

Edited (version 22)

2 years ago ~ roberta_padlina

Edited (version 20)

2 years ago ~ roberta_padlina

Testing different solutions for speech-to-text

2 years ago ~ roberta_padlina

Edited (version 16)

2 years ago ~ roberta_padlina

Edited (version 14)

2 years ago ~ Darienne

Start

Edited (version 12)

2 years ago ~ jonaslendenmann

Edited (version 8)

2 years ago ~ roberta_padlina

Edited (version 6)

2 years ago ~ roberta_padlina

Edited (version 4)

2 years ago ~ roberta_padlina

Joined the team

2 years ago ~ roberta_padlina

Challenge shared
Tap here to review.

2 years ago ~ roberta_padlina
 
Contributed 2 years ago by roberta_padlina for GLAMhack 2022
All attendees, sponsors, partners, volunteers and staff at our hackathon are required to agree with the Hack Code of Conduct. Organisers will enforce this code throughout the event. We expect cooperation from all participants to ensure a safe environment for everybody.

Creative Commons LicenceThe contents of this website, unless otherwise stated, are licensed under a Creative Commons Attribution 4.0 International License.