Challenge view
Back to ProjectThe scene lives!
Opening a dataset of underground multimedia art
Create an open dataset of demoscene productions, which could be filtered to individual countries, themes, or platforms, and help make the demoscene more accessible to people who may have never heard about it. This dataset could be of interest from an art-history perspective to complement our UNESCO digital heritage application - or just be used to introduce people to the history of the 'scene.
Outputs
- Project report (Readme above)
- Backend: Data Package & API
- Frontend: Demo app
Supporting the demoscene, one Data Package at a time
Data Package 🌐 json
- productions ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■
Prior art
This project is closely related to the Swiss Video Games Directory from previous OpenGLAM events, and was quite inspired by this tweet:
1/ First and most important: Is the scene dead?
— pouët.net (@pouetdotnet) January 1, 2021
Judging by the number of prods released each year, no! Naturally the pandemic in 2020 didn't help matters, but there is a solid average of 1300 Pouët-worthy prods each year! pic.twitter.com/9f5nJl2ATI
title: The Scene lives! tags: echtzeit description: Echtzeit x GLAMhack 2021 slideOptions:
theme: dark
<tt style="font-size:70%">
TL;DR - Presenting (see intro) Echtzeit = Contributors to international demoscene - GLAMhack / OpenGLAM = Community of international culture-data wranglers - Our #GLAMhack 2021 project: https://hack.glam.opendata.ch/project/114 - Prototype web app: https://scene.rip/ - Source code app: https://github.com/we-art-o-nauts/the-scene-lives-app - Prototype API: https://api.scene.rip/productions - Data + API: https://github.com/we-art-o-nauts/the-scene-lives - Slack channel: #team-26-echtzeit-x-openglam
</tt>
10'000 m goals
- Make the demoscene more accessible to people who may have never heard about it. (:wave: hello #GLAMhack!)
- Create an open dataset of demoscene productions, which could be filtered to individual themes or platforms.
- Support the UNESCO digital heritage application, or just explore the history of the 'scene.
..Or just tune into SceneSat :headphones: and enjoy electronic art at the hackathon!
Elevated by Urszula "Urssa" Kocol
Our GLAMhack roadmap
- [x]
10 Mine some data from
demozoo - [x]
20 Collect a few bytes from
pouët - [x]
30 Create an initial
Data Package - [x]
40 Document process of
aggregation - [x]
50 Set up a basic demo
service - [x]
60 Push dataset as open
repository - [x]
70 Propose a standard(TM)
schema - [x]
80 Create and demo the
prototype - [ ] ... :moneymouthface:
:point_down: The drilldown
Road by PG and R0ger
Pouet logo by tomaes
trumpets
The oldest and most well known central repository, pouët, makes daily data exports available at https://data.pouet.net/ with a JSON API endpoint at https://api.pouet.net/ and the open source code of it at https://github.com/pouetnet/pouet2.0-api
We downloaded and tried to parse the raw JSON with a couple of tools, and didn't manage to get far. Convoluted structure and formatting errors were rather demotivating. Nevertheless it influenced our thoughts about a "demoscene data standard", and brainstorm ideas of improving overall data quality (for example, we immediately noticed mismatched dates and missing values).
This API was also recently used to do some terrific data analysis, and we reached out to the authors to find out if we can reuse their scripts. We have also reached out via the #pouet Discord for guru meditation.
demozoo
The Demozoo API is basic but usable: http://demozoo.org/api/v1/ As it's a paginated web service, it would require a bit of scraping code to aggregate. So we used the 'nuclear option' of getting the database dump in raw SQL format. Importing this into a local SQLite database (inspired by all of Wikipedia in SQLite) and then re-exported the tables in CSV format. This should be done differently for automated data updates.
We have reached out via the #demozoo Discord and GitHub for some further ideas.
package
A popular current way to crowdsource open data in a distributed way is the Data Package, the preferred format for doing this using the Frictionless Data project, which has a create tool to generate an initial datapackage.json
.
An initial data package based on the Demozoo archive is at GitHub, which compiles and aggregates the data from several tables using the Python dataflows library.
schema
Exploring and transforming the data gives us a frame of reference based on which to think about some commonalities and differences between different archives' approaches. We did some research to see what effort in this direction was already made, and reached out to the Demozoo and Pouet communities.
Each of the data sources have a schema of their own, and some attempts at consolidation have been made. We started with a simplified version of the Demozoo model, created a Table Schema which can be used for validation or annotation as JSON Schema.
repository
Our dataset repository clearly explains its sources, but also points out there are many other places which could be future data acquisition targets. These notable scene repositories and data sources include:
- https://demozoo.org/ (as above)
- https://pouet.net/ (as above)
- https://csdb.dk/
- https://zxart.ee/
- https://ada.untergrund.net/
- https://files.scene.org/
- https://www.demoparty.net/
service
While the Data Package is nice to look at in it's JSON glory, most people (and programmers) will want some kind of interface to it. We wrote a small server using the Falcon Framework to produce a barebones API. Since the data is loaded using a Frictionless Data wrapper for the Pandas library, it can incorporate various advanced sorting and filtering routines. Our proto-service is currently running at a private VPS hosted on Linode, but should also work on 'lambda function' hosts like Vercel or Heroku.
Still from Traffic Jam by Chainsaw
prototype
If we make it this far, we would like to make a basic example of data usage. After all this data wrangling, we didn't have time to really explore the space of user interface possibilities. But we have a small application that demonstrates the API with an infinite-scrolling user interface showing productions.
See for yourself at https://scene.rip
Ideas to build upon
- Establish live feed graph view of demoscene productions
- Create an infographic that helps to explain the demoscene
- An interactive app for scrolling through prods (e.g. Netflix or Giphy clone)
- A cheatsheet to learn the most important terms and famous groups/prods
- A virtual reality exhibition (like other teams are working on)
More or less irrelevant links
- Thread showing various graphs of prod release stats from Pouët https://twitter.com/pouetdotnet/status/1345056403338231808
- Discussion of "greets graph" https://www.pouet.net/topic.php?which=12099&page=1
- Some debate about classification https://www.pouet.net/topic.php?which=12098
- Discussion of Demozoo API https://demozoo.org/forums/66/
- Oleg's blog explaining the demoscene, partially on the topic of "letting the data speak for itself" https://blog.datalets.ch/010/
- Teaser Revision 2017 seminar "Graph databases and the demoscene universe" https://2017.revision-party.net/events/seminars
- Ideas from demosceners on GitHub https://github.com/nesbox/TIC-80/discussions/1286
- Internet Archive gallery https://archive.org/details/softwarelibraryc64demos "To some, the heart of the Demoscene - the self-playing examples of programming and artistic prowess of the last 30 years on the underpowered but extremely flexible C64."
Demo or :skullandcrossbones: die!
:sheep: Thanks for watching!
:love_letter: seism@utou.ch :bird: @seismist
Pass it forward: hackmd.io/@oleg/the-scene-lives
<small>This presentation is shared under CC BY 4.0</small>