This Challenge was posted 4 months ago

13

Challenge view

Back to Project

Perceptual Plausibility

Build a game for classifying images as plausible (human-created) or implausible (machine-generated), that could also be "played" by an autonomous agent in order to automatically discern the provenance of content in a GLAM context, such as the images or other content shared on Wikimedia Commons. Make sure to include both the metadata (such as the Wikidata record) as well as the actual content in the assessment. A collection of synthetic and actual images from a Swiss municipality will be provided for this challenge.

To support investigation into the provenance of open cultural content, we propose in this challenge a form of gamified data collection to decide the extent (a non-binary rating) to which an image is synthetic or generative in origin. This could build on the Depiction Wikidata Game, which asks users to "Decide whether the image depicts the named item", or not. Such a game would could also be prototyped with an open source framework used for citizen science and crowdsourcing, such as PyBossa.

As a dataset for this project, we propose to compile and curate additional media from artists in the municipality of Köniz, where a culture festival is being run in parallel to this year's GLAMhack. In addition to the 100+ existing Wikimedia Commons entries we can source additional materials to upload. We can then generate an equal amount of images using Stable Diffusion (with Community License) close to the subject matter of the Commons collection. Some of these images will be "uncanny", making it hard to decide their origins. The complete collection will be provided as a Data Package in open formats.

The output of this challenge should be a concept sketch, a prototype, or even playable game, that can engage the public and contribute to the debate on Perceptual Plausibility outlined below.

Screenshot of the Depiction Wikidata game, Sigalov and Nachmias 2023

Discussion

The GLAM community is at the heart of academic and critical debate on the use of machine learning (ML) in creative domains. Whether generating text, images, sounds, video, or 3D metaverses, we explore the multimedia advances and augmentative potential of artificial intelligence (AI) with active interest and critical reflection on the impact to society.

With ML and AI now built into illustration tools, photocameras, and other recording devices, it may be argued that the lines of what is or what is not "generative" are indeed increasingly blurry. One aspect of this discussion that we might agree on as designers or critics alike, is that Perceptual Plausibility - such as photo-realism, in the case of images - is an important differentiating factor. See the History of Text-to-image models and Uncanny Valley for further references on Wikipedia.

There are numerous tools and even ML-based benchmarks to test the generative provenance of uploaded content, with varying levels of accuracy and applicability. We even see increasing use of adversarial techniques like Nightshade. The Turing test begs the question of whether determining plausibility (realism) as a human observer or moderator will be possible for much longer. See also: Stable Diffusion and Why It Matters and Evaluating Diffusion Models.

Here is an excerpt of the recently posted official guidance on Commons:AI-generated media states:

Per the Commons project scope, only media that are realistically useful for an educational purpose should be hosted on Commons. Just because an AI image is interesting, pretty, or looks like a work of art, that doesn't mean that it is necessarily within the scope of Commons. While some AI-generated media fall within our scope, media that lack a realistic educational use may be nominated for deletion.

The legal discussion, a confusing cornucopia of legislative debates on AI and copyright around the world related to this, was neatly summed up last October by Creative Commons in Understanding CC Licenses and Generative AI:

We encourage the use of CC0 for those works that do not involve a significant degree of human creativity, to clarify the intellectual property status of the work and to ensure the public domain grows and thrives. ... Neither copyright nor CC licenses can or should address all of the ways that AI might impact people. There are no easy solutions, but it is clear we need to step outside of copyright to work together on governance, regulatory frameworks, societal norms, and many other mechanisms to enable us to harness AI technologies and practices for good.

In terms of content labelling, there have been several initiatives in this community:

Recently the Vimeo video sharing platform, an early adopter of CC licenses, announced new AI guidelines, along with a revised Terms of Service.

Image: Vimeo Help Center

A project like this one could also be used to explore critical questions about the way online gigs and microtasking is used for building AI datasets: see Inside the AI Factory, and Are CAPTCHAs used to to train AI?

This image, and the one in the header, were rendered using Stable DIffusion XL 1.0.

Please feel free to leave a Comment here with any further links.

Contributed 6 months ago by loleg for GLAMhack 2024
All attendees, sponsors, partners, volunteers and staff at our hackathon are required to agree with the Hack Code of Conduct. Organisers will enforce this code throughout the event. We expect cooperation from all participants to ensure a safe environment for everybody. Please be reminded that this GLAMhack event adheres to Wikimedia's Friendly Space Policies. For further details, please refer to the information available here.

Creative Commons LicenceThe contents of this website, unless otherwise stated, are licensed under a Creative Commons Attribution 4.0 International License.