In this active workshop, participants will explore 1) Wikidata as a tool for measuring under-representation and bias in Wikipedia and other reference sources, and 2) Wikidata’s own representation and how it could be improved.
Online spaces, to different extents, reflect the patriarchal context of wider society and especially the “brogrammer” ethos of the tech industry. A salient example is Wikipedia, whose role in both perpetuating and challenging the under-representation of women is well-documented (Ford and Wajcman (2017)). Wikipedia’s open decision-making process makes visible the role of patriarchal values and assumptions in questions of notability (Menger-Anderson (2018)). Other sources, such as library catalogues or Project Gutenberg, also have inequalities of representation, reflecting both historic injustice and the priorities of scholarship.
Wikidata is a sister-project of Wikipedia that acts as an authority hub, combining identifiers from thousands of different systems, from library catalogues to social media. For people, it also holds demographic data such as nationality, gender, and occupation. Its notability requirement is much wider than Wikipedia’s. As such, it can be used as an tool to explore under-representation, not just of Wikipedia but of other digital spaces and reference sources (Konieczny and Klein (2018), White (2018)).
From this session, participants will explore under-representation by learning how to identify people from a target group that are not represented in a chosen platform, but could be. They will focus on groups and platforms that fits their own interests, for example Francophone women scientists who lack French Wikipedia articles, or living British Muslims who lack an image on Wikimedia Commons. Similar target lists are used in the Women In Red campaign to increase the representation of women in Wikipedia and could be used in educational assignments or public engagement campaigns. Naturally Wikidata has its own flaws, and the second exercise will identify under-representation in Wikidata itself.
This will be a 1 hour interactive workshop session: you can bring a laptop and work alone or in pairs. You’ll learn about Wikidata, and how to write SPARQL queries to identify where people are missing or underrepresented. We assume no prior experience with databases: we will share short-cuts and templates that help non-technical users pose their questions and get answers. All our materials will be released online under a free licence via Wikidata project pages in advance of the session and the data are all freely reusable as CC0.
Ford, H. and J. Wajcman (2017) “‘Anyone can edit’, not everyone does: Wikipedia’s infrastructure and the gender gap” Social Studies of Science. DOI 10.1177/0306312717692172
Konieczny, P. and Maximilian Klein (2018) “Gender gap through time and space: A journey through Wikipedia biographies via the Wikidata Human Gender Indicator” New Media & Society DOI 10.1177/1461444818779080
Menger-Anderson, K. (2018) “Who’s Important? A tale from Wikipedia” Medium.com
White, A. (2018) “The history of women in engineering on Wikipedia” Science Museum Group Journal DOI 10.15180; 181008