Some Things I Like About the Expanding Wikipedia Universe

Oh, the things we could know! A cartoon in Dr Seuss style, with concentric bright colors, and a smiling person in a fuzzy yellow hat. (By Hilda Bastian.)

About a week ago, I created an English Wikipedia page for a woman who had an extraordinary life, and left a phenomenal legacy. The Pasteur Institute honored her, her husband, and their son, as “pioneers of modern biology.” Her name was Elisabeth Wollman. She was born in 1888, and she died in Auschwitz along with her husband and scientific collaborator, Eugène, in 1943. Their work, picked up by colleagues – including their son – and continued after the War ended, led to a Nobel prize and helped lay the foundations for understanding viruses, cancer, and, later, HIV.

I found out about her when I was trawling through the Women In Red list, looking for a female scientist who should have a Wikipedia page, but didn’t. Women in Red (WIR) is a wikiproject that just celebrated its 8th birthday. In that time, it’s helped push the percentage of biographies of women’s lives on Wikipedia from a measly 16% of all bios to 20%.

Wollman made it onto the WIR list because she has an entry in one of my favorite Wikipedia-universe expanders, Wikidata. WIR runs hundreds of automated queries in Wikidata to find women without English Wikipedia pages, while another automated function, Listeria, generates the dynamic list, updated by a bot.

Wikidata is structured information which volunteers can edit, just as we do other parts of Wikipedia content. I edited Elisabeth Wollman’s Wikidata when I created her page, adding detail on the date of her death and references, for example.

To get a sense of what Wikidata is like, you can check out Elisabeth Wollman’s entry. There are several data items about her – obvious things like her place and date of birth. And there are critical linkages, too. In her case, for example, there’s her relationship with other people in Wikidata (her husband and son), a public domain photo of her in Wikipedia’s image repository (Wikimedia Commons), and the Wikipedia pages about her. There were already small bios for her in the French and Belarusian Wikipedias – that’s how she came to be in WIR’s list. And when I created her English page, I linked it in as well.

There’s a system for running queries in Wikidata, with a query builder to help so you don’t need to use program language. (Unfortunately, though, there’s still no simple way to run queries for having – or not having – associated Wikipedia pages.) You can check out some examples of queries in this post by Egon Willighagen, for using Wikidata to find accounts on Mastodon. One example is universities with Mastodon accounts, which listed 64 of them at the time I wrote this. (To run the query to get the current list, you press the blue square with the white arrow on the bottom left of the query box.)

As of writing, there are over 11,100 categories for data – called properties – in Wikidata, and over 105 million entries, for people and many other subjects (called items). And from the little I’ve seen so far, although there are errors, of course, it’s pretty good quality. Wikidata is growing in scope – there have been over 1,000 properties added since this time last year – as well as size – with over 5 million entries since then, too.

The data isn’t just coming from those of us who are editing it manually. It’s being drawn out of structured elements of Wikipedia etc. And there are data donations. Some examples: biographical data from the national libraries of Australia and France, artist bio data from the Netherlands Institute for Art History, and World Heritage Site data from UNESCO.

One of my other favorite things about the Wikipedia universe these days is that you can keep up with all sorts of news via Mastodon. For example, I currently follow Wikidata, the Wikimedia Foundation, and my local Wikipedia group (Australia). Wikipedia and Mastodon is such a natural pairing! You can even link your Wikipedia user page to your Mastodon account, resulting in a verified link at Mastodon. (Here’s my explanation of how I did that.)

Since I’ve been seeing so much Wikipedia news, a couple of Wikipedia developments in particular caught my eye. One is a type of extension of Wikidata, called Abstract Wikipedia. It doesn’t mean abstract, as in scientific journal summaries – it’s abstract as in conceptual. The idea is that you could develop language-independent versions of Wikipedia articles, with code that could translate them into other languages. Gulp! You can read about it here.

The other news was why it’s a big deal that Wikipedia updated to a Creative Commons 4.0 licence. For example, it’s a single global standard, and critically, it means that other content created under 4.0 is now allowed in the Wikipedia universe.

Wikipedia turned 22 early this year. I’ve lost track of how many times I’ve seen people predict its imminent failure and death, but it’s still here, and its universe is still expanding. The New York Times recently had an in-depth article by Jon Gertner called “Wikipedia’s Moment of Truth.” It explores what the Large Language Model explosion means for Wikipedia, now and down the line. Could we end up with a situation where Wikipedia, “outflanked by A.I. that has cannibalized it, suffers from disuse and dereliction”?

Machines couldn’t produce anything as good as Wikipedia without Wikipedia to feed on. And several people argued that it’s possible the LLM developments “will help the organization improve rather than crash.” Fingers crossed we steer through this to that positive outcome. Reading this article made me realize we need to do more than just cross our fingers, though. I stopped keeping up with Wikipedia “politics” etc a few years ago, though I’ve kept editing. It turns out I’m one of only 40,000 people who edit enough to be rated as active – and 80% of them are male. I should carry some of the load on the sociopolitical side, too.

The first in-person Wikipedia meeting I ever went to launched me into a lot of activity for a few years. Here’s a photo from then, at the WikiWomen gathering at Wikimania in July 2012. (I’m in the second row, fourth from the left.) I remember several people there asking why I wasted so much time on Twitter, when it could be better spent on Wikipedia – including socially. It was a good question, and I didn’t really have a good answer. Now, a decade(-ish) later, I’m no longer spending any time on Twitter – and it’s the Wikipedia-compatible platform of Mastodon that’s pulling me back into the Wikipedia social orbit. Better late than never, eh?!

I’m on Mastodon. Interested in Mastodon? Check out my Shortcuts to Giving Mastodon a Try. You can check out my newsletter, Living With Evidence, here.

Correction on August 3: The first version had Belgian instead of Belarusian Wikipedia. I’m very grateful to Ihar Razanau for letting me know.

Disclosures: I’m an active Wikipedian – User:Hildabast. I maintain a list of financial disclosures in the About section of Absolutely Maybe.

  1. Came across this on the WikiWomeninRed on Twitter – nice read and yes Twitter is nuch less fun. Thanks for the nice and informative article and the Wikipedia additions.

