Tracking the Covid Vaccine Race in the Wild First 2 Years – A Process Story

By the time the pandemic was declared in March 2020, my first Covid vaccine post was online – a backgrounder and history on first-in-human vaccine trials that I kept updating for months with news of vaccine trials for Covid. It was meant to introduce the phases of trials to people who hadn’t paid that much attention to this before – and it was also a sort of live blog on the start of Covid vaccines.

Deciding to zero in on vaccines was quick and easy. My main clinical research skill set is keeping up with large bodies of trial data and shifting evidence – that’s the obsession I PhD’d in. But memories of the long haul for antivirals in HIV and a quick look at papers on monoclonal antibodies had me doubting those skills would be needed for therapeutic drugs in the near future.

Digging around in the old papers on vaccine development for SARS and MERS, on the other hand, convinced me the optimistic voices in the coronavirus vaccine research community had good reason for confidence. There was another big driver here. I’d been deeply engaged in debates about HPV vaccines for years, and I wanted to be very well-informed for the inevitable anti-vax war – we were going to need “all hands on deck”. I was very concerned, watching the ground already being prepared to discredit vaccines before they even existed.

In February and March 2020, though, I didn’t realize what trying to keep up with Covid vaccine development was going to require. I’d assumed others would do the real monitoring, and I’d help keep the Wikipedia pages accurate and updated, and concentrate on communication. I started off just hunting out the early trials that were getting registered on ClinicalTrials.gov, doing searches there and in major media sources internationally each day. It was happening very fast. The first dose was injected in the Moderna phase 1 trial in Seattle on March 17, and the Seattle group had fully recruited by March 19.

However, as I followed what others were doing, I got frustrated. There were several academic efforts that were slow and narrow, still doing things the way they’d been done pre-pandemic – and that was risky. Pretty quickly you could see this was escalating dramatically. People were developing silo-ed systems that didn’t really help get a handle on what was going on. For example, trial registry entries were being gathered, but not linked up with preclinical reports, sources like developers’ press releases or regulator news, or progress reports on trial recruitment. Even being a few days behind was too much, when times were changing so quickly. And as the catastrophe galloped globally, I really needed a constant stream of hopeful signs.

What I didn’t need was hype, though, and we were being engulfed with it. On the media and expert commentary side, manufacturers and vaccine development teams were having way too much influence on coverage – their claims were generally accepted uncritically. Early on, though, I realized even simple things needed fact checking. For example, I’d originally assumed that the claim the US trial was the first and only must be right. But when I started catching up on what was happening in China, it wasn’t actually the case: the phase 1 trial for the CanSino vaccine had started a shade earlier. By April, that vaccine was already heading into phase 2.

Although I’d known to be prepared for anti-vaccine activism, I hadn’t realized how influential – and distorting – parochialism and nationalism were going to be. It was sort of race-to-the-moon level propaganda meets drug company hype, laced with prejudice and condescension about the science world beyond EuroAmerica. And as I said in an interview with Julia Belluz for Vox in February 2021, I “started to think, ‘This is just racism,’ an old colonialist-thinking legacy”, a belief that EuroAmerica always rescues everybody.

By June, I had abandoned updating that unwieldy post, and couldn’t keep spending time on Wikipedia pages. I had developed a routine, that I’d adapt as issues shifted, or to re-calibrate for efficiency. This wasn’t a funded exercise or part of a job, so being practical was always front and center. There’s always a trade-off between your willingness to accept you’re missing something, and your willingness/ability to invest time and effort.

I didn’t document the evolution of my searching process, but here’s where it ended up. I compensated a little for the searches being done by only one person by “doubling up” on places I searched, giving myself a second chance to pick something up. So I searched both PubMed and Europe PMC daily until September 2021, and less often since. (The search terms are at the foot of the post.) And I looked through all the new entries in the Covid preprint stream at medRxiv/bioRxiv – the back-up here was Europe PMC, which indexes both of those preprint servers, and a bunch of others as well.

Venn diagram of PubMed, Europe PMC, and the medRxiv/bioRxiv Covid preprint stream

That also helped compensate for differences in the sources. PubMed and Europe PMC have a lot of overlapping content, but their search engines aren’t identical, so I’d get different results for what was essentially the same search in each. The Covid stream at medRxiv/bioRxiv misses some too, and Europe PMC was the backup here.

Even though this captured the overwhelming bulk of publications, I knew I’d be missing some. So whenever I’d do an update post or write about a particular vaccine – generally because it had reached the phase 3 trial or roll-out stage – I’d do a deep Google and media dive on it.

Then there were 2 other major components: firstly, searching clinical trials registries – ClinicalTrials.gov every day, and several other countries plus the EU register less often. (The online search for WHO’s platform wasn’t available early in the pandemic, but I check it sometimes now.)

Then there was the media. I’d search Chinese news sources a lot in the early days, and Google News in particular countries when there was a lot of activity there. This was critical to understanding what was going on internationally. Here’s an example of me chasing down a trial reported in Chinese media in July 2020 that raises a critical issue – searching in languages I don’t speak.

Zhiwei Xu found that trial for me – and in doing so, enabled me to get the hang of searching that trials registry. For media searches, I used Google News in whatever parts of the world/languages I knew there could be vital activity. I had to do that every day, though. For large stretches of time when there was so much vaccine news, anything other than results of a 24-hour search was impossible to grapple with.

The end result? For most of 2020 and 2021 I was spending several hours a day searching and compiling resources, 7 days a week. Now I have my baseline searches, and then deep dives on whatever vaccines/themes I write about to supplement those. So I guess that’s on average a full couple of days a week.

The start of July 2020 marks the point where I had clearly fully committed to this ride. It was when emergency use of vaccine had started in China, and I started trying to do a monthly roundup post. It was also when I made my Zotero collection of records public. There are notes on this at the foot of this post if you want to dig into it – it’s a very specific set for keeping up-to-date on vaccines with published results, and the trial registry entries for vaccine with published results (or that’s rolled out without published results). It’s not useful for a lot of purposes – for example, I replace preprints with journal articles when I find them, so you can’t work out who published what first.

That first version of the collection I published had only 28 records in it. I’m glad I got started early. It wouldn’t be possible for one unpaid person to compile all that now, especially not when so much vital data is on government and company websites in a variety of languages. It won’t be long now till my collection has 1,500 entries, getting closer to 300 vaccine development groups with public results! For a virus that had its genome first sequenced in January 2020, that truly is wild.


Disclosures: My interest in Covid-19 vaccine trials is as a person worried about the virus, as my son is immunocompromised: I have no financial or professional interest in the vaccines. I have worked for an institute of the NIH in the past, but not the one working on vaccines (NIAID). It’s the one that produces PubMed and ClinicalTrials.gov. More about me.

The cartoons are my own (CC BY-NC-ND license). (More cartoons at Statistically Funny.)

Correction on March 2: Original version incorrectly stated that I was searching the Chinese clinical trials registry early on – it was only ClinicalTrials.gov to start.

Search terms


vaccine[ti] AND (“sars-cov-2″[ti] OR “covid-19″[ti])

Europe PMC:

(TITLE:”sars-cov-2″ OR TITLE:”covid-19″) AND (TITLE:”vaccine”)

Notes on the Zotero collection 

You can dig into my collection of records for vaccines here.

This is a publicly accessible collection I update regularly. It includes any Covid-19 vaccine with published preclinical, clinical trial results, or trial protocols. If a phase 3 trial starts (or is about to) without any prior publications for the vaccine, the vaccine is added. Once a vaccine is in the collection, clinical trial register entries for that vaccine are also added.

When trials are registered in more than clinical trials registry, the multiple records may or may not be in the collection: If I have located a record in ClinicalTrials.gov, I do not hunt for additional registrations. For my own convenience in keeping an overview of vaccine progress, when preprints appear later in journals, I replace the original record with the journal article. Preprints may also be uploaded to multiple preprint servers: rather than check if versions have differed, I keep the first preprint in the collection (sometimes over-written with updates in the same server), unless I saw that a version that seems markedly different.

