When you choose to publish with PLOS, your research makes an impact. Make your work accessible to all, without restrictions, and accelerate scientific discovery with options like preprints and published peer review that make your work more Open.

PLOS BLOGS Absolutely Maybe

Peering Through the Smoke at a Duel Over Covid’s Infection Fatality Rate

Cartoon of dueling meta-analyst gangs

This is one of my older cartoons. Unfortunately, the problem it’s depicting hasn’t gone out of style. But now it strikes me as kind of out-of-date. Why? Because it doesn’t show a bunch of scientists across the street taking sides and shouting at each other.

That’s typically followed now by concern about “cancel culture” – a pointless debate which often just distracts attention from critical discussions we should be having. I think that happened with a recent pair of duelling systematic reviews about Covid’s infection fatality rate (IFR).

But before we get to the questions I think we need to address, we have to unpack the scientific issue in dispute. The IFR is a critical number in pandemic response models, for example. Even very small differences have a huge impact: half a percentage point is a million people dead out of every 200 million who get infected – including those who never even knew they’d gotten it.

Here’s the background of this particular gunfight at the Twitter corral:

  • March 2020: John Ioannidis nailed his colors to the Covid-is-no-worse-than-the-flu mast, suggesting that the US might suffer only 10,000 deaths. His reputation took a beating over it.
  • April 2020: A preprint of a study suggested so many people had asymptomatic infections in Santa Clara, California, that the infection fatality rate (IFR) for Covid-19 was very low: 0.12% to 0.2%. It was highly controversial for many reasons. It supported the Covid-is-no-worse-than-the-flu position – fast becoming the core of Covid denialism. Ioannidis was a co-author. His reputation took another beating.
  • May 6, 2020: Gideon Meyerowitz-Katz and Lea Merone’s preprint of a systematic review and meta-analysis went online, with an estimated global IFR of 0.75% (with a range of uncertainty from 0.49 to 1.01%) – several times higher than the Santa Clara study. The preprint was updated 3 times between then and July. In July, the CDC updated their Covid models using their estimate. The first version currently has an Altmetric score of 1,946, a measure of how much attention an article is getting – and that’s a very high score. That score for a journal article would get you close to halfway up Altmetric’s top 100 for 2020.
  • May 19, 2020: Ioannidis’ preprint of a sole-authored systematic review of studies inferring Covid’s IFR went online, including the Santa Clara study he co-authored. His conclusion again pegged Covid’s IFR very low – at roughly the same end of the spectrum as his Santa Clara study: 0.02% to 0.40%. The first version currently has an Almetric score of 4,324. (Which would be nudging close to the top 20 if it was a journal article.)
  • May 20, 2020: Meyerowitz-Katz criticized Ioannidis’ review on Twitter, with over 2,000 likes and over 1,000 retweets.
  • October 12, 2020: Meyerowitz-Katz criticized another Ioannidis publication on Twitter, this time a Covid commentary – with reference back to the disputed IFR estimate.
  • October 14, 2020: The Ioannidis review was published in the journal, the Bulletin of the World Health Organization. According to Google Scholar as I’m writing, it has been cited 190 times.
  • October 15, 2020: Myerowitz-Katz took aim at Ioannidis’ review on Twitter again.
  • December 2020: The Meyerowitz-Katz systematic review was published in the International Journal of Infectious Diseases. According to Google Scholar, it has been cited 194 times.
  • March 2021: Ioannidis published a sole-authored systematic review of the systematic reviews of Covid-19 IFR, including his own and Meyerowitz-Katz’s. He pretty much judges his own review to be reliable, and Meyerowitz-Katz’s particularly unreliable. In an appendix – which he has since withdrawn, thank heavens! – he made claims about Meyerowitz-Katz’s qualifications, his Twitter account, his Twitter bio, the photo on his Twitter account (including his T-shirt)….you get the picture. Heavily personal. This led to a full-blown Twitter storm. (My response to that extraordinary salvo was to write my Cartoon Guide to Criticism: Scientist Edition.)

I think this counts as a full-on feud between these 2 scientists, and it seems to be expanding beyond IFR. But I don’t want to discuss their behavior here. I want to discuss the science side of all this, and what issues this episode raises for the quality of science.

Major cheer and jeer squads formed around both reviews, often praising one review and heaping disdain on the other. It wasn’t just about a difference in interpretation of data: these were fundamental issues about what counts as reliable science in systematic reviewing – and that’s a highly specialized area. So what should we make of these respective claims? Is one, the other, or both of these systematic reviews excellent – or as diabolically bad as detractors say? And what are the implications of scientists’ conflicting claims if the answer is actually cut and dried?

It would take far too long to dig into all the detail about these 2 reviews, and every claim made about them. But there’s no need to. The picture gets very clear, very quickly. (Note: I criticized both these reviews heavily when they were in preprint, but never followed up to see what was in the published versions, and how much of the pre-publication critique the authors attended to.)

I’ve spent a few decades analyzing multiple systematic reviews on the same question, and I’ve studied reviews with conflicting conclusions, too. Over the years, I narrowed down to a list of 5 questions to save time by knocking out most of the worst and unreliable systematic reviews quickly. One isn’t relevant to this debate – it’s about whether the review is up-to-date. But let’s go through the other 4 questions for these 2 reviews. From here on, I call Ioannidis (October 2020) the “I” review, and the one by Meyerowitz-Katz and Merone (December 2020) the M&M review.

1. Are there clear, pre-specified, eligibility criteria for studies being chosen or rejected for the review?

This is key to being systematic – and the point of being systematic is to make a review’s results more reliable by minimizing the biases that lead to them. You want to know for sure that the goalposts aren’t moving around so people can include studies they want, and kick out those that are “inconvenient” for whatever reason. Ideally there is a pre-published protocol, so we can see if the goalposts shifted.

Now of course, if you already know some studies you want to keep out or allow in, you can set up criteria that operationalize your bias. So what we’re looking for here are justifiable criteria and methods that clearly aim to minimize bias.

And there are a lot of potential studies that could be included. In Ioannidis’ review of 6 reviews conducted within about a 3-month period, the most included studies in a single review was 338 studies, 2 had more than 80, and the other 3 each had less than 30. Clearly the scope for biased selection in a review on this question is pretty enormous.

The “I” review is very explicit about the criteria applied, but there is no protocol for this review. There were 3 versions of the preprint that preceded it, though. And it did not start off with the same explicit criteria as in the final criteria.

The scope is narrower for this review – only seroprevalence studies. That would tend towards lower estimates of IFR because of presumably larger denominators of people with asymptomatic infections. And there is a limitation in study size, which is a subject for debate.

It’s impossible, though, to get past the high risk of bias of a sole-authored systematic review conducted by a co-author of a primary study that caused him reputational damage. In his review of reviews, Ioannidis writes that he is also a co-investigator for a second of the included studies, for which he’s not a named co-author. In the “I” review, he declares being a co-author of one of the included studies. So for me, the “I” review passes this question, although not with flying colors. But it doesn’t get over the hurdle of the intent of this aspect of a systematic review: to give you confidence that the selection of studies was reasonably unbiased.

The M&M review gets a straight-up “no” to this first question. Again there was no protocol, and again there were several versions in preprint previously. The final has only 2 explicit eligibility criteria – and it’s explicitly stated if they were met, they were included. But it’s evident even within the paper that this is not so, as they list some studies excluded despite meeting the inclusion criteria – including one because the authors “explicitly warned against using its data to obtain an IFR”. That would be a really weird exclusion criterion – but it’s also mystifying: I can’t find any statement remotely like that in the publication cited.

Far more problematic, though, is the very clear evolution of the criteria as new studies emerged that the authors wanted to include: they changed the criteria to allow that. (By dropping the criterion of being published in English, for example.) Changing criteria along the way isn’t necessarily a bad thing, of course, but it does have implications: how you re-do your previous literature screening to accommodate the change, for example. Transparency is absolutely critical though, and the final paper is not transparent about the evolution of eligibility so that readers can assess the potential for bias in the iterative process.

And there’s a further problem for this review on the question of selection criteria to assemble as unbiased a study pool as possible. They did not ensure that population estimates were not multiple-counted. So the same groups of people can go into the study pool several times via different studies and thus get counted towards the totals multiple times (like the people on board the Diamond Princess cruise ship, for example).

2. Did they make a strong effort to find all the studies which could have been eligible?

Well, they made a lot of effort, but I don’t think either clears the “strong” bar. Neither has a librarian or information specialist involved, and so it’s not surprising that the quality of their search strategies is so low. (In January this year, a systematic review community standard for reporting on search strategies was published, called PRISMA-S.)

You can call the search terms “broad” as a technical matter, but I think they’re best described as vague. All searching is in English. Neither tells you how the records and de-duplication were managed. That may sound trivial, but it’s not. I want to know if this was done professionally or not, which minimizes the chances for records to fall through the cracks and optimizes quality control.

The descriptions of what they actually did are imprecise too. What does that mean? Well, for example, I tried to do what the “I” review says for one of the preprint servers (SSRN), and I couldn’t figure out exactly what had been done so I could be sure I had done the same thing – and nothing I tried came even remotely close to the results reported. ¯\_(ツ)_/¯

The entire searching and selection process for the “I” review was done by a single person, so there is no attempt to minimize error or bias in these processes. On the other hand, the narrow scope – seroprevalence studies – makes it more likely that the studies were findable, and fairly generally likely to be in the places searched.

The M&M search strategy is worse, although there is at least a second author in some of the later selection processes. The reporting of the searching only picks up after there had already been screening by a single author: there are already only 269 studies in consideration at that point – no reporting of the thousands of records that were discarded. Given government reports internationally is a major category of included studies, searching only in English is a far bigger problem for this review than the other.

Then there is the underlying problem of the changing inclusion criteria as this review had successive updates. There’s no explanation of how this was handled. Did they go back and start again from scratch each time? Had they kept records so complete – even for Google, Google Scholar, and Twitter searches – that they could go back and re-screen with the new eligibility criteria? Neither of those seems likely. Which leaves going back and doing a patch search for the particular new criteria: it’s not clear if they did that. They report the search as if it were a once-off. And they report the review as though the eligibility criteria were a constant, not iterative, expanding and contracting along the way. (This seems less of an issue for the “I” review, which had a broadly similar type of potentially eligible study at the beginning and end.)

3. Can you see a list of the studies that were excluded from the review?

This matters a lot for these reviews, given the problems in the 2 steps we’ve just looked at. If you really want to see if there was author bias in exclusions, you need to be able to see this – at least for the ones that were screened in full text.

The “I” review did not provide reasons for exclusions, or a list of the excluded studies at any stage. The M&M review did not provide reasons for exclusions. The flow diagram says 15 studies were excluded at full-text assessment stage, and there is a description in the text of reasons for excluding 14 studies: presumably that’s all but one of those. So the M&M review comes out ahead on this point, but not with flying colors.

4. Have they given you some indication of how good they think the studies they included are?

Yes, both reviews did this. So along with being about up-to-date as you could expect at the time of this dispute, that’s a point in their favor. That’s not enough to save them, although it’s clear that one has far more problems than the other.

Although I’m not going to dig into all the other issues, claims, and counter-claims about these 2 reviews, there are 2 additional issues that I think are important to touch on about the M&M review. One is a major methodological issue, and the other is a really simple reporting quality issue.

The first is the issue of the meta-analyses – the statistical combination of data from multiple studies. I’ve written an explainer about understanding the data in meta-analyses here, if you want more context.

One of the key rules to keep in mind is just because you can throw a bunch of numbers into a statistical pot, it doesn’t mean you should. Here’s the image I use to bring that message home when I teach this basic principle of meta-analysis:

Photo of a town sign adding population, feet elevation and year established into a total
Photo by Mike Gogulski (Wikimedia Commons)

The “I” review doesn’t combine the various IFR estimates, arguing the IFR varies too much for that to make sense. The M&M review does, though. (Using a random effects model, for those who want to know that detail.)

There’s a test for whether there might be too much difference in a set of studies to pool the data: it’s called a test for heterogeneity. It’s not a perfect science, but if there’s a lot of heterogeneity, it really calls into question whether the data even belong together. Even 75% on that heterogeneity test is classed as “considerable”. The rate in the meta-analyses in the M&M was 99%. This is what the authors say about that:

The main finding of this research is that there is very high heterogeneity among estimates of IFR for COVID-19 and therefore, it is difficult to draw a single conclusion regarding the number. Aggregating the results together provides a point estimate of 0.68% (0.53%–0.82%), but there remains considerable uncertainty about whether this is a reasonable figure or simply a best guess.

That’s stressed in the abstract, too. But there isn’t a discussion of the validity of meta-analysis for this data at all – or of using a random effects meta-analysis. Neither of the authors is a statistician, and it raises the question of whether there was statistical peer review here, when it was obviously needed. (Note: I’m not a statistician either.)

The simple reporting quality issue? The abstract says there are 24 estimates; the flow diagram says there are 26 included studies; Table 1 lists 27 studies; the text after Table 1 says 40 papers were reviewed in full text (the flow diagram says it was 42) and 25 studies were included “in the qualitative analysis”; and the meta-analysis includes 26 studies (with the text above it describing it as including all 24 included studies). And I raise that because it’s such an obvious quality red flag: it’s easy to see something has gone terribly wrong in quality control for this review.

There really isn’t a lot of reason for confidence in the quality of peer review or editorial care for this review. (And it suggests, by the way, that peer reviewers can’t be relied on to look at the criticisms posted on preprints of the manuscript they are reviewing.)

The bottom line, though: whether they’re right on the particulars or not, each side of this dispute is right in claiming the other review has deep flaws – though one has more than the other. But what does it mean that there are so many scientists claiming one or the other is solid and has come to a reliable answer?

And what about the hundreds of citations? I don’t know what proportion of the papers are citing one of these as “the” estimate for Covid-19’s IFR, and how often the estimate is used in a way that has serious implications – the CDC was one clear example of that, though. Systematic reviews are regarded as a gold standard, so people are going to reach for one to cite. The trouble is, as I argued in my last post, they’re not a “neutral good”. They can be wildly misleading, so they can give unjustified heft to highly biased claims.

To me, the success of these 2 reviews, and the content of much of the dispute around them, lead to 2 depressing conclusions. Awareness of what makes a systematic review reliable is still shockingly low, including among scientists who see themselves as experts in judging the quality of scientific claims. And as a consequence, even after more than 50 years of development of rigorous methods for systematic reviewing, scientists are still too often building their work on a foundation of perilously shaky knowledge of the science that’s gone before.

~~~~

The cartoon is my own (CC BY-NC-ND license). (More cartoons at Statistically Funny.)

Disclosures: I studied the prevalence of some types of post-publication events in clinical trials and systematic reviews as part of my PhD. I was the lead scientist/editor for PubMed Commons, a commenting system in PubMed that ran from 2013 to early 2018 (archived here).

I have never met Gideon Meyerowitz-Katz (and haven’t cited or criticized his work prior to this episode). I have known John Ioannidis for decades (and often cited and praised his publications). I have also (less often) criticized work he’s done, before and during the pandemic (see for example, herehere, and here; and I wrote a post about EBM heroes and disillusion around one of those episodes.)

Discussion
  1. Thank you! Very well said and illustrated. I have a feeling this is like a bad case of recentism for Wikipedia articles: when the field you’re in is moving faster than you, any attempt to summarise it is essentially futile; any additional work will only make your situation worse. At some point, the only sensible option is to throw away everything you’ve done and start over. Often it’s too difficult for the individuals involved, due to the sunk cost fallacy, so someone else will end up doing it.

  2. So…….What is the IFR? It must be greater than the PFR (population fatality rate). Colorado (where I live) has a pretty good daily spreadsheet of all the data collected. I have be looking at that spreadsheet every day for almost a year. The calculated CFR (case fatality rate) is about 1.21% and the calculated PFR is about 0.11%. So the IFR has to be between these two values. If everybody in the state has been infected, the PFR and IFR would be the same. But we know that is not the case. Broken down by age the data shows starkly that this is an old age disease. 80 and up PFR is 1.81%, 70-79 PFR is 0.43%, 60-69 0.13%, 50-59 0.060, 40-49 0.025%, 30-39 0.0097%, 20-29 0.0063%, 10-19 0.0018% and 0-9 the PFR is 0.0003%.

  3. Posting this on behalf of John Ioannidis

    Dear Hilda,

    Thank you for sharing this blog. It is very stimulating and exciting, as always.

    Here are a couple of points to consider:

    I think the first two bullets about March 2020 and April 2020 that supposedly initiate this “gunfight” (btw, what a brilliant cartoon!) are strawman arguments. For a response to the first point (“the 10,000 strawman”), please see what I published a year ago in the International Journal of Forecasting in section 2.2 (https://www.sciencedirect.com/science/article/pii/S0169207020301199?via%3Dihub). For a response to the second point (“the Santa Clara strawman”), the Santa Clara study was accepted last year in the International Journal of Epidemiology and it was published in February 2021 (https://academic.oup.com/ije/article/50/2/410/6146069). The final paper and the extensive supplements dissect carefully all the criticisms that were raised. The study was not perfect (no study is), and it is just a single study (no single study says much all alone). I am grateful for the criticisms received, but the results remain robust. Media and social media storms were raised also against other studies that happened to find low IFR (e.g. Gangelt). Should researchers be accused of denialism and have their reputations take a beating, if their studies happen to find low IFR values? With over a thousand seroprevalence estimates to-date, unavoidably some will find lower and some will find higher IFR estimates. Those who happen to find low IFRs are not denialists. I dread to imagine what would happen (or may have happened already) if the same media and social media fury extends to the scientific peer-review of papers depending on what IFR they have calculated. At a minimum, I hope that other scientists who found low IFRs did not have the lives of their family members threatened, as it happened in my case.

    From the very beginning of the pandemic, I have stated that this is a major threat. However, as Normal Doidge puts it: “Ioannidis was greeted with intense anger, pilloried nonstop, caricatured as implying COVID-19 is not severe (he actually said it was “the major threat the world is facing”) and generally demonized.” (https://www.tabletmag.com/sections/science/articles/plague-journal-herd-immunity-doidge). What I worry about is not my vilification, but the fact that people who “believe” in me (and it is not good to “believe” when it comes to science) may think that this is not a serious problem, specifically because my critics have caricatured my positions as such. I love being criticized, but critics may cause substantial collateral damage by using strawman arguments about what I wrote. I have created a folder with what I did write and publish on COVID-19 so that people can at least see the real documents (with all their flaws and limitations) and not their tweeted distortions. It is here: https://profiles.stanford.edu/john-ioannidis?tab=research-and-scholarship > Projects > Published COVID-19 work

    For the comparison of the two systematic reviews, the points that you raise are largely congruent with what I published in my March 2021 overview of overviews in EJCI (https://onlinelibrary.wiley.com/doi/10.1111/eci.13554). It would have been useful if people did read the paper instead of focusing entirely on the personal files of the supplementary appendix which are the equivalent of a Facebook page or Twitter account, neither of which I possess (for reasons that I explain in my Stanford webpage). I refer to the full paper for a full discussion on where we agree and where there may be some residual disagreements.

    At the heart of the problem, is the fact that these papers are not really (just) systematic reviews. They are probably best described as complex, multi-step analytical constructions that utilize many pieces of information and have many components of calculations. Some of the information is indeed procured with a systematic review process, but this is only part of the machinery. A routine protocol for something that was so complex and that had so many unknown or poorly known parts in April 2020 was close to impossible. You can take the first preprint from May 2020 as the equivalent of a protocol, which then had to be clarified/amended as the scientific community accrued more insights. In a way, the process is thus transparent through the multiple iterations of the preprints – probably more transparent than in the vast majority of traditional systematic reviews. Importantly, I think that Gideon and Lea should be applauded for their admirable effort to tackle urgently an extremely difficult problem.

    We all know that doing systematic reviews is not as easy as it seems and most systematic reviews are horrible. I have published an embarrassing number of systematic reviews and meta-analyses and along with my students in the courses that I teach and on various other occasions we have scrutinized many hundreds, if not thousands, of systematic reviews and several other more complex analytical studies; however, the IFR project was one of the most challenging I have come across, and “systematic review” was only a modest portion of it. Even the most experienced people in evidence synthesis might fail, perhaps in major ways. Consider it like the equivalent of performing a complex surgical operation that has very high fatality risk even in the hands of the most experienced surgeons. What are the chances of a poor outcome if the same operation is performed by the most brilliant and enthusiastic students or by the best economists or physicists in the world rather than a technically trained surgeon? Opinionated experts should not be confounded with relevant, field-specific technical expertise. Epidemiology, medicine, public health, and statistics would benefit from having fewer opinionated experts. Conversely, epidemiology, medicine, public health, and statistics would not exist without relevant, field-specific technical expertise.

    Unfortunately, during the pandemic, social media isolate single sentences or fragments of sentences and these are further distorted to create suitable narratives to promote agendas or to destroy “opponents”. The EJCI overview of the 6 overviews has an Altmetric score approaching 7000, the Santa Clara paper reached over 17000, and another paper where my team modeled the impact of less versus more restrictive measures hovers around 20000. However, I am concerned that even though 95% of these comments were favorable, the nature of the comments suggest that very few people actually read these papers. Most people read tweets, blogs, and media stories that had little or nothing to do with the papers themselves. The divergence between what I actually wrote and what eventually was distorted can be monumental.

    After all this work, I feel that my ignorance remains vast. I hope we will learn more in the future. We need nuance and some distance to understand the strong and weak points of the science that we and our colleagues produce. This takes time, patience and good will. In the meanwhile, I consider my critics to be my greatest benefactors. I am always grateful to them.

    Keep up the great work! I really miss you and I can’t wait to see you one day soon and give you a hug. Please do post my complete e-mail message in your blog, I hope it might help to clarify some of the existing confusion.

    Cheers!

    John

    1. Since Dr. Ioannidis mentioned me (Atomsk’s Sanakan) negatively in the initial appendix of his paper, I want to address his work, starting with a comment Dr. Bastian wrote elsewhere:

      “[Dr. Ioannidis] included studies that he acknowledges elsewhere in the paper aren’t population-based and don’t “approximate the general population”. […] I think there are even more included studies than that in the same boat, like a study among the students of a single high school in France and their parents, siblings, and teachers.”
      https://hildabastian.net/index.php/91

      That France sample was not representative of the department of Oise. But Dr. Ioannidis treated it as such, resulting in a seroprevalence estimate of ~26% for the department and a “corrected” IFR of 0.05% for early April 2020:

      http://web.archive.org/web/20210422111825/http://www.who.int/bulletin/volumes/99/1/20-265892.pdf
      https://www.medrxiv.org/content/10.1101/2020.04.18.20071134v1.full-text

      Yet a study with randomized representative sampling showed seroprevalence of ~3% for Oise in May 2020, which is over 7 times lower than Dr. Ioannidis claimed for April 2020. That study sampled more people from a wider geographic area than the study Dr. Ioannidis used, and with two independent antibody tests to confirm its results:

      https://drees.solidarites-sante.gouv.fr/sites/default/files/2021-01/er1167-en.pdf
      https://www.medrxiv.org/content/10.1101/2021.02.24.21252316v1.full-text

      Moreover, Dr. Ioannidis’ IFR of 0.05% is mathematically impossible, since >0.12% of Oise’s total population died of COVID-19:

      https://dashboard.covid19.data.gouv.fr/vue-d-ensemble?location=FRA

      Dr. Ioannidis cannot get around this by claiming IFR increased overtime, since he lowered his overall IFR estimate from ~0.23% to ~0.15%, and has elsewhere argued IFR decreased overtime. Nor can he claim this difference is explained by over-estimating of reported COVID-19 deaths, since he’s claiming to use reported COVID-19 deaths for his IFR calculation. This is not a one-off event either, since there are (to my count) at least 10 impossible IFRs in Dr. Ioannidis’ paper in the Bulletin of the WHO. And that’s not even setting touching on the other possible IFRs that are still implausible extrapolations from non-representative samples:

      https://sciencebasedmedicine.org/what-the-heck-happened-to-john-ioannidis/#comment-5367267118

      The central problem is, again, that Dr. Ioannidis uses non-randomized non-representative samples that over-estimate the total number of infections, and thus under-estimate IFR. This differs from what he said earlier in the pandemic, when he championed the use of randomized representative samples before it became clear they overall supported a higher IFR than he claimed. He could easily have published a sensitivity analysis looking at IFR from just randomized representative samples. But he didn’t. And he has yet to adequately address this central objection to his work.

      Dr. Ioannidis, in March 2020:
      “The most valuable piece of information for answering those questions would be to know the current prevalence of the infection in a random sample of a population and to repeat this exercise at regular time intervals to estimate the incidence of new infections.”
      https://www.statnews.com/2020/03/17/a-fiasco-in-the-making-as-the-coronavirus-pandemic-takes-hold-we-are-making-decisions-without-reliable-data/

      Dr. Ioannidis, in April 2020:
      “Data from Iceland and Denmark, which have done the best random sampling, also point in the same direction, Ioannidis said. “If I were to make an informed estimate based on the limited testing data we have, I would say that covid-19 will result in fewer than 40,000 deaths this season in the USA,” he told me.”
      https://archive.is/dT97F#selection-2211.202-2219.279

      Gideon Meyerowitz-Katz‬, April 2021:
      “[…] while Prof Ioannidis felt comfortable attacking myself and my co-authors personally, he hasn’t actually responded to any elements of my scientific critique […].
      […]
      While there are many areas in which we disagree, I think the biggest error remains including studies that are clearly inappropriate to determine population estimates of infection rates”
      https://twitter.com/GidMK/status/1381846727011954689

      Me, March 2021:
      “On Ioannidis’ bad reasons for including non-randomized samples:
      – non-randomization biases the sample
      – most folks go home eventually
      – non-randomized sampling is not more likely to reach folks
      – randomized sampling was done in less hard hit regions”
      https://twitter.com/AtomsksSanakan/status/1371131499861442565

  4. Very interesting commentary. Really appreciated learning some of the nuances of meta-analyses and seeing Dr. Ioannidis’ response.

  5. Thanks for this excellent article. I always had the impression that Prof Ioannidis thought that the M&M article was of poor quality, and speculated on the reasons for this in the appendix. Which doesn’t excuse making the appendix public, but the twitter piling on of people who assume without question that the M&M paper is correct has been unedifying.

    On the 10,000 point, I think Prof I is being disingenuous in his reply. Re-reading his statnews article from March 17 2020, the figure is based on an IFR of 0.3% and a total percentage of the US population infected of 1%. Later discussion has focused on his lowball estimates of IFR, but surely a 1% infection percentage was extraordinarily low at a time when the doubling rate was known to be around 3 days. Fair enough to say 10,000 was not a prediction, and the 0.3% was reasonable based on his analysis of the Diamond Princess, but surely he should have given more discussion as to whether the 1% was reasonable or the purest optimism.

Leave a Reply

Your email address will not be published. Required fields are marked *


Add your ORCID here. (e.g. 0000-0002-7299-680X)

Related Posts
Back to top