When I saw the tweet, it felt like watching pinball for misleading public health hype – ding! ding! ding! – hitting target…
It’s a fundamental part of science – and yet peer review is not itself very science-based. It needs to be.
That turns very quickly into a long list. Tere are people pecking away at those questions, though. Here are the top 5 points that struck me last year. It’s part of a series on this blog.
1. A substantial proportion of journal editors may believe it’s ok to edit a peer review report without the peer reviewer’s permission (or presumably, knowledge).
[Survey of editors of journals in ecology, economics, medicine, physics and psychology]
I’ve been deeply engaged with this issue a very long time – since even before I wrote a chapter in a book about it in 2003 – and I had no idea this was a thing.
It’s one of the findings from a survey of editors, by Daniel Hamilton and colleagues. These have notoriously low participation rates. This one had 322 editors, a response rate of about 20%. It was a survey on peer review policy, peer review practice, and publication ethics. And just shy of 20% of the 293 who answered the question believe it’s acceptable to edit a peer review they don’t agree with – some, even without the peer reviewer’s permission. Meanwhile, about a quarter wouldn’t even edit out inappropriate or offensive language or reference to authors’ ages or gender, unless the peer reviewer agreed. (Though I think that might usually be bad enough to be a “sacking” offense.) A third believe it’s ok to remove an author’s name without their permission when they type it in a system that’s meant to be blinded.
How widespread are the beliefs and practices? I’ve no idea if these editors are representative, or how often the more worrisome things happen. If actually doing it is as rare as hen’s teeth, then it’s not a burning issue. But it’s a reminder that are a lot of surfaces in the peer review black box left to scratch beneath.
Here’s an example of what this can look like – one of the authors tweeting about her personal experience:
Daniel G. Hamilton and colleagues (2020). Meta-Research: Journal policies and editors’ opinions on peer review.
2. A study at a single journal suggests articles with published peer review trails might get more citations – although why isn’t clear – and that authors with conflicts of interest may be less likely to agree to published peer review.
[Propensity score matching study of publications with open and closed peer review reports at 1 journal]
Compared to last year, there was a dearth of empirical studies published on peer review – or if there were, they didn’t get a lot of attention and they’re not easy to find. So a propensity score matching study was a shoe-in for this year’s list!
Propensity score matching is a technique to try to reduce the impact of confounding factors in an observational study. The score is based on things known to influence an outcome other than the one you want to study: the thing you’re studying is then scored on those potential confounders, to try to get a more balanced analysis.
This one was done by Qianjin Zong and colleagues, on 1,495 non-review research articles at PeerJ, a journal with a scope covering life, biology, medicine, and environmental sciences. Publishing peer review reports is optional at the journal. Peer review is single blind – the authors don’t know name the names of peer reviewers – but when a manuscript is accepted for publication, reviewers may reveal themselves (40% do), and authors can choose to have the peer review reports and trail published (80% do).
There are many limitations here. The biggest is that the propensity score can’t account for quality of the articles, or how much interest there is in its subject. We don’t know how often the people who cite articles with published peer review trails look at them, and if that increases their confidence in a paper. It seems more likely that the willingness to be open is a characteristic of authors who do more citable research. But this is an interesting paper to read, and putting the issue of conflicts of interest on the table for this discussion is an important one.
Qianjin Zong and colleagues (2020). Does open peer review improve citation count? Evidence from a propensity score matching analysis of PeerJ.
3. Peer review at PLOS ONE was supposed to radically change the paradigm but it didn’t – reports are conventional, still often comment on novelty, and usually don’t address reproducibility.
[Qualitative analysis of a random sample of peer review reports]
This one is an interesting short book, addressing the radical cultural change that PLOS ONE’s founders hoped for. Bottom line: it didn’t happen. What they found along the way is interesting, because this is a particularly excellent analysis. It’s on my list for another reason as well – the method it used to do this. It was hugely time-consuming, though. I think the taxonomy they developed for coding statements in peer review reports is a must-read for people doing research on peer review (Table 1).
Martin Paul Eve and team did an in-depth analysis of a random sample of 78 peer review reports around the median length for the journal. They were independently coded by 3 authors who then reached a consensus on 2,049 statements. No wonder it took them a year!
Here are some of the points I found interesting, alongside their core finding:
- The median length of a peer review is 500 words – and that’s similar to the results in studies of more traditional journals. The longest peer reviews were probably longer than the manuscripts they were commenting on – the longest was a major outlier at nearly 14,000 words.
- Second and third rounds of peer review can continue to raise new issues, instead of getting shorter and easier.
- Peer review can be “extremely direct (and likely emotionally bruising for authors)”. There’s not much “hedging” of criticisms or “sugar-coating” going on. However, positive reviews can go downhill – there’s a tendency for people to lead with the most positive thing they’re going to say.
- Peer review often crossed over with copy editing.
- Peer review reports often push language to conform to formal norms: “peer review is a process that sees itself as protecting the public perception of science, and the ways in which language registers may confer reputational advantage or damage”. In many ways, that’s not a bad thing, but, they said, for people who aren’t native speakers of English particularly, it contributes to a “coloniality of gatekeeping”.
Their core question was whether or not the journal’s radical change remit had succeeded. Eve & co summed up their conclusion:
[I]n terms of reviewer behaviour, our work shows that PLOS’s readers have not wholly gotten the message of change. For instance, one of the core goals of the new PLOS review model was to change appraisal from novelty and significance to technical soundness, reproducibility, and scientific accuracy. Yet it is clear from our analysis that reviewers still frequently mention novelty and significance (albeit not always as a discriminator for publication and with the caveats set out earlier in the section on topic modelling) but that they rarely remark upon reproducibility. In other words: changing the criteria of peer review to ask reviewers to appraise aspects of science different from those with which they are familiar appears a necessary but insufficient condition of change. The norms of gatekeeping creep back in, despite such changes.
Martin Paul Eve and colleagues (2020). Reading Peer Review
PLOS ONE and Institutional Change in Academia.
4. Opening peer review, even partially, represents an acknowledgement of the subjectivity of the process: as long as it is confidential advice to editors, it has an aura of objectivity that it does not deserve.
Trust and distrust: “peer review seems to exist in this strange balance” between them. It recognizes the possibility of betrayal of trust in what the authors of a manuscript say they have done, but it has “resolved on the side of trust”. It’s an interesting discussion to have, especially since in so many ways, the reproducibility movement is pushing that balance back more towards distrust, isn’t it? Or perhaps the journalistic model of “trust, but verify”?
The dissertation begins with discussions around themes like objectivity and how that differs in peer review from its meaning in other parts of science. Like Eve and colleagues in the previous study, he sees peer review as a mechanism for pushing towards conformity.
The last chapter is about opening peer review’s black box. Open peer review, Ucko writes, takes us away from the traditional role of peer review being confidential advice to the editor. It also embraces the inherent subjectivity of peer review:
On a deeper, more philosophical level, what this does is to change the role of the referee from a vicarious representative of the relevant scientific community to an explicitly identified peer. In a sense, this is embracing subjectivity, even though the motivation is an increasingly objective process. The calculation here is that hidden bias, facilitated by anonymity,
is a bigger threat to objectivity than any potential loss of candor from the referees as a result of open review. The loss of any claim to the “view from nowhere” objectivity that comes with a nebulous referee who stands in for an entire field, replacing them with a named colleague or even a competitor changes the dynamic from one of arbitration to one of scientific debate…If peer review is opened up, then the expert becomes a public figure, with certain consequences for how they act.
Like open access, Ucko concludes, open peer review is a potential step in achieving some of Habermas’ proposed solutions to problems in science: encouraging “communication between expert cultures” and “direct[ing] the cognitive potential of the expert cultures back into the lifeworld”.
Daniel Ucko (2020). Peer Review: Objectivity, Anonymity, Trust.
5. Covid-19 showed that peer reviewers can ignore even the highest-possible profile data that isn’t in the manuscript.
There’s no study for this one. And in many ways, of course, it’s not new. But I really thought if ever peer review of a clinical trial would be thorough enough, it would be the mRNA Covid vaccine trials society was counting on so desperately. But no.
Here’s what hammered that home. With the massive effort the US FDA was putting into evaluating the trials of the first 2 mRNA vaccines – 150 people reportedly – its assessments, like the ones from the European equivalent (EMA), were always going to be the most seriously peer reviewed literature on the evidence. And it’s a context where they have more access to raw data than a journal likely ever would.
Those reports landed in public view with a huge amount of attention. There’s no way, surely, you could be a peer reviewer of the manuscripts that followed at medical journals and not both know about the related FDA report and have access to it. You would think that would mean the medical journal would benefit from those reports’ analyses and revelations, wouldn’t you? Especially given the vaccine manufacturers’ role in the manuscripts and the trials they report.
Instead, you would think they existed in separate worlds. The Moderna example makes that clear. The FDA report went online 2 days before a committee meeting on December 17, and that timetable had been public for quite some time. The journal publication was epublished in the New England Journal of Medicine on December 30.
The journal version is the basis for the common claim that the vaccine has been shown to be 100% effective against severe Covid-19. Don’t get me wrong – the prior assumption here is that a vaccine that would work against symptomatic disease, would reduce the risk of severe disease, too. And it’s clear the mRNA vaccines do that remarkably well. I’m sure Moderna reduces the risk. But it wasn’t powered to answer that question, so we have to be careful about this. The journal wasn’t. It states there were 0 people with severe Covid-19 in the vaccine group.
Whereas the FDA report points out there was a person who met those criteria and who was hospitalized, as well. The case hadn’t been sent to the team assessing potential Covid-19, though, because at the point in time cases were sent for adjudication by them, the positive test result wasn’t known. So it wasn’t officially in their statistics. I think it’s hard to see how you could justify not ensuring this was at least a footnote in the paper if you were aware of it – so I think it suggests no one assumed responsibility for checking the manuscript carefully against the FDA’s findings. Yet, it’s the NEJM version that is getting disseminated and spreading perceptions of the study, not the FDA’s far more rigorous assessment.
This story took a swift turn in January, when the EMA’s assessment went online. They pointed to something critical that neither the journal nor the FDA had done. If you pulled on the thread of this person’s Covid-19 illness that hadn’t been adjudicated and included in the statistics, you would have found, as EMA reported, that they weren’t the only one. It’s not clear how many potential Covid-19 illnesses haven’t been counted because they are awaiting adjudication. EMA doesn’t believe it would add “substantial bias” to the result, because the reports would be blinded at that stage. It does mean, though, that the confidence intervals would change, if not the effect of the estimate.
An excellent reminder, for anyone who needs it, that you can’t expect peer reviewers to have done their homework. When quality control is in large part delegated to a number of peer reviewers, then responsibility is distributed, too. It’s almost a recipe for ensuring that things will fall through the cracks. Even the most important things of all.
And something we’re going to learn: a randomized trial on community member peer review that may be past its halfway mark
If there was a randomized trial on peer review published in 2020, I didn’t find it. But here’s a 5-year one that presumably passed its halfway point in 2020.
It was registered for community members as peer reviewers for medical journals in the US. The plan? Train 24 people how to review a manuscript, and then randomize 568 articles submitted to 2 medical journals to them (or not), and it’s a 5-year trial that started in June 2018.
Which 2 journals? It doesn’t say and I couldn’t see any other report about it. Part of the goal is to see if this multi-component intervention can increase consideration of disparities. The principal investigator is Ashwini Sehgal, who’s a Co-Director of a Center for Reducing Health Disparities. Definitely one to look forward to – but it’s a shame, with so many critical questions about journal peer review still unanswered, that there aren’t several important trials being published each year. I mean, it’s not like there’s a shortage of journals or manuscripts, is there?
This is the 4th post of a series on peer review research – that started with a couple of catch-ups on peer review research milestones from 1945 to 2018:
Disclosures: I’ve had a variety of editorial roles at multiple journals across the years, including being on the editorial board of PLOS Medicine for a time, and PLOS ONE‘s human ethics advisory group. I do some research on post-publication peer review, subsequent to a previous role I had, as Editor-in-Chief of PubMed Commons (a discontinued post-publication commenting system for PubMed). I am currently on the editorial board of the Drugs and Therapeutics Bulletin.