This post is part of a series that I started in 2019, recapping the scientific results on peer review to that point…
This post is part of a series that I started in 2019, recapping the scientific results on peer review to that point. Since then, I do an annual pick of the main areas where I think research has reached a stage of telling us something new, or tipping the balance on something that had been very unclear.
For something so critical to science as editorial peer review, there’s very little scientific foundation to it. This year, though, there were reports from 5 randomized trials – which I think must be an all-time record. The Peer Review Congress tipped the scales heavily with 3 trials – and that Congress only comes once every 4 years. Still, we can’t usually count on there being even one in a year.
One of those trials didn’t make my list, though. It was a trial by Jürgen Huber and colleagues, with results released in September. It got a lot of attention, with the claim that their study proved peer reviewers’ bias towards well-known authors is so great, that “double-anonymization is a minimum requirement for an unbiased review process.” I don’t think it proved much at all, and I wrote a blog post about. Meanwhile, here’s my pick for the top 5 advances in knowledge this year – plus a short list of some other research that might interest you.
- A randomized trial showed carefully selected, trained, and supported community members can be good medical journal peer reviewers, and improve consideration of health disparity and sociocultural issues.
- Given the choice between having their names concealed or revealed in peer review at a journal, most authors will opt for revealing their names. The gender of corresponding authors doesn’t seem to affect the choice. It’s still not clear, though, if choosing to stay unnamed affects authors’ chances of being rejected.
- Prompting peer reviewers to look for particular quality issues is starting to look like a dead end – at least for substantial improvements in reports of biomedical research.
- A specialized review comparing study registrations and manuscripts could reduce undisclosed discrepancies between plans and study reports. It takes roughly half an hour to an hour to do, with additional peer review time needed from editors and authors.
- Although at least as many medical graduates have been women as men for over 20 years in the UK and US, the proportion of women peer reviewers at medical journals may not have been growing substantially. Women editors may be likely to recruit more women as peer reviewers, but they remain in a minority too. Could the software editors use to find potential peer reviewers be further entrenching the biased patterns of the past?
A randomized trial showed carefully selected, trained, and supported community members can be good medical journal peer reviewers, and improve consideration of health disparity and sociocultural issues.
This paper is behind a paywall, so I’m going to describe it in detail, given the subject. (The research was NIH-funded, so it should be available in full text at PubMed Central a year after publication – so you can check back here by October 2023.)
I’d listed this trial as on the horizon in my 2021 post in this series. Results arrived in September. It was registered with Ashwini Sehgal as principal investigator – a co-director of a Center for Reducing Health Disparities – with addressing health disparities mentioned in the registration. So I was hopeful about the potential profile of the community members recruited. The plan was for 568 manuscripts at 2 medical journals to be randomized to one of 24 trained people. They hit that recruitment mark and then a bit: 578 manuscripts and 28 people. And the 28 people were almost all women (only 2 men), with over half the group Black. The 2 journals were Annals of Internal Medicine and Annals of Family Medicine.
Participants were recruited via flyers at libraries and community centers in Cleveland, Ohio. They had to have at least a high school education, be proficient in English, and have access to a computer. In addition, they had to have experience as either a patient or carer of at least one of a list of common health conditions – and not to have a healthcare or research background. People were then given a test: 59 people did that test, and 45 of them did well enough to be invited to join the training on research types and peer review.
To enter the trial, you had to complete the training – 6 sessions of 90 minutes each – and write practice reviews the researchers judged to be effective. Of the 45 who started, 12 didn’t finish the 6-week course, and of those that did, 5 weren’t writing reviews considered good enough to join the trial. (The authors report they weren’t different in age, educational status, race, or gender.) So out of the 59 eligible people who started out, nearly half didn’t join the trial. Out of those 28 that started in the trial, 12 dropped out along the way because of competing demands on their time. Unfortunately, the demographics of the participants who provided the final reviews isn’t reported.
The randomized manuscripts were full-length primary research reports (not reviews) that weren’t focused on statistics (e.g. prediction models), medical practice or education, or a rare condition. They were randomized to the journal’s usual scientific review alone, or that plus a community reviewer. For the manuscripts randomized to the intervention, a reviewer with relevant interests was chosen by study staff. The deadlines for their reviews were a week early, and in that time, study staff reviewed their drafts and provided feedback – a process similar to the one used by PCORI. The participants were paid US $100 for each completed review. Editors knew which reviews came from the community reviewers – they were all marked, and the study authors say it was too obvious which they were to keep the editors masked.
Unfortunately, there are a couple of fundamental problems in the trial’s conduct and reporting. One of the journals (Annals of Internal Medicine) required author consent to participation in the trial, and they only seem to have requested that permission after randomization – and nearly a third refused. You can’t work out how this played out, and that leads to a second of the fundamental problems – key data aren’t reported. I wonder if it’s because it was included in a CONSORT flow diagram – which is a basic part of adequate trial reporting. It’s possibly an error at the journal: At one point, data in an appendix is mentioned, but there isn’t one. (The only additional information is a CONSORT checklist, and those don’t include data.) (I’ll chase this up.)
The community reviewers assessed the manuscripts more favorably than the professionals: Their mean scores for recommending publication were 2.7 compared to 2.2 by the professionals. The editors rated the professionals’ reviews a little more highly (3.1 for community reviews versus 3.3 for professional). Editors found the community perspectives the most useful part of the reviews (e.g. socioeconomic factors potentially affecting adherence).
Community reviewers were more likely than the professionals to address study design, participant selection, and importance of the research. There were several themes they addressed very often: diversity of the study participants, relevance of the research to patients and the community, cultural and social considerations, and implementation by patients and the community.
Community reviewers’ comments were integrated into 64 of the 66 papers that were published in the intervention group, including 186 comments in all – though the number 66 is only mentioned in the abstract, so there’s some confusion on this when you get to the results. The authors didn’t count listing additional demographic data in that tally, unless there was also discussion, which seems to me under-counting the influence. Unfortunately, the study authors don’t report the same for the professionals’ reviews. For the 4 themes, they do report that the 64 papers with community reviewer feedback were substantially more likely to address these issues than the 54 published papers from the control group.
This is an important addition to the research base on consumer and community participation generally, as well as peer review. No word, though, as far as I’ve seen, on whether either of the journals involved are going to act on these results. (One of them published the article.)
Anne Huml and colleagues (2022). Community members as reviewers of medical journal manuscripts: A randomized controlled trial.
Given the choice between having their names concealed or revealed in peer review at a journal, most authors will opt for revealing their names. The gender of corresponding authors doesn’t seem to affect the choice. It’s still not clear, though, if choosing to stay unnamed affects authors’ chances of being rejected.
Many scientists firmly believe that concealing the names of authors reduces bias based on gender, race, geographical or institutional home, prestige and seniority, etc. Some journals that routinely conceal the names of reviewers (“single-blind”) offer authors the option to have their names concealed as well (“double-blind”). Some people think giving the authors the choice is the best of all worlds. Others are concerned that it’s risky, since we don’t know whether peer reviewers make negative judgments about authors who want their names concealed – and reviewer bias about which authors reveal their names may be important factor.
I know of 2 previous studies on this, the first was a small study in 2 physics journals in 2017 [Harris 2018] where 20% chose to have their names concealed. The second from 25 journals in the Nature group [McGillivray 2018] was large – more than 106,000 manuscripts – and only 12% authors opted for being unnamed. Going by presumed gender of corresponding authors, gender wasn’t a factor in the choice, but country/less prestigious institution was – though a large majority of authors still chose to be named.
The new study has so far only been reported as a conference poster, and it’s not as strong a study as the one from the Nature group. And in this new study, the lack of data about the overall experience is frustrating. It’s from a single journal, Pediatrics, and they focus on US-based authors only, for reasons that aren’t explained. The way the authors went about this is perplexing. They were only interested in research studies, of which 2,720 were submitted in the year they were studying (from April 2020). Authors of 18.6% chose to have their names concealed from peer reviewers – a little over 500 articles.
To get a sample to compare single- to double-concealment, though, they pulled 150 of each from all the articles, not just the research ones. And then they excluded the non-research articles from the random samples, along with all those that had no US-based author. They don’t tell how many were excluded for which reason, though. Out of 300, they ended up with 96 manuscripts with author names’ concealed versus 73 with author names revealed.
The rate of rejection was 95% for the group with the authors’ names concealed and 86% for the group with named authors. Though that sounds like a fairly big difference, the sample was relatively small, and this difference wasn’t statistically significant. Based on the gender and seniority of the corresponding authors, the groups were similar. They had expected women and less senior people to be more likely to choose concealed names.
From these 3 studies, it seems that in the context of a journal offering a choice, most authors choose to be named. That doesn’t take into account authors who choose journals that never reveal names, though. We don’t know if having already presented results at a conference or in a preprint is influencing these choices. And we still don’t know if acceptance the manuscripts would have been more or less likely if the authors were named in the context of a journal where it’s the authors’ choice – and there’s no insight here on how peer reviewers view authors’ choosing to have their names concealed.
In the trials of revealed or concealed author names, authors were randomized, and so the peer reviewers didn’t think the authors had chosen to withhold their names. In that context, I think the evidence suggests it might make no difference to the rejection rate. When I was writing my post on the Huber trial, I hunted for randomized trials of concealing or revealing names in peer reviews – including in 2 new systematic reviews published in 2022.
Counting the Huber trial, I found 9 published trials – and only 1 of those included peer review at more than 1 single journal. The larger trials tended to find no effect, while the small ones sometimes found a small effect, though generally in people who hadn’t guessed who the unnamed people were – and a high proportion do, especially when the pool of people working on a particular subject isn’t large.
I think both of those new systematic reviews are problematic, as they place considerable weight on studies with major scientific problems that make them susceptible to bias. One review compared concealing just peer reviewers’ names with concealing both theirs and the authors’ [Ucci 2022]. They concluded that 52% of people guessed the names of the concealed people, and that concealing names of both authors and peers reviewers might lower acceptance rate.
The second review studied concealing both parties’ names and gender bias [Kern-Goldberger 2022]. Those authors include highly biased studies, and when they conclude “there is reasonable evidence that gender bias may exist in scientific publishing and that double-blinding can mitigate its impact,” they are leaning on the more biased studies.
Meredith Campbell Joseph and colleagues (2022). Preference and characteristics of US-based authors for single- and double-anonymous peer review.
Prompting peer reviewers to look for particular quality issues is starting to look like a dead end – at least for substantial improvements in reports of biomedical research.
[3 randomized trials]
Before we get to the 3 new trials, here are 3 previous trials of prompting peer reviewers I know of:
- 2 trials by the same group, at a single medical journal, found no improvement of manuscript quality after providing peer reviewers with reporting guidelines. [Cobo 2007 and 2011; Source: 2016 systematic review; a 2021 systematic review did not identify further prompting studies.]
- 1 trial of short instructions on reducing spin in abstracts at a single medical journal didn’t reduce abstract spin [Ghannad 2021; Source: my things we learned in 2021 post. I wrote more on this topic in 2022.]
Reporting guidelines are the intervention tested in 2 trials released in 2022, though this time it focused on the 10 most important and poorly reported items from the guidelines. Reporting guidelines have a straightforward goal – preventing gaps in critical information about how a study was done. (You can read more about them here, and how they evolved here.)
The first of the new pair of trials addressed protocols of clinical trials and the SPIRIT guidelines; the second, reports of clinical trial results and the CONSORT guidelines. In both trials, the completeness of research reporting wasn’t improved. Looking at the data they presented, I wouldn’t rule out this line out completely. Perhaps narrowing in on fewer items, and explaining them better would help? That said, people who are suitable peer reviewers for clinical trials should know all this. Sigh.
The third new trial was even more targeted, and with the researchers even doing some of the heavy lifting for the peer reviewers. The trial studied clusters of eligible trials submitted to 13 medical journals in a year. There ended up being 243 control manuscripts, and 176 in the intervention group. Roughly 40% ended up published – for 173 trials.
The goal was to reduce the discrepancy between the outcomes specified in trial registrations, and the outcomes the trialists chose to include in their manuscripts. Peer reviewers didn’t even have to look the trial registrations up – they were provided with the information for the specific trial they were asked to review. Still, no luck.
Meanwhile, a systematic review in 2022 showed that peer review isn’t making much difference to poorly reported research. Those authors compared the preprints and following published papers of studies they had included in a living systematic review of Covid prediction models. The reporting in the preprints was lousy, and the published papers were only trivially less so.
Benjamin Speich and colleagues (2022) [abstract only]. Reminding peer reviewers of the most important reporting guideline items to improve completeness in published articles: Primary results of 2 randomized controlled trials.
Christopher Jones and colleagues (2022). Peer reviewed evaluation of registered end-points of randomised trials (the PRE-REPORT study): A stepped wedge, cluster-randomised trial.
A specialized review comparing study registrations and manuscripts could reduce undisclosed discrepancies between plans and study reports. It takes roughly half an hour to an hour to do, with additional peer review time needed from editors and authors.
One of the only types of peer review we can be sure increases the quality of manuscripts is a form of specialized review – a statistical review. We already saw in the previous section that simply prompting peer reviewers to check for discrepancies between study registration plans and what’s reported in a manuscript isn’t enough to improve reporting. Perhaps dedicated reviewers given this task could make a difference.
This feasibility study is an interesting read. The detail about the twists, turns, barriers, and rejections they had to navigate gives great insight into why it’s so hard to get studies of journal operations done – with an added layer of pain from trying to do it in the pandemic. For people who are in the business of research, it’s remarkable how many editors and authors sure don’t welcome being studied themselves!
The authors report on their process of refining their guidelines for doing the review, the inconsistencies in people’s reviews, and some perspectives of editors and authors. Five of the 13 editors involved in this feasibility responded to a questionnaire. They all said it would be feasible to implement and they’d be interested “if they were provided with discrepancy reviewers”.
That’s no small “if”. The intervention this group has developed is like pulling the work meta-scientists do forward into the peer review process, instead of methodically assessing the extent of problems in published papers – a form of activist meta-science, studying a problem and trying to fix it at the same time. An early intervention version of sorts to Ben Goldacre’s COMPare project.
The authors concluded it’s worth putting this intervention to the test, and they describe the trial they recommend. That would be interesting to see, even though there are questions about how you scale this up and get it adopted at least at the journals that publish a lot of pre-registered clinical trials and other studies.
Robert Thibault and colleagues (2022). Discrepancy review: A feasibility study of a novel peer review intervention to reduce undisclosed discrepancies between registrations and publications.
Although at least as many medical graduates have been women as men for more than 20 years in the UK and US, the proportion of women peer reviewers at medical journals may not have been growing substantially. Women editors may be likely to recruit more women as peer reviewers, but they remain in a minority too. Could the software editors use to find potential peer reviewers be further entrenching the biased patterns of the past?
Women have been about half of the medical profession in the UK and US for more than 20 years. A new study of women as peer reviewers and editors at 47 BMJ journals in 2020 also looked at some time trend data for BMJ journals, and 2 US journals (JAMA and NEJM). There was some growth in the proportion of women peer reviewers, but that seems to have stalled at the BMJ journals and NEJM.
Women’s representation among peer reviewers across the board at BMJ was 30%, and tended to be lower at more prestigious journals. The highest rate was 50%. (Gender of peer reviewers was assessed as binary using software on names.)
The proportion of women editors was 33% at these journals. The number of editors ranged from 3 to 39, with the proportion of women ranging from 0 to 88%. The authors concluded that there was an association between women editors or a woman editor-in-chief and a higher proportion of women peer reviewers. They didn’t explore whether the medical specialty was a factor here, but looking at table of data, that didn’t seem like a major factor.
As if the rate being stagnant isn’t depressing enough, according to a second study at 21 BMJ journals [abstract only], women’s peer reviewing activity took a hit in the pandemic – though we don’t know if it’s still going backwards. And the authors of the first study also say we need to watch out for impact on diversity of changes in publishing, like more open peer review.
So what now? Women’s under-representation, argued the authors of the study of women as editors and peer reviewers, “may be both a symptom and a cause of broader under-representation in senior positions in academia”, and it could be having serious implications for medical research. They point to several issues that journals need to be acting on. For example, they write, many journals use software based on databases of authors to find matches for manuscripts. They suggest the algorithms used in that software by definition replicate, or perhaps even worsen, existing biases. Journals need policies on gender, and so they need adequate representation on editorial boards as well.
Ana-Catarina Pinho-Gomes and colleagues (2022). Cross-sectional study of the relationship between women’s representation among editors and peer reviewers in journals of the British Medical Journal Publishing Group.
Khaoula Ben Messaoud and colleagues (2022) [abstract only]. Women’s responses to peer review invitations by 21 biomedical journals prior to and during the COVID-19 pandemic.
Other interesting studies
- The journal research “Olympics” was in 2022: The Ninth International Congress on Peer Review and Scientific Publication. Abstracts for presentations and posters, as well as slides and videos for presentations, are online here.
- A trial of randomizing group discussions to be kicked off by people who were positive or negative to a manuscript didn’t affect the outcome – there was, wrote the others, no sign of a herding effect. The context: Submissions for a computer conference [abstract only].
- In neuroscience articles in PLOS ONE, author-suggested peer reviewers rated manuscripts more favorably.
- A study used machine-learning to study journal impact factor and 10,000 peer review reports in medical and life sciences journals. The authors concluded: “Differences were modest and variability high, indicating that the JIF is a bad predictor for the quality of peer review of an individual manuscript.” Another study used machine-learning to study over 1.3m peer review reports from 740 Elsevier journals.
- A group analyzed the content of the corpus of peer review reports from Royal Society journals, publishing a pair of papers in August and September.
- In March, Nature‘s editors announced that nearly half of their authors were agreeing to have anonymous versions of their papers’ peer review reports published. In October, they announced that all papers submitted from November 2022 would have peer review reports (signed or anonymous) and authors’ responses published.
You can keep up with my work via my free newsletter, Living With Evidence.
This is the 6th post of a series on peer review research – that started with a couple of catch-ups on peer review research milestones from 1945 to 2018:
Disclosures: I know at least one author of 2 of the papers or abstracts I selected this year (Thibault, Speich). I’ve had a variety of editorial roles at multiple journals across the years, including having been a member of the ethics committee of the BMJ, the editorial boards of PLOS Medicine and The Drugs and Therapeutics Bulletin, and PLOS ONE‘s human ethics advisory group. I wrote a chapter of the second edition of the BMJ‘s book, Peer Review in Health Sciences. I have done research on post-publication peer review, subsequent to a previous role I had, as Editor-in-Chief of PubMed Commons (a discontinued post-publication commenting system for PubMed). I have been advising on some controversial issues for The Cochrane Library, a systematic review journal which I helped establish, and for which I was an editor for several years. I was a longtime health consumer advocate in health research, and designed and led many 1- or 2-day training workshops internationally for potential consumer peer reviewers for Cochrane.