What’s Wrong With How We Talk About Preprints?

October 25, 2024 Hilda Bastian Science Communication

A steampunk publishing machine, called the Journal Spin-a-Tron, with a tagline, "Puffing out science hype since 1894." (Cartoon by Hilda Bastian.)

“When you see the term ‘preprint’ in a scientific news article, what do you interpret it to mean?”

That’s the key question Alice Fleerackers and colleagues posed to their study participants (2024). There were 864 of them, recruited from a US research service. There was also a group of university students, but I’m not discussing that sample in this post.

Around half of the participants from the general population had a degree, or some tertiary education. They had been randomized to read a news article about a piece of research from a preprint server that either disclosed it was based on a preprint (with an explanation of what this meant), or did not mention its status.

After coding the participants’ answers, the researchers concluded that only 10% of them were technically accurate. Those who had read the versions including an explanation that the research came from an un-peer-reviewed preprint that wasn’t published in a journal “were no more likely to provide a technically accurate” answer. (All they had to say to score a “technically accurate” rating was that the work hadn’t been peer-reviewed and/or that wasn’t from a journal.)

I think the most worrisome part of their findings comes from the misunderstandings. The proportion of people who either didn’t answer, or answered that they didn’t know, was only around 10%. So the overwhelming majority of the group misunderstood the term, sometimes dramatically. Misunderstandings derived in part from the word “preprint” itself, in part from the short explanation, and in part because of misunderstandings about journals and peer review.

For example, one of the dominant themes for answers designated inaccurate included that the source was itself a news story (not a scientific manuscript). Another was that preprint meant it was “complete and credible” information, with an assumption it means a journal article appearing before a physical version is printed.

In their paper’s introduction, Fleerackers and co-authors wrote that the evidence generally suggests that “the terms preprint and peer review have little meaning to public audiences as standalone terms.” Their study adds more weight to that conclusion.

That peer review is so poorly understood is clearly a major reason that short explanations about preprints are of limited use. I think that extends to the way even many academics talk about peer review and preprints – perhaps especially when the general public is the intended audience.

“Peer-reviewed” is used as an imprimatur for legitimizing and giving authority to scientific publication. Terms like “validated” and “confirmed” are used to describe peer-reviewed articles. This use of those words is more weight than peer review can bear, though. Validation and confirmation of results come from other studies, not from the opinions of a couple of peer reviewers and an editor. MedRxiv, the leading platform for medical preprints, describes peer-reviewed as “certified.” I don’t know if that implies something in the community that reflects what journals are really doing, but at least it’s not a term with another formal meaning in scientific work.

Here’s another typical example, where a professor of chemistry tells a general readership that journal peer review is “rigorous quality control by experts,” and publication is a “seal of approval.” Peer review is a quality assurance process, but it’s a highly variable one. The scientific basis for it – even at its best – is depressingly weak. (I have quite a few posts on this, tagged here.) It’s odd, too, given that peer reviewers don’t even have to have recommended publication for editors to publish a manuscript anyway.

Definitions of preprints tend to stress the peer review issue, too. For example, the US National Library of Medicine describes the publication type purely in that frame: “Scientific manuscript made available prior to peer review.” Preprints are preprints because they’re published on a platform for preprints: It’s not synonymous with un-peer-reviewed. And the screening at good preprint servers is better quality control than exists at the worst journals.

Authors can, and often do, upload new versions of a preprint after responses to the first version, or after journal peer review. There are publication models that formalize this structure – like f1000Research and Wellcome Open Research. There are journals that are reviewed preprints, like eLife. There are also publication models like Peer Community In that explicitly de-couple peer review from journal publication.

Where does this leave us? I understand why people think it’s important to point out that a particular publication is a preprint, not a journal article. However, somewhere along the line, I stopped worrying about routinely differentiating preprints from other reports. Preprints became normal. The quality of so many of them is so high, and the quality of so many journal reports of research so low, that I don’t think publication status is a reliable distinction.

The Fleerackers paper includes several conclusions. However, they all related to communicating preprint status in news stories, or a discussion of a research agenda. The relative uselessness of a caveat about publication status leads me in a different direction. It underscores that the onus for not misleading the public lies in vetting a preprint before publishing a news story about it. Cautioning that it’s a preprint doesn’t really get anyone off the hook.

You can keep up with my work at my newsletter, Living With Evidence. And I’m active on Mastodon: @hildabast@mastodon.online and on Bluesky @hildabast.bsky.social

~~~~

Disclosures: My scientific publications include journal articles as well as preprints at bioRxiv and medRxiv. I have been an editor or board/committee member at several medical journals in the past. From 2011 to 2018, I worked on projects for PubMed, at the National Center for Biotechnology Information (NCBI) (part of the US National Library of Medicine at the National Institutes of Health).

The cartoon is my own (CC BY-NC-ND license). (More cartoons at Statistically Funny.)