Who You Gonna Search? Engine News for Systematic Reviewers

October 31, 2019 Hilda Bastian Science Communication

Isn’t it almost always the way? I had barely finished my recent post about alternatives to Google Scholar, it seems to me, before news for systematic reviewers landed about Google Scholar and Microsoft Academic – one of the alternatives I wrote about.

28 search systems and 27 criteria. That’s what Michael Gusenbauer and Neal Haddaway put through a grinder, to find out which search systems are suitable for systematic reviewing. Not many, was the answer, and above all, not Google Scholar. It is, they concluded, “inappropriate as principal search system”.

By the way, that’s the conclusion Wichor Bramer and colleagues came to a while ago in a couple of studies. Those studies are included in Bramer’s newly minted PhD thesis. (Congratulations, Dr Bramer!)

Back to the new one by Gusenbauer and Haddaway. These are the 28 systems they studied:

List of 28 search systems - click for the open access PDF that lists them — *From Gusenbauer & Haddaway (**2019**)*

Their criteria covered a lot of territory, putting the systems through their paces on many aspects of quality, like coverage, reproducibility of results at different times, whether it serves up bulk downloads, and whether or not they are open access. They wrote:

Each of these criteria assessed a single functionality of the search system. Jointly, these tests showed to what degree a search system was capable of searching effectively and efficiently…

If you suffer frequently from the pain of database idiosyncrasies and hitting download limits, this paper is for you – I felt seen! (Although also occasionally not: they concluded the Cochrane Library allows full download options. My pain this week is the download limits for trials in there, so that statement got a growl.)

Here’s an example of why Google Scholar came out so badly in this study:

In our sample of 28 academic search systems, all but two – Google Scholar, and WorldWideScience –were reproducible in terms of reporting identical results for repeated identical queries. While World-WideScience failed to deliver replicable results at all times, Google Scholar failed to deliver them only during certain periods: sometimes search results were replicable with two consecutive queries; then with a third query or with queries after some queries in between, they were no longer replicable and the results set differed in a way not explainable by natural database growth…

If a system such as Google Scholar fails to deliver retrieval capabilities that allow a reviewer to search systematically with high levels of recall, precision, transparency and reproducibility, its coverage is irrelevant for query-based search.

The ability to search for citations was one of the things Gusenbauer and Haddaway tested. And that brings us to the second item of breaking news this month. What if there was a really good way to find closely related articles to ones you’ve got, on a massive scale, and get an alert when new ones appear, without a lot of chaff to scratch through?

Here’s the hot-off-the-press abstract Ian Shemilt sent me. It’s a presentation by him and James Thomas, and he sent me the Powerpoints. (Thanks, Ian!) This is based on Microsoft Academic. They have created Microsoft Academic Graph (MAG), which links up the citations of the 200 million+ scientific publications indexed in the search engine. MAG is updated every 10 days, and available under a Creative Commons license. With Wellcome Trust funding, Shemilt and colleagues created a browser which is going to be released in beta version in EPPI-Reviewer. And that’s not free, unfortunately, although if you want to use it for a Cochrane or Campbell systematic review, you’re in luck – there will be access for that.

It’s early days, but if this is easy to use and it keeps performing as it has in this first report, then this could make review updates, study registers, and more much more efficient. In the simulation Shemilt and Thomas report, they cut the publication part of a 240-hour task down to 67 hours, although it left the searching of trials registers and conference abstracts still to do. So they’re trying to get Microsoft Academic’s coverage expanded.

If it were put through the Gusenbauer and Haddaway grinder, it would topple at the open access criterion. Here’s hoping it’s not going to stay that way.

In my recent post, I was surprised to see that Microsoft Academic has come a long way, with far better functionality than Google Scholar, and growing coverage. But there’s the same core problem, isn’t there, of long-term security and accessibility. Still, public resources aren’t guaranteed to last either. It’s easier to get excited about another alternative, though – the open source Lens. When the researchers have worked out how to enable us to do amazing things, we need those capabilities to spread to where we can all use, and safely rely, on them.

~~~~

The cartoons are my own (CC BY-NC-ND license). (More cartoons at Statistically Funny and on Tumblr.)

Leave a Reply Cancel reply