Crowd-sourcing structured metadata to improve literature search efficiency

Session: 

Oral session: Searching and information retrieval (3)

Date: 

Tuesday 18 September 2018 - 14:00 to 14:20

Location: 

All authors in correct order:

Kaiser K1, Sweeney H1, Brown A2
1 School of Public Health, University of Alabama at Birmingham, USA
2 School of Public Health-Bloomington, Indiana University- Bloomington, USA
Presenting author and contact person

Presenting author:

Kathryn Kaiser

Contact person:

Abstract text
Background:
Scientific literature search tools are often designed to enable presentation of matches based on a relevance model of information retrieval. In the case of MEDLINE the medical subject headings and keywords system was implemented in the 1960s when there were around 100,000 articles in MEDLINE. Now, > 1 million articles are added each year, with an accumulation now of > 40 million. The relevance indexing approach results in more than 97% of retrieved articles for systematic reviews (SRs) being non-relevant.

Objectives:
We aimed to quantify the percentage improvement in the identification of studies by creating structured metadata for two components that are the most common reasons why studies are excluded from systematic reviews: population and study design.

Methods:
We created a web-based portal that allows for crowd-sourcing of structured metadata. Twelve questions were answered by at least two coders to describe the study characteristics of one year's worth of articles published in a single journal, Obesity (year: 2016; N = 365, including a wide array of study types and designs from human epidemiological studies to basic science articles with a variety of species as the focus of study). Once created, we applied structured queries to identify articles with different designs and populations, and compared the search precision to the standard methods available in PubMed.

Results:
Many articles could be coded in less than 10 minutes by people with little training or expertise using this approach. Using a context-based method, search precision was increased by as much as 50% when compared to the standard approach (which had 19% false negatives) over the study categories tested.

Conclusions:
We propose a transition to all articles having structured metadata descriptors available upon publication, with expansion to the full PICOS structure (Population, Intervention, Comparison, Outcomes, Study Design). Our tests show that full metadata can be provided in under 15 minutes. With this persistent metadata, all consumers of scientific literature could search with significantly higher precision and efficiency.

Patient/healthcare consumer involvement:
Lay people can contribute to the improved efficiency of literature searches and may also be more easily able to curate personal databases of interest when SRs are lacking or out of date.

Relevance to patients and consumers: 

The described project reports the results of a pilot test of the creation of structured metadata to improve literature search efficiency. In this initial test, we aimed to quantify the percent reduction in the information retrieval burden for persons who wish to identify journal articles of interest on a specific study design or population of interest. By providing easy to use, web-based tools, patient/consumer groups can create custom databases of evidence to catalog treatment options and their relative strengths as reported in clinical studies.