Evaluation of risk of bias in non-randomized trial using the ROBINS-I tool: a pilot experience

Session: 

Oral session: Inclusion of non-randomized designs

Date: 

Sunday 16 September 2018 - 14:20 to 14:40

Location: 

All authors in correct order:

Minozzi S1, Cinquini M2, Castellini G3, Gianola S4, Gerardi C5, Banzi R5
1 Department of Epidemiology, Lazio Regional Health Service– Rome, Italy
2 IRCCS-Istituto di Ricerche Farmacologiche Mario Negri– Milano, Italy
3 Department of Biomedical Sciences for Health, University of Milan, Milan, Italy
4 IRCCS Galeazzi Orthopedic Institute, Milan, Italy
5 IRCCS- Istituto di Ricerche Farmacologiche, Milano, Italy
Presenting author and contact person

Presenting author:

Silvia Minozzi

Contact person:

Abstract text
Background: The number of systematic reviews including non-randomized studies (NRS) is increasing, thus the evaluation of NRS validity is critical. The 'risk of bias in non-randomized studies of intervention' (ROBINS-I), published in 2016, is gaining popularity. No studies have been conducted so far to assess its reliability.
Objectives: To measure the inter-rater reliability (IRR) of ROBINS-I and explore its applicability.
Methods: Taking a systematic review on influenza vaccination as case model, we applied the ROBINS-I-stage 2 (definition of: target trial, confounding, co-morbidities, effect of interest). Five raters with low-medium expertise in risk of bias assessment of NRS independently read 14 cohort studies and applied the ROBINS-I-stage 3 on two outcomes: influenza-like illness (ILI, subjective), laboratory-confirmed influenza (objective).
We calculated Fliess’ k for multiple raters for signalling questions, individual domains and overall risk of bias, after a round of discussion aimed at clarifying some critical aspects of the tools (e.g. conditional questions). We classified agreement as poor (≤ 0.00), slight (0.01 to 0.20), fair (0.21 to 0.40), moderate (0.41 to 0.60), substantial (0.61 to 0.80), almost perfect (0.81 to 1.00). We calculated time to complete the tool as mean of the time spent in minutes by each rater on each study.
Results: Six studies evaluated ILI, four influenza, and four both outcomes. Table 1 reports the IRR: agreement was poor/slight for all the individual domains. IRR for the overall risk of bias was slight for the subjective outcome ILI (0.24, SD 0.07) and poor for objective outcome influenza (-0.06, SD 0.08). The mean time to complete ROBINS-I was 36.2 minutes (SD 12.9).
Conclusions: The agreement ranged from poor to slight. We found the tool difficult to apply, mainly because of the ambiguity of conditional questions, i.e. when the answer to trigger questions is 'no information'. Unclear reporting of several studies increased the poor agreement. The small sample and the use of studies not adequate to assess bias due to post-intervention deviations limit our findings. We will present results on 15 additional NRSs where adherence to the assigned interventions may introduce biases.
Patient or healthcare consumer involvement: The project focuses on methods to assess risk of bias, so we could not involve consumers

Attachments: 

Relevance to patients and consumers: 

The evaluation of the validity of NRSs is relevant for healthcare decision making, that is often informed by studies other than randomized trials. NRSs are critical to many areas of healthcare evaluation, because they can provide evidence about effectiveness and safety of interventions delivered in settings and to populations typical of real world practice. Evidence generated by NRSs can be of critical importance to assess patient-centred health outcome (i.e. effectiveness and adverse effects) in clinical practice conditions.