The GRADE approach is reproducible in assessing the quality of evidence of quantitative evidence syntheses

We evaluated the inter-rater reliability (IRR) of assessing the quality of evidence (QoE) using the Grading of Recommendations, Assessment, Development, and Evaluation (GRADE) approach. On completing two training exercises, participants worked independently as individual raters to assess the QoE of...

Full description

Saved in:
Bibliographic Details
Published inJournal of clinical epidemiology Vol. 66; no. 7; pp. 736 - 742.e5
Main Authors Mustafa, Reem A., Santesso, Nancy, Brozek, Jan, Akl, Elie A., Walter, Stephen D., Norman, Geoff, Kulasegaram, Mahan, Christensen, Robin, Guyatt, Gordon H., Falck-Ytter, Yngve, Chang, Stephanie, Murad, Mohammad Hassan, Vist, Gunn E., Lasserson, Toby, Gartlehner, Gerald, Shukla, Vijay, Sun, Xin, Whittington, Craig, Post, Piet N., Lang, Eddy, Thaler, Kylie, Kunnamo, Ilkka, Alenius, Heidi, Meerpohl, Joerg J., Alba, Ana C., Nevis, Immaculate F., Gentles, Stephen, Ethier, Marie-Chantal, Carrasco-Labra, Alonso, Khatib, Rasha, Nesrallah, Gihad, Kroft, Jamie, Selk, Amanda, Brignardello-Petersen, Romina, Schünemann, Holger J.
Format Journal Article
LanguageEnglish
Published New York, NY Elsevier Inc 01.07.2013
Elsevier
Elsevier Limited
Subjects
Online AccessGet full text
ISSN0895-4356
1878-5921
1878-5921
DOI10.1016/j.jclinepi.2013.02.004

Cover

More Information
Summary:We evaluated the inter-rater reliability (IRR) of assessing the quality of evidence (QoE) using the Grading of Recommendations, Assessment, Development, and Evaluation (GRADE) approach. On completing two training exercises, participants worked independently as individual raters to assess the QoE of 16 outcomes. After recording their initial impression using a global rating, raters graded the QoE following the GRADE approach. Subsequently, randomly paired raters submitted a consensus rating. The IRR without using the GRADE approach for two individual raters was 0.31 (95% confidence interval [95% CI] = 0.21–0.42) among Health Research Methodology students (n = 10) and 0.27 (95% CI = 0.19–0.37) among the GRADE working group members (n = 15). The corresponding IRR of the GRADE approach in assessing the QoE was significantly higher, that is, 0.66 (95% CI = 0.56–0.75) and 0.72 (95% CI = 0.61–0.79), respectively. The IRR further increased for three (0.80 [95% CI = 0.73–0.86] and 0.74 [95% CI = 0.65–0.81]) or four raters (0.84 [95% CI = 0.78–0.89] and 0.79 [95% CI = 0.71–0.85]). The IRR did not improve when QoE was assessed through a consensus rating. Our findings suggest that trained individuals using the GRADE approach improves reliability in comparison to intuitive judgments about the QoE and that two individual raters can reliably assess the QoE using the GRADE system.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ObjectType-Article-2
ObjectType-Feature-1
ISSN:0895-4356
1878-5921
1878-5921
DOI:10.1016/j.jclinepi.2013.02.004