Performance of InSilicoVA for assigning causes of death to verbal autopsies: multisite validation study using clinical diagnostic gold standards

Background Recently, a new algorithm for automatic computer certification of verbal autopsy data named InSilicoVA was published. The authors presented their algorithm as a statistical method and assessed its performance using a single set of model predictors and one age group. Methods We perform a s...

Full description

Saved in:
Bibliographic Details
Published inBMC medicine Vol. 16; no. 1; pp. 56 - 11
Main Authors Flaxman, Abraham D., Joseph, Jonathan C., Murray, Christopher J. L., Riley, Ian Douglas, Lopez, Alan D.
Format Journal Article
LanguageEnglish
Published London BioMed Central 19.04.2018
BioMed Central Ltd
BMC
Subjects
Online AccessGet full text
ISSN1741-7015
1741-7015
DOI10.1186/s12916-018-1039-1

Cover

More Information
Summary:Background Recently, a new algorithm for automatic computer certification of verbal autopsy data named InSilicoVA was published. The authors presented their algorithm as a statistical method and assessed its performance using a single set of model predictors and one age group. Methods We perform a standard procedure for analyzing the predictive accuracy of verbal autopsy classification methods using the same data and the publicly available implementation of the algorithm released by the authors. We extend the original analysis to include children and neonates, instead of only adults, and test accuracy using different sets of predictors, including the set used in the original paper and a set that matches the released software. Results The population-level performance (i.e., predictive accuracy) of the algorithm varied from 2.1 to 37.6% when trained on data preprocessed similarly as in the original study. When trained on data that matched the software default format, the performance ranged from −11.5 to 17.5%. When using the default training data provided, the performance ranged from −59.4 to −38.5%. Overall, the InSilicoVA predictive accuracy was found to be 11.6–8.2 percentage points lower than that of an alternative algorithm. Additionally, the sensitivity for InSilicoVA was consistently lower than that for an alternative diagnostic algorithm (Tariff 2.0), although the specificity was comparable. Conclusions The default format and training data provided by the software lead to results that are at best suboptimal, with poor cause-of-death predictive performance. This method is likely to generate erroneous cause of death predictions and, even if properly configured, is not as accurate as alternative automated diagnostic methods.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ObjectType-Undefined-3
ISSN:1741-7015
1741-7015
DOI:10.1186/s12916-018-1039-1