The efficient design of Nested Group Testing algorithms for disease identification in clustered data

Group testing study designs have been used since the 1940s to reduce screening costs for uncommon diseases; for rare diseases, all cases are identifiable with substantially fewer tests than the population size. Substantial research has identified efficient designs under this paradigm. However, littl...

Full description

Saved in:

Bibliographic Details
Published in	Journal of applied statistics Vol. 50; no. 10; pp. 2228 - 2245
Main Authors	Best, Ana F., Malinovsky, Yaakov, Albert, Paul S.
Format	Journal Article
Language	English
Published	England Taylor & Francis 27.07.2023 Taylor & Francis Ltd
Subjects	Algorithms clustered data Clustering Disease disease identification Estimates group testing Heterogeneity Medical screening pooled sample analysis prevalence heterogeneity Statistical methods clustered data pooled sample analysis disease identification group testing prevalence heterogeneity
Online Access	Get full text
ISSN	0266-4763 1360-0532 1360-0532
DOI	10.1080/02664763.2022.2071419

Cover

More Information
Summary:	Group testing study designs have been used since the 1940s to reduce screening costs for uncommon diseases; for rare diseases, all cases are identifiable with substantially fewer tests than the population size. Substantial research has identified efficient designs under this paradigm. However, little work has focused on the important problem of disease screening among clustered data, such as geographic heterogeneity in HIV prevalence. We evaluated designs where we first estimate disease prevalence and then apply efficient group testing algorithms using these estimates. Specifically, we evaluate prevalence using individual testing on a fixed-size subset of each cluster and use these prevalence estimates to choose group sizes that minimize the corresponding estimated average number of tests per subject. We compare designs where we estimate cluster-specific prevalences as well as a common prevalence across clusters, use different group testing algorithms, construct groups from individuals within and in different clusters, and consider misclassification. For diseases with low prevalence, our results suggest that accounting for clustering is unnecessary. However, for diseases with higher prevalence and sizeable between-cluster heterogeneity, accounting for clustering in study design and implementation improves efficiency. We consider the practical aspects of our design recommendations with two examples with strong clustering effects: (1) Identification of HIV carriers in the US population and (2) Laboratory screening of anti-cancer compounds using cell lines.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	0266-4763 1360-0532 1360-0532
DOI:	10.1080/02664763.2022.2071419