Best Practices for QSAR Model Development, Validation, and Exploitation

After nearly five decades “in the making”, QSAR modeling has established itself as one of the major computational molecular modeling methodologies. As any mature research discipline, QSAR modeling can be characterized by a collection of well defined protocols and procedures that enable the expert ap...

Full description

Saved in:
Bibliographic Details
Published inMolecular informatics Vol. 29; no. 6-7; pp. 476 - 488
Main Author Tropsha, Alexander
Format Journal Article
LanguageEnglish
Published Weinheim WILEY-VCH Verlag 12.07.2010
WILEY‐VCH Verlag
Subjects
Online AccessGet full text
ISSN1868-1743
1868-1751
DOI10.1002/minf.201000061

Cover

More Information
Summary:After nearly five decades “in the making”, QSAR modeling has established itself as one of the major computational molecular modeling methodologies. As any mature research discipline, QSAR modeling can be characterized by a collection of well defined protocols and procedures that enable the expert application of the method for exploring and exploiting ever growing collections of biologically active chemical compounds. This review examines most critical QSAR modeling routines that we regard as best practices in the field. We discuss these procedures in the context of integrative predictive QSAR modeling workflow that is focused on achieving models of the highest statistical rigor and external predictive power. Specific elements of the workflow consist of data preparation including chemical structure (and when possible, associated biological data) curation, outlier detection, dataset balancing, and model validation. We especially emphasize procedures used to validate models, both internally and externally, as well as the need to define model applicability domains that should be used when models are employed for the prediction of external compounds or compound libraries. Finally, we present several examples of successful applications of QSAR models for virtual screening to identify experimentally confirmed hits.
Bibliography:National Cancer Institute NIH - No. R01GM066940
ark:/67375/WNG-CZV12101-9
ArticleID:MINF201000061
istex:BE7E9D66A7065C639D53B15D55499DC9B8E40E9F
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
ObjectType-Review-3
content type line 23
ISSN:1868-1743
1868-1751
DOI:10.1002/minf.201000061