Large-Scale Learning of Structure−Activity Relationships Using a Linear Support Vector Machine and Problem-Specific Metrics

The goal of this study was to adapt a recently proposed linear large-scale support vector machine to large-scale binary cheminformatics classification problems and to assess its performance on various benchmarks using virtual screening performance measures. We extended the large-scale linear support...

Full description

Saved in:

Bibliographic Details
Published in	Journal of chemical information and modeling Vol. 51; no. 2; pp. 203 - 213
Main Authors	Hinselmann, Georg, Rosenbaum, Lars, Jahn, Andreas, Fechner, Nikolas, Ostermann, Claude, Zell, Andreas
Format	Journal Article
Language	English
Published	Washington, DC American Chemical Society 28.02.2011
Subjects	Algorithmics. Computability. Computer arithmetics Analytical chemistry Applied sciences Artificial Intelligence Benchmarks Biological and medical sciences Chemical Information Chemistry Cluster analysis Computational Biology - methods Computer science; control theory; systems Data processing. List processing. Character string processing Databases, Factual Drug Evaluation, Preclinical - methods Exact sciences and technology General and physical chemistry General pharmacology General. Nomenclature, chemical documentation, computer chemistry Medical sciences Memory organisation. Data processing Models, Molecular Molecular Conformation Pharmaceutical technology. Pharmaceutical industry Pharmacology. Drug treatments Reproducibility of Results Software Structure-Activity Relationship Studies Theoretical computing Theory of reactions, general kinetics. Catalysis. Nomenclature, chemical documentation, computer chemistry Time Factors User-Computer Interface Virtualization High throughput screening Bayes estimation Performance evaluation High performance Statistical analysis Tree(graph) Linear machine Large scale structure Virtual screening Very large databases Random decision forests System with n degrees of freedom Computational chemistry Structure activity relation Vector support machine Linear complexity Metric Large scale Non linear effect Bayes decision Artificial intelligence Comparative study Turbulence structure Binary classification
Online Access	Get full text
ISSN	1549-9596 1549-960X 1549-960X
DOI	10.1021/ci100073w

Cover

Abstract	The goal of this study was to adapt a recently proposed linear large-scale support vector machine to large-scale binary cheminformatics classification problems and to assess its performance on various benchmarks using virtual screening performance measures. We extended the large-scale linear support vector machine library LIBLINEAR with state-of-the-art virtual high-throughput screening metrics to train classifiers on whole large and unbalanced data sets. The formulation of this linear support machine has an excellent performance if applied to high-dimensional sparse feature vectors. An additional advantage is the average linear complexity in the number of non-zero features of a prediction. Nevertheless, the approach assumes that a problem is linearly separable. Therefore, we conducted an extensive benchmarking to evaluate the performance on large-scale problems up to a size of 175000 samples. To examine the virtual screening performance, we determined the chemotype clusters using Feature Trees and integrated this information to compute weighted AUC-based performance measures and a leave-cluster-out cross-validation. We also considered the BEDROC score, a metric that was suggested to tackle the early enrichment problem. The performance on each problem was evaluated by a nested cross-validation and a nested leave-cluster-out cross-validation. We compared LIBLINEAR against a Naïve Bayes classifier, a random decision forest classifier, and a maximum similarity ranking approach. These reference approaches were outperformed in a direct comparison by LIBLINEAR. A comparison to literature results showed that the LIBLINEAR performance is competitive but without achieving results as good as the top-ranked nonlinear machines on these benchmarks. However, considering the overall convincing performance and computation time of the large-scale support vector machine, the approach provides an excellent alternative to established large-scale classification approaches.
AbstractList	The goal of this study was to adapt a recently proposed linear large-scale support vector machine to large-scale binary cheminformatics classification problems and to assess its performance on various benchmarks using virtual screening performance measures. We extended the large-scale linear support vector machine library LIBLINEAR with state-of-the-art virtual high-throughput screening metrics to train classifiers on whole large and unbalanced data sets. The formulation of this linear support machine has an excellent performance if applied to high-dimensional sparse feature vectors. An additional advantage is the average linear complexity in the number of non-zero features of a prediction. Nevertheless, the approach assumes that a problem is linearly separable. Therefore, we conducted an extensive benchmarking to evaluate the performance on large-scale problems up to a size of 175000 samples. To examine the virtual screening performance, we determined the chemotype clusters using Feature Trees and integrated this information to compute weighted AUC-based performance measures and a leave-cluster-out cross-validation. We also considered the BEDROC score, a metric that was suggested to tackle the early enrichment problem. The performance on each problem was evaluated by a nested cross-validation and a nested leave-cluster-out cross-validation. We compared LIBLINEAR against a Naïve Bayes classifier, a random decision forest classifier, and a maximum similarity ranking approach. These reference approaches were outperformed in a direct comparison by LIBLINEAR. A comparison to literature results showed that the LIBLINEAR performance is competitive but without achieving results as good as the top-ranked nonlinear machines on these benchmarks. However, considering the overall convincing performance and computation time of the large-scale support vector machine, the approach provides an excellent alternative to established large-scale classification approaches.The goal of this study was to adapt a recently proposed linear large-scale support vector machine to large-scale binary cheminformatics classification problems and to assess its performance on various benchmarks using virtual screening performance measures. We extended the large-scale linear support vector machine library LIBLINEAR with state-of-the-art virtual high-throughput screening metrics to train classifiers on whole large and unbalanced data sets. The formulation of this linear support machine has an excellent performance if applied to high-dimensional sparse feature vectors. An additional advantage is the average linear complexity in the number of non-zero features of a prediction. Nevertheless, the approach assumes that a problem is linearly separable. Therefore, we conducted an extensive benchmarking to evaluate the performance on large-scale problems up to a size of 175000 samples. To examine the virtual screening performance, we determined the chemotype clusters using Feature Trees and integrated this information to compute weighted AUC-based performance measures and a leave-cluster-out cross-validation. We also considered the BEDROC score, a metric that was suggested to tackle the early enrichment problem. The performance on each problem was evaluated by a nested cross-validation and a nested leave-cluster-out cross-validation. We compared LIBLINEAR against a Naïve Bayes classifier, a random decision forest classifier, and a maximum similarity ranking approach. These reference approaches were outperformed in a direct comparison by LIBLINEAR. A comparison to literature results showed that the LIBLINEAR performance is competitive but without achieving results as good as the top-ranked nonlinear machines on these benchmarks. However, considering the overall convincing performance and computation time of the large-scale support vector machine, the approach provides an excellent alternative to established large-scale classification approaches. The goal of this study was to adapt a recently proposed linear large-scale support vector machine to large-scale binary cheminformatics classification problems and to assess its performance on various benchmarks using virtual screening performance measures. We extended the large-scale linear support vector machine library LIBLINEAR with state-of-the-art virtual high-throughput screening metrics to train classifiers on whole large and unbalanced data sets. The formulation of this linear support machine has an excellent performance if applied to high-dimensional sparse feature vectors. An additional advantage is the average linear complexity in the number of non-zero features of a prediction. Nevertheless, the approach assumes that a problem is linearly separable. Therefore, we conducted an extensive benchmarking to evaluate the performance on large-scale problems up to a size of 175000 samples. To examine the virtual screening performance, we determined the chemotype clusters using Feature Trees and integrated this information to compute weighted AUC-based performance measures and a leave-cluster-out cross-validation. We also considered the BEDROC score, a metric that was suggested to tackle the early enrichment problem. The performance on each problem was evaluated by a nested cross-validation and a nested leave-cluster-out cross-validation. We compared LIBLINEAR against a Naïve Bayes classifier, a random decision forest classifier, and a maximum similarity ranking approach. These reference approaches were outperformed in a direct comparison by LIBLINEAR. A comparison to literature results showed that the LIBLINEAR performance is competitive but without achieving results as good as the top-ranked nonlinear machines on these benchmarks. However, considering the overall convincing performance and computation time of the large-scale support vector machine, the approach provides an excellent alternative to established large-scale classification approaches. The goal of this study was to adapt a recently proposed linear large-scale support vector machine to large-scale binary cheminformatics classification problems and to assess its performance on various benchmarks using virtual screening performance measures. We extended the large-scale linear support vector machine library LIBLINEAR with state-of-the-art virtual high-throughput screening metrics to train classifiers on whole large and unbalanced data sets. The formulation of this linear support machine has an excellent performance if applied to high-dimensional sparse feature vectors. An additional advantage is the average linear complexity in the number of non-zero features of a prediction. Nevertheless, the approach assumes that a problem is linearly separable. Therefore, we conducted an extensive benchmarking to evaluate the performance on large-scale problems up to a size of 175000 samples. To examine the virtual screening performance, we determined the chemotype clusters using Feature Trees and integrated this information to compute weighted AUC-based performance measures and a leave-cluster-out cross-validation. We also considered the BEDROC score, a metric that was suggested to tackle the early enrichment problem. The performance on each problem was evaluated by a nested cross-validation and a nested leave-cluster-out cross-validation. We compared LIBLINEAR against a Naïve Bayes classifier, a random decision forest classifier, and a maximum similarity ranking approach. These reference approaches were outperformed in a direct comparison by LIBLINEAR. A comparison to literature results showed that the LIBLINEAR performance is competitive but without achieving results as good as the top-ranked nonlinear machines on these benchmarks. However, considering the overall convincing performance and computation time of the large-scale support vector machine, the approach provides an excellent alternative to established large-scale classification approaches. The goal of this study was to adapt a recently proposed linear large-scale support vector machine to large-scale binary cheminformatics classification problems and to assess its performance on various benchmarks using virtual screening performance measures. We extended the large-scale linear support vector machine library LIBLINEAR with state-of-the-art virtual high-throughput screening metrics to train classifiers on whole large and unbalanced data sets. The formulation of this linear support machine has an excellent performance if applied to high-dimensional sparse feature vectors. An additional advantage is the average linear complexity in the number of non-zero features of a prediction. Nevertheless, the approach assumes that a problem is linearly separable. Therefore, we conducted an extensive benchmarking to evaluate the performance on large-scale problems up to a size of 175000 samples. To examine the virtual screening performance, we determined the chemotype clusters using Feature Trees and integrated this information to compute weighted AUC-based performance measures and a leave-cluster-out cross-validation. We also considered the BEDROC score, a metric that was suggested to tackle the early enrichment problem. The performance on each problem was evaluated by a nested cross-validation and a nested leave-cluster-out cross-validation. We compared LIBLINEAR against a Naive Bayes classifier, a random decision forest classifier, and a maximum similarity ranking approach. These reference approaches were outperformed in a direct comparison by LIBLINEAR. A comparison to literature results showed that the LIBLINEAR performance is competitive but without achieving results as good as the top-ranked nonlinear machines on these benchmarks. However, considering the overall convincing performance and computation time of the large-scale support vector machine, the approach provides an excellent alternative to established large-scale classification approaches. [PUBLICATION ABSTRACT]
Author	Hinselmann, Georg Rosenbaum, Lars Zell, Andreas Jahn, Andreas Ostermann, Claude Fechner, Nikolas
Author_xml	– sequence: 1 givenname: Georg surname: Hinselmann fullname: Hinselmann, Georg email: georg.hinselmann@uni-tuebingen.de – sequence: 2 givenname: Lars surname: Rosenbaum fullname: Rosenbaum, Lars – sequence: 3 givenname: Andreas surname: Jahn fullname: Jahn, Andreas – sequence: 4 givenname: Nikolas surname: Fechner fullname: Fechner, Nikolas – sequence: 5 givenname: Claude surname: Ostermann fullname: Ostermann, Claude – sequence: 6 givenname: Andreas surname: Zell fullname: Zell, Andreas
BackLink	http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=24332665$$DView record in Pascal Francis https://www.ncbi.nlm.nih.gov/pubmed/21207929$$D View this record in MEDLINE/PubMed
BookMark	eNptkc1u1TAQhS1URH9gwQsgCwlVLEL9kzjxsqooIKUCcSliF82d2K2rXCe1HVAX7FnziDwJvvReKhVWtjTfOTNzZp_s-NEbQp5y9oozwY_QccZYLb89IHu8KnWhFfuys_1XWu2S_RivGJNSK_GI7AouWK2F3iPfWwgXplggDIa2BoJ3_oKOli5SmDHNwfz68fMYk_vq0g39aAZIbvTx0k2Rnsc1C7R1PgvpYp6mMST62WAaAz0DvMwFCr6nH8K4HMyqWEwGnXVIz0wKDuNj8tDCEM2TzXtAzk9ffzp5W7Tv37w7OW4LKGWdClS8X4LQmivbKCvrJStZjQwE4wqrvsylSgEvta0b2_OqQkTdcKtB9H2v5QE5vPWdwng9m5i6lYtohgG8GefYNVXu0whdZfL5PfJqnIPPw2VIMVlKvrZ7toHm5cr03RTcCsJNt401Ay82AMScrA3g0cU7rpRSKLXudnTLYRhjDMZ26NKfhFMAN3ScdesDd38PnBUv7ym2pv9jN1MAxrs1_uV-AxLOsiE
CitedBy_id	crossref_primary_10_1186_s13321_015_0070_x crossref_primary_10_3389_fnins_2019_01250 crossref_primary_10_2174_1570163817666200806165934 crossref_primary_10_1109_TCBB_2015_2424435 crossref_primary_10_1002_cjoc_201300390 crossref_primary_10_1016_j_bmc_2012_03_030 crossref_primary_10_3390_diagnostics11091614 crossref_primary_10_1016_j_brainres_2023_148469 crossref_primary_10_1186_1758_2946_3_11 crossref_primary_10_1021_acschemneuro_8b00381 crossref_primary_10_3389_fnbeh_2023_1148172 crossref_primary_10_3389_fnins_2019_00494 crossref_primary_10_1142_S1469026822500195 crossref_primary_10_1155_2016_8309253 crossref_primary_10_1517_17460441_2014_866943 crossref_primary_10_3389_fnins_2020_00027 crossref_primary_10_1007_s11063_022_10879_6 crossref_primary_10_3390_molecules21111554 crossref_primary_10_3390_molecules25225277 crossref_primary_10_1021_acs_jcim_6b00753 crossref_primary_10_1093_bib_bbaa348 crossref_primary_10_1007_s10910_018_0855_z crossref_primary_10_1016_j_drudis_2014_10_012 crossref_primary_10_1142_S0219720024500033 crossref_primary_10_3389_fnins_2022_989589 crossref_primary_10_1186_s13321_015_0110_6 crossref_primary_10_3389_fnins_2022_891247
Cites_doi	10.1093/nar/gkn923 10.2307/2528823 10.1021/ci8002649 10.1007/s10822-008-9181-z 10.1039/b409865j 10.1021/ci034160g 10.1021/ci8004379 10.1145/1656274.1656278 10.1021/ci010366a 10.1007/s10822-008-9199-2 10.1007/s10822-007-9167-2 10.1145/1390156.1390208 10.1002/qsar.200510135 10.2174/138620709788167980 10.1023/A:1008068904628 10.1186/1758-2946-1-7 10.1021/ci0498719 10.1021/jm0003992 10.1021/ci049714+ 10.1007/s10822-008-9196-5 10.1016/j.jmgm.2009.10.001 10.1093/bioinformatics/bti683 10.1021/ci900161g 10.1007/s10822-007-9163-6 10.1016/j.neunet.2005.07.009 10.1021/ci8003978 10.1186/1752-153X-2-11 10.1021/ci025584y 10.1021/ci600426e 10.2174/138161206777585274 10.1021/ci600358f 10.1093/bioinformatics/btq140 10.1186/1471-2105-9-401 10.1093/bioinformatics/btm341 10.1021/ci034143r 10.1021/ci0496144 10.1021/jm049611i 10.1021/ci010132r 10.1021/ci060138m 10.1007/s10822-008-9170-2 10.1016/0898-5529(90)90156-3
ContentType	Journal Article
Copyright	Copyright © 2011 American Chemical Society 2015 INIST-CNRS Copyright American Chemical Society Feb 28, 2011
Copyright_xml	– notice: Copyright © 2011 American Chemical Society – notice: 2015 INIST-CNRS – notice: Copyright American Chemical Society Feb 28, 2011
DBID	AAYXX CITATION IQODW CGR CUY CVF ECM EIF NPM 7SC 7SR 7U5 8BQ 8FD JG9 JQ2 L7M L~C L~D 7X8
DOI	10.1021/ci100073w
DatabaseName	CrossRef Pascal-Francis Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed Computer and Information Systems Abstracts Engineered Materials Abstracts Solid State and Superconductivity Abstracts METADEX Technology Research Database Materials Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional MEDLINE - Academic
DatabaseTitle	CrossRef MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) Materials Research Database Engineered Materials Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic ProQuest Computer Science Collection Computer and Information Systems Abstracts Solid State and Superconductivity Abstracts Advanced Technologies Database with Aerospace METADEX Computer and Information Systems Abstracts Professional MEDLINE - Academic
DatabaseTitleList	MEDLINE - Academic MEDLINE Materials Research Database
Database_xml	– sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: EIF name: MEDLINE url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search sourceTypes: Index Database
DeliveryMethod	fulltext_linktorsrc
Discipline	Chemistry Applied Sciences
EISSN	1549-960X
EndPage	213
ExternalDocumentID	2287004201 21207929 24332665 10_1021_ci100073w b121688007
Genre	Research Support, Non-U.S. Gov't Journal Article Feature
GroupedDBID	- 4.4 55A 5GY 5VS 7~N AABXI ABFLS ABMVS ABUCX ACGFS ACIWK ACNCT ACS AEESW AENEX AFEFF ALMA_UNASSIGNED_HOLDINGS AQSVZ D0L DU5 EBS ED ED~ EJD F5P GNL IH9 JG JG~ LG6 P2P PQEST PQQKQ RNS ROL UI2 VF5 VG9 W1F X --- -~X AAYXX ABBLG ABJNI ABLBI ABQRX ADHLV AHGAQ CITATION CUPRZ GGK 1WB 53G ACRPL ADNMO AEYZD AGQPQ ANPPW ANTXH IHE IQODW CGR CUY CVF ECM EIF NPM 7SC 7SR 7U5 8BQ 8FD JG9 JQ2 L7M L~C L~D 7X8
ID	FETCH-LOGICAL-a437t-c61dba29916f86f37b0407c0a2016c5d429956a149f78fd155ccc981f9a2ddd93
IEDL.DBID	ACS
ISSN	1549-9596 1549-960X
IngestDate	Fri Jul 11 11:09:56 EDT 2025 Mon Jun 30 10:54:37 EDT 2025 Thu Jan 02 22:11:24 EST 2025 Mon Jul 21 09:18:27 EDT 2025 Wed Oct 01 06:44:08 EDT 2025 Thu Apr 24 23:08:36 EDT 2025 Thu Aug 27 13:50:13 EDT 2020
IsPeerReviewed	true
IsScholarly	true
Issue	2
Keywords	High throughput screening Bayes estimation Performance evaluation High performance Statistical analysis Tree(graph) Linear machine Large scale structure Virtual screening Very large databases Random decision forests System with n degrees of freedom Computational chemistry Structure activity relation Vector support machine Linear complexity Metric Large scale Non linear effect Bayes decision Artificial intelligence Comparative study Turbulence structure Binary classification
Language	English
License	CC BY 4.0
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-a437t-c61dba29916f86f37b0407c0a2016c5d429956a149f78fd155ccc981f9a2ddd93
Notes	SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 ObjectType-Article-1 ObjectType-Feature-2 content type line 23
PMID	21207929
PQID	856034319
PQPubID	28739
PageCount	11
ParticipantIDs	proquest_miscellaneous_854378295 proquest_journals_856034319 pubmed_primary_21207929 pascalfrancis_primary_24332665 crossref_citationtrail_10_1021_ci100073w crossref_primary_10_1021_ci100073w acs_journals_10_1021_ci100073w
ProviderPackageCode	JG~ 55A AABXI GNL VF5 7~N VG9 W1F ACS AEESW AFEFF ABMVS ABUCX IH9 AQSVZ ED~ UI2 CITATION AAYXX
PublicationCentury	2000
PublicationDate	2011-02-28
PublicationDateYYYYMMDD	2011-02-28
PublicationDate_xml	– month: 02 year: 2011 text: 2011-02-28 day: 28
PublicationDecade	2010
PublicationPlace	Washington, DC
PublicationPlace_xml	– name: Washington, DC – name: United States – name: Washington
PublicationTitle	Journal of chemical information and modeling
PublicationTitleAlternate	J. Chem. Inf. Model
PublicationYear	2011
Publisher	American Chemical Society
Publisher_xml	– name: American Chemical Society
References	Melville J. L. (ref7/cit7) 2009; 12 Nasr R. (ref19/cit19) 2009; 1 Gower J. C. (ref35/cit35) 1971; 27 Hsieh J.-H. (ref11/cit11) 2008; 22 ref16/cit16 Clark R. D. (ref32/cit32) 2008; 22 Nicholls A. (ref30/cit30) 2008; 22 (ref50/cit50) 2008 ref20/cit20 Swamidass S. J. (ref10/cit10) 2009; 49 Rohrer S. G. (ref18/cit18) 2009; 49 Stahl M. (ref42/cit42) 2001; 44 Chen J. H. (ref43/cit43) 2007; 23 Bottou L. (ref39/cit39) 2008 Durant J. L. (ref12/cit12) 2002; 42 Good A. C. (ref14/cit14) 2008; 22 Chen J. (ref5/cit5) 2005; 21 Fröhlich H. (ref22/cit22) 2006; 25 ref21/cit21 Mackey M. D. (ref13/cit13) 2009; 49 Steinbeck C. (ref27/cit27) 2006; 12 Ralaivola L. (ref23/cit23) 2005; 18 Bender A. (ref8/cit8) 2004; 44 ref49/cit49 ref38/cit38 Whittle M. (ref6/cit6) 2006; 46 Irwin J. J. (ref51/cit51) 2005; 45 Gasteiger J. (ref44/cit44) 1992; 3 Schölkopf B. (ref34/cit34) 2002 Swamidass S. J. (ref36/cit36) 2007; 47 Han L. (ref9/cit9) 2008; 9 Bender A. (ref25/cit25) 2004; 47 Steinbeck C. (ref26/cit26) 2003; 43 Hall M. (ref48/cit48) 2009; 11 Fan R.-E. (ref1/cit1) 2008; 9 Mahé P. (ref24/cit24) 2006; 46 Sutherland J. J. (ref40/cit40) 2003; 43 Hansen K. (ref17/cit17) 2009; 49 Rarey M. (ref31/cit31) 1998; 12 Swamidass S. J. (ref33/cit33) 2010; 26 Oprea T. I. (ref41/cit41) 2001; 41 Hert J. (ref46/cit46) 2004; 2 Jain A. N. (ref29/cit29) 2008; 22 Wang Y. (ref4/cit4) 2009; 37 Truchon J.-F. (ref15/cit15) 2007; 47 (ref45/cit45) 2008 Wild D. J. (ref3/cit3) 2008; 2 ref47/cit47 Kirchmair J. (ref28/cit28) 2008; 22 Chen B. (ref2/cit2) 2010; 28 Svetnik V. (ref37/cit37) 2003; 43
References_xml	– volume: 37 start-page: 1 year: 2009 ident: ref4/cit4 publication-title: Nucleic Acids Res. doi: 10.1093/nar/gkn923 – ident: ref20/cit20 – volume: 27 start-page: 857 year: 1971 ident: ref35/cit35 publication-title: Biometrics doi: 10.2307/2528823 – volume-title: Mining Massive Data Sets for Security year: 2008 ident: ref39/cit39 – volume: 49 start-page: 169 year: 2009 ident: ref18/cit18 publication-title: J. Chem. Inf. Model. doi: 10.1021/ci8002649 – volume: 22 start-page: 141 year: 2008 ident: ref32/cit32 publication-title: J. Comput.-Aided Mol. Des. doi: 10.1007/s10822-008-9181-z – ident: ref47/cit47 – volume-title: Schrödinger MacroModel year: 2008 ident: ref45/cit45 – volume: 2 start-page: 3256 year: 2004 ident: ref46/cit46 publication-title: Org. Biomol. Chem. doi: 10.1039/b409865j – volume-title: Learning with Kernels year: 2002 ident: ref34/cit34 – volume: 43 start-page: 1947 year: 2003 ident: ref37/cit37 publication-title: J. Chem. Inf. Comput. Sci. doi: 10.1021/ci034160g – volume: 49 start-page: 756 year: 2009 ident: ref10/cit10 publication-title: J. Chem. Inf. Model. doi: 10.1021/ci8004379 – volume: 11 start-page: 10 year: 2009 ident: ref48/cit48 publication-title: SIGKDD Explorations doi: 10.1145/1656274.1656278 – volume: 41 start-page: 1308 year: 2001 ident: ref41/cit41 publication-title: J. Chem. Inf. Comput. Sci. doi: 10.1021/ci010366a – volume: 22 start-page: 593 year: 2008 ident: ref11/cit11 publication-title: J. Comput.-Aided Mol. Des. doi: 10.1007/s10822-008-9199-2 – volume: 22 start-page: 169 year: 2008 ident: ref14/cit14 publication-title: J. Comput.-Aided Mol. Des. doi: 10.1007/s10822-007-9167-2 – ident: ref38/cit38 doi: 10.1145/1390156.1390208 – volume: 25 start-page: 317 year: 2006 ident: ref22/cit22 publication-title: QSAR Comb. Sci. doi: 10.1002/qsar.200510135 – volume: 12 start-page: 332 year: 2009 ident: ref7/cit7 publication-title: Comb. Chem. High Throughput Screening doi: 10.2174/138620709788167980 – volume: 12 start-page: 471 year: 1998 ident: ref31/cit31 publication-title: J. Comput.-Aided Mol. Des. doi: 10.1023/A:1008068904628 – volume: 1 start-page: 7 year: 2009 ident: ref19/cit19 publication-title: J. Cheminf. doi: 10.1186/1758-2946-1-7 – ident: ref16/cit16 – volume: 9 start-page: 1871 year: 2008 ident: ref1/cit1 publication-title: J. Mach. Learn. Res. – volume: 44 start-page: 1708 year: 2004 ident: ref8/cit8 publication-title: J. Chem. Inf. Comput. Sci. doi: 10.1021/ci0498719 – ident: ref49/cit49 – volume: 44 start-page: 1035 year: 2001 ident: ref42/cit42 publication-title: J. Med. Chem. doi: 10.1021/jm0003992 – volume: 45 start-page: 177 year: 2005 ident: ref51/cit51 publication-title: J. Chem. Inf. Model. doi: 10.1021/ci049714+ – volume: 22 start-page: 133 year: 2008 ident: ref29/cit29 publication-title: J. Comput.-Aided Mol. Des. doi: 10.1007/s10822-008-9196-5 – volume: 28 start-page: 420 year: 2010 ident: ref2/cit2 publication-title: J. Mol. Graphics Modell. doi: 10.1016/j.jmgm.2009.10.001 – volume: 21 start-page: 4133 year: 2005 ident: ref5/cit5 publication-title: Bioinformatics doi: 10.1093/bioinformatics/bti683 – volume: 49 start-page: 2077 year: 2009 ident: ref17/cit17 publication-title: J. Chem. Inf. Model. doi: 10.1021/ci900161g – ident: ref21/cit21 – volume: 22 start-page: 213 year: 2008 ident: ref28/cit28 publication-title: J. Comput.-Aided Mol. Des. doi: 10.1007/s10822-007-9163-6 – volume: 18 start-page: 1093 year: 2005 ident: ref23/cit23 publication-title: Neural Networks doi: 10.1016/j.neunet.2005.07.009 – volume: 49 start-page: 1154 year: 2009 ident: ref13/cit13 publication-title: J. Chem. Inf. Model. doi: 10.1021/ci8003978 – volume: 2 start-page: 11 year: 2008 ident: ref3/cit3 publication-title: Chem. Cent. J. doi: 10.1186/1752-153X-2-11 – volume: 43 start-page: 493 year: 2003 ident: ref26/cit26 publication-title: J. Chem. Inf. Comput. Sci. doi: 10.1021/ci025584y – volume: 47 start-page: 488 year: 2007 ident: ref15/cit15 publication-title: J. Chem. Inf. Model doi: 10.1021/ci600426e – volume: 12 start-page: 2111 year: 2006 ident: ref27/cit27 publication-title: Curr. Pharm. Des. doi: 10.2174/138161206777585274 – volume-title: dragonX year: 2008 ident: ref50/cit50 – volume: 47 start-page: 302 year: 2007 ident: ref36/cit36 publication-title: J. Chem. Inf. Model. doi: 10.1021/ci600358f – volume: 26 start-page: 1348 year: 2010 ident: ref33/cit33 publication-title: Bioinformatics doi: 10.1093/bioinformatics/btq140 – volume: 9 start-page: 401 year: 2008 ident: ref9/cit9 publication-title: BMC Bioinf. doi: 10.1186/1471-2105-9-401 – volume: 23 start-page: 2348 year: 2007 ident: ref43/cit43 publication-title: Bioinformatics doi: 10.1093/bioinformatics/btm341 – volume: 43 start-page: 1906 year: 2003 ident: ref40/cit40 publication-title: J. Chem. Inf. Comput. Sci. doi: 10.1021/ci034143r – volume: 46 start-page: 2206 year: 2006 ident: ref6/cit6 publication-title: J. Chem. Inf. Model. doi: 10.1021/ci0496144 – volume: 47 start-page: 6569 year: 2004 ident: ref25/cit25 publication-title: J. Med. Chem. doi: 10.1021/jm049611i – volume: 42 start-page: 1273 year: 2002 ident: ref12/cit12 publication-title: J. Chem. Inf. Comput. Sci. doi: 10.1021/ci010132r – volume: 46 start-page: 2003 year: 2006 ident: ref24/cit24 publication-title: J. Chem. Inf. Model. doi: 10.1021/ci060138m – volume: 22 start-page: 239 year: 2008 ident: ref30/cit30 publication-title: J. Comput.-Aided Mol. Des. doi: 10.1007/s10822-008-9170-2 – volume: 3 start-page: 537 year: 1992 ident: ref44/cit44 publication-title: Tetrahedron Comput. Methodol. doi: 10.1016/0898-5529(90)90156-3
SSID	ssj0033962
Score	2.0967305
Snippet	The goal of this study was to adapt a recently proposed linear large-scale support vector machine to large-scale binary cheminformatics classification problems...
SourceID	proquest pubmed pascalfrancis crossref acs
SourceType	Aggregation Database Index Database Enrichment Source Publisher
StartPage	203
SubjectTerms	Algorithmics. Computability. Computer arithmetics Analytical chemistry Applied sciences Artificial Intelligence Benchmarks Biological and medical sciences Chemical Information Chemistry Cluster analysis Computational Biology - methods Computer science; control theory; systems Data processing. List processing. Character string processing Databases, Factual Drug Evaluation, Preclinical - methods Exact sciences and technology General and physical chemistry General pharmacology General. Nomenclature, chemical documentation, computer chemistry Medical sciences Memory organisation. Data processing Models, Molecular Molecular Conformation Pharmaceutical technology. Pharmaceutical industry Pharmacology. Drug treatments Reproducibility of Results Software Structure-Activity Relationship Studies Theoretical computing Theory of reactions, general kinetics. Catalysis. Nomenclature, chemical documentation, computer chemistry Time Factors User-Computer Interface Virtualization
Title	Large-Scale Learning of Structure−Activity Relationships Using a Linear Support Vector Machine and Problem-Specific Metrics
URI	http://dx.doi.org/10.1021/ci100073w https://www.ncbi.nlm.nih.gov/pubmed/21207929 https://www.proquest.com/docview/856034319 https://www.proquest.com/docview/854378295
Volume	51
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
journalDatabaseRights	– providerCode: PRVABC databaseName: American Chemical Society Journals customDbUrl: eissn: 1549-960X dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0033962 issn: 1549-9596 databaseCode: ACS dateStart: 20050101 isFulltext: true titleUrlDefault: https://pubs.acs.org/action/showPublications?display=journals providerName: American Chemical Society
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwhV1Nb9QwEB2VcgAJlW9YCisLOHBxmziJnRxXC1WFWIS0FPW2csY2rEDZFdkVUqXee-Yn8ksY25uUihauydiyPePMc2Y8D-ClzXVOjibjptaa55gUXBureJHUztmMfEwopD15Lw-P8rfHxfEWvLgigi_SfZynIZ704xpcF1Ipn7c3Gk-7z22WVYE11Jca41VRya580J9NvevB9oLrubXULa2Ci_QVV-PL4GcObsPr7rZOTC_5urde1Xt48nfxxn9N4Q7sbHAmG0XDuAtbtrkHN8Ydvdt9OH3nc8D5lEZn2abK6me2cGwaKsquv9tfZz9HGMklWJ8z92W-bFnIM2Ca0UGWGjLPDEoonn0KEQA2CfmZlunGsA-Rr4YHmns3RzbxDF7YPoCjgzcfx4d8w8XAdZ6pFUeZkjaFR5OulC5TNe1-hYkmACGxMN6tFVLTecup0hlCKYhYlamrtDDGVNlD2G4WjX0MTBprVIl5blB6ca0TakEnP4taYJEPYEjKmm32UjsLYXKRzvpVHMCrTo_0MGYseEKNb5eJPu9Fl7F8x2VCwwvG0EsKX9xNymIAu511nA-rJKyYEfyqBsD6t6RDH3DRjV2svQitXSkq6uBRtKnzrlORKIKmT_432V24GX9j-1v0T2GbbMA-Ixy0qodhH_wGw_gDjQ
linkProvider	American Chemical Society
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Lb9QwEB6V9lAkBOW9LV0sxIGLSx62kxxXK6ot7FZI26LeVo4fsAJlV2RXSEjcOfMT-0sY20m2RUVwTcaWY08ynzPj7wN4aZhkGGhSqkspKVMRp1KbjPKotNakGGM8kfbkVIzO2dsLftHQ5LizMDiIGnuqfRJ_wy4Qv1bz2KeVvt2CHS5Y7DZag-G0_eqmaeHFQx3jGC14IVoWoatNXQRS9bUIdGcpa5wMG1Qs_g4zfbg5vhd0i_xAfZXJ56P1qjxS3__gcPy_J9mDuw3qJIPgJvdhy1QPYHfYir09hB9jVxFOpzhIQxrO1Y9kYcnU88uuv5rLn78GKkhNkK6C7tN8WRNfdUAkwW0tNiROJxQxPfng8wFk4qs1DZGVJu-Deg31ovd2rsjE6Xmp-hGcH785G45oo8xAJUuzFVUixrVNHLa0ubBpVuK3IFORRDghFNcuyHEhcfdls9xqxCxKqSKPbSETrXWRPobtalGZp0CENjrLFWNaCWcuZYQtcB9olEwUZz3o4yTOmjernvmkeRLPulnswat2OfFiqF9w8hpfbjJ90ZkuA5nHTUb9az7RWSaO6k0I3oOD1kk2w8oROaYIxooekO4urqFLv8jKLNbOBOcuTwrs4ElwrU3XcRJlCFT3__Wwz2F3dDYZz8Ynp-8O4Hb4we3O1z-DbfQHc4gIaVX2_avxGxvcC-8
linkToPdf	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT9wwEB5RkFqkCmgLdIEuFuqhF0NedpIjWlhRYAFpS8Vt5fjRIlA2IruqVIk75_7E_hLGzgOoqNprMrYce5z5nJl8H8BHHYkIA01IVSYEjaTHqFA6pszLjNEhxhhHpD044Qfn0eEFu6gPivZfGBxEiT2VLolvd3WhTM0w4O_IS9-lln68gDnGcYtbKNQbNm_eMEydgKhlHaMpS3nDJPS4qY1CsnwShV4XosQJMZWSxd-hpgs5_UU4bQfrKk2utqeTbFv-_IPH8f-fZgkWavRJdit3eQMzOn8Lr3qN6Ns7uD22leF0iAPVpOZe_UbGhgwdz-z0Rv---7UrK8kJ0lbSfb8sSuKqD4ggeLzFhsTqhSK2J19dXoAMXNWmJiJX5KxSsaHDQruSQDKwul6yXIbz_v6X3gGtFRqoiMJ4QiX3cY0DizFNwk0YZ_hOiKUnEFZwyZQNdowLPIWZODEKsYuUMk18k4pAKZWGKzCbj3P9HghXWsWJjCIluTUXwsMWeB7UUgSSRR3o4kSO6h1WjlzyPPBH7Sx24FOzpHixqmOwMhvXz5lutaZFRerxnFH3iV-0loGlfOOcdWC9cZSHYSWIIEMEZWkHSHsX19CmYUSux1NrgnOXBCl2sFq510PXfuDFCFjX_vWwm_DybK8_Ov58crQO89V3bvub_QbMojvoDwiUJlnX7Y57vxAOcg
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Large-scale+learning+of+structure-activity+relationships+using+a+linear+support+vector+machine+and+problem-specific+metrics&rft.jtitle=Journal+of+chemical+information+and+modeling&rft.au=Hinselmann%2C+Georg&rft.au=Rosenbaum%2C+Lars&rft.au=Jahn%2C+Andreas&rft.au=Fechner%2C+Nikolas&rft.date=2011-02-28&rft.eissn=1549-960X&rft.volume=51&rft.issue=2&rft.spage=203&rft_id=info:doi/10.1021%2Fci100073w&rft_id=info%3Apmid%2F21207929&rft.externalDocID=21207929
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1549-9596&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1549-9596&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1549-9596&client=summon