Large-Scale Learning of Structure−Activity Relationships Using a Linear Support Vector Machine and Problem-Specific Metrics

The goal of this study was to adapt a recently proposed linear large-scale support vector machine to large-scale binary cheminformatics classification problems and to assess its performance on various benchmarks using virtual screening performance measures. We extended the large-scale linear support...

Full description

Saved in:
Bibliographic Details
Published inJournal of chemical information and modeling Vol. 51; no. 2; pp. 203 - 213
Main Authors Hinselmann, Georg, Rosenbaum, Lars, Jahn, Andreas, Fechner, Nikolas, Ostermann, Claude, Zell, Andreas
Format Journal Article
LanguageEnglish
Published Washington, DC American Chemical Society 28.02.2011
Subjects
Online AccessGet full text
ISSN1549-9596
1549-960X
1549-960X
DOI10.1021/ci100073w

Cover

Abstract The goal of this study was to adapt a recently proposed linear large-scale support vector machine to large-scale binary cheminformatics classification problems and to assess its performance on various benchmarks using virtual screening performance measures. We extended the large-scale linear support vector machine library LIBLINEAR with state-of-the-art virtual high-throughput screening metrics to train classifiers on whole large and unbalanced data sets. The formulation of this linear support machine has an excellent performance if applied to high-dimensional sparse feature vectors. An additional advantage is the average linear complexity in the number of non-zero features of a prediction. Nevertheless, the approach assumes that a problem is linearly separable. Therefore, we conducted an extensive benchmarking to evaluate the performance on large-scale problems up to a size of 175000 samples. To examine the virtual screening performance, we determined the chemotype clusters using Feature Trees and integrated this information to compute weighted AUC-based performance measures and a leave-cluster-out cross-validation. We also considered the BEDROC score, a metric that was suggested to tackle the early enrichment problem. The performance on each problem was evaluated by a nested cross-validation and a nested leave-cluster-out cross-validation. We compared LIBLINEAR against a Naïve Bayes classifier, a random decision forest classifier, and a maximum similarity ranking approach. These reference approaches were outperformed in a direct comparison by LIBLINEAR. A comparison to literature results showed that the LIBLINEAR performance is competitive but without achieving results as good as the top-ranked nonlinear machines on these benchmarks. However, considering the overall convincing performance and computation time of the large-scale support vector machine, the approach provides an excellent alternative to established large-scale classification approaches.
AbstractList The goal of this study was to adapt a recently proposed linear large-scale support vector machine to large-scale binary cheminformatics classification problems and to assess its performance on various benchmarks using virtual screening performance measures. We extended the large-scale linear support vector machine library LIBLINEAR with state-of-the-art virtual high-throughput screening metrics to train classifiers on whole large and unbalanced data sets. The formulation of this linear support machine has an excellent performance if applied to high-dimensional sparse feature vectors. An additional advantage is the average linear complexity in the number of non-zero features of a prediction. Nevertheless, the approach assumes that a problem is linearly separable. Therefore, we conducted an extensive benchmarking to evaluate the performance on large-scale problems up to a size of 175000 samples. To examine the virtual screening performance, we determined the chemotype clusters using Feature Trees and integrated this information to compute weighted AUC-based performance measures and a leave-cluster-out cross-validation. We also considered the BEDROC score, a metric that was suggested to tackle the early enrichment problem. The performance on each problem was evaluated by a nested cross-validation and a nested leave-cluster-out cross-validation. We compared LIBLINEAR against a Naïve Bayes classifier, a random decision forest classifier, and a maximum similarity ranking approach. These reference approaches were outperformed in a direct comparison by LIBLINEAR. A comparison to literature results showed that the LIBLINEAR performance is competitive but without achieving results as good as the top-ranked nonlinear machines on these benchmarks. However, considering the overall convincing performance and computation time of the large-scale support vector machine, the approach provides an excellent alternative to established large-scale classification approaches.The goal of this study was to adapt a recently proposed linear large-scale support vector machine to large-scale binary cheminformatics classification problems and to assess its performance on various benchmarks using virtual screening performance measures. We extended the large-scale linear support vector machine library LIBLINEAR with state-of-the-art virtual high-throughput screening metrics to train classifiers on whole large and unbalanced data sets. The formulation of this linear support machine has an excellent performance if applied to high-dimensional sparse feature vectors. An additional advantage is the average linear complexity in the number of non-zero features of a prediction. Nevertheless, the approach assumes that a problem is linearly separable. Therefore, we conducted an extensive benchmarking to evaluate the performance on large-scale problems up to a size of 175000 samples. To examine the virtual screening performance, we determined the chemotype clusters using Feature Trees and integrated this information to compute weighted AUC-based performance measures and a leave-cluster-out cross-validation. We also considered the BEDROC score, a metric that was suggested to tackle the early enrichment problem. The performance on each problem was evaluated by a nested cross-validation and a nested leave-cluster-out cross-validation. We compared LIBLINEAR against a Naïve Bayes classifier, a random decision forest classifier, and a maximum similarity ranking approach. These reference approaches were outperformed in a direct comparison by LIBLINEAR. A comparison to literature results showed that the LIBLINEAR performance is competitive but without achieving results as good as the top-ranked nonlinear machines on these benchmarks. However, considering the overall convincing performance and computation time of the large-scale support vector machine, the approach provides an excellent alternative to established large-scale classification approaches.
The goal of this study was to adapt a recently proposed linear large-scale support vector machine to large-scale binary cheminformatics classification problems and to assess its performance on various benchmarks using virtual screening performance measures. We extended the large-scale linear support vector machine library LIBLINEAR with state-of-the-art virtual high-throughput screening metrics to train classifiers on whole large and unbalanced data sets. The formulation of this linear support machine has an excellent performance if applied to high-dimensional sparse feature vectors. An additional advantage is the average linear complexity in the number of non-zero features of a prediction. Nevertheless, the approach assumes that a problem is linearly separable. Therefore, we conducted an extensive benchmarking to evaluate the performance on large-scale problems up to a size of 175000 samples. To examine the virtual screening performance, we determined the chemotype clusters using Feature Trees and integrated this information to compute weighted AUC-based performance measures and a leave-cluster-out cross-validation. We also considered the BEDROC score, a metric that was suggested to tackle the early enrichment problem. The performance on each problem was evaluated by a nested cross-validation and a nested leave-cluster-out cross-validation. We compared LIBLINEAR against a Naïve Bayes classifier, a random decision forest classifier, and a maximum similarity ranking approach. These reference approaches were outperformed in a direct comparison by LIBLINEAR. A comparison to literature results showed that the LIBLINEAR performance is competitive but without achieving results as good as the top-ranked nonlinear machines on these benchmarks. However, considering the overall convincing performance and computation time of the large-scale support vector machine, the approach provides an excellent alternative to established large-scale classification approaches.
The goal of this study was to adapt a recently proposed linear large-scale support vector machine to large-scale binary cheminformatics classification problems and to assess its performance on various benchmarks using virtual screening performance measures. We extended the large-scale linear support vector machine library LIBLINEAR with state-of-the-art virtual high-throughput screening metrics to train classifiers on whole large and unbalanced data sets. The formulation of this linear support machine has an excellent performance if applied to high-dimensional sparse feature vectors. An additional advantage is the average linear complexity in the number of non-zero features of a prediction. Nevertheless, the approach assumes that a problem is linearly separable. Therefore, we conducted an extensive benchmarking to evaluate the performance on large-scale problems up to a size of 175000 samples. To examine the virtual screening performance, we determined the chemotype clusters using Feature Trees and integrated this information to compute weighted AUC-based performance measures and a leave-cluster-out cross-validation. We also considered the BEDROC score, a metric that was suggested to tackle the early enrichment problem. The performance on each problem was evaluated by a nested cross-validation and a nested leave-cluster-out cross-validation. We compared LIBLINEAR against a Naïve Bayes classifier, a random decision forest classifier, and a maximum similarity ranking approach. These reference approaches were outperformed in a direct comparison by LIBLINEAR. A comparison to literature results showed that the LIBLINEAR performance is competitive but without achieving results as good as the top-ranked nonlinear machines on these benchmarks. However, considering the overall convincing performance and computation time of the large-scale support vector machine, the approach provides an excellent alternative to established large-scale classification approaches.
The goal of this study was to adapt a recently proposed linear large-scale support vector machine to large-scale binary cheminformatics classification problems and to assess its performance on various benchmarks using virtual screening performance measures. We extended the large-scale linear support vector machine library LIBLINEAR with state-of-the-art virtual high-throughput screening metrics to train classifiers on whole large and unbalanced data sets. The formulation of this linear support machine has an excellent performance if applied to high-dimensional sparse feature vectors. An additional advantage is the average linear complexity in the number of non-zero features of a prediction. Nevertheless, the approach assumes that a problem is linearly separable. Therefore, we conducted an extensive benchmarking to evaluate the performance on large-scale problems up to a size of 175000 samples. To examine the virtual screening performance, we determined the chemotype clusters using Feature Trees and integrated this information to compute weighted AUC-based performance measures and a leave-cluster-out cross-validation. We also considered the BEDROC score, a metric that was suggested to tackle the early enrichment problem. The performance on each problem was evaluated by a nested cross-validation and a nested leave-cluster-out cross-validation. We compared LIBLINEAR against a Naive Bayes classifier, a random decision forest classifier, and a maximum similarity ranking approach. These reference approaches were outperformed in a direct comparison by LIBLINEAR. A comparison to literature results showed that the LIBLINEAR performance is competitive but without achieving results as good as the top-ranked nonlinear machines on these benchmarks. However, considering the overall convincing performance and computation time of the large-scale support vector machine, the approach provides an excellent alternative to established large-scale classification approaches. [PUBLICATION ABSTRACT]
Author Hinselmann, Georg
Rosenbaum, Lars
Zell, Andreas
Jahn, Andreas
Ostermann, Claude
Fechner, Nikolas
Author_xml – sequence: 1
  givenname: Georg
  surname: Hinselmann
  fullname: Hinselmann, Georg
  email: georg.hinselmann@uni-tuebingen.de
– sequence: 2
  givenname: Lars
  surname: Rosenbaum
  fullname: Rosenbaum, Lars
– sequence: 3
  givenname: Andreas
  surname: Jahn
  fullname: Jahn, Andreas
– sequence: 4
  givenname: Nikolas
  surname: Fechner
  fullname: Fechner, Nikolas
– sequence: 5
  givenname: Claude
  surname: Ostermann
  fullname: Ostermann, Claude
– sequence: 6
  givenname: Andreas
  surname: Zell
  fullname: Zell, Andreas
BackLink http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=24332665$$DView record in Pascal Francis
https://www.ncbi.nlm.nih.gov/pubmed/21207929$$D View this record in MEDLINE/PubMed
BookMark eNptkc1u1TAQhS1URH9gwQsgCwlVLEL9kzjxsqooIKUCcSliF82d2K2rXCe1HVAX7FnziDwJvvReKhVWtjTfOTNzZp_s-NEbQp5y9oozwY_QccZYLb89IHu8KnWhFfuys_1XWu2S_RivGJNSK_GI7AouWK2F3iPfWwgXplggDIa2BoJ3_oKOli5SmDHNwfz68fMYk_vq0g39aAZIbvTx0k2Rnsc1C7R1PgvpYp6mMST62WAaAz0DvMwFCr6nH8K4HMyqWEwGnXVIz0wKDuNj8tDCEM2TzXtAzk9ffzp5W7Tv37w7OW4LKGWdClS8X4LQmivbKCvrJStZjQwE4wqrvsylSgEvta0b2_OqQkTdcKtB9H2v5QE5vPWdwng9m5i6lYtohgG8GefYNVXu0whdZfL5PfJqnIPPw2VIMVlKvrZ7toHm5cr03RTcCsJNt401Ay82AMScrA3g0cU7rpRSKLXudnTLYRhjDMZ26NKfhFMAN3ScdesDd38PnBUv7ym2pv9jN1MAxrs1_uV-AxLOsiE
CitedBy_id crossref_primary_10_1186_s13321_015_0070_x
crossref_primary_10_3389_fnins_2019_01250
crossref_primary_10_2174_1570163817666200806165934
crossref_primary_10_1109_TCBB_2015_2424435
crossref_primary_10_1002_cjoc_201300390
crossref_primary_10_1016_j_bmc_2012_03_030
crossref_primary_10_3390_diagnostics11091614
crossref_primary_10_1016_j_brainres_2023_148469
crossref_primary_10_1186_1758_2946_3_11
crossref_primary_10_1021_acschemneuro_8b00381
crossref_primary_10_3389_fnbeh_2023_1148172
crossref_primary_10_3389_fnins_2019_00494
crossref_primary_10_1142_S1469026822500195
crossref_primary_10_1155_2016_8309253
crossref_primary_10_1517_17460441_2014_866943
crossref_primary_10_3389_fnins_2020_00027
crossref_primary_10_1007_s11063_022_10879_6
crossref_primary_10_3390_molecules21111554
crossref_primary_10_3390_molecules25225277
crossref_primary_10_1021_acs_jcim_6b00753
crossref_primary_10_1093_bib_bbaa348
crossref_primary_10_1007_s10910_018_0855_z
crossref_primary_10_1016_j_drudis_2014_10_012
crossref_primary_10_1142_S0219720024500033
crossref_primary_10_3389_fnins_2022_989589
crossref_primary_10_1186_s13321_015_0110_6
crossref_primary_10_3389_fnins_2022_891247
Cites_doi 10.1093/nar/gkn923
10.2307/2528823
10.1021/ci8002649
10.1007/s10822-008-9181-z
10.1039/b409865j
10.1021/ci034160g
10.1021/ci8004379
10.1145/1656274.1656278
10.1021/ci010366a
10.1007/s10822-008-9199-2
10.1007/s10822-007-9167-2
10.1145/1390156.1390208
10.1002/qsar.200510135
10.2174/138620709788167980
10.1023/A:1008068904628
10.1186/1758-2946-1-7
10.1021/ci0498719
10.1021/jm0003992
10.1021/ci049714+
10.1007/s10822-008-9196-5
10.1016/j.jmgm.2009.10.001
10.1093/bioinformatics/bti683
10.1021/ci900161g
10.1007/s10822-007-9163-6
10.1016/j.neunet.2005.07.009
10.1021/ci8003978
10.1186/1752-153X-2-11
10.1021/ci025584y
10.1021/ci600426e
10.2174/138161206777585274
10.1021/ci600358f
10.1093/bioinformatics/btq140
10.1186/1471-2105-9-401
10.1093/bioinformatics/btm341
10.1021/ci034143r
10.1021/ci0496144
10.1021/jm049611i
10.1021/ci010132r
10.1021/ci060138m
10.1007/s10822-008-9170-2
10.1016/0898-5529(90)90156-3
ContentType Journal Article
Copyright Copyright © 2011 American Chemical Society
2015 INIST-CNRS
Copyright American Chemical Society Feb 28, 2011
Copyright_xml – notice: Copyright © 2011 American Chemical Society
– notice: 2015 INIST-CNRS
– notice: Copyright American Chemical Society Feb 28, 2011
DBID AAYXX
CITATION
IQODW
CGR
CUY
CVF
ECM
EIF
NPM
7SC
7SR
7U5
8BQ
8FD
JG9
JQ2
L7M
L~C
L~D
7X8
DOI 10.1021/ci100073w
DatabaseName CrossRef
Pascal-Francis
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
Computer and Information Systems Abstracts
Engineered Materials Abstracts
Solid State and Superconductivity Abstracts
METADEX
Technology Research Database
Materials Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
MEDLINE - Academic
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
Materials Research Database
Engineered Materials Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Solid State and Superconductivity Abstracts
Advanced Technologies Database with Aerospace
METADEX
Computer and Information Systems Abstracts Professional
MEDLINE - Academic
DatabaseTitleList MEDLINE - Academic

MEDLINE
Materials Research Database
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: EIF
  name: MEDLINE
  url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search
  sourceTypes: Index Database
DeliveryMethod fulltext_linktorsrc
Discipline Chemistry
Applied Sciences
EISSN 1549-960X
EndPage 213
ExternalDocumentID 2287004201
21207929
24332665
10_1021_ci100073w
b121688007
Genre Research Support, Non-U.S. Gov't
Journal Article
Feature
GroupedDBID -
4.4
55A
5GY
5VS
7~N
AABXI
ABFLS
ABMVS
ABUCX
ACGFS
ACIWK
ACNCT
ACS
AEESW
AENEX
AFEFF
ALMA_UNASSIGNED_HOLDINGS
AQSVZ
D0L
DU5
EBS
ED
ED~
EJD
F5P
GNL
IH9
JG
JG~
LG6
P2P
PQEST
PQQKQ
RNS
ROL
UI2
VF5
VG9
W1F
X
---
-~X
AAYXX
ABBLG
ABJNI
ABLBI
ABQRX
ADHLV
AHGAQ
CITATION
CUPRZ
GGK
1WB
53G
ACRPL
ADNMO
AEYZD
AGQPQ
ANPPW
ANTXH
IHE
IQODW
CGR
CUY
CVF
ECM
EIF
NPM
7SC
7SR
7U5
8BQ
8FD
JG9
JQ2
L7M
L~C
L~D
7X8
ID FETCH-LOGICAL-a437t-c61dba29916f86f37b0407c0a2016c5d429956a149f78fd155ccc981f9a2ddd93
IEDL.DBID ACS
ISSN 1549-9596
1549-960X
IngestDate Fri Jul 11 11:09:56 EDT 2025
Mon Jun 30 10:54:37 EDT 2025
Thu Jan 02 22:11:24 EST 2025
Mon Jul 21 09:18:27 EDT 2025
Wed Oct 01 06:44:08 EDT 2025
Thu Apr 24 23:08:36 EDT 2025
Thu Aug 27 13:50:13 EDT 2020
IsPeerReviewed true
IsScholarly true
Issue 2
Keywords High throughput screening
Bayes estimation
Performance evaluation
High performance
Statistical analysis
Tree(graph)
Linear machine
Large scale structure
Virtual screening
Very large databases
Random decision forests
System with n degrees of freedom
Computational chemistry
Structure activity relation
Vector support machine
Linear complexity
Metric
Large scale
Non linear effect
Bayes decision
Artificial intelligence
Comparative study
Turbulence structure
Binary classification
Language English
License CC BY 4.0
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a437t-c61dba29916f86f37b0407c0a2016c5d429956a149f78fd155ccc981f9a2ddd93
Notes SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 14
ObjectType-Article-1
ObjectType-Feature-2
content type line 23
PMID 21207929
PQID 856034319
PQPubID 28739
PageCount 11
ParticipantIDs proquest_miscellaneous_854378295
proquest_journals_856034319
pubmed_primary_21207929
pascalfrancis_primary_24332665
crossref_citationtrail_10_1021_ci100073w
crossref_primary_10_1021_ci100073w
acs_journals_10_1021_ci100073w
ProviderPackageCode JG~
55A
AABXI
GNL
VF5
7~N
VG9
W1F
ACS
AEESW
AFEFF
ABMVS
ABUCX
IH9
AQSVZ
ED~
UI2
CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2011-02-28
PublicationDateYYYYMMDD 2011-02-28
PublicationDate_xml – month: 02
  year: 2011
  text: 2011-02-28
  day: 28
PublicationDecade 2010
PublicationPlace Washington, DC
PublicationPlace_xml – name: Washington, DC
– name: United States
– name: Washington
PublicationTitle Journal of chemical information and modeling
PublicationTitleAlternate J. Chem. Inf. Model
PublicationYear 2011
Publisher American Chemical Society
Publisher_xml – name: American Chemical Society
References Melville J. L. (ref7/cit7) 2009; 12
Nasr R. (ref19/cit19) 2009; 1
Gower J. C. (ref35/cit35) 1971; 27
Hsieh J.-H. (ref11/cit11) 2008; 22
ref16/cit16
Clark R. D. (ref32/cit32) 2008; 22
Nicholls A. (ref30/cit30) 2008; 22
(ref50/cit50) 2008
ref20/cit20
Swamidass S. J. (ref10/cit10) 2009; 49
Rohrer S. G. (ref18/cit18) 2009; 49
Stahl M. (ref42/cit42) 2001; 44
Chen J. H. (ref43/cit43) 2007; 23
Bottou L. (ref39/cit39) 2008
Durant J. L. (ref12/cit12) 2002; 42
Good A. C. (ref14/cit14) 2008; 22
Chen J. (ref5/cit5) 2005; 21
Fröhlich H. (ref22/cit22) 2006; 25
ref21/cit21
Mackey M. D. (ref13/cit13) 2009; 49
Steinbeck C. (ref27/cit27) 2006; 12
Ralaivola L. (ref23/cit23) 2005; 18
Bender A. (ref8/cit8) 2004; 44
ref49/cit49
ref38/cit38
Whittle M. (ref6/cit6) 2006; 46
Irwin J. J. (ref51/cit51) 2005; 45
Gasteiger J. (ref44/cit44) 1992; 3
Schölkopf B. (ref34/cit34) 2002
Swamidass S. J. (ref36/cit36) 2007; 47
Han L. (ref9/cit9) 2008; 9
Bender A. (ref25/cit25) 2004; 47
Steinbeck C. (ref26/cit26) 2003; 43
Hall M. (ref48/cit48) 2009; 11
Fan R.-E. (ref1/cit1) 2008; 9
Mahé P. (ref24/cit24) 2006; 46
Sutherland J. J. (ref40/cit40) 2003; 43
Hansen K. (ref17/cit17) 2009; 49
Rarey M. (ref31/cit31) 1998; 12
Swamidass S. J. (ref33/cit33) 2010; 26
Oprea T. I. (ref41/cit41) 2001; 41
Hert J. (ref46/cit46) 2004; 2
Jain A. N. (ref29/cit29) 2008; 22
Wang Y. (ref4/cit4) 2009; 37
Truchon J.-F. (ref15/cit15) 2007; 47
(ref45/cit45) 2008
Wild D. J. (ref3/cit3) 2008; 2
ref47/cit47
Kirchmair J. (ref28/cit28) 2008; 22
Chen B. (ref2/cit2) 2010; 28
Svetnik V. (ref37/cit37) 2003; 43
References_xml – volume: 37
  start-page: 1
  year: 2009
  ident: ref4/cit4
  publication-title: Nucleic Acids Res.
  doi: 10.1093/nar/gkn923
– ident: ref20/cit20
– volume: 27
  start-page: 857
  year: 1971
  ident: ref35/cit35
  publication-title: Biometrics
  doi: 10.2307/2528823
– volume-title: Mining Massive Data Sets for Security
  year: 2008
  ident: ref39/cit39
– volume: 49
  start-page: 169
  year: 2009
  ident: ref18/cit18
  publication-title: J. Chem. Inf. Model.
  doi: 10.1021/ci8002649
– volume: 22
  start-page: 141
  year: 2008
  ident: ref32/cit32
  publication-title: J. Comput.-Aided Mol. Des.
  doi: 10.1007/s10822-008-9181-z
– ident: ref47/cit47
– volume-title: Schrödinger MacroModel
  year: 2008
  ident: ref45/cit45
– volume: 2
  start-page: 3256
  year: 2004
  ident: ref46/cit46
  publication-title: Org. Biomol. Chem.
  doi: 10.1039/b409865j
– volume-title: Learning with Kernels
  year: 2002
  ident: ref34/cit34
– volume: 43
  start-page: 1947
  year: 2003
  ident: ref37/cit37
  publication-title: J. Chem. Inf. Comput. Sci.
  doi: 10.1021/ci034160g
– volume: 49
  start-page: 756
  year: 2009
  ident: ref10/cit10
  publication-title: J. Chem. Inf. Model.
  doi: 10.1021/ci8004379
– volume: 11
  start-page: 10
  year: 2009
  ident: ref48/cit48
  publication-title: SIGKDD Explorations
  doi: 10.1145/1656274.1656278
– volume: 41
  start-page: 1308
  year: 2001
  ident: ref41/cit41
  publication-title: J. Chem. Inf. Comput. Sci.
  doi: 10.1021/ci010366a
– volume: 22
  start-page: 593
  year: 2008
  ident: ref11/cit11
  publication-title: J. Comput.-Aided Mol. Des.
  doi: 10.1007/s10822-008-9199-2
– volume: 22
  start-page: 169
  year: 2008
  ident: ref14/cit14
  publication-title: J. Comput.-Aided Mol. Des.
  doi: 10.1007/s10822-007-9167-2
– ident: ref38/cit38
  doi: 10.1145/1390156.1390208
– volume: 25
  start-page: 317
  year: 2006
  ident: ref22/cit22
  publication-title: QSAR Comb. Sci.
  doi: 10.1002/qsar.200510135
– volume: 12
  start-page: 332
  year: 2009
  ident: ref7/cit7
  publication-title: Comb. Chem. High Throughput Screening
  doi: 10.2174/138620709788167980
– volume: 12
  start-page: 471
  year: 1998
  ident: ref31/cit31
  publication-title: J. Comput.-Aided Mol. Des.
  doi: 10.1023/A:1008068904628
– volume: 1
  start-page: 7
  year: 2009
  ident: ref19/cit19
  publication-title: J. Cheminf.
  doi: 10.1186/1758-2946-1-7
– ident: ref16/cit16
– volume: 9
  start-page: 1871
  year: 2008
  ident: ref1/cit1
  publication-title: J. Mach. Learn. Res.
– volume: 44
  start-page: 1708
  year: 2004
  ident: ref8/cit8
  publication-title: J. Chem. Inf. Comput. Sci.
  doi: 10.1021/ci0498719
– ident: ref49/cit49
– volume: 44
  start-page: 1035
  year: 2001
  ident: ref42/cit42
  publication-title: J. Med. Chem.
  doi: 10.1021/jm0003992
– volume: 45
  start-page: 177
  year: 2005
  ident: ref51/cit51
  publication-title: J. Chem. Inf. Model.
  doi: 10.1021/ci049714+
– volume: 22
  start-page: 133
  year: 2008
  ident: ref29/cit29
  publication-title: J. Comput.-Aided Mol. Des.
  doi: 10.1007/s10822-008-9196-5
– volume: 28
  start-page: 420
  year: 2010
  ident: ref2/cit2
  publication-title: J. Mol. Graphics Modell.
  doi: 10.1016/j.jmgm.2009.10.001
– volume: 21
  start-page: 4133
  year: 2005
  ident: ref5/cit5
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/bti683
– volume: 49
  start-page: 2077
  year: 2009
  ident: ref17/cit17
  publication-title: J. Chem. Inf. Model.
  doi: 10.1021/ci900161g
– ident: ref21/cit21
– volume: 22
  start-page: 213
  year: 2008
  ident: ref28/cit28
  publication-title: J. Comput.-Aided Mol. Des.
  doi: 10.1007/s10822-007-9163-6
– volume: 18
  start-page: 1093
  year: 2005
  ident: ref23/cit23
  publication-title: Neural Networks
  doi: 10.1016/j.neunet.2005.07.009
– volume: 49
  start-page: 1154
  year: 2009
  ident: ref13/cit13
  publication-title: J. Chem. Inf. Model.
  doi: 10.1021/ci8003978
– volume: 2
  start-page: 11
  year: 2008
  ident: ref3/cit3
  publication-title: Chem. Cent. J.
  doi: 10.1186/1752-153X-2-11
– volume: 43
  start-page: 493
  year: 2003
  ident: ref26/cit26
  publication-title: J. Chem. Inf. Comput. Sci.
  doi: 10.1021/ci025584y
– volume: 47
  start-page: 488
  year: 2007
  ident: ref15/cit15
  publication-title: J. Chem. Inf. Model
  doi: 10.1021/ci600426e
– volume: 12
  start-page: 2111
  year: 2006
  ident: ref27/cit27
  publication-title: Curr. Pharm. Des.
  doi: 10.2174/138161206777585274
– volume-title: dragonX
  year: 2008
  ident: ref50/cit50
– volume: 47
  start-page: 302
  year: 2007
  ident: ref36/cit36
  publication-title: J. Chem. Inf. Model.
  doi: 10.1021/ci600358f
– volume: 26
  start-page: 1348
  year: 2010
  ident: ref33/cit33
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btq140
– volume: 9
  start-page: 401
  year: 2008
  ident: ref9/cit9
  publication-title: BMC Bioinf.
  doi: 10.1186/1471-2105-9-401
– volume: 23
  start-page: 2348
  year: 2007
  ident: ref43/cit43
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btm341
– volume: 43
  start-page: 1906
  year: 2003
  ident: ref40/cit40
  publication-title: J. Chem. Inf. Comput. Sci.
  doi: 10.1021/ci034143r
– volume: 46
  start-page: 2206
  year: 2006
  ident: ref6/cit6
  publication-title: J. Chem. Inf. Model.
  doi: 10.1021/ci0496144
– volume: 47
  start-page: 6569
  year: 2004
  ident: ref25/cit25
  publication-title: J. Med. Chem.
  doi: 10.1021/jm049611i
– volume: 42
  start-page: 1273
  year: 2002
  ident: ref12/cit12
  publication-title: J. Chem. Inf. Comput. Sci.
  doi: 10.1021/ci010132r
– volume: 46
  start-page: 2003
  year: 2006
  ident: ref24/cit24
  publication-title: J. Chem. Inf. Model.
  doi: 10.1021/ci060138m
– volume: 22
  start-page: 239
  year: 2008
  ident: ref30/cit30
  publication-title: J. Comput.-Aided Mol. Des.
  doi: 10.1007/s10822-008-9170-2
– volume: 3
  start-page: 537
  year: 1992
  ident: ref44/cit44
  publication-title: Tetrahedron Comput. Methodol.
  doi: 10.1016/0898-5529(90)90156-3
SSID ssj0033962
Score 2.0967305
Snippet The goal of this study was to adapt a recently proposed linear large-scale support vector machine to large-scale binary cheminformatics classification problems...
SourceID proquest
pubmed
pascalfrancis
crossref
acs
SourceType Aggregation Database
Index Database
Enrichment Source
Publisher
StartPage 203
SubjectTerms Algorithmics. Computability. Computer arithmetics
Analytical chemistry
Applied sciences
Artificial Intelligence
Benchmarks
Biological and medical sciences
Chemical Information
Chemistry
Cluster analysis
Computational Biology - methods
Computer science; control theory; systems
Data processing. List processing. Character string processing
Databases, Factual
Drug Evaluation, Preclinical - methods
Exact sciences and technology
General and physical chemistry
General pharmacology
General. Nomenclature, chemical documentation, computer chemistry
Medical sciences
Memory organisation. Data processing
Models, Molecular
Molecular Conformation
Pharmaceutical technology. Pharmaceutical industry
Pharmacology. Drug treatments
Reproducibility of Results
Software
Structure-Activity Relationship
Studies
Theoretical computing
Theory of reactions, general kinetics. Catalysis. Nomenclature, chemical documentation, computer chemistry
Time Factors
User-Computer Interface
Virtualization
Title Large-Scale Learning of Structure−Activity Relationships Using a Linear Support Vector Machine and Problem-Specific Metrics
URI http://dx.doi.org/10.1021/ci100073w
https://www.ncbi.nlm.nih.gov/pubmed/21207929
https://www.proquest.com/docview/856034319
https://www.proquest.com/docview/854378295
Volume 51
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVABC
  databaseName: American Chemical Society Journals
  customDbUrl:
  eissn: 1549-960X
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0033962
  issn: 1549-9596
  databaseCode: ACS
  dateStart: 20050101
  isFulltext: true
  titleUrlDefault: https://pubs.acs.org/action/showPublications?display=journals
  providerName: American Chemical Society
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwhV1Nb9QwEB2VcgAJlW9YCisLOHBxmziJnRxXC1WFWIS0FPW2csY2rEDZFdkVUqXee-Yn8ksY25uUihauydiyPePMc2Y8D-ClzXVOjibjptaa55gUXBureJHUztmMfEwopD15Lw-P8rfHxfEWvLgigi_SfZynIZ704xpcF1Ipn7c3Gk-7z22WVYE11Jca41VRya580J9NvevB9oLrubXULa2Ci_QVV-PL4GcObsPr7rZOTC_5urde1Xt48nfxxn9N4Q7sbHAmG0XDuAtbtrkHN8Ydvdt9OH3nc8D5lEZn2abK6me2cGwaKsquv9tfZz9HGMklWJ8z92W-bFnIM2Ca0UGWGjLPDEoonn0KEQA2CfmZlunGsA-Rr4YHmns3RzbxDF7YPoCjgzcfx4d8w8XAdZ6pFUeZkjaFR5OulC5TNe1-hYkmACGxMN6tFVLTecup0hlCKYhYlamrtDDGVNlD2G4WjX0MTBprVIl5blB6ca0TakEnP4taYJEPYEjKmm32UjsLYXKRzvpVHMCrTo_0MGYseEKNb5eJPu9Fl7F8x2VCwwvG0EsKX9xNymIAu511nA-rJKyYEfyqBsD6t6RDH3DRjV2svQitXSkq6uBRtKnzrlORKIKmT_432V24GX9j-1v0T2GbbMA-Ixy0qodhH_wGw_gDjQ
linkProvider American Chemical Society
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Lb9QwEB6V9lAkBOW9LV0sxIGLSx62kxxXK6ot7FZI26LeVo4fsAJlV2RXSEjcOfMT-0sY20m2RUVwTcaWY08ynzPj7wN4aZhkGGhSqkspKVMRp1KbjPKotNakGGM8kfbkVIzO2dsLftHQ5LizMDiIGnuqfRJ_wy4Qv1bz2KeVvt2CHS5Y7DZag-G0_eqmaeHFQx3jGC14IVoWoatNXQRS9bUIdGcpa5wMG1Qs_g4zfbg5vhd0i_xAfZXJ56P1qjxS3__gcPy_J9mDuw3qJIPgJvdhy1QPYHfYir09hB9jVxFOpzhIQxrO1Y9kYcnU88uuv5rLn78GKkhNkK6C7tN8WRNfdUAkwW0tNiROJxQxPfng8wFk4qs1DZGVJu-Deg31ovd2rsjE6Xmp-hGcH785G45oo8xAJUuzFVUixrVNHLa0ubBpVuK3IFORRDghFNcuyHEhcfdls9xqxCxKqSKPbSETrXWRPobtalGZp0CENjrLFWNaCWcuZYQtcB9olEwUZz3o4yTOmjernvmkeRLPulnswat2OfFiqF9w8hpfbjJ90ZkuA5nHTUb9az7RWSaO6k0I3oOD1kk2w8oROaYIxooekO4urqFLv8jKLNbOBOcuTwrs4ElwrU3XcRJlCFT3__Wwz2F3dDYZz8Ynp-8O4Hb4we3O1z-DbfQHc4gIaVX2_avxGxvcC-8
linkToPdf http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT9wwEB5RkFqkCmgLdIEuFuqhF0NedpIjWlhRYAFpS8Vt5fjRIlA2IruqVIk75_7E_hLGzgOoqNprMrYce5z5nJl8H8BHHYkIA01IVSYEjaTHqFA6pszLjNEhxhhHpD044Qfn0eEFu6gPivZfGBxEiT2VLolvd3WhTM0w4O_IS9-lln68gDnGcYtbKNQbNm_eMEydgKhlHaMpS3nDJPS4qY1CsnwShV4XosQJMZWSxd-hpgs5_UU4bQfrKk2utqeTbFv-_IPH8f-fZgkWavRJdit3eQMzOn8Lr3qN6Ns7uD22leF0iAPVpOZe_UbGhgwdz-z0Rv---7UrK8kJ0lbSfb8sSuKqD4ggeLzFhsTqhSK2J19dXoAMXNWmJiJX5KxSsaHDQruSQDKwul6yXIbz_v6X3gGtFRqoiMJ4QiX3cY0DizFNwk0YZ_hOiKUnEFZwyZQNdowLPIWZODEKsYuUMk18k4pAKZWGKzCbj3P9HghXWsWJjCIluTUXwsMWeB7UUgSSRR3o4kSO6h1WjlzyPPBH7Sx24FOzpHixqmOwMhvXz5lutaZFRerxnFH3iV-0loGlfOOcdWC9cZSHYSWIIEMEZWkHSHsX19CmYUSux1NrgnOXBCl2sFq510PXfuDFCFjX_vWwm_DybK8_Ov58crQO89V3bvub_QbMojvoDwiUJlnX7Y57vxAOcg
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Large-scale+learning+of+structure-activity+relationships+using+a+linear+support+vector+machine+and+problem-specific+metrics&rft.jtitle=Journal+of+chemical+information+and+modeling&rft.au=Hinselmann%2C+Georg&rft.au=Rosenbaum%2C+Lars&rft.au=Jahn%2C+Andreas&rft.au=Fechner%2C+Nikolas&rft.date=2011-02-28&rft.eissn=1549-960X&rft.volume=51&rft.issue=2&rft.spage=203&rft_id=info:doi/10.1021%2Fci100073w&rft_id=info%3Apmid%2F21207929&rft.externalDocID=21207929
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1549-9596&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1549-9596&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1549-9596&client=summon