Complexity curve: a graphical measure of data complexity and classifier performance

We describe a method for assessing data set complexity based on the estimation of the underlining probability distribution and Hellinger distance. In contrast to some popular complexity measures, it is not focused on the shape of a decision boundary in a classification task but on the amount of avai...

Full description

Saved in:
Bibliographic Details
Published inPeerJ. Computer science Vol. 2; p. e76
Main Authors Zubek, Julian, Plewczynski, Dariusz M.
Format Journal Article
LanguageEnglish
Published San Diego PeerJ. Ltd 08.08.2016
PeerJ, Inc
PeerJ Inc
Subjects
Online AccessGet full text
ISSN2376-5992
2376-5992
DOI10.7717/peerj-cs.76

Cover

Abstract We describe a method for assessing data set complexity based on the estimation of the underlining probability distribution and Hellinger distance. In contrast to some popular complexity measures, it is not focused on the shape of a decision boundary in a classification task but on the amount of available data with respect to the attribute structure. Complexity is expressed in terms of graphical plot, which we call complexity curve. It demonstrates the relative increase of available information with the growth of sample size. We perform theoretical and experimental examination of properties of the introduced complexity measure and show its relation to the variance component of classification error. We then compare it with popular data complexity measures on 81 diverse data sets and show that it can contribute to explaining performance of specific classifiers on these sets. We also apply our methodology to a panel of simple benchmark data sets, demonstrating how it can be used in practice to gain insights into data characteristics. Moreover, we show that the complexity curve is an effective tool for reducing the size of the training set (data pruning), allowing to significantly speed up the learning process without compromising classification accuracy. The associated code is available to download at: https://github.com/zubekj/complexity_curve (open source Python implementation).
AbstractList We describe a method for assessing data set complexity based on the estimation of the underlining probability distribution and Hellinger distance. In contrast to some popular complexity measures, it is not focused on the shape of a decision boundary in a classification task but on the amount of available data with respect to the attribute structure. Complexity is expressed in terms of graphical plot, which we call complexity curve. It demonstrates the relative increase of available information with the growth of sample size. We perform theoretical and experimental examination of properties of the introduced complexity measure and show its relation to the variance component of classification error. We then compare it with popular data complexity measures on 81 diverse data sets and show that it can contribute to explaining performance of specific classifiers on these sets. We also apply our methodology to a panel of simple benchmark data sets, demonstrating how it can be used in practice to gain insights into data characteristics. Moreover, we show that the complexity curve is an effective tool for reducing the size of the training set (data pruning), allowing to significantly speed up the learning process without compromising classification accuracy. The associated code is available to download at:
We describe a method for assessing data set complexity based on the estimation of the underlining probability distribution and Hellinger distance. In contrast to some popular complexity measures, it is not focused on the shape of a decision boundary in a classification task but on the amount of available data with respect to the attribute structure. Complexity is expressed in terms of graphical plot, which we call complexity curve. It demonstrates the relative increase of available information with the growth of sample size. We perform theoretical and experimental examination of properties of the introduced complexity measure and show its relation to the variance component of classification error. We then compare it with popular data complexity measures on 81 diverse data sets and show that it can contribute to explaining performance of specific classifiers on these sets. We also apply our methodology to a panel of simple benchmark data sets, demonstrating how it can be used in practice to gain insights into data characteristics. Moreover, we show that the complexity curve is an effective tool for reducing the size of the training set (data pruning), allowing to significantly speed up the learning process without compromising classification accuracy. The associated code is available to download at: https://github.com/zubekj/complexity_curve (open source Python implementation).
We describe a method for assessing data set complexity based on the estimation of the underlining probability distribution and Hellinger distance. In contrast to some popular complexity measures, it is not focused on the shape of a decision boundary in a classification task but on the amount of available data with respect to the attribute structure. Complexity is expressed in terms of graphical plot, which we call complexity curve. It demonstrates the relative increase of available information with the growth of sample size. We perform theoretical and experimental examination of properties of the introduced complexity measure and show its relation to the variance component of classification error. We then compare it with popular data complexity measures on 81 diverse data sets and show that it can contribute to explaining performance of specific classifiers on these sets. We also apply our methodology to a panel of simple benchmark data sets, demonstrating how it can be used in practice to gain insights into data characteristics. Moreover, we show that the complexity curve is an effective tool for reducing the size of the training set (data pruning), allowing to significantly speed up the learning process without compromising classification accuracy. The associated code is available to download at: https://github.com/zubekj/complexity_curve (open source Python implementation).
ArticleNumber e76
Audience Academic
Author Zubek, Julian
Plewczynski, Dariusz M.
Author_xml – sequence: 1
  givenname: Julian
  surname: Zubek
  fullname: Zubek, Julian
  organization: Centre of New Technologies, University of Warsaw, Warsaw, Poland, Institute of Computer Science, Polish Academy of Sciences, Warsaw, Poland
– sequence: 2
  givenname: Dariusz M.
  surname: Plewczynski
  fullname: Plewczynski, Dariusz M.
  organization: Centre of New Technologies, University of Warsaw, Warsaw, Poland
BookMark eNp9kU1v1DAQhiPUSpTSE38gEicEWex1bCfcqhUfK1VCasvZGjvjxaskDrYD3X-Pt4soi6D2wePx876aGT8rTkY_YlG8oGQhJZVvJ8SwrUxcSPGkOFsyKSretsuTP-KnxUWMW0II5TSv9qy4Wflh6vHOpV1p5vAd35VQbgJMX52BvhwQ4hyw9LbsIEFpHmgYu9L0EKOzDkM5YbA-DDAafF6cWugjXvw6z4svH97frj5VV58_rleXV5WpBU8VgCXUcKmRtYxrRpmgdcPypRZoG91BLtJS2zTYgGBAkDNWa85tSxqma3ZerA--nYetmoIbIOyUB6fuEz5sFITkTI9KS0NFR2otrKw5LjVhQgjNtDXatEuRvd4cvOZxgt0P6PvfhpSo_XzV_XyViUru8ZcHfAr-24wxqa2fw5i7VbTlpBaE1-yB2kCuwY3WpwBmcNGoy_07p4yTTC3-QeXd4eBM_mPrcv5I8OpIkJmEd2kDc4xqfXN9zNIDa4KPMaBVxiVILksCuP4_zb3-S_PYKH4CsKzHIQ
CitedBy_id crossref_primary_10_1016_j_patcog_2022_109240
crossref_primary_10_1002_sam_11463
crossref_primary_10_3233_IDA_215962
crossref_primary_10_1007_s10994_023_06361_6
crossref_primary_10_1145_3347711
crossref_primary_10_1016_j_neucom_2024_127967
crossref_primary_10_3233_JIFS_210624
Cites_doi 10.1016/j.cor.2013.11.015
10.1016/j.patrec.2014.11.006
10.1109/34.990132
10.1007/s10618-011-0222-1
10.1109/FUZZY.1996.561296
10.1002/9780470316849
10.1007/s10994-013-5422-z
10.1016/j.patcog.2012.09.022
10.1016/0012-365X(79)90084-0
10.1080/713827175
10.1016/j.neucom.2012.04.039
10.1016/j.ins.2015.07.025
10.1016/S0893-6080(00)00026-5
10.1007/978-1-84628-172-3_1
10.1016/j.cor.2011.07.006
10.1098/rsta.2009.0159
10.1007/978-3-540-89689-0_1
10.1007/s10115-013-0700-4
ContentType Journal Article
Copyright COPYRIGHT 2016 PeerJ. Ltd.
2016 Zubek and Plewczynski. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml – notice: COPYRIGHT 2016 PeerJ. Ltd.
– notice: 2016 Zubek and Plewczynski. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID AAYXX
CITATION
ISR
3V.
7XB
8AL
8FE
8FG
8FK
ABUWG
AFKRA
ARAPS
AZQEC
BENPR
BGLVJ
CCPQU
DWQXO
GNUQQ
HCIFZ
JQ2
K7-
M0N
P5Z
P62
PHGZM
PHGZT
PIMPY
PKEHL
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
Q9U
ADTOC
UNPAY
DOA
DOI 10.7717/peerj-cs.76
DatabaseName CrossRef
Gale In Context: Science
ProQuest Central (Corporate)
ProQuest Central (purchase pre-March 2016)
Computing Database (Alumni Edition)
ProQuest SciTech Collection
ProQuest Technology Collection
ProQuest Central (Alumni) (purchase pre-March 2016)
ProQuest Central (Alumni)
ProQuest Central UK/Ireland
Advanced Technologies & Computer Science Collection
ProQuest Central Essentials Local Electronic Collection Information
ProQuest Central
Technology Collection (ProQuest)
ProQuest One Community College
ProQuest Central Korea
ProQuest Central Student
SciTech Premium Collection
ProQuest Computer Science Collection
Computer Science Database (Proquest)
Computing Database
Advanced Technologies & Aerospace Database
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Premium
ProQuest One Academic
Publicly Available Content Database
ProQuest One Academic Middle East (New)
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic
ProQuest One Academic UKI Edition
ProQuest Central China
ProQuest Central Basic
Unpaywall for CDI: Periodical Content
Unpaywall
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
Publicly Available Content Database
Computer Science Database
ProQuest Central Student
Technology Collection
ProQuest One Academic Middle East (New)
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
ProQuest Computer Science Collection
ProQuest Central (Alumni Edition)
SciTech Premium Collection
ProQuest One Community College
ProQuest Central China
ProQuest Central
ProQuest One Applied & Life Sciences
ProQuest Central Korea
ProQuest Central (New)
Advanced Technologies & Aerospace Collection
ProQuest Computing
ProQuest Central Basic
ProQuest Computing (Alumni Edition)
ProQuest One Academic Eastern Edition
ProQuest Technology Collection
ProQuest SciTech Collection
Advanced Technologies & Aerospace Database
ProQuest One Academic UKI Edition
ProQuest One Academic
ProQuest One Academic (New)
ProQuest Central (Alumni)
DatabaseTitleList

Publicly Available Content Database
CrossRef
Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: UNPAY
  name: Unpaywall
  url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/
  sourceTypes: Open Access Repository
– sequence: 3
  dbid: 8FG
  name: ProQuest Technology Collection
  url: https://search.proquest.com/technologycollection1
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 2376-5992
ExternalDocumentID oai_doaj_org_article_b7c16d04b6f745e2b03666b3bfcbc926
10.7717/peerj-cs.76
A543351350
10_7717_peerj_cs_76
GroupedDBID 53G
5VS
8FE
8FG
AAFWJ
AAYXX
ABUWG
ADBBV
AFKRA
AFPKN
ALMA_UNASSIGNED_HOLDINGS
ARAPS
AZQEC
BCNDV
BENPR
BGLVJ
BPHCQ
CCPQU
CITATION
DWQXO
FRP
GNUQQ
GROUPED_DOAJ
H13
HCIFZ
IAO
ICD
IEA
ISR
ITC
K6V
K7-
M~E
OK1
P62
PHGZM
PHGZT
PIMPY
PQGLB
PQQKQ
PROAC
PUEGO
3V.
7XB
8AL
8FK
JQ2
M0N
PKEHL
PQEST
PQUKI
PRINS
Q9U
ADTOC
ARCSS
RPM
UNPAY
ID FETCH-LOGICAL-c465t-aaf01c57be3935b3136148339346ef8bda511f1f88e8a63a0e5334b55f9083b43
IEDL.DBID UNPAY
ISSN 2376-5992
IngestDate Fri Oct 03 12:44:07 EDT 2025
Wed Oct 01 16:25:38 EDT 2025
Sun Sep 07 03:26:52 EDT 2025
Mon Oct 20 21:59:29 EDT 2025
Mon Oct 20 16:22:22 EDT 2025
Thu Oct 16 15:01:23 EDT 2025
Thu Apr 24 22:53:38 EDT 2025
Wed Oct 01 02:47:50 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Language English
License http://creativecommons.org/licenses/by/4.0
cc-by
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c465t-aaf01c57be3935b3136148339346ef8bda511f1f88e8a63a0e5334b55f9083b43
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
OpenAccessLink https://proxy.k.utb.cz/login?url=https://doi.org/10.7717/peerj-cs.76
PQID 1950460543
PQPubID 2045934
PageCount e76
ParticipantIDs doaj_primary_oai_doaj_org_article_b7c16d04b6f745e2b03666b3bfcbc926
unpaywall_primary_10_7717_peerj_cs_76
proquest_journals_1950460543
gale_infotracmisc_A543351350
gale_infotracacademiconefile_A543351350
gale_incontextgauss_ISR_A543351350
crossref_citationtrail_10_7717_peerj_cs_76
crossref_primary_10_7717_peerj_cs_76
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 20160808
PublicationDateYYYYMMDD 2016-08-08
PublicationDate_xml – month: 08
  year: 2016
  text: 20160808
  day: 08
PublicationDecade 2010
PublicationPlace San Diego
PublicationPlace_xml – name: San Diego
PublicationTitle PeerJ. Computer science
PublicationYear 2016
Publisher PeerJ. Ltd
PeerJ, Inc
PeerJ Inc
Publisher_xml – name: PeerJ. Ltd
– name: PeerJ, Inc
– name: PeerJ Inc
References Thrun (10.7717/peerj-cs.76/ref-30) 1991
Ho (10.7717/peerj-cs.76/ref-12) 2002; 24
Johnstone (10.7717/peerj-cs.76/ref-15) 2009; 367
Bohanec (10.7717/peerj-cs.76/ref-2) 1988
Smith-Miles (10.7717/peerj-cs.76/ref-28) 2014; 45
Yin (10.7717/peerj-cs.76/ref-31) 2013; 105
Dy (10.7717/peerj-cs.76/ref-9) 2004; 5
Hyvärinen (10.7717/peerj-cs.76/ref-14) 2000; 13
Màcia (10.7717/peerj-cs.76/ref-19) 2013; 46
Li (10.7717/peerj-cs.76/ref-16) 2006
Choubey (10.7717/peerj-cs.76/ref-4) 1996; vol. 2
Orriols-Puig (10.7717/peerj-cs.76/ref-21) 2010
Díez-Pastor (10.7717/peerj-cs.76/ref-7) 2015; 325
Chvátal (10.7717/peerj-cs.76/ref-5) 1979; 25
Frank (10.7717/peerj-cs.76/ref-10) 2010
Liu (10.7717/peerj-cs.76/ref-17) 1998
Domingos (10.7717/peerj-cs.76/ref-8) 2000
Mantovani (10.7717/peerj-cs.76/ref-20) 2015
Smith-Miles (10.7717/peerj-cs.76/ref-29) 2012; 39
Luengo (10.7717/peerj-cs.76/ref-18) 2013; 42
Cieslak (10.7717/peerj-cs.76/ref-6) 2011; 24
Pedregosa (10.7717/peerj-cs.76/ref-22) 2011; 12
Smith (10.7717/peerj-cs.76/ref-27) 2013; 95
Ratanamahatana (10.7717/peerj-cs.76/ref-24) 2003; 17
Provost (10.7717/peerj-cs.76/ref-23) 1999
Brent (10.7717/peerj-cs.76/ref-3) 1973
Skala (10.7717/peerj-cs.76/ref-26) 2013
Scott (10.7717/peerj-cs.76/ref-25) 1992
Alcalá (10.7717/peerj-cs.76/ref-1) 2011; 17
Ho (10.7717/peerj-cs.76/ref-11) 2008; vol. 5342
Ho (10.7717/peerj-cs.76/ref-13) 2006
References_xml – volume: 45
  start-page: 12
  year: 2014
  ident: 10.7717/peerj-cs.76/ref-28
  article-title: Towards objective measures of algorithm performance across instance space
  publication-title: Computers & Operations Research
  doi: 10.1016/j.cor.2013.11.015
– year: 2010
  ident: 10.7717/peerj-cs.76/ref-21
  article-title: Documentation for the data complexity library in C++
  publication-title: Technical report
– volume: 17
  start-page: 255
  issue: 2–3
  year: 2011
  ident: 10.7717/peerj-cs.76/ref-1
  article-title: Keel data-mining software tool: data set repository, integration of algorithms and experimental analysis framework
  publication-title: Journal of Multiple-Valued Logic and Soft Computing
– year: 1991
  ident: 10.7717/peerj-cs.76/ref-30
  article-title: The MONK’s problems: a performance comparison of different learning algorithms
  publication-title: Technical Report CMU.-CS-91-197. Carnegie Mellon University
– volume: 5
  start-page: 845
  year: 2004
  ident: 10.7717/peerj-cs.76/ref-9
  article-title: Feature selection for unsupervised learning
  publication-title: The Journal of Machine Learning Research
  doi: 10.1016/j.patrec.2014.11.006
– volume: 24
  start-page: 289
  issue: 3
  year: 2002
  ident: 10.7717/peerj-cs.76/ref-12
  article-title: Complexity measures of supervised classification problems
  publication-title: IEEE Transactions on Pattern Analysis and Machine Intelligence
  doi: 10.1109/34.990132
– start-page: 1
  year: 2015
  ident: 10.7717/peerj-cs.76/ref-20
  article-title: To tune or not to tune: recommending when to adjust SVM hyper-parameters via meta-learning
– volume-title: Algorithms for minimization without derivatives
  year: 1973
  ident: 10.7717/peerj-cs.76/ref-3
– volume: 24
  start-page: 136
  issue: 1
  year: 2011
  ident: 10.7717/peerj-cs.76/ref-6
  article-title: Hellinger distance decision trees are robust and skew-insensitive
  publication-title: Data Mining and Knowledge Discovery
  doi: 10.1007/s10618-011-0222-1
– volume-title: UCI machine learning repository
  year: 2010
  ident: 10.7717/peerj-cs.76/ref-10
– volume: vol. 2
  start-page: 1122
  volume-title: Proceedings of the fifth IEEE international conference on fuzzy systems
  year: 1996
  ident: 10.7717/peerj-cs.76/ref-4
  article-title: A comparison of feature selection algorithms in the context of rough classifiers
  doi: 10.1109/FUZZY.1996.561296
– start-page: 101
  volume-title: Machine learning: ECML-98
  year: 1998
  ident: 10.7717/peerj-cs.76/ref-17
  article-title: A monotonic measure for optimal feature selection
– volume-title: Multivariate density estimation: theory, practice, and visualization
  year: 1992
  ident: 10.7717/peerj-cs.76/ref-25
  doi: 10.1002/9780470316849
– volume: 95
  start-page: 225
  issue: 2
  year: 2013
  ident: 10.7717/peerj-cs.76/ref-27
  article-title: An instance level analysis of data complexity
  publication-title: Machine Learning
  doi: 10.1007/s10994-013-5422-z
– volume: 46
  start-page: 1054
  issue: 3
  year: 2013
  ident: 10.7717/peerj-cs.76/ref-19
  article-title: Learner excellence biased by data set selection: a case for data characterisation and artificial data sets
  publication-title: Pattern Recognition
  doi: 10.1016/j.patcog.2012.09.022
– volume: 25
  start-page: 285
  issue: 3
  year: 1979
  ident: 10.7717/peerj-cs.76/ref-5
  article-title: The tail of the hypergeometric distribution
  publication-title: Discrete Mathematics
  doi: 10.1016/0012-365X(79)90084-0
– year: 2006
  ident: 10.7717/peerj-cs.76/ref-16
  article-title: Data complexity in machine learning
  publication-title: Technical Report
– volume: 17
  start-page: 475
  issue: 5–6
  year: 2003
  ident: 10.7717/peerj-cs.76/ref-24
  article-title: Feature selection for the naive bayesian classifier using decision trees
  publication-title: Applied Artificial Intelligence
  doi: 10.1080/713827175
– volume: 105
  start-page: 3
  year: 2013
  ident: 10.7717/peerj-cs.76/ref-31
  article-title: Feature selection for high-dimensional imbalanced data
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2012.04.039
– volume: 325
  start-page: 98
  year: 2015
  ident: 10.7717/peerj-cs.76/ref-7
  article-title: Diversity techniques improve the performance of the best imbalance learning ensembles
  publication-title: Information Sciences
  doi: 10.1016/j.ins.2015.07.025
– volume: 12
  start-page: 2825
  year: 2011
  ident: 10.7717/peerj-cs.76/ref-22
  article-title: Scikit-learn: machine learning in python
  publication-title: Journal of Machine Learning Research
– volume: 13
  start-page: 411
  issue: 4–5
  year: 2000
  ident: 10.7717/peerj-cs.76/ref-14
  article-title: Independent component analysis: algorithms and applications
  publication-title: Neural Networks
  doi: 10.1016/S0893-6080(00)00026-5
– start-page: 1
  volume-title: Data complexity in pattern recognition
  year: 2006
  ident: 10.7717/peerj-cs.76/ref-13
  article-title: Measures of geometrical complexity in classification problems
  doi: 10.1007/978-1-84628-172-3_1
– volume: 39
  start-page: 875
  issue: 5
  year: 2012
  ident: 10.7717/peerj-cs.76/ref-29
  article-title: Measuring instance difficulty for combinatorial optimization problems
  publication-title: Computers & Operations Research
  doi: 10.1016/j.cor.2011.07.006
– year: 1988
  ident: 10.7717/peerj-cs.76/ref-2
  article-title: Knowledge acquisition and explanation for multi-attribute decision
– volume: 367
  start-page: 4237
  issue: 1906
  year: 2009
  ident: 10.7717/peerj-cs.76/ref-15
  article-title: Statistical challenges of high-dimensional data
  publication-title: Philosophical Transactions of the Royal Society of London A: Mathematical, Physical and Engineering Sciences
  doi: 10.1098/rsta.2009.0159
– start-page: 23
  year: 1999
  ident: 10.7717/peerj-cs.76/ref-23
  article-title: Efficient progressive sampling
– volume: vol. 5342
  start-page: 1
  volume-title: Structural, syntactic, and statistical pattern recognition
  year: 2008
  ident: 10.7717/peerj-cs.76/ref-11
  article-title: Data complexity analysis: linkage between context and solution in classification
  doi: 10.1007/978-3-540-89689-0_1
– volume: 42
  start-page: 147
  issue: 1
  year: 2013
  ident: 10.7717/peerj-cs.76/ref-18
  article-title: An automatic extraction method of the domains of competence for learning classifiers using data complexity measures
  publication-title: Knowledge and Information Systems
  doi: 10.1007/s10115-013-0700-4
– start-page: 564
  year: 2000
  ident: 10.7717/peerj-cs.76/ref-8
  article-title: A Unified bias-variance decomposition for zero-one and squared loss
– year: 2013
  ident: 10.7717/peerj-cs.76/ref-26
  article-title: Hypergeometric tail inequalities: ending the insanity
SSID ssj0001511119
Score 2.0516365
Snippet We describe a method for assessing data set complexity based on the estimation of the underlining probability distribution and Hellinger distance. In contrast...
SourceID doaj
unpaywall
proquest
gale
crossref
SourceType Open Website
Open Access Repository
Aggregation Database
Enrichment Source
Index Database
StartPage e76
SubjectTerms Algorithms
Analysis
Artificial intelligence
Automatic classification
Bias
Bias-variance decomposition
Classification
Classifiers
Complexity
Data complexity
Data processing
Data pruning
Datasets
Downloading
Graphic methods
Hellinger distance
Learning curves
Methods
Neural networks
Noise
Operations research
Pattern recognition
Performance measures
Probability distribution
Probability distributions
Pruning
Sparsity
Variables
SummonAdditionalLinks – databaseName: DOAJ Directory of Open Access Journals
  dbid: DOA
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1Lb9QwELZQL8CB8qrYtiALLaqElDaOH0m4LYiqILWHPqTeLHtiI9CSXW12i_rvmUmy210VwYVjnDnY4xnPjD3zDWPDTDhROjAYm3iZqFiKpIQUkjxzVa5iqlSg-47TM3Nypb5e6-u1Vl-UE9bBA3eMO_I5CFOlypuYKx0yj0euMV76CB7KrAXbTotyLZjq6oPpKCi7grwcQ5ajaQizHwk0h4QusmaCWqT---fxY_ZwUU_d7S83Hq8ZnOOn7EnvKfJRN8Nn7EGon7PtZRcG3ivlC3ZBQwRrOb_lsJjdhA_c8RaHmvjPf3aXgHwSOWWDcrijdnXFgbzn7xGNI5_e1RC8ZFfHny8_nSR9q4QElNHzxLmYCtC5D1Rq66WQBPAp8UOZEAtfOWRIFLEoQuGMdGmgElyvdSzRB_NK7rCtelKHV4xjtFyhkTLGZaCoUDYTlSdYwgKDnSxLB-z9knsWehxxamcxthhPEKtty2oLjc3NgA1XxNMOPuPPZB9pG1YkhHndDqAk2F4S7L8kYcDe0iZaQrWoKW3mm1s0jf1ycW5HWkmphdQ4-4OeKE5w1uD6KgRcOwFhbVDub1Ci2sHm76Ws2F7tG0s9demhWckBe7eSn7-tfPd_rHyPPUI_zrR5icU-25rPFuE1-kpz_6ZVi99SJBLj
  priority: 102
  providerName: Directory of Open Access Journals
– databaseName: ProQuest Central
  dbid: BENPR
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV3raxQxEA_1-kH94Fs8rRKkIghrd_PaXUGklZYqeEhrod9Ckk2Kcu6ut3dK_3tn9rJ3PZR-3OwQ8pjMK5nfELLLMpOVxinwTSxPRCizpHSpS3JmqlyEVAiP8Y4vE3V8Jj6fy_MtMhlyYfBZ5SATe0FdNQ5j5HtYrhTv8AT_0P5KsGoU3q4OJTRMLK1Qve8hxm6QbYbIWCOyfXA4-XqyjrpIFBHlMlEvB1dmr_V-9iNx3VtEHbmimnoE_3_l9G1yc1G35vKPmU6vKKKje-ROtCDp_nLL75MtXz8gd4fqDDQe1ofkFJsQ7nJ-Sd1i9tu_o4b2-NS4L_TnMjhIm0DxlSh1a2pTV9ShVf09gNKk7Tq34BE5Ozr89vE4iSUUEieUnCfGhDRzMrceU3AtzzgCf3L4EMqHwlYGFiRkoSh8YRQ3qcfUXCtlKME2s4I_JqO6qf0TQsGLrkB5KWWYE5hAy7LKIlxhAU4QY-mYvBlWT7uIL45lLqYa_Axcat0vtXadztWY7K6I2yWsxv_JDnAbViSIhd03NLMLHY-WtrnLVJUKq0IupGcWlLJSltvgrCsZdPISN1Ej2kWNz2kuzKLr9KfTE70P7MRlxiWM_nUkCg2M2pmYnQBzR4CsDcqdDUo4jm7z98ArOoqDTq-Zd0xerfjnupk_vb6bZ-QWWG6qf4lY7JDRfLbwz8E6mtsXkeX_AqSkEGw
  priority: 102
  providerName: ProQuest
Title Complexity curve: a graphical measure of data complexity and classifier performance
URI https://www.proquest.com/docview/1950460543
https://doi.org/10.7717/peerj-cs.76
https://doaj.org/article/b7c16d04b6f745e2b03666b3bfcbc926
UnpaywallVersion publishedVersion
Volume 2
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAON
  databaseName: DOAJ Directory of Open Access Journals
  customDbUrl:
  eissn: 2376-5992
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0001511119
  issn: 2376-5992
  databaseCode: DOA
  dateStart: 20150101
  isFulltext: true
  titleUrlDefault: https://www.doaj.org/
  providerName: Directory of Open Access Journals
– providerCode: PRVHPJ
  databaseName: ROAD: Directory of Open Access Scholarly Resources
  customDbUrl:
  eissn: 2376-5992
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0001511119
  issn: 2376-5992
  databaseCode: M~E
  dateStart: 20150101
  isFulltext: true
  titleUrlDefault: https://road.issn.org
  providerName: ISSN International Centre
– providerCode: PRVPQU
  databaseName: ProQuest Central
  customDbUrl: http://www.proquest.com/pqcentral?accountid=15518
  eissn: 2376-5992
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0001511119
  issn: 2376-5992
  databaseCode: BENPR
  dateStart: 20150527
  isFulltext: true
  titleUrlDefault: https://www.proquest.com/central
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ProQuest Technology Collection
  customDbUrl:
  eissn: 2376-5992
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0001511119
  issn: 2376-5992
  databaseCode: 8FG
  dateStart: 20150527
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/technologycollection1
  providerName: ProQuest
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3db9MwED9t7QPwwPgUhVFZMISElC6JP5Lw1sHKQKKaNiqNJ8t2bASUtGoT0Pjr8SVptw4EPMa5RGefz3dn-34HsBdHKsqUET420TRgLouCzIQmSGKVJ8yFjFnc73g_FkcT9u6Mn23B01UuzKXz-8RHGvtzaxdfArMcJGIbuoJ7h7sD3cn4ePixLhuXiIBnWdxk3l39YsPW1JD8vy-8N-BaVczV-Q81nV6yLKMdeL3iqblQ8nVQlXpgfl6Ba_wH07fgZutZkmEzFW7Dli3uwM6qagNplfgunGITwmCW58RUi-_2JVGkxq1GeZFvzaYhmTmCt0eJuaBWRU4MetufnTemZH6Rc3APJqPDD6-Ogra0QmCY4GWglAsjwxNtMTVX04giICj1D0xYl-pceUfMRS5NbaoEVaHFlF3Nucu8z6YZvQ-dYlbYB0B8dJ17oyaEig3DxNo4yjXCGKZeVnEc9uDFSgjStLjjWP5iKn38gcMl6-GSZikT0YO9NfG8gdv4M9kBSnNNghjZdYOXgmxVTurERCIPmRYuYdzG2htrITTVzmiTxf4nT3AuSETBKPCazSdVLZfy7emJHHJGKY8o99w_b4nczHNtVJu14PuOwFkblLsblF5Nzebr1ZST7TKxlFiDFw-mGe3Bs_U0_FvPH_4n3SO47l07UV9VTHehUy4q-9i7T6Xuw3Y6etOH7sHh-PikX29C9FuV-gWjch2z
linkProvider Unpaywall
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1bb9MwFLbG9jD2wB2tMMBCm5CQsiWx4yRIE9pgU8u2Cu0i7c3Yjj2BSlqalql_jt_GOanTrgLtbY9xjixfjs_NPt8hZDOOVJQrI8A30SzgLo-C3IQmSGNVpNyFnFuMd5x0RfuCf7lMLpfInyYXBp9VNjKxFtRF32CMfAfLleIdHmcfB78CrBqFt6tNCQ3lSysUuzXEmE_sOLKTa3Dhqt3OZ9jvrTg-PDj_1A58lYHAcJGMAqVcGJkk1RazVDWLGGJjMvjgwrpMFwpsEhe5LLOZEkyFFrNXdZK4HMwXzRn0e4-scMZzcP5W9g-6X0_nUZ4ERVI-TQxMwXXaGVg7_BGYahtRTm6owrpiwL96YY2sjsuBmlyrXu-G4jt8RB54i5XuTVnsMVmy5RPysKkGQb1weErOsAnhNUcTasbD3_YDVbTGw0Y-oD-nwUjadxRfpVIzp1ZlQQ1a8d8dKGk6mOcyPCMXd7KYz8ly2S_tOqHgtRegLIVQseGYsBtHhUZ4xAycrjgOW-R9s3rSeDxzLKvRk-DX4FLLeqmlqWQqWmRzRjyYwnj8n2wft2FGgtjbdUN_eCX9UZY6NZEoQq6FS3liYw1GgBCaaWe0yWPo5C1uokR0jRKf71ypcVXJztmp3AP2ZUnEEhj9O0_k-jBqo3w2BMwdAbkWKDcWKOH4m8XfDa9IL34qOT8sLbI145_bZv7i9m7ekNX2-cmxPO50j16S-2A1ivoVZLZBlkfDsX0FltlIv_bsT8m3uz5xfwFO20wI
linkToPdf http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1bb9MwFLbGkLg8cEcUBlhoExJSaGI7ToKE0GCUlcGE2CbtzdiOPYFKGpqWqX-NX8c5ubSrQHvbY5wjy5dztX2-Q8gmi3SUaSshNjE8ED6LgsyGNkiYzhPhQyEcnnd83pe7R-LjcXy8Rv50uTD4rLLTibWizscWz8j7WK4U7_AE7_v2WcSXncGb8leAFaTwprUrp9GwyJ6bn0L4Vr0e7sBebzE2eH_4bjdoKwwEVsh4Gmjtw8jGiXGYoWp4xBEXk8OHkM6nJtfgj_jIp6lLteQ6dJi5auLYZ-C6GMGh30vkcoIo7pilPviwPN-JURllTUpgAkFTv3Ru8iOw1UvENzljBOtaAf9ahOvk6qwo9fxUj0ZnTN7gFrnR-qp0u2Gu22TNFXfIza4OBG3Vwl1ygE0IrDmdUzub_HavqKY1EjZyAP3ZHEPSsaf4HpXaJbUucmrRf__uwTzTcpnFcI8cXchS3ifrxbhwDwiFeD0HMymlZlZgqi6LcoPAiCmEW4yFPfKiWz1lWyRzLKgxUhDR4FKreqmVrVQie2RzQVw2AB7_J3uL27AgQdTtumE8OVGtECuT2EjmoTDSJyJ2zID5l9Jw462xGYNOnuEmKsTVKJBDT_SsqtTw4KvaBsblccRjGP3zlsiPYdRWt3kQMHeE4lqh3FihBMG3q787XlGt4qnUUkx6ZGvBP-fN_OH53TwlV0DO1Kfh_t4jcg3cRVk_f0w3yPp0MnOPwSWbmic171Py7aKF7S8-bEmi
linkToUnpaywall http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Lb9QwELbK9gAcKE-xUJAFRUhIWeL4kYTb8qgKEhWirFROlu3YCFiyq00CKr-emTy23YKAY5xJNPZ4PDO25xtC9hJmWG6cgtjE8kiEnEW5i12UJqZIRYiF8Ljf8fZQHczEm2N5vEUeDrkwZ87vU4g0ni69X32JXDVJ1QWyrSQ43COyPTt8N_3Ylo1LVSTzPOky785_sWFrWkj-3xfey-RiUy7NyQ8zn5-xLPs75OXAU3eh5Oukqe3E_TwH1_gPpq-SK71nSafdVLhGtnx5newMVRtor8Q3yBE2IQxmfUJds_run1FDW9xqlBf91m0a0kWgeHuUulNqUxbUobf9OYAxpcvTnIObZLb_6sOLg6gvrRA5oWQdGRNi5mRqPabmWs44AoJyeBDKh8wWBhyxwEKW-cwobmKPKbtWypCDz2YFv0VG5aL0twmF6LoAo6aUSZzAxNqEFRZhDDOQVZLEY_JkEIJ2Pe44lr-Ya4g_cLh0O1zaVTpVY7K3Jl52cBt_JnuO0lyTIEZ22wBS0L3KaZs6popYWBVSIX1iwVgrZbkNzro8gZ88wLmgEQWjxGs2n0xTVfr10Xs9lYJzybgE7h_3RGEBXDvTZy1A3xE4a4Nyd4MS1NRtvh6mnO6XiUpjDV48mBZ8TB6tp-Hfen7nP-nukkvg2qn2qmK2S0b1qvH3wH2q7f1efX4BfHYaPg
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Complexity+curve%3A+a+graphical+measure+of+data+complexity+and+classifier+performance&rft.jtitle=PeerJ.+Computer+science&rft.au=Zubek%2C+Julian&rft.au=Plewczynski%2C+Dariusz+M&rft.date=2016-08-08&rft.pub=PeerJ.+Ltd&rft.issn=2376-5992&rft.eissn=2376-5992&rft.volume=2&rft.spage=e76&rft_id=info:doi/10.7717%2Fpeerj-cs.76&rft.externalDocID=A543351350
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2376-5992&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2376-5992&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2376-5992&client=summon