Decision tree classification algorithm for non-equilibrium data set based on random forests

In order to overcome the problems of poor accuracy and high complexity of current classification algorithm for non-equilibrium data set, this paper proposes a decision tree classification algorithm for non-equilibrium data set based on random forest. Wavelet packet decomposition is used to denoise n...

Full description

Saved in:
Bibliographic Details
Published inJournal of intelligent & fuzzy systems Vol. 39; no. 2; pp. 1639 - 1648
Main Authors Wang, Peng, Zhang, Ningchao
Format Journal Article
LanguageEnglish
Published London, England SAGE Publications 01.01.2020
Sage Publications Ltd
Subjects
Online AccessGet full text
ISSN1064-1246
1875-8967
DOI10.3233/JIFS-179937

Cover

Abstract In order to overcome the problems of poor accuracy and high complexity of current classification algorithm for non-equilibrium data set, this paper proposes a decision tree classification algorithm for non-equilibrium data set based on random forest. Wavelet packet decomposition is used to denoise non-equilibrium data, and SNM algorithm and RFID are combined to remove redundant data from data sets. Based on the results of data processing, the non-equilibrium data sets are classified by random forest method. According to Bootstrap resampling method with certain constraints, the majority and minority samples of each sample subset are sampled, CART is used to train the data set, and a decision tree is constructed. Obtain the final classification results by voting on the CART decision tree classification. Experimental results show that the proposed algorithm has the characteristics of high classification accuracy and low complexity, and it is a feasible classification algorithm for non-equilibrium data set.
AbstractList In order to overcome the problems of poor accuracy and high complexity of current classification algorithm for non-equilibrium data set, this paper proposes a decision tree classification algorithm for non-equilibrium data set based on random forest. Wavelet packet decomposition is used to denoise non-equilibrium data, and SNM algorithm and RFID are combined to remove redundant data from data sets. Based on the results of data processing, the non-equilibrium data sets are classified by random forest method. According to Bootstrap resampling method with certain constraints, the majority and minority samples of each sample subset are sampled, CART is used to train the data set, and a decision tree is constructed. Obtain the final classification results by voting on the CART decision tree classification. Experimental results show that the proposed algorithm has the characteristics of high classification accuracy and low complexity, and it is a feasible classification algorithm for non-equilibrium data set.
Author Wang, Peng
Zhang, Ningchao
Author_xml – sequence: 1
  givenname: Peng
  surname: Wang
  fullname: Wang, Peng
  organization: Department of Electronic Information Engineering
– sequence: 2
  givenname: Ningchao
  surname: Zhang
  fullname: Zhang, Ningchao
  organization: Department of Electronic Information Engineering
BookMark eNp1kMFOwzAMhiM0JLbBiReIxAUJFZKmbZojGgyGJnEAThyqNHVGpq7ZkvTA25OtSEgITrbsz_7tf4JGne0AoXNKrlnK2M3TYv6SUC4E40doTEueJ6Uo-CjmpMgSmmbFCZp4vyaE8jwlY_R-B8p4YzscHABWrfTeaKNk2Ndku7LOhI8N1tbhKJbArjetqZ3pN7iRQWIPAdfSQ4Mj72TX2AMMPvhTdKxl6-HsO07R2_z-dfaYLJ8fFrPbZaJSkYVEsbwU0OhCUtKUDSmIUHXKGRBea0XzTMnY0yWldd6IlJIsBk6pBtB1zhmbooth79bZXR-Vq7XtXRclqzRjgmRUiCJSdKCUs9470JUy4fBmcNK0FSXV3sNq72E1eBhnrn7NbJ3ZSPf5D3050F6u4OeGv9Avsm6B4A
CitedBy_id crossref_primary_10_2478_amns_2022_1_00028
crossref_primary_10_3390_app12189113
Cites_doi 10.1109/MIS.2016.27
10.1049/iet-com.2016.0332
10.1049/iet-cvi.2017.0524
10.1016/j.knosys.2015.12.005
10.1002/pmic.201500451
10.1109/TCYB.2017.2774266
10.1007/s11227-015-1541-6
10.1016/j.neuroimage.2015.10.026
10.1109/TFUZZ.2016.2594275
10.1007/s11227-016-1631-0
10.1109/TCYB.2016.2606104
10.1016/j.patcog.2016.05.033
ContentType Journal Article
Copyright 2020 – IOS Press and the authors. All rights reserved
Copyright IOS Press BV 2020
Copyright_xml – notice: 2020 – IOS Press and the authors. All rights reserved
– notice: Copyright IOS Press BV 2020
DBID AAYXX
CITATION
7SC
8FD
JQ2
L7M
L~C
L~D
DOI 10.3233/JIFS-179937
DatabaseName CrossRef
Computer and Information Systems Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Computer and Information Systems Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Advanced Technologies Database with Aerospace
ProQuest Computer Science Collection
Computer and Information Systems Abstracts Professional
DatabaseTitleList Computer and Information Systems Abstracts

CrossRef
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 1875-8967
EndPage 1648
ExternalDocumentID 10_3233_JIFS_179937
10.3233_JIFS-179937
GroupedDBID .4S
.DC
4.4
5GY
8VB
AAGLT
ABCQX
ABDBF
ABJNI
ABUJY
ACGFS
ACPQW
ACUHS
ADMLS
ADZMO
AEMOZ
AENEX
AFRHK
AHDMH
AHQJS
AJNRN
AKVCP
ALMA_UNASSIGNED_HOLDINGS
AMVHM
ARCSS
ARTOV
ASPBG
AVWKF
DU5
EAD
EAP
EBA
EBR
EBS
EBU
EDO
EMK
EPL
EST
ESX
H13
HZ~
I-F
IOS
K1G
L7B
MET
MIO
MK~
MV1
NGNOM
O9-
P2P
QWB
TH9
TUS
ZL0
0R~
AAYXX
CITATION
7SC
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c294t-c3589edf6a10d8d0609cb273e07bfc154caf6af811b5d921045d9711feefb5733
ISSN 1064-1246
IngestDate Fri Jul 25 10:12:15 EDT 2025
Wed Oct 01 08:23:23 EDT 2025
Thu Apr 24 22:52:30 EDT 2025
Sun Jul 13 06:01:30 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 2
Keywords RFID
decision tree
Random forest
classification
SNM algorithm
non-equilibrium data set
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c294t-c3589edf6a10d8d0609cb273e07bfc154caf6af811b5d921045d9711feefb5733
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
PQID 2439041996
PQPubID 2046407
PageCount 10
ParticipantIDs proquest_journals_2439041996
crossref_citationtrail_10_3233_JIFS_179937
crossref_primary_10_3233_JIFS_179937
sage_journals_10_3233_JIFS_179937
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2020-01-01
PublicationDateYYYYMMDD 2020-01-01
PublicationDate_xml – month: 01
  year: 2020
  text: 2020-01-01
  day: 01
PublicationDecade 2020
PublicationPlace London, England
PublicationPlace_xml – name: London, England
– name: London
PublicationTitle Journal of intelligent & fuzzy systems
PublicationYear 2020
Publisher SAGE Publications
Sage Publications Ltd
Publisher_xml – name: SAGE Publications
– name: Sage Publications Ltd
References 2017; 48
2017; 47
2017; 25
2017; 49
2017; 11
2017; PP
2019; 13
2019; 34
2017; 34
2017; 56
2016; 31
2016; 95
2016; 72
2016; 60
2016; 125
2018; 55
2016; 16
2016; 99
2018; 46
e_1_3_2_15_2
Yue D. (e_1_3_2_18_2) 2017; 25
e_1_3_2_16_2
e_1_3_2_7_2
Chen Z. (e_1_3_2_11_2) 2017; 48
e_1_3_2_17_2
e_1_3_2_6_2
e_1_3_2_19_2
Xu J. (e_1_3_2_9_2) 2018; 46
e_1_3_2_21_2
e_1_3_2_5_2
e_1_3_2_22_2
e_1_3_2_4_2
Li Y.X. (e_1_3_2_8_2) 2019; 34
e_1_3_2_12_2
Tang C. (e_1_3_2_20_2) 2017; 56
e_1_3_2_13_2
Fu J.J. (e_1_3_2_10_2) 2018; 55
Deng F. (e_1_3_2_3_2) 2017; 34
Antonelli M. (e_1_3_2_14_2) 2017
Kim T. (e_1_3_2_2_2) 2016; 99
References_xml – volume: 99
  start-page: 1
  issue: 3
  year: 2016
  end-page: 16
  article-title: Incorporating receiver operating characteristics into naive Bayes for non-equilibrium data classification
  publication-title: Computing
– volume: 11
  start-page: 1725
  issue: 11
  year: 2017
  end-page: 1731
  article-title: Non-equilibrium data set classification algorithm based on markov sampling and SVM algorithm
  publication-title: Iet Communications
– volume: 56
  start-page: 1
  issue: 1
  year: 2017
  end-page: 14
  article-title: Gene selection for microarray data classification via subspace learning and manifold regularization
  publication-title: Medical & Biological Engineering & Computing
– volume: 34
  start-page: 292
  issue: 1
  year: 2017
  end-page: 295
  article-title: Multiple Hops Network Classification Attribute Data Fuzzy Clustering in the Simulation
  publication-title: Computer Simulation
– volume: 95
  start-page: 75
  issue: 2
  year: 2016
  end-page: 85
  article-title: A maximum margin and minimum volume hyper-spheres machine with pinball loss for imbalanced data classification
  publication-title: Knowledge-Based Systems
– volume: PP
  start-page: 1
  issue: 99
  year: 2017
  end-page: 1
  article-title: Multi-Objective Evolutionary Optimization of Type-2 Fuzzy Rule-based Systems for Financial Data Classification
  publication-title: IEEE Transactions on Fuzzy Systems
– volume: 125
  start-page: 587
  issue: 12
  year: 2016
  end-page: 600
  article-title: Matched signal detection on graphs: Theory and application to brain imaging data classification
  publication-title: Neuroimage
– volume: 31
  start-page: 50
  issue: 5
  year: 2016
  end-page: 56
  article-title: An algorithm of non-equilibrium data set classification based on machine learning
  publication-title: IEEE Intelligent Systems
– volume: 72
  start-page: 3210
  issue: 8
  year: 2016
  end-page: 3221
  article-title: Feature selection based on an improved cat swarm optimization algorithm for big data classification
  publication-title: Journal of Supercomputing
– volume: 16
  start-page: 1731
  issue: 11-12
  year: 2016
  end-page: 1735
  article-title: Classification algorithm of non-equilibrium data sets based on linear regression
  publication-title: Proteomics
– volume: 34
  start-page: 673
  issue: 2
  year: 2019
  end-page: 688
  article-title: Review of imbalanced data classification methods
  publication-title: Control and Decision
– volume: 47
  start-page: 4263
  issue: 12
  year: 2017
  end-page: 4274
  article-title: A Noise-Filtered Under-Sampling Scheme for Imbalanced Classification
  publication-title: IEEE Transactions on Cybernetics
– volume: 25
  start-page: 1078
  issue: 5
  year: 2017
  end-page: 1089
  article-title: Improving Supervised Learning Classification Methods Using Multigranular Linguistic Modeling and Fuzzy Entropy
  publication-title: IEEE Transactions on Fuzzy Systems
– volume: 49
  start-page: 403
  issue: 2
  year: 2017
  end-page: 416
  article-title: Hybrid Incremental Ensemble Learning for Noisy Real-World Data Classification
  publication-title: IEEE Transactions on Cybernetics
– volume: 46
  start-page: 102
  issue: 3
  year: 2018
  end-page: 112
  article-title: Research on the Generalization Performance of SVM Imbalanced Data Classification Algorithm Based on Markov Sampling
  publication-title: Acta Electronica Sinica
– volume: 55
  start-page: 21
  issue: 2
  year: 2018
  end-page: 31
  article-title: A GEV-Based Classification Algorithm for Imbalanced Data
  publication-title: Journal of Computer Research and Development
– volume: 60
  start-page: 585
  issue: 8
  year: 2016
  end-page: 595
  article-title: Graph Embedded One-Class Classifiers for media data classification
  publication-title: Pattern Recognition
– volume: 25
  start-page: 1006
  issue: 2
  year: 2017
  end-page: 1012
  article-title: A Hierarchical Fused Fuzzy Deep Neural Network for Data Classification
  publication-title: IEEE Transactions on Fuzzy Systems
– volume: 48
  start-page: 1
  issue: 8
  year: 2017
  end-page: 17
  article-title: A synthetic neighborhood generation based ensemble learning for the imbalanced data classification
  publication-title: Applied Intelligence
– volume: 13
  start-page: 194
  issue: 2
  year: 2019
  end-page: 205
  article-title: On combining active and transfer learning for medical data classification
  publication-title: Iet Computer Vision
– volume: 72
  start-page: 1
  issue: 10
  year: 2016
  end-page: 21
  article-title: Improving the classification performance of biological imbalanced datasets by swarm optimization algorithms
  publication-title: Journal of Supercomputing
– ident: e_1_3_2_5_2
  doi: 10.1109/MIS.2016.27
– ident: e_1_3_2_6_2
  doi: 10.1049/iet-com.2016.0332
– ident: e_1_3_2_16_2
  doi: 10.1049/iet-cvi.2017.0524
– volume: 48
  start-page: 1
  issue: 8
  year: 2017
  ident: e_1_3_2_11_2
  article-title: A synthetic neighborhood generation based ensemble learning for the imbalanced data classification
  publication-title: Applied Intelligence
– volume: 25
  start-page: 1006
  issue: 2
  year: 2017
  ident: e_1_3_2_18_2
  article-title: A Hierarchical Fused Fuzzy Deep Neural Network for Data Classification
  publication-title: IEEE Transactions on Fuzzy Systems
– ident: e_1_3_2_4_2
  doi: 10.1016/j.knosys.2015.12.005
– ident: e_1_3_2_7_2
  doi: 10.1002/pmic.201500451
– start-page: 1
  issue: 99
  year: 2017
  ident: e_1_3_2_14_2
  article-title: Multi-Objective Evolutionary Optimization of Type-2 Fuzzy Rule-based Systems for Financial Data Classification
  publication-title: IEEE Transactions on Fuzzy Systems
– volume: 34
  start-page: 292
  issue: 1
  year: 2017
  ident: e_1_3_2_3_2
  article-title: Multiple Hops Network Classification Attribute Data Fuzzy Clustering in the Simulation
  publication-title: Computer Simulation
– ident: e_1_3_2_22_2
  doi: 10.1109/TCYB.2017.2774266
– volume: 55
  start-page: 21
  issue: 2
  year: 2018
  ident: e_1_3_2_10_2
  article-title: A GEV-Based Classification Algorithm for Imbalanced Data
  publication-title: Journal of Computer Research and Development
– volume: 46
  start-page: 102
  issue: 3
  year: 2018
  ident: e_1_3_2_9_2
  article-title: Research on the Generalization Performance of SVM Imbalanced Data Classification Algorithm Based on Markov Sampling
  publication-title: Acta Electronica Sinica
– ident: e_1_3_2_12_2
  doi: 10.1007/s11227-015-1541-6
– volume: 99
  start-page: 1
  issue: 3
  year: 2016
  ident: e_1_3_2_2_2
  article-title: Incorporating receiver operating characteristics into naive Bayes for non-equilibrium data classification
  publication-title: Computing
– ident: e_1_3_2_17_2
  doi: 10.1016/j.neuroimage.2015.10.026
– ident: e_1_3_2_15_2
  doi: 10.1109/TFUZZ.2016.2594275
– ident: e_1_3_2_21_2
  doi: 10.1007/s11227-016-1631-0
– volume: 34
  start-page: 673
  issue: 2
  year: 2019
  ident: e_1_3_2_8_2
  article-title: Review of imbalanced data classification methods
  publication-title: Control and Decision
– ident: e_1_3_2_13_2
  doi: 10.1109/TCYB.2016.2606104
– volume: 56
  start-page: 1
  issue: 1
  year: 2017
  ident: e_1_3_2_20_2
  article-title: Gene selection for microarray data classification via subspace learning and manifold regularization
  publication-title: Medical & Biological Engineering & Computing
– ident: e_1_3_2_19_2
  doi: 10.1016/j.patcog.2016.05.033
SSID ssj0017520
Score 2.1910806
Snippet In order to overcome the problems of poor accuracy and high complexity of current classification algorithm for non-equilibrium data set, this paper proposes a...
SourceID proquest
crossref
sage
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 1639
SubjectTerms Algorithms
Classification
Complexity
Data processing
Datasets
Decision trees
Noise reduction
Radio frequency identification
Resampling
Title Decision tree classification algorithm for non-equilibrium data set based on random forests
URI https://journals.sagepub.com/doi/full/10.3233/JIFS-179937
https://www.proquest.com/docview/2439041996
Volume 39
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVEBS
  databaseName: EBSCOhost Academic Search Ultimate
  customDbUrl: https://search.ebscohost.com/login.aspx?authtype=ip,shib&custid=s3936755&profile=ehost&defaultdb=asn
  eissn: 1875-8967
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0017520
  issn: 1064-1246
  databaseCode: ABDBF
  dateStart: 19980201
  isFulltext: true
  titleUrlDefault: https://search.ebscohost.com/direct.asp?db=asn
  providerName: EBSCOhost
– providerCode: PRVEBS
  databaseName: Inspec with Full Text
  customDbUrl:
  eissn: 1875-8967
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0017520
  issn: 1064-1246
  databaseCode: ADMLS
  dateStart: 19980201
  isFulltext: true
  titleUrlDefault: https://www.ebsco.com/products/research-databases/inspec-full-text
  providerName: EBSCOhost
– providerCode: PRVEBS
  databaseName: Mathematics Source【Trial-2025/12/31】【リモートアクセス可】
  customDbUrl:
  eissn: 1875-8967
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0017520
  issn: 1064-1246
  databaseCode: AMVHM
  dateStart: 19980201
  isFulltext: true
  titleUrlDefault: https://www.ebsco.com/products/research-databases/mathematics-source
  providerName: EBSCOhost
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3Pb9MwFLagu8BhGr9Ex0BG4kRlSBznh48TUJVJm5DYYBKHyHZsNmmjG00P61_Pe7GTplOQBpe0cmwnzfvq9z3n-TMhbwwvUqMywZxILBNOJ0znlWZS5zqPYpcJ1WRbHGWzE3Fwmp6u022b1SW1fmdWg-tK_seqUAZ2xVWy_2DZrlMogO9gXziCheF4Jxt_DBvkYLq4nRgkwpj5422qLn7OIfI_u2wyCSHKZ_Z6ed6k-C8vJ5gZOlnYeoJurMJXBuC0qnlTGRzF4i-k9bxT8Kwb0LjlanUT5KA7dv49zEF_scEt9iemj8BXmjM170838Kg33eBHSOAwDEhB0K_2ZRD0sEL6fTXaYdVrFAX48N4YCQxQ9vwtxGvF0FiecJxrnh58nn5FDVXptWE2FbNvebIuvxAiG2xeYuPSN75PtjgM_NGIbO0ffpsddq-a8pR7yYrwu_wiTmz-vnftTdqyjkV66X8NIzneIdvBKnTf4-IRuWd_PSYPewKTT8iPFiEUEUI3EUI7hFAwOr2FEIoIoYAQ2iCEQn2PEBoQ8pScTD8df5ixsJ0GM1yKmpkkLaStXKbiqCqqKIuk0cBebZRrZ4BKGwXnXBHHOq0khzgdPvI4dtY6jbKZz8gIbsU-J9QCj4VYtFBJakUK3UsVQ9ypTJYI3Cl-TN62T6s0QWsetzy5KAcsM4YRo6185SVWhqvttY-9DP_BRcmBT0cCM-nH5DWaYn1qoIvdu13pBXmwRv4eGdW_l_YlUM9avwrg-QOezoQM
linkProvider EBSCOhost
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Decision+tree+classification+algorithm+for+non-equilibrium+data+set+based+on+random+forests&rft.jtitle=Journal+of+intelligent+%26+fuzzy+systems&rft.au=Wang%2C+Peng&rft.au=Zhang%2C+Ningchao&rft.date=2020-01-01&rft.issn=1064-1246&rft.eissn=1875-8967&rft.volume=39&rft.issue=2&rft.spage=1639&rft.epage=1648&rft_id=info:doi/10.3233%2FJIFS-179937&rft.externalDBID=n%2Fa&rft.externalDocID=10_3233_JIFS_179937
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1064-1246&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1064-1246&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1064-1246&client=summon