Decision tree classification algorithm for non-equilibrium data set based on random forests
In order to overcome the problems of poor accuracy and high complexity of current classification algorithm for non-equilibrium data set, this paper proposes a decision tree classification algorithm for non-equilibrium data set based on random forest. Wavelet packet decomposition is used to denoise n...
Saved in:
| Published in | Journal of intelligent & fuzzy systems Vol. 39; no. 2; pp. 1639 - 1648 |
|---|---|
| Main Authors | , |
| Format | Journal Article |
| Language | English |
| Published |
London, England
SAGE Publications
01.01.2020
Sage Publications Ltd |
| Subjects | |
| Online Access | Get full text |
| ISSN | 1064-1246 1875-8967 |
| DOI | 10.3233/JIFS-179937 |
Cover
| Abstract | In order to overcome the problems of poor accuracy and high complexity of current classification algorithm for non-equilibrium data set, this paper proposes a decision tree classification algorithm for non-equilibrium data set based on random forest. Wavelet packet decomposition is used to denoise non-equilibrium data, and SNM algorithm and RFID are combined to remove redundant data from data sets. Based on the results of data processing, the non-equilibrium data sets are classified by random forest method. According to Bootstrap resampling method with certain constraints, the majority and minority samples of each sample subset are sampled, CART is used to train the data set, and a decision tree is constructed. Obtain the final classification results by voting on the CART decision tree classification. Experimental results show that the proposed algorithm has the characteristics of high classification accuracy and low complexity, and it is a feasible classification algorithm for non-equilibrium data set. |
|---|---|
| AbstractList | In order to overcome the problems of poor accuracy and high complexity of current classification algorithm for non-equilibrium data set, this paper proposes a decision tree classification algorithm for non-equilibrium data set based on random forest. Wavelet packet decomposition is used to denoise non-equilibrium data, and SNM algorithm and RFID are combined to remove redundant data from data sets. Based on the results of data processing, the non-equilibrium data sets are classified by random forest method. According to Bootstrap resampling method with certain constraints, the majority and minority samples of each sample subset are sampled, CART is used to train the data set, and a decision tree is constructed. Obtain the final classification results by voting on the CART decision tree classification. Experimental results show that the proposed algorithm has the characteristics of high classification accuracy and low complexity, and it is a feasible classification algorithm for non-equilibrium data set. |
| Author | Wang, Peng Zhang, Ningchao |
| Author_xml | – sequence: 1 givenname: Peng surname: Wang fullname: Wang, Peng organization: Department of Electronic Information Engineering – sequence: 2 givenname: Ningchao surname: Zhang fullname: Zhang, Ningchao organization: Department of Electronic Information Engineering |
| BookMark | eNp1kMFOwzAMhiM0JLbBiReIxAUJFZKmbZojGgyGJnEAThyqNHVGpq7ZkvTA25OtSEgITrbsz_7tf4JGne0AoXNKrlnK2M3TYv6SUC4E40doTEueJ6Uo-CjmpMgSmmbFCZp4vyaE8jwlY_R-B8p4YzscHABWrfTeaKNk2Ndku7LOhI8N1tbhKJbArjetqZ3pN7iRQWIPAdfSQ4Mj72TX2AMMPvhTdKxl6-HsO07R2_z-dfaYLJ8fFrPbZaJSkYVEsbwU0OhCUtKUDSmIUHXKGRBea0XzTMnY0yWldd6IlJIsBk6pBtB1zhmbooth79bZXR-Vq7XtXRclqzRjgmRUiCJSdKCUs9470JUy4fBmcNK0FSXV3sNq72E1eBhnrn7NbJ3ZSPf5D3050F6u4OeGv9Avsm6B4A |
| CitedBy_id | crossref_primary_10_2478_amns_2022_1_00028 crossref_primary_10_3390_app12189113 |
| Cites_doi | 10.1109/MIS.2016.27 10.1049/iet-com.2016.0332 10.1049/iet-cvi.2017.0524 10.1016/j.knosys.2015.12.005 10.1002/pmic.201500451 10.1109/TCYB.2017.2774266 10.1007/s11227-015-1541-6 10.1016/j.neuroimage.2015.10.026 10.1109/TFUZZ.2016.2594275 10.1007/s11227-016-1631-0 10.1109/TCYB.2016.2606104 10.1016/j.patcog.2016.05.033 |
| ContentType | Journal Article |
| Copyright | 2020 – IOS Press and the authors. All rights reserved Copyright IOS Press BV 2020 |
| Copyright_xml | – notice: 2020 – IOS Press and the authors. All rights reserved – notice: Copyright IOS Press BV 2020 |
| DBID | AAYXX CITATION 7SC 8FD JQ2 L7M L~C L~D |
| DOI | 10.3233/JIFS-179937 |
| DatabaseName | CrossRef Computer and Information Systems Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | CrossRef Computer and Information Systems Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Advanced Technologies Database with Aerospace ProQuest Computer Science Collection Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Computer and Information Systems Abstracts CrossRef |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EISSN | 1875-8967 |
| EndPage | 1648 |
| ExternalDocumentID | 10_3233_JIFS_179937 10.3233_JIFS-179937 |
| GroupedDBID | .4S .DC 4.4 5GY 8VB AAGLT ABCQX ABDBF ABJNI ABUJY ACGFS ACPQW ACUHS ADMLS ADZMO AEMOZ AENEX AFRHK AHDMH AHQJS AJNRN AKVCP ALMA_UNASSIGNED_HOLDINGS AMVHM ARCSS ARTOV ASPBG AVWKF DU5 EAD EAP EBA EBR EBS EBU EDO EMK EPL EST ESX H13 HZ~ I-F IOS K1G L7B MET MIO MK~ MV1 NGNOM O9- P2P QWB TH9 TUS ZL0 0R~ AAYXX CITATION 7SC 8FD JQ2 L7M L~C L~D |
| ID | FETCH-LOGICAL-c294t-c3589edf6a10d8d0609cb273e07bfc154caf6af811b5d921045d9711feefb5733 |
| ISSN | 1064-1246 |
| IngestDate | Fri Jul 25 10:12:15 EDT 2025 Wed Oct 01 08:23:23 EDT 2025 Thu Apr 24 22:52:30 EDT 2025 Sun Jul 13 06:01:30 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 2 |
| Keywords | RFID decision tree Random forest classification SNM algorithm non-equilibrium data set |
| Language | English |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c294t-c3589edf6a10d8d0609cb273e07bfc154caf6af811b5d921045d9711feefb5733 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| PQID | 2439041996 |
| PQPubID | 2046407 |
| PageCount | 10 |
| ParticipantIDs | proquest_journals_2439041996 crossref_citationtrail_10_3233_JIFS_179937 crossref_primary_10_3233_JIFS_179937 sage_journals_10_3233_JIFS_179937 |
| ProviderPackageCode | CITATION AAYXX |
| PublicationCentury | 2000 |
| PublicationDate | 2020-01-01 |
| PublicationDateYYYYMMDD | 2020-01-01 |
| PublicationDate_xml | – month: 01 year: 2020 text: 2020-01-01 day: 01 |
| PublicationDecade | 2020 |
| PublicationPlace | London, England |
| PublicationPlace_xml | – name: London, England – name: London |
| PublicationTitle | Journal of intelligent & fuzzy systems |
| PublicationYear | 2020 |
| Publisher | SAGE Publications Sage Publications Ltd |
| Publisher_xml | – name: SAGE Publications – name: Sage Publications Ltd |
| References | 2017; 48 2017; 47 2017; 25 2017; 49 2017; 11 2017; PP 2019; 13 2019; 34 2017; 34 2017; 56 2016; 31 2016; 95 2016; 72 2016; 60 2016; 125 2018; 55 2016; 16 2016; 99 2018; 46 e_1_3_2_15_2 Yue D. (e_1_3_2_18_2) 2017; 25 e_1_3_2_16_2 e_1_3_2_7_2 Chen Z. (e_1_3_2_11_2) 2017; 48 e_1_3_2_17_2 e_1_3_2_6_2 e_1_3_2_19_2 Xu J. (e_1_3_2_9_2) 2018; 46 e_1_3_2_21_2 e_1_3_2_5_2 e_1_3_2_22_2 e_1_3_2_4_2 Li Y.X. (e_1_3_2_8_2) 2019; 34 e_1_3_2_12_2 Tang C. (e_1_3_2_20_2) 2017; 56 e_1_3_2_13_2 Fu J.J. (e_1_3_2_10_2) 2018; 55 Deng F. (e_1_3_2_3_2) 2017; 34 Antonelli M. (e_1_3_2_14_2) 2017 Kim T. (e_1_3_2_2_2) 2016; 99 |
| References_xml | – volume: 99 start-page: 1 issue: 3 year: 2016 end-page: 16 article-title: Incorporating receiver operating characteristics into naive Bayes for non-equilibrium data classification publication-title: Computing – volume: 11 start-page: 1725 issue: 11 year: 2017 end-page: 1731 article-title: Non-equilibrium data set classification algorithm based on markov sampling and SVM algorithm publication-title: Iet Communications – volume: 56 start-page: 1 issue: 1 year: 2017 end-page: 14 article-title: Gene selection for microarray data classification via subspace learning and manifold regularization publication-title: Medical & Biological Engineering & Computing – volume: 34 start-page: 292 issue: 1 year: 2017 end-page: 295 article-title: Multiple Hops Network Classification Attribute Data Fuzzy Clustering in the Simulation publication-title: Computer Simulation – volume: 95 start-page: 75 issue: 2 year: 2016 end-page: 85 article-title: A maximum margin and minimum volume hyper-spheres machine with pinball loss for imbalanced data classification publication-title: Knowledge-Based Systems – volume: PP start-page: 1 issue: 99 year: 2017 end-page: 1 article-title: Multi-Objective Evolutionary Optimization of Type-2 Fuzzy Rule-based Systems for Financial Data Classification publication-title: IEEE Transactions on Fuzzy Systems – volume: 125 start-page: 587 issue: 12 year: 2016 end-page: 600 article-title: Matched signal detection on graphs: Theory and application to brain imaging data classification publication-title: Neuroimage – volume: 31 start-page: 50 issue: 5 year: 2016 end-page: 56 article-title: An algorithm of non-equilibrium data set classification based on machine learning publication-title: IEEE Intelligent Systems – volume: 72 start-page: 3210 issue: 8 year: 2016 end-page: 3221 article-title: Feature selection based on an improved cat swarm optimization algorithm for big data classification publication-title: Journal of Supercomputing – volume: 16 start-page: 1731 issue: 11-12 year: 2016 end-page: 1735 article-title: Classification algorithm of non-equilibrium data sets based on linear regression publication-title: Proteomics – volume: 34 start-page: 673 issue: 2 year: 2019 end-page: 688 article-title: Review of imbalanced data classification methods publication-title: Control and Decision – volume: 47 start-page: 4263 issue: 12 year: 2017 end-page: 4274 article-title: A Noise-Filtered Under-Sampling Scheme for Imbalanced Classification publication-title: IEEE Transactions on Cybernetics – volume: 25 start-page: 1078 issue: 5 year: 2017 end-page: 1089 article-title: Improving Supervised Learning Classification Methods Using Multigranular Linguistic Modeling and Fuzzy Entropy publication-title: IEEE Transactions on Fuzzy Systems – volume: 49 start-page: 403 issue: 2 year: 2017 end-page: 416 article-title: Hybrid Incremental Ensemble Learning for Noisy Real-World Data Classification publication-title: IEEE Transactions on Cybernetics – volume: 46 start-page: 102 issue: 3 year: 2018 end-page: 112 article-title: Research on the Generalization Performance of SVM Imbalanced Data Classification Algorithm Based on Markov Sampling publication-title: Acta Electronica Sinica – volume: 55 start-page: 21 issue: 2 year: 2018 end-page: 31 article-title: A GEV-Based Classification Algorithm for Imbalanced Data publication-title: Journal of Computer Research and Development – volume: 60 start-page: 585 issue: 8 year: 2016 end-page: 595 article-title: Graph Embedded One-Class Classifiers for media data classification publication-title: Pattern Recognition – volume: 25 start-page: 1006 issue: 2 year: 2017 end-page: 1012 article-title: A Hierarchical Fused Fuzzy Deep Neural Network for Data Classification publication-title: IEEE Transactions on Fuzzy Systems – volume: 48 start-page: 1 issue: 8 year: 2017 end-page: 17 article-title: A synthetic neighborhood generation based ensemble learning for the imbalanced data classification publication-title: Applied Intelligence – volume: 13 start-page: 194 issue: 2 year: 2019 end-page: 205 article-title: On combining active and transfer learning for medical data classification publication-title: Iet Computer Vision – volume: 72 start-page: 1 issue: 10 year: 2016 end-page: 21 article-title: Improving the classification performance of biological imbalanced datasets by swarm optimization algorithms publication-title: Journal of Supercomputing – ident: e_1_3_2_5_2 doi: 10.1109/MIS.2016.27 – ident: e_1_3_2_6_2 doi: 10.1049/iet-com.2016.0332 – ident: e_1_3_2_16_2 doi: 10.1049/iet-cvi.2017.0524 – volume: 48 start-page: 1 issue: 8 year: 2017 ident: e_1_3_2_11_2 article-title: A synthetic neighborhood generation based ensemble learning for the imbalanced data classification publication-title: Applied Intelligence – volume: 25 start-page: 1006 issue: 2 year: 2017 ident: e_1_3_2_18_2 article-title: A Hierarchical Fused Fuzzy Deep Neural Network for Data Classification publication-title: IEEE Transactions on Fuzzy Systems – ident: e_1_3_2_4_2 doi: 10.1016/j.knosys.2015.12.005 – ident: e_1_3_2_7_2 doi: 10.1002/pmic.201500451 – start-page: 1 issue: 99 year: 2017 ident: e_1_3_2_14_2 article-title: Multi-Objective Evolutionary Optimization of Type-2 Fuzzy Rule-based Systems for Financial Data Classification publication-title: IEEE Transactions on Fuzzy Systems – volume: 34 start-page: 292 issue: 1 year: 2017 ident: e_1_3_2_3_2 article-title: Multiple Hops Network Classification Attribute Data Fuzzy Clustering in the Simulation publication-title: Computer Simulation – ident: e_1_3_2_22_2 doi: 10.1109/TCYB.2017.2774266 – volume: 55 start-page: 21 issue: 2 year: 2018 ident: e_1_3_2_10_2 article-title: A GEV-Based Classification Algorithm for Imbalanced Data publication-title: Journal of Computer Research and Development – volume: 46 start-page: 102 issue: 3 year: 2018 ident: e_1_3_2_9_2 article-title: Research on the Generalization Performance of SVM Imbalanced Data Classification Algorithm Based on Markov Sampling publication-title: Acta Electronica Sinica – ident: e_1_3_2_12_2 doi: 10.1007/s11227-015-1541-6 – volume: 99 start-page: 1 issue: 3 year: 2016 ident: e_1_3_2_2_2 article-title: Incorporating receiver operating characteristics into naive Bayes for non-equilibrium data classification publication-title: Computing – ident: e_1_3_2_17_2 doi: 10.1016/j.neuroimage.2015.10.026 – ident: e_1_3_2_15_2 doi: 10.1109/TFUZZ.2016.2594275 – ident: e_1_3_2_21_2 doi: 10.1007/s11227-016-1631-0 – volume: 34 start-page: 673 issue: 2 year: 2019 ident: e_1_3_2_8_2 article-title: Review of imbalanced data classification methods publication-title: Control and Decision – ident: e_1_3_2_13_2 doi: 10.1109/TCYB.2016.2606104 – volume: 56 start-page: 1 issue: 1 year: 2017 ident: e_1_3_2_20_2 article-title: Gene selection for microarray data classification via subspace learning and manifold regularization publication-title: Medical & Biological Engineering & Computing – ident: e_1_3_2_19_2 doi: 10.1016/j.patcog.2016.05.033 |
| SSID | ssj0017520 |
| Score | 2.1910806 |
| Snippet | In order to overcome the problems of poor accuracy and high complexity of current classification algorithm for non-equilibrium data set, this paper proposes a... |
| SourceID | proquest crossref sage |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 1639 |
| SubjectTerms | Algorithms Classification Complexity Data processing Datasets Decision trees Noise reduction Radio frequency identification Resampling |
| Title | Decision tree classification algorithm for non-equilibrium data set based on random forests |
| URI | https://journals.sagepub.com/doi/full/10.3233/JIFS-179937 https://www.proquest.com/docview/2439041996 |
| Volume | 39 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVEBS databaseName: EBSCOhost Academic Search Ultimate customDbUrl: https://search.ebscohost.com/login.aspx?authtype=ip,shib&custid=s3936755&profile=ehost&defaultdb=asn eissn: 1875-8967 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0017520 issn: 1064-1246 databaseCode: ABDBF dateStart: 19980201 isFulltext: true titleUrlDefault: https://search.ebscohost.com/direct.asp?db=asn providerName: EBSCOhost – providerCode: PRVEBS databaseName: Inspec with Full Text customDbUrl: eissn: 1875-8967 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0017520 issn: 1064-1246 databaseCode: ADMLS dateStart: 19980201 isFulltext: true titleUrlDefault: https://www.ebsco.com/products/research-databases/inspec-full-text providerName: EBSCOhost – providerCode: PRVEBS databaseName: Mathematics Source【Trial-2025/12/31】【リモートアクセス可】 customDbUrl: eissn: 1875-8967 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0017520 issn: 1064-1246 databaseCode: AMVHM dateStart: 19980201 isFulltext: true titleUrlDefault: https://www.ebsco.com/products/research-databases/mathematics-source providerName: EBSCOhost |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3Pb9MwFLagu8BhGr9Ex0BG4kRlSBznh48TUJVJm5DYYBKHyHZsNmmjG00P61_Pe7GTplOQBpe0cmwnzfvq9z3n-TMhbwwvUqMywZxILBNOJ0znlWZS5zqPYpcJ1WRbHGWzE3Fwmp6u022b1SW1fmdWg-tK_seqUAZ2xVWy_2DZrlMogO9gXziCheF4Jxt_DBvkYLq4nRgkwpj5422qLn7OIfI_u2wyCSHKZ_Z6ed6k-C8vJ5gZOlnYeoJurMJXBuC0qnlTGRzF4i-k9bxT8Kwb0LjlanUT5KA7dv49zEF_scEt9iemj8BXmjM170838Kg33eBHSOAwDEhB0K_2ZRD0sEL6fTXaYdVrFAX48N4YCQxQ9vwtxGvF0FiecJxrnh58nn5FDVXptWE2FbNvebIuvxAiG2xeYuPSN75PtjgM_NGIbO0ffpsddq-a8pR7yYrwu_wiTmz-vnftTdqyjkV66X8NIzneIdvBKnTf4-IRuWd_PSYPewKTT8iPFiEUEUI3EUI7hFAwOr2FEIoIoYAQ2iCEQn2PEBoQ8pScTD8df5ixsJ0GM1yKmpkkLaStXKbiqCqqKIuk0cBebZRrZ4BKGwXnXBHHOq0khzgdPvI4dtY6jbKZz8gIbsU-J9QCj4VYtFBJakUK3UsVQ9ypTJYI3Cl-TN62T6s0QWsetzy5KAcsM4YRo6185SVWhqvttY-9DP_BRcmBT0cCM-nH5DWaYn1qoIvdu13pBXmwRv4eGdW_l_YlUM9avwrg-QOezoQM |
| linkProvider | EBSCOhost |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Decision+tree+classification+algorithm+for+non-equilibrium+data+set+based+on+random+forests&rft.jtitle=Journal+of+intelligent+%26+fuzzy+systems&rft.au=Wang%2C+Peng&rft.au=Zhang%2C+Ningchao&rft.date=2020-01-01&rft.issn=1064-1246&rft.eissn=1875-8967&rft.volume=39&rft.issue=2&rft.spage=1639&rft.epage=1648&rft_id=info:doi/10.3233%2FJIFS-179937&rft.externalDBID=n%2Fa&rft.externalDocID=10_3233_JIFS_179937 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1064-1246&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1064-1246&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1064-1246&client=summon |