A Survey of Distributed Classification Based Ensemble Data Mining Methods
Distributed classification is one task of distributed data mining which allows predicting if a data instance is member of a predefined class. It can be applied for two different objectives: the first is the desire to scale up algorithms to large data sets where the data are distributed in order to i...
Saved in:
| Published in | Journal of applied sciences (Asian Network for Scientific Information) Vol. 9; no. 20; pp. 3739 - 3745 |
|---|---|
| Main Authors | , |
| Format | Journal Article |
| Language | English |
| Published |
01.10.2009
|
| Online Access | Get full text |
| ISSN | 1812-5654 |
| DOI | 10.3923/jas.2009.3739.3745 |
Cover
| Abstract | Distributed classification is one task of distributed data mining which allows predicting if a data instance is member of a predefined class. It can be applied for two different objectives: the first is the desire to scale up algorithms to large data sets where the data are distributed in order to increase the overall efficiency; the second is the processing of data which are inherently distributed and autonomous. Ensemble learning methods as very promising techniques in terms of accuracy and also providing a distributed aspect, can be adapted to the distributed data mining. This study presents a survey of various approaches which use ensemble learning methods in a context of distributed classification, using as base classifier decision trees algorithm. According to the two objective mentioned above, the majority of work reported in the literature address the problem using one of the two techniques. The adaptation of ensemble learning methods to disjoint data sets, in the context of mining inherently distributed data and the parallelization of ensemble learning methods, in a scalability context. Through this survey, one can deduct that the work done in one or the other perspective (scaling up data mining algorithms or mining inherently distributed data) could be complementary. Some open questions, current debates and future directions are also discussed. |
|---|---|
| AbstractList | Distributed classification is one task of distributed data mining which allows predicting if a data instance is member of a predefined class. It can be applied for two different objectives: the first is the desire to scale up algorithms to large data sets where the data are distributed in order to increase the overall efficiency; the second is the processing of data which are inherently distributed and autonomous. Ensemble learning methods as very promising techniques in terms of accuracy and also providing a distributed aspect, can be adapted to the distributed data mining. This study presents a survey of various approaches which use ensemble learning methods in a context of distributed classification, using as base classifier decision trees algorithm. According to the two objective mentioned above, the majority of work reported in the literature address the problem using one of the two techniques. The adaptation of ensemble learning methods to disjoint data sets, in the context of mining inherently distributed data and the parallelization of ensemble learning methods, in a scalability context. Through this survey, one can deduct that the work done in one or the other perspective (scaling up data mining algorithms or mining inherently distributed data) could be complementary. Some open questions, current debates and future directions are also discussed. |
| Author | Belbachir, H. Mokeddem, D. |
| Author_xml | – sequence: 1 givenname: D. surname: Mokeddem fullname: Mokeddem, D. – sequence: 2 givenname: H. surname: Belbachir fullname: Belbachir, H. |
| BookMark | eNp9kLFOwzAQhj0UibbwAkye2FLiOLF9Y2kLVGrFAMyW457BVZqU2EHq25NQJgaWO-n0fyf934SM6qZGQm5YOuOQ8bu9CbMsTWHGJR9GXozImCmWJYUo8ksyCWGfpjkXIMdkPacvXfuFJ9o4uvQhtr7sIu7oojIheOetib6p6b0J_XFVBzyUFdKliYZufe3rd7rF-NHswhW5cKYKeP27p-TtYfW6eEo2z4_rxXyTWAY8JiXkQgmGPFNiJ6zKC4RMKGDAHIBUTIIBJ6UTllsmsURpQXFnWepyYZBPye3577FtPjsMUR98sFhVpsamC5rnkAH0_aZEnYO2bUJo0Wnr40-b2BpfaZbqwZfufenBlx586cFXj2Z_0GPrD6Y9_Qd9A_QdcjA |
| CitedBy_id | crossref_primary_10_1016_j_ins_2015_11_001 crossref_primary_10_3923_jas_2011_3221_3232 crossref_primary_10_3923_jas_2011_2076_2083 crossref_primary_10_1007_s10489_019_01423_6 |
| Cites_doi | 10.1016/S0893-6080(05)80023-1 10.1016/j.jpdc.2007.07.009 10.1016/S0031-3203(02)00121-8 10.1007/BF00058655 10.1109/TPAMI.2006.211 10.1006/inco.1995.1136 10.1109/34.709601 10.1023/A:1010933404324 10.1145/545151.545176 10.1023/A:1007563306331 10.1016/S0167-8655(02)00269-6 10.1007/BF00116037 10.1023/A:1013992203485 10.1023/A:1007607513941 10.1162/153244304322972694 |
| ContentType | Journal Article |
| DBID | AAYXX CITATION 7SC 7SP 7SR 7TB 7U5 8BQ 8FD FR3 JG9 JQ2 KR7 L7M L~C L~D |
| DOI | 10.3923/jas.2009.3739.3745 |
| DatabaseName | CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Engineered Materials Abstracts Mechanical & Transportation Engineering Abstracts Solid State and Superconductivity Abstracts METADEX Technology Research Database Engineering Research Database Materials Research Database ProQuest Computer Science Collection Civil Engineering Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | CrossRef Materials Research Database Civil Engineering Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Mechanical & Transportation Engineering Abstracts Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts METADEX Computer and Information Systems Abstracts Professional Engineered Materials Abstracts Solid State and Superconductivity Abstracts Engineering Research Database Advanced Technologies Database with Aerospace |
| DatabaseTitleList | Materials Research Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Sciences (General) |
| EndPage | 3745 |
| ExternalDocumentID | 10_3923_jas_2009_3739_3745 |
| GroupedDBID | .DC 29J 2WC 5GY AAYXX ACGFO ADBBV ALMA_UNASSIGNED_HOLDINGS BAWUL C1A CITATION DIK DU5 E3Z EBS EJD GX1 HH5 LJA OK1 OVT RNS TR2 XSB 7SC 7SP 7SR 7TB 7U5 8BQ 8FD FR3 JG9 JQ2 KR7 L7M L~C L~D |
| ID | FETCH-LOGICAL-c193t-b946861e3286d6c845e92689191f9978179a9f77f6c3c17ebe7c983fc10f46ae3 |
| ISSN | 1812-5654 |
| IngestDate | Thu Jul 10 23:20:50 EDT 2025 Thu Apr 24 23:03:16 EDT 2025 Tue Jul 01 02:02:12 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Issue | 20 |
| Language | English |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c193t-b946861e3286d6c845e92689191f9978179a9f77f6c3c17ebe7c983fc10f46ae3 |
| Notes | ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 23 |
| PQID | 34929904 |
| PQPubID | 23500 |
| PageCount | 7 |
| ParticipantIDs | proquest_miscellaneous_34929904 crossref_citationtrail_10_3923_jas_2009_3739_3745 crossref_primary_10_3923_jas_2009_3739_3745 |
| ProviderPackageCode | CITATION AAYXX |
| PublicationCentury | 2000 |
| PublicationDate | 2009-10-1 |
| PublicationDateYYYYMMDD | 2009-10-01 |
| PublicationDate_xml | – month: 10 year: 2009 text: 2009-10-1 day: 01 |
| PublicationDecade | 2000 |
| PublicationTitle | Journal of applied sciences (Asian Network for Scientific Information) |
| PublicationYear | 2009 |
| References | ref13 ref15 Chawla (ref7) 2004; 5 ref14 Quinlan (ref18) 1993 ref20 Yu (ref22) 2001 ref11 ref21 ref2 Park (ref16) 2002 ref17 Blake (ref1) 1998 Davies (ref10) 2000; 2000 ref19 ref8 ref9 ref4 Zhang (ref23) 2003; 2 ref3 ref6 ref5 Eschrich (ref12) 2002; 2 |
| References_xml | – ident: ref9 doi: 10.1016/S0893-6080(05)80023-1 – volume-title: Distributed Data Mining: Algorithms year: 2002 ident: ref16 – volume: 5 start-page: 421 issn: 1532-4435 year: 2004 ident: ref7 article-title: Learning ensembles from bites: A scalable and accurate approach. publication-title: J. Mach. Learn. Res. – ident: ref21 doi: 10.1016/j.jpdc.2007.07.009 – ident: ref6 doi: 10.1016/S0031-3203(02)00121-8 – ident: ref3 doi: 10.1007/BF00058655 – volume: 2000 start-page: 1 issn: 0885-6125 year: 2000 ident: ref10 article-title: Dagger: A new approach to combining multiple models learned from disjoint subsets. publication-title: Mach. Learn. – ident: ref19 doi: 10.1109/TPAMI.2006.211 – ident: ref13 doi: 10.1006/inco.1995.1136 – ident: ref14 doi: 10.1109/34.709601 – ident: ref5 doi: 10.1023/A:1010933404324 – ident: ref2 doi: 10.1145/545151.545176 – volume: 2 start-page: 1464 year: 2002 ident: ref12 article-title: Learning to predict in complex biological domains. publication-title: J. Syst. Simul. – ident: ref4 doi: 10.1023/A:1007563306331 – ident: ref8 doi: 10.1016/S0167-8655(02)00269-6 – volume-title: C4.5 Programs for Machine Learning. year: 1993 ident: ref18 – ident: ref20 doi: 10.1007/BF00116037 – volume: 2 start-page: 5 year: 2003 ident: ref23 article-title: Multi-database mining. publication-title: IEEE Comput. Intell. Bull. – volume-title: UCI Repository of Machine Learning Databases. year: 1998 ident: ref1 – ident: ref15 doi: 10.1023/A:1013992203485 – volume-title: Parallelizing Boosting and Bagging. year: 2001 ident: ref22 – ident: ref11 doi: 10.1023/A:1007607513941 – ident: ref17 doi: 10.1162/153244304322972694 |
| SSID | ssj0043697 |
| Score | 1.8290895 |
| Snippet | Distributed classification is one task of distributed data mining which allows predicting if a data instance is member of a predefined class. It can be applied... |
| SourceID | proquest crossref |
| SourceType | Aggregation Database Enrichment Source Index Database |
| StartPage | 3739 |
| Title | A Survey of Distributed Classification Based Ensemble Data Mining Methods |
| URI | https://www.proquest.com/docview/34929904 |
| Volume | 9 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVFSB databaseName: Free Full-Text Journals in Chemistry issn: 1812-5654 databaseCode: HH5 dateStart: 20010101 customDbUrl: isFulltext: true dateEnd: 99991231 titleUrlDefault: http://abc-chemistry.org/ omitProxy: true ssIdentifier: ssj0043697 providerName: ABC ChemistRy – providerCode: PRVBFR databaseName: Free Medical Journals issn: 1812-5654 databaseCode: DIK dateStart: 20010101 customDbUrl: isFulltext: true dateEnd: 99991231 titleUrlDefault: http://www.freemedicaljournals.com omitProxy: true ssIdentifier: ssj0043697 providerName: Flying Publisher – providerCode: PRVFQY databaseName: GFMER Free Medical Journals issn: 1812-5654 databaseCode: GX1 dateStart: 20010101 customDbUrl: isFulltext: true dateEnd: 99991231 titleUrlDefault: http://www.gfmer.ch/Medical_journals/Free_medical.php omitProxy: true ssIdentifier: ssj0043697 providerName: Geneva Foundation for Medical Education and Research |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Lb9NAEF5BuXBBlIcolLIHhECWS2xv93FM21QFNeGSSLmt7PVaIFqnjV0O_HpmdtdJ3PK-WJG1jq2Zb2fH65nvI-S1MDCNqlJgN04aswM7iKVULLYG1q6Ss1K6JrHxhJ_O2Mf5wXyj4xq7S9pi33z_aV_J_3gVzoFfsUv2Hzy7-lM4Ab_Bv3AED8Pxr3w8hHm__GbdR_JjZMBF8SrcssWUGGuAvHcPYaUqo1Hd2AvskzrO2zwaO2GIaOwEpJtfpKh5SFHDMuk2aIeu63Liq8ddkaILD67kKArNTa5gZL3FMF58tRDfLnoFxof2HELW5y_LdYdEt_mgVmVsXbyUKKXAPQ90F1DVBm7SwUZ0zIQnLgorbSY8k-TNKA4pW-bUAxrPJ4qX7XeD-5TZk0_6ZHZ2pqej-fTN5VWMamL41T1Iq9wl91KI9k7SY74q_WFZEN3pHt43U-Ft39--aT9h6a_XLgmZPiQPgmvo0ENhm9yx9SOyHeJzQ98GEvF3j8mHIfXYoIuKbmCD9rFBHTZohw2K2KAeGzRg4wmZnYymR6dxkM2IDWTjbVwoxiVPbJZKXnIjYQKqlEsFb-aVQoozoXJVCVFxk5lEwCwWRsmsMsmgYjy32VOyVS9q-4xQlZVlwQcWslzLIFUsTJKWRVpUAia0LOQOSTrbaBM45VHa5FzDuyXaU4M9UelUabSnRnvukGh1zaVnVPnt6FedyTUEPvyaldd2cd1opNWEVIo9_-OIF-T-Gru7ZKtdXtuXkEq2xZ7DxQ_WPXNr |
| linkProvider | Geneva Foundation for Medical Education and Research |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+Survey+of+Distributed+Classification+Based+Ensemble+Data+Mining+Methods&rft.jtitle=Journal+of+applied+sciences+%28Asian+Network+for+Scientific+Information%29&rft.au=Mokeddem%2C+D&rft.au=Belbschir%2C+H&rft.date=2009-10-01&rft.issn=1812-5654&rft.volume=9&rft.issue=20&rft.spage=3739&rft.epage=3745&rft_id=info:doi/10.3923%2Fjas.2009.3739.3745&rft.externalDBID=NO_FULL_TEXT |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1812-5654&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1812-5654&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1812-5654&client=summon |