A Survey of Distributed Classification Based Ensemble Data Mining Methods

Distributed classification is one task of distributed data mining which allows predicting if a data instance is member of a predefined class. It can be applied for two different objectives: the first is the desire to scale up algorithms to large data sets where the data are distributed in order to i...

Full description

Saved in:
Bibliographic Details
Published inJournal of applied sciences (Asian Network for Scientific Information) Vol. 9; no. 20; pp. 3739 - 3745
Main Authors Mokeddem, D., Belbachir, H.
Format Journal Article
LanguageEnglish
Published 01.10.2009
Online AccessGet full text
ISSN1812-5654
DOI10.3923/jas.2009.3739.3745

Cover

Abstract Distributed classification is one task of distributed data mining which allows predicting if a data instance is member of a predefined class. It can be applied for two different objectives: the first is the desire to scale up algorithms to large data sets where the data are distributed in order to increase the overall efficiency; the second is the processing of data which are inherently distributed and autonomous. Ensemble learning methods as very promising techniques in terms of accuracy and also providing a distributed aspect, can be adapted to the distributed data mining. This study presents a survey of various approaches which use ensemble learning methods in a context of distributed classification, using as base classifier decision trees algorithm. According to the two objective mentioned above, the majority of work reported in the literature address the problem using one of the two techniques. The adaptation of ensemble learning methods to disjoint data sets, in the context of mining inherently distributed data and the parallelization of ensemble learning methods, in a scalability context. Through this survey, one can deduct that the work done in one or the other perspective (scaling up data mining algorithms or mining inherently distributed data) could be complementary. Some open questions, current debates and future directions are also discussed.
AbstractList Distributed classification is one task of distributed data mining which allows predicting if a data instance is member of a predefined class. It can be applied for two different objectives: the first is the desire to scale up algorithms to large data sets where the data are distributed in order to increase the overall efficiency; the second is the processing of data which are inherently distributed and autonomous. Ensemble learning methods as very promising techniques in terms of accuracy and also providing a distributed aspect, can be adapted to the distributed data mining. This study presents a survey of various approaches which use ensemble learning methods in a context of distributed classification, using as base classifier decision trees algorithm. According to the two objective mentioned above, the majority of work reported in the literature address the problem using one of the two techniques. The adaptation of ensemble learning methods to disjoint data sets, in the context of mining inherently distributed data and the parallelization of ensemble learning methods, in a scalability context. Through this survey, one can deduct that the work done in one or the other perspective (scaling up data mining algorithms or mining inherently distributed data) could be complementary. Some open questions, current debates and future directions are also discussed.
Author Belbachir, H.
Mokeddem, D.
Author_xml – sequence: 1
  givenname: D.
  surname: Mokeddem
  fullname: Mokeddem, D.
– sequence: 2
  givenname: H.
  surname: Belbachir
  fullname: Belbachir, H.
BookMark eNp9kLFOwzAQhj0UibbwAkye2FLiOLF9Y2kLVGrFAMyW457BVZqU2EHq25NQJgaWO-n0fyf934SM6qZGQm5YOuOQ8bu9CbMsTWHGJR9GXozImCmWJYUo8ksyCWGfpjkXIMdkPacvXfuFJ9o4uvQhtr7sIu7oojIheOetib6p6b0J_XFVBzyUFdKliYZufe3rd7rF-NHswhW5cKYKeP27p-TtYfW6eEo2z4_rxXyTWAY8JiXkQgmGPFNiJ6zKC4RMKGDAHIBUTIIBJ6UTllsmsURpQXFnWepyYZBPye3577FtPjsMUR98sFhVpsamC5rnkAH0_aZEnYO2bUJo0Wnr40-b2BpfaZbqwZfufenBlx586cFXj2Z_0GPrD6Y9_Qd9A_QdcjA
CitedBy_id crossref_primary_10_1016_j_ins_2015_11_001
crossref_primary_10_3923_jas_2011_3221_3232
crossref_primary_10_3923_jas_2011_2076_2083
crossref_primary_10_1007_s10489_019_01423_6
Cites_doi 10.1016/S0893-6080(05)80023-1
10.1016/j.jpdc.2007.07.009
10.1016/S0031-3203(02)00121-8
10.1007/BF00058655
10.1109/TPAMI.2006.211
10.1006/inco.1995.1136
10.1109/34.709601
10.1023/A:1010933404324
10.1145/545151.545176
10.1023/A:1007563306331
10.1016/S0167-8655(02)00269-6
10.1007/BF00116037
10.1023/A:1013992203485
10.1023/A:1007607513941
10.1162/153244304322972694
ContentType Journal Article
DBID AAYXX
CITATION
7SC
7SP
7SR
7TB
7U5
8BQ
8FD
FR3
JG9
JQ2
KR7
L7M
L~C
L~D
DOI 10.3923/jas.2009.3739.3745
DatabaseName CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Engineered Materials Abstracts
Mechanical & Transportation Engineering Abstracts
Solid State and Superconductivity Abstracts
METADEX
Technology Research Database
Engineering Research Database
Materials Research Database
ProQuest Computer Science Collection
Civil Engineering Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Materials Research Database
Civil Engineering Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Mechanical & Transportation Engineering Abstracts
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
METADEX
Computer and Information Systems Abstracts Professional
Engineered Materials Abstracts
Solid State and Superconductivity Abstracts
Engineering Research Database
Advanced Technologies Database with Aerospace
DatabaseTitleList Materials Research Database
DeliveryMethod fulltext_linktorsrc
Discipline Sciences (General)
EndPage 3745
ExternalDocumentID 10_3923_jas_2009_3739_3745
GroupedDBID .DC
29J
2WC
5GY
AAYXX
ACGFO
ADBBV
ALMA_UNASSIGNED_HOLDINGS
BAWUL
C1A
CITATION
DIK
DU5
E3Z
EBS
EJD
GX1
HH5
LJA
OK1
OVT
RNS
TR2
XSB
7SC
7SP
7SR
7TB
7U5
8BQ
8FD
FR3
JG9
JQ2
KR7
L7M
L~C
L~D
ID FETCH-LOGICAL-c193t-b946861e3286d6c845e92689191f9978179a9f77f6c3c17ebe7c983fc10f46ae3
ISSN 1812-5654
IngestDate Thu Jul 10 23:20:50 EDT 2025
Thu Apr 24 23:03:16 EDT 2025
Tue Jul 01 02:02:12 EDT 2025
IsPeerReviewed false
IsScholarly true
Issue 20
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c193t-b946861e3286d6c845e92689191f9978179a9f77f6c3c17ebe7c983fc10f46ae3
Notes ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 23
PQID 34929904
PQPubID 23500
PageCount 7
ParticipantIDs proquest_miscellaneous_34929904
crossref_citationtrail_10_3923_jas_2009_3739_3745
crossref_primary_10_3923_jas_2009_3739_3745
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2009-10-1
PublicationDateYYYYMMDD 2009-10-01
PublicationDate_xml – month: 10
  year: 2009
  text: 2009-10-1
  day: 01
PublicationDecade 2000
PublicationTitle Journal of applied sciences (Asian Network for Scientific Information)
PublicationYear 2009
References ref13
ref15
Chawla (ref7) 2004; 5
ref14
Quinlan (ref18) 1993
ref20
Yu (ref22) 2001
ref11
ref21
ref2
Park (ref16) 2002
ref17
Blake (ref1) 1998
Davies (ref10) 2000; 2000
ref19
ref8
ref9
ref4
Zhang (ref23) 2003; 2
ref3
ref6
ref5
Eschrich (ref12) 2002; 2
References_xml – ident: ref9
  doi: 10.1016/S0893-6080(05)80023-1
– volume-title: Distributed Data Mining: Algorithms
  year: 2002
  ident: ref16
– volume: 5
  start-page: 421
  issn: 1532-4435
  year: 2004
  ident: ref7
  article-title: Learning ensembles from bites: A scalable and accurate approach.
  publication-title: J. Mach. Learn. Res.
– ident: ref21
  doi: 10.1016/j.jpdc.2007.07.009
– ident: ref6
  doi: 10.1016/S0031-3203(02)00121-8
– ident: ref3
  doi: 10.1007/BF00058655
– volume: 2000
  start-page: 1
  issn: 0885-6125
  year: 2000
  ident: ref10
  article-title: Dagger: A new approach to combining multiple models learned from disjoint subsets.
  publication-title: Mach. Learn.
– ident: ref19
  doi: 10.1109/TPAMI.2006.211
– ident: ref13
  doi: 10.1006/inco.1995.1136
– ident: ref14
  doi: 10.1109/34.709601
– ident: ref5
  doi: 10.1023/A:1010933404324
– ident: ref2
  doi: 10.1145/545151.545176
– volume: 2
  start-page: 1464
  year: 2002
  ident: ref12
  article-title: Learning to predict in complex biological domains.
  publication-title: J. Syst. Simul.
– ident: ref4
  doi: 10.1023/A:1007563306331
– ident: ref8
  doi: 10.1016/S0167-8655(02)00269-6
– volume-title: C4.5 Programs for Machine Learning.
  year: 1993
  ident: ref18
– ident: ref20
  doi: 10.1007/BF00116037
– volume: 2
  start-page: 5
  year: 2003
  ident: ref23
  article-title: Multi-database mining.
  publication-title: IEEE Comput. Intell. Bull.
– volume-title: UCI Repository of Machine Learning Databases.
  year: 1998
  ident: ref1
– ident: ref15
  doi: 10.1023/A:1013992203485
– volume-title: Parallelizing Boosting and Bagging.
  year: 2001
  ident: ref22
– ident: ref11
  doi: 10.1023/A:1007607513941
– ident: ref17
  doi: 10.1162/153244304322972694
SSID ssj0043697
Score 1.8290895
Snippet Distributed classification is one task of distributed data mining which allows predicting if a data instance is member of a predefined class. It can be applied...
SourceID proquest
crossref
SourceType Aggregation Database
Enrichment Source
Index Database
StartPage 3739
Title A Survey of Distributed Classification Based Ensemble Data Mining Methods
URI https://www.proquest.com/docview/34929904
Volume 9
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVFSB
  databaseName: Free Full-Text Journals in Chemistry
  issn: 1812-5654
  databaseCode: HH5
  dateStart: 20010101
  customDbUrl:
  isFulltext: true
  dateEnd: 99991231
  titleUrlDefault: http://abc-chemistry.org/
  omitProxy: true
  ssIdentifier: ssj0043697
  providerName: ABC ChemistRy
– providerCode: PRVBFR
  databaseName: Free Medical Journals
  issn: 1812-5654
  databaseCode: DIK
  dateStart: 20010101
  customDbUrl:
  isFulltext: true
  dateEnd: 99991231
  titleUrlDefault: http://www.freemedicaljournals.com
  omitProxy: true
  ssIdentifier: ssj0043697
  providerName: Flying Publisher
– providerCode: PRVFQY
  databaseName: GFMER Free Medical Journals
  issn: 1812-5654
  databaseCode: GX1
  dateStart: 20010101
  customDbUrl:
  isFulltext: true
  dateEnd: 99991231
  titleUrlDefault: http://www.gfmer.ch/Medical_journals/Free_medical.php
  omitProxy: true
  ssIdentifier: ssj0043697
  providerName: Geneva Foundation for Medical Education and Research
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Lb9NAEF5BuXBBlIcolLIHhECWS2xv93FM21QFNeGSSLmt7PVaIFqnjV0O_HpmdtdJ3PK-WJG1jq2Zb2fH65nvI-S1MDCNqlJgN04aswM7iKVULLYG1q6Ss1K6JrHxhJ_O2Mf5wXyj4xq7S9pi33z_aV_J_3gVzoFfsUv2Hzy7-lM4Ab_Bv3AED8Pxr3w8hHm__GbdR_JjZMBF8SrcssWUGGuAvHcPYaUqo1Hd2AvskzrO2zwaO2GIaOwEpJtfpKh5SFHDMuk2aIeu63Liq8ddkaILD67kKArNTa5gZL3FMF58tRDfLnoFxof2HELW5y_LdYdEt_mgVmVsXbyUKKXAPQ90F1DVBm7SwUZ0zIQnLgorbSY8k-TNKA4pW-bUAxrPJ4qX7XeD-5TZk0_6ZHZ2pqej-fTN5VWMamL41T1Iq9wl91KI9k7SY74q_WFZEN3pHt43U-Ft39--aT9h6a_XLgmZPiQPgmvo0ENhm9yx9SOyHeJzQ98GEvF3j8mHIfXYoIuKbmCD9rFBHTZohw2K2KAeGzRg4wmZnYymR6dxkM2IDWTjbVwoxiVPbJZKXnIjYQKqlEsFb-aVQoozoXJVCVFxk5lEwCwWRsmsMsmgYjy32VOyVS9q-4xQlZVlwQcWslzLIFUsTJKWRVpUAia0LOQOSTrbaBM45VHa5FzDuyXaU4M9UelUabSnRnvukGh1zaVnVPnt6FedyTUEPvyaldd2cd1opNWEVIo9_-OIF-T-Gru7ZKtdXtuXkEq2xZ7DxQ_WPXNr
linkProvider Geneva Foundation for Medical Education and Research
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+Survey+of+Distributed+Classification+Based+Ensemble+Data+Mining+Methods&rft.jtitle=Journal+of+applied+sciences+%28Asian+Network+for+Scientific+Information%29&rft.au=Mokeddem%2C+D&rft.au=Belbschir%2C+H&rft.date=2009-10-01&rft.issn=1812-5654&rft.volume=9&rft.issue=20&rft.spage=3739&rft.epage=3745&rft_id=info:doi/10.3923%2Fjas.2009.3739.3745&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1812-5654&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1812-5654&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1812-5654&client=summon