Tree Structure Based Parallel Frequent Pattern Mining on PC Cluster

Frequent pattern mining has become a fundamental technique for many data mining tasks. Many modern frequent pattern mining algorithms such as FP-growth adopt tree structure to compress database into on-memory compact data structure. Recent studies show that the tree structure can be efficiently mine...

Full description

Saved in:
Bibliographic Details
Published inDatabase and Expert Systems Applications Vol. 2736; pp. 537 - 547
Main Authors Pramudiono, Iko, Kitsuregawa, Masaru
Format Book Chapter Conference Proceeding
LanguageEnglish
Japanese
Published Germany Springer Berlin / Heidelberg 2003
Springer Berlin Heidelberg
Springer
SeriesLecture Notes in Computer Science
Subjects
Online AccessGet full text
ISBN9783540408062
3540408061
ISSN0302-9743
1611-3349
DOI10.1007/978-3-540-45227-0_53

Cover

Abstract Frequent pattern mining has become a fundamental technique for many data mining tasks. Many modern frequent pattern mining algorithms such as FP-growth adopt tree structure to compress database into on-memory compact data structure. Recent studies show that the tree structure can be efficiently mined using frequent pattern growth methodology. Higher level of performance improvement can be expected from parallel execution. In particular, PC cluster is gaining popularity as the high cost-performance parallel platform for data extensive task like data mining. However, we have to address many issues such as space distribution on each node and skew handling to efficiently mine frequent patterns from tree structure on a shared-nothing environment. We develop a framework to address those issues using novel granularity control mechanism and tree remerging. The common framework can be enhanced with temporal constrain to mine web access patterns. We invent improved support counting procedure to reduce the additional communication overhead. Real implementation using up to 32 nodes confirms that good speedup ratio can be achieved even on skewed environment.
AbstractList Frequent pattern mining has become a fundamental technique for many data mining tasks. Many modern frequent pattern mining algorithms such as FP-growth adopt tree structure to compress database into on-memory compact data structure. Recent studies show that the tree structure can be efficiently mined using frequent pattern growth methodology. Higher level of performance improvement can be expected from parallel execution. In particular, PC cluster is gaining popularity as the high cost-performance parallel platform for data extensive task like data mining. However, we have to address many issues such as space distribution on each node and skew handling to efficiently mine frequent patterns from tree structure on a shared-nothing environment. We develop a framework to address those issues using novel granularity control mechanism and tree remerging. The common framework can be enhanced with temporal constrain to mine web access patterns. We invent improved support counting procedure to reduce the additional communication overhead. Real implementation using up to 32 nodes confirms that good speedup ratio can be achieved even on skewed environment.
Author Kitsuregawa, Masaru
Pramudiono, Iko
Author_xml – sequence: 1
  givenname: Iko
  surname: Pramudiono
  fullname: Pramudiono, Iko
– sequence: 2
  givenname: Masaru
  surname: Kitsuregawa
  fullname: Kitsuregawa, Masaru
BackLink http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=15551584$$DView record in Pascal Francis
BookMark eNotkM1OAjEUhauiEZA3cDEbl9W2t79LnYiaYCQR100ZOgiOHWxnFr69BeymyTnn3pvzjdAgtMEjdE3JLSVE3RmlMWDBCeaCMYWJFXCCRpCVgwCnaEglpRiAmzM0yfmDRzSRbICGBAjDRnG42Ockl1oqfYkmKW1JfsAY5zBE5SJ6X7x3sa-6PvriwSW_KuYuuqbxTTGN_qf3octK1_kYitdN2IR10YZiXhZl06esXqHz2jXJT_7_MfqYPi7KZzx7e3op72d4C1J1uJZGgGbKU-2EACXrlaoqz7xaUcKVrs3SOVYrsVTZqZkhQspagK9WRnJjYIxujnt3LlWuqaML1SbZXdx8u_hrqRCCCs1zjh1zKVth7aNdtu1XspTYPVqbUVmwGZY9kLR7tHkI_pfHNldOnfX7qSqXzyyqT7fLTZMFog0h2gpj8zn4A9dOd14
ContentType Book Chapter
Conference Proceeding
Copyright Springer-Verlag Berlin Heidelberg 2003
2004 INIST-CNRS
Copyright_xml – notice: Springer-Verlag Berlin Heidelberg 2003
– notice: 2004 INIST-CNRS
DBID FFUUA
IQODW
DEWEY 5.74
DOI 10.1007/978-3-540-45227-0_53
DatabaseName ProQuest Ebook Central - Book Chapters - Demo use only
Pascal-Francis
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
Applied Sciences
EISBN 3540452273
9783540452270
EISSN 1611-3349
Editor Retschitzegger, Werner
Stepankova, Olga
Marik, Vladimir
Editor_xml – sequence: 1
  fullname: Retschitzegger, Werner
– sequence: 2
  fullname: Stepankova, Olga
– sequence: 3
  fullname: Marik, Vladimir
EndPage 547
ExternalDocumentID 15551584
EBC3089008_59_555
GroupedDBID 2HV
38.
50X
6IE
6IK
6IL
AABBV
AAJGR
AAVQY
AAWTH
ABBVZ
ABCGF
ABODI
ABWRD
ACFJL
ACPRQ
ACRFJ
ADHDZ
ADHHQ
ADNMO
AEAWX
AEDXK
AEJIA
AEKFX
AEZAY
AFPTF
AGEUI
AKVJN
ALMA_UNASSIGNED_HOLDINGS
ASPBG
AVWKF
AZFZN
AZZ
BBABE
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CZZ
FEDTE
FFUUA
HVGLF
JJU
MA.
MW~
N2R
NUJIC
OCGVD
OCL
RIB
RIC
RIE
RIL
SBO
SCCYB
SOYCC
SYRPR
SZJJV
TSXQS
~1X
~54
-DT
-GH
-~X
1SB
29L
2HA
5QI
875
AASHB
ABMNI
ACGFS
ADCXD
AEFIE
EJD
F5P
LAS
LDH
P2P
RNI
RSU
SVGTG
VI1
~02
IQODW
RIG
ID FETCH-LOGICAL-j367t-f6953827e18a55376fd7cce2e7d10478f9baa2f75b7fd7f290566f53ecd964993
ISBN 9783540408062
3540408061
ISSN 0302-9743
IngestDate Wed Apr 02 07:25:09 EDT 2025
Wed Sep 17 04:56:25 EDT 2025
Thu May 29 00:45:14 EDT 2025
IsPeerReviewed true
IsScholarly true
Keywords Grain size
Parallel algorithm
Data analysis
Tree structured method
World wide web
Database
Information extraction
Internet
Distributed system
Data mining
Data structure
LCCallNum QA76.9.D35
Language English
Japanese
License CC BY 4.0
LinkModel OpenURL
MeetingName DEXA 2003 : database and expert systems applications (Prague, 1-5 September 2003)
MergedId FETCHMERGED-LOGICAL-j367t-f6953827e18a55376fd7cce2e7d10478f9baa2f75b7fd7f290566f53ecd964993
OCLC 166468678
PQID EBC3089008_59_555
PageCount 11
ParticipantIDs pascalfrancis_primary_15551584
springer_books_10_1007_978_3_540_45227_0_53
proquest_ebookcentralchapters_3089008_59_555
PublicationCentury 2000
PublicationDate 2003
PublicationDateYYYYMMDD 2003-01-01
PublicationDate_xml – year: 2003
  text: 2003
PublicationDecade 2000
PublicationPlace Germany
PublicationPlace_xml – name: Germany
– name: Berlin, Heidelberg
– name: Berlin
PublicationSeriesTitle Lecture Notes in Computer Science
PublicationSubtitle 14th International Conference, DEXA 2003, Prague, Czech Republic, September 1-5, 2003, Proceedings
PublicationTitle Database and Expert Systems Applications
PublicationYear 2003
Publisher Springer Berlin / Heidelberg
Springer Berlin Heidelberg
Springer
Publisher_xml – name: Springer Berlin / Heidelberg
– name: Springer Berlin Heidelberg
– name: Springer
RelatedPersons Hartmanis, Juris
Goos, Gerhard
van Leeuwen, Jan
RelatedPersons_xml – sequence: 1
  givenname: Gerhard
  surname: Goos
  fullname: Goos, Gerhard
– sequence: 2
  givenname: Juris
  surname: Hartmanis
  fullname: Hartmanis, Juris
– sequence: 3
  givenname: Jan
  surname: van Leeuwen
  fullname: van Leeuwen, Jan
SSID ssj0000322443
ssj0002792
Score 1.787104
Snippet Frequent pattern mining has become a fundamental technique for many data mining tasks. Many modern frequent pattern mining algorithms such as FP-growth adopt...
SourceID pascalfrancis
springer
proquest
SourceType Index Database
Publisher
StartPage 537
SubjectTerms Applied sciences
Computer science; control theory; systems
Computer systems and distributed systems. User interface
Exact sciences and technology
Frequent Pattern
Minimum Support
Parallel Execution
Processing Node
Software
Speedup Ratio
Title Tree Structure Based Parallel Frequent Pattern Mining on PC Cluster
URI http://ebookcentral.proquest.com/lib/SITE_ID/reader.action?docID=3089008&ppg=555
http://link.springer.com/10.1007/978-3-540-45227-0_53
Volume 2736
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3Nb9MwFLe6SUiIw8aXKLDJB24hqI1rOzlwIdo0EJt22KTdLMexEbC1qEk1ib-e9_zRNWVCGpcoStLGfT_3-fl9_B4h78AC4LoUIm-dqXKYFDbXruQ5005KW0jjfCjm9EycXM6-XPGr0chtZC2t-uaD-X1vXcn_oArXAFeskn0AsusvhQtwDvjCERCG45bxO3SzhlwW3Wtcg7z73zMW94l_PNiWm844VH5LfbNqv4de29nnn4u78HuPbsJv-laH6p1OL1eDqbS0mHCIRLMYbvgEL23B9FxiG5ZrMH19NnaPbP_oXcxOfc8JjEKc11l9vepSBjDKxHYfv8awxdmi99lgWeoskRTNwBPBtjwRyROZ_YOoKzqaZmCnik23JgO9DDuboOpsUMUCCRZZIDSN6pUHgpi4UvPA1fnXIrCd94Gk8TKfKM52yA4MIFT5rT1xE9BpngQwrt9IqRhiT2FMsSLIj3n6aO0sC79hoxrzvhdiqq3u4N_mQpuUwT5mK_TuLZqLffIEq1wolp-A5J-SkZ0_I3sJCRqReE5qBJ-uwacefJrApwl8GsGnAXy6mNPzmkbwX5DL46OL-iSPjTfyH0zIPneignWwkHZaao58P66VxtjCytazObmq0bpwkjcS7riiAitaOM6saSsBW2j2kuzOF3P7ilBeMgP6AIxYA0JuQQ3wqSj5BM4bJ4QZk8OBgNSvQLKiwM4FU7ucjcn7JDHl8wdi0rIJ8ukUm5QVtlLllYKPjEmWpKrw6U4lXm6ARzEF8CgPj0J4Xj_o6Tfk8d20f0t2QfT2AEzSvjn0M-oPOE2Bdg
linkProvider IEEE
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.title=Database+and+Expert+Systems+Applications&rft.au=Pramudiono%2C+Iko&rft.au=Kitsuregawa%2C+Masaru&rft.atitle=Tree+Structure+Based+Parallel+Frequent+Pattern+Mining+on+PC+Cluster&rft.series=Lecture+Notes+in+Computer+Science&rft.date=2003-01-01&rft.pub=Springer+Berlin+Heidelberg&rft.isbn=9783540408062&rft.issn=0302-9743&rft.eissn=1611-3349&rft.spage=537&rft.epage=547&rft_id=info:doi/10.1007%2F978-3-540-45227-0_53
thumbnail_s http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=https%3A%2F%2Febookcentral.proquest.com%2Fcovers%2F3089008-l.jpg