Tree Structure Based Parallel Frequent Pattern Mining on PC Cluster
Frequent pattern mining has become a fundamental technique for many data mining tasks. Many modern frequent pattern mining algorithms such as FP-growth adopt tree structure to compress database into on-memory compact data structure. Recent studies show that the tree structure can be efficiently mine...
Saved in:
| Published in | Database and Expert Systems Applications Vol. 2736; pp. 537 - 547 |
|---|---|
| Main Authors | , |
| Format | Book Chapter Conference Proceeding |
| Language | English Japanese |
| Published |
Germany
Springer Berlin / Heidelberg
2003
Springer Berlin Heidelberg Springer |
| Series | Lecture Notes in Computer Science |
| Subjects | |
| Online Access | Get full text |
| ISBN | 9783540408062 3540408061 |
| ISSN | 0302-9743 1611-3349 |
| DOI | 10.1007/978-3-540-45227-0_53 |
Cover
| Abstract | Frequent pattern mining has become a fundamental technique for many data mining tasks. Many modern frequent pattern mining algorithms such as FP-growth adopt tree structure to compress database into on-memory compact data structure. Recent studies show that the tree structure can be efficiently mined using frequent pattern growth methodology. Higher level of performance improvement can be expected from parallel execution. In particular, PC cluster is gaining popularity as the high cost-performance parallel platform for data extensive task like data mining. However, we have to address many issues such as space distribution on each node and skew handling to efficiently mine frequent patterns from tree structure on a shared-nothing environment. We develop a framework to address those issues using novel granularity control mechanism and tree remerging. The common framework can be enhanced with temporal constrain to mine web access patterns. We invent improved support counting procedure to reduce the additional communication overhead. Real implementation using up to 32 nodes confirms that good speedup ratio can be achieved even on skewed environment. |
|---|---|
| AbstractList | Frequent pattern mining has become a fundamental technique for many data mining tasks. Many modern frequent pattern mining algorithms such as FP-growth adopt tree structure to compress database into on-memory compact data structure. Recent studies show that the tree structure can be efficiently mined using frequent pattern growth methodology. Higher level of performance improvement can be expected from parallel execution. In particular, PC cluster is gaining popularity as the high cost-performance parallel platform for data extensive task like data mining. However, we have to address many issues such as space distribution on each node and skew handling to efficiently mine frequent patterns from tree structure on a shared-nothing environment. We develop a framework to address those issues using novel granularity control mechanism and tree remerging. The common framework can be enhanced with temporal constrain to mine web access patterns. We invent improved support counting procedure to reduce the additional communication overhead. Real implementation using up to 32 nodes confirms that good speedup ratio can be achieved even on skewed environment. |
| Author | Kitsuregawa, Masaru Pramudiono, Iko |
| Author_xml | – sequence: 1 givenname: Iko surname: Pramudiono fullname: Pramudiono, Iko – sequence: 2 givenname: Masaru surname: Kitsuregawa fullname: Kitsuregawa, Masaru |
| BackLink | http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=15551584$$DView record in Pascal Francis |
| BookMark | eNotkM1OAjEUhauiEZA3cDEbl9W2t79LnYiaYCQR100ZOgiOHWxnFr69BeymyTnn3pvzjdAgtMEjdE3JLSVE3RmlMWDBCeaCMYWJFXCCRpCVgwCnaEglpRiAmzM0yfmDRzSRbICGBAjDRnG42Ockl1oqfYkmKW1JfsAY5zBE5SJ6X7x3sa-6PvriwSW_KuYuuqbxTTGN_qf3octK1_kYitdN2IR10YZiXhZl06esXqHz2jXJT_7_MfqYPi7KZzx7e3op72d4C1J1uJZGgGbKU-2EACXrlaoqz7xaUcKVrs3SOVYrsVTZqZkhQspagK9WRnJjYIxujnt3LlWuqaML1SbZXdx8u_hrqRCCCs1zjh1zKVth7aNdtu1XspTYPVqbUVmwGZY9kLR7tHkI_pfHNldOnfX7qSqXzyyqT7fLTZMFog0h2gpj8zn4A9dOd14 |
| ContentType | Book Chapter Conference Proceeding |
| Copyright | Springer-Verlag Berlin Heidelberg 2003 2004 INIST-CNRS |
| Copyright_xml | – notice: Springer-Verlag Berlin Heidelberg 2003 – notice: 2004 INIST-CNRS |
| DBID | FFUUA IQODW |
| DEWEY | 5.74 |
| DOI | 10.1007/978-3-540-45227-0_53 |
| DatabaseName | ProQuest Ebook Central - Book Chapters - Demo use only Pascal-Francis |
| DatabaseTitleList | |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science Applied Sciences |
| EISBN | 3540452273 9783540452270 |
| EISSN | 1611-3349 |
| Editor | Retschitzegger, Werner Stepankova, Olga Marik, Vladimir |
| Editor_xml | – sequence: 1 fullname: Retschitzegger, Werner – sequence: 2 fullname: Stepankova, Olga – sequence: 3 fullname: Marik, Vladimir |
| EndPage | 547 |
| ExternalDocumentID | 15551584 EBC3089008_59_555 |
| GroupedDBID | 2HV 38. 50X 6IE 6IK 6IL AABBV AAJGR AAVQY AAWTH ABBVZ ABCGF ABODI ABWRD ACFJL ACPRQ ACRFJ ADHDZ ADHHQ ADNMO AEAWX AEDXK AEJIA AEKFX AEZAY AFPTF AGEUI AKVJN ALMA_UNASSIGNED_HOLDINGS ASPBG AVWKF AZFZN AZZ BBABE BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CZZ FEDTE FFUUA HVGLF JJU MA. MW~ N2R NUJIC OCGVD OCL RIB RIC RIE RIL SBO SCCYB SOYCC SYRPR SZJJV TSXQS ~1X ~54 -DT -GH -~X 1SB 29L 2HA 5QI 875 AASHB ABMNI ACGFS ADCXD AEFIE EJD F5P LAS LDH P2P RNI RSU SVGTG VI1 ~02 IQODW RIG |
| ID | FETCH-LOGICAL-j367t-f6953827e18a55376fd7cce2e7d10478f9baa2f75b7fd7f290566f53ecd964993 |
| ISBN | 9783540408062 3540408061 |
| ISSN | 0302-9743 |
| IngestDate | Wed Apr 02 07:25:09 EDT 2025 Wed Sep 17 04:56:25 EDT 2025 Thu May 29 00:45:14 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Keywords | Grain size Parallel algorithm Data analysis Tree structured method World wide web Database Information extraction Internet Distributed system Data mining Data structure |
| LCCallNum | QA76.9.D35 |
| Language | English Japanese |
| License | CC BY 4.0 |
| LinkModel | OpenURL |
| MeetingName | DEXA 2003 : database and expert systems applications (Prague, 1-5 September 2003) |
| MergedId | FETCHMERGED-LOGICAL-j367t-f6953827e18a55376fd7cce2e7d10478f9baa2f75b7fd7f290566f53ecd964993 |
| OCLC | 166468678 |
| PQID | EBC3089008_59_555 |
| PageCount | 11 |
| ParticipantIDs | pascalfrancis_primary_15551584 springer_books_10_1007_978_3_540_45227_0_53 proquest_ebookcentralchapters_3089008_59_555 |
| PublicationCentury | 2000 |
| PublicationDate | 2003 |
| PublicationDateYYYYMMDD | 2003-01-01 |
| PublicationDate_xml | – year: 2003 text: 2003 |
| PublicationDecade | 2000 |
| PublicationPlace | Germany |
| PublicationPlace_xml | – name: Germany – name: Berlin, Heidelberg – name: Berlin |
| PublicationSeriesTitle | Lecture Notes in Computer Science |
| PublicationSubtitle | 14th International Conference, DEXA 2003, Prague, Czech Republic, September 1-5, 2003, Proceedings |
| PublicationTitle | Database and Expert Systems Applications |
| PublicationYear | 2003 |
| Publisher | Springer Berlin / Heidelberg Springer Berlin Heidelberg Springer |
| Publisher_xml | – name: Springer Berlin / Heidelberg – name: Springer Berlin Heidelberg – name: Springer |
| RelatedPersons | Hartmanis, Juris Goos, Gerhard van Leeuwen, Jan |
| RelatedPersons_xml | – sequence: 1 givenname: Gerhard surname: Goos fullname: Goos, Gerhard – sequence: 2 givenname: Juris surname: Hartmanis fullname: Hartmanis, Juris – sequence: 3 givenname: Jan surname: van Leeuwen fullname: van Leeuwen, Jan |
| SSID | ssj0000322443 ssj0002792 |
| Score | 1.787104 |
| Snippet | Frequent pattern mining has become a fundamental technique for many data mining tasks. Many modern frequent pattern mining algorithms such as FP-growth adopt... |
| SourceID | pascalfrancis springer proquest |
| SourceType | Index Database Publisher |
| StartPage | 537 |
| SubjectTerms | Applied sciences Computer science; control theory; systems Computer systems and distributed systems. User interface Exact sciences and technology Frequent Pattern Minimum Support Parallel Execution Processing Node Software Speedup Ratio |
| Title | Tree Structure Based Parallel Frequent Pattern Mining on PC Cluster |
| URI | http://ebookcentral.proquest.com/lib/SITE_ID/reader.action?docID=3089008&ppg=555 http://link.springer.com/10.1007/978-3-540-45227-0_53 |
| Volume | 2736 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3Nb9MwFLe6SUiIw8aXKLDJB24hqI1rOzlwIdo0EJt22KTdLMexEbC1qEk1ib-e9_zRNWVCGpcoStLGfT_3-fl9_B4h78AC4LoUIm-dqXKYFDbXruQ5005KW0jjfCjm9EycXM6-XPGr0chtZC2t-uaD-X1vXcn_oArXAFeskn0AsusvhQtwDvjCERCG45bxO3SzhlwW3Wtcg7z73zMW94l_PNiWm844VH5LfbNqv4de29nnn4u78HuPbsJv-laH6p1OL1eDqbS0mHCIRLMYbvgEL23B9FxiG5ZrMH19NnaPbP_oXcxOfc8JjEKc11l9vepSBjDKxHYfv8awxdmi99lgWeoskRTNwBPBtjwRyROZ_YOoKzqaZmCnik23JgO9DDuboOpsUMUCCRZZIDSN6pUHgpi4UvPA1fnXIrCd94Gk8TKfKM52yA4MIFT5rT1xE9BpngQwrt9IqRhiT2FMsSLIj3n6aO0sC79hoxrzvhdiqq3u4N_mQpuUwT5mK_TuLZqLffIEq1wolp-A5J-SkZ0_I3sJCRqReE5qBJ-uwacefJrApwl8GsGnAXy6mNPzmkbwX5DL46OL-iSPjTfyH0zIPneignWwkHZaao58P66VxtjCytazObmq0bpwkjcS7riiAitaOM6saSsBW2j2kuzOF3P7ilBeMgP6AIxYA0JuQQ3wqSj5BM4bJ4QZk8OBgNSvQLKiwM4FU7ucjcn7JDHl8wdi0rIJ8ukUm5QVtlLllYKPjEmWpKrw6U4lXm6ARzEF8CgPj0J4Xj_o6Tfk8d20f0t2QfT2AEzSvjn0M-oPOE2Bdg |
| linkProvider | IEEE |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.title=Database+and+Expert+Systems+Applications&rft.au=Pramudiono%2C+Iko&rft.au=Kitsuregawa%2C+Masaru&rft.atitle=Tree+Structure+Based+Parallel+Frequent+Pattern+Mining+on+PC+Cluster&rft.series=Lecture+Notes+in+Computer+Science&rft.date=2003-01-01&rft.pub=Springer+Berlin+Heidelberg&rft.isbn=9783540408062&rft.issn=0302-9743&rft.eissn=1611-3349&rft.spage=537&rft.epage=547&rft_id=info:doi/10.1007%2F978-3-540-45227-0_53 |
| thumbnail_s | http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=https%3A%2F%2Febookcentral.proquest.com%2Fcovers%2F3089008-l.jpg |