A Co-Training Strategy for Multiple View Clustering in Process Mining
Process mining refers to the discovery, conformance, and enhancement of process models from event logs currently produced by several information systems (e.g. workflow management systems). By tightly coupling event logs and process models, process mining makes it possible to detect deviations, predi...
Saved in:
| Published in | IEEE transactions on services computing Vol. 9; no. 6; pp. 832 - 845 |
|---|---|
| Main Authors | , |
| Format | Journal Article |
| Language | English |
| Published |
Piscataway
IEEE
01.11.2016
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Subjects | |
| Online Access | Get full text |
| ISSN | 1939-1374 2372-0204 |
| DOI | 10.1109/TSC.2015.2430327 |
Cover
| Abstract | Process mining refers to the discovery, conformance, and enhancement of process models from event logs currently produced by several information systems (e.g. workflow management systems). By tightly coupling event logs and process models, process mining makes it possible to detect deviations, predict delays, support decision making, and recommend process redesigns. Event logs are data sets containing the executions (called traces) of a business process. Several process mining algorithms have been defined to mine event logs and deliver valuable models (e.g. Petri nets) of how logged processes are being executed. However, they often generate spaghetti-like process models, which can be hard to understand. This is caused by the inherent complexity of real-life processes, which tend to be less structured and more flexible than what the stakeholders typically expect. In particular, spaghetti-like process models are discovered when all possible behaviors are shown in a single model as a result of considering the set of traces in the event log all at once.To minimize this problem, trace clustering can be used as a preprocessing step. It splits up an event log into clusters of similar traces, so as to handle variability in the recorded behavior and facilitate process model discovery. In this paper, we investigate a multiple view aware approach to trace clustering, based on a co-training strategy. In an assessment, using benchmark event logs, we show that the presented algorithm is able to discover a clustering pattern of the log, such that related traces result appropriately clustered. We evaluate the significance of the formed clusters using established machine learning and process mining metrics. |
|---|---|
| AbstractList | Process mining refers to the discovery, conformance, and enhancement of process models from event logs currently produced by several information systems (e.g. workflow management systems). By tightly coupling event logs and process models, process mining makes it possible to detect deviations, predict delays, support decision making, and recommend process redesigns.Event logs are data sets containing the executions (called traces) of a business process. Several process mining algorithms have been defined to mine event logs and deliver valuable models (e.g. Petri nets) of how logged processes are being executed. However, they often generate spaghetti-like process models, which can be hard to understand. This is caused by the inherent complexity of real-life processes, which tend to be less structured and more flexible than what the stakeholders typically expect. In particular, spaghetti-like process models are discovered when all possible behaviors are shown in a single model as a result of considering the set of traces in the event log all at once.To minimize this problem, trace clustering can be used as a preprocessing step. It splits up an event log into clusters of similar traces, so as to handle variability in the recorded behavior and facilitate process model discovery. In this paper, we investigate a multiple view aware approach to trace clustering, based on a co-training strategy. In an assessment, using benchmark event logs, we show that the presented algorithm is able to discover a clustering pattern of the log, such that related traces result appropriately clustered. We evaluate the significance of the formed clusters using established machine learning and process mining metrics. |
| Author | Appice, Annalisa Malerba, Donato |
| Author_xml | – sequence: 1 givenname: Annalisa surname: Appice fullname: Appice, Annalisa email: annalisa.appice@uniba.it organization: Dipt. di Inf., Univ. degli Studi Aldo Moro di Bari, Bari, Italy – sequence: 2 givenname: Donato surname: Malerba fullname: Malerba, Donato email: donato.malerba@uniba.it organization: Dipt. di Inf., Univ. degli Studi Aldo Moro di Bari, Bari, Italy |
| BookMark | eNp9kE1Lw0AQhhdRsK3eBS8LnlP3K7vdYwn1A1oUWr0u2WRStsRs3U2Q_nsTWzx48DSHeZ95mWeMzhvfAEI3lEwpJfp-s86mjNB0ygQnnKkzNGJcsYQwIs7RiGquE8qVuETjGHeESDab6RFazHHmk03IXeOaLV63IW9he8CVD3jV1a3b14DfHXzhrO5iC2FIuQa_Bl9AjHj1w12hiyqvI1yf5gS9PSw22VOyfHl8zubLpOCStgkUuSW8slJJZilIUhalFpDaCjSIXJW60izNrbRCQ2ltIWhKCk0YK_tV_8IE3R3v7oP_7CC2Zue70PSVhs6EkryvGVLkmCqCjzFAZfbBfeThYCgxgyzTyzKDLHOS1SPyD1K4Nm-db3ohrv4PvD2CDgB-exQlTAnOvwHOIXiR |
| CODEN | ITSCAD |
| CitedBy_id | crossref_primary_10_1108_MD_10_2023_1747 crossref_primary_10_1111_itor_13062 crossref_primary_10_1007_s10844_020_00598_6 crossref_primary_10_1109_TMM_2022_3154592 crossref_primary_10_1007_s10618_016_0488_4 crossref_primary_10_1016_j_ifacsc_2021_100150 crossref_primary_10_1016_j_ins_2023_02_089 crossref_primary_10_1109_TBDATA_2021_3128906 crossref_primary_10_1007_s00500_023_07884_9 crossref_primary_10_3390_electronics11142128 crossref_primary_10_1016_j_eswa_2024_124181 crossref_primary_10_1016_j_datak_2023_102253 crossref_primary_10_1007_s44311_024_00002_4 crossref_primary_10_1109_LSP_2025_3527231 crossref_primary_10_1016_j_engappai_2023_107028 crossref_primary_10_1016_j_knosys_2022_109736 crossref_primary_10_1016_j_sysarc_2022_102435 crossref_primary_10_1109_TSC_2021_3051771 crossref_primary_10_1007_s42979_023_02536_z crossref_primary_10_1016_j_knosys_2023_110707 crossref_primary_10_1109_TGRS_2020_2988982 crossref_primary_10_1007_s11227_018_2601_5 crossref_primary_10_1109_TCYB_2019_2955388 crossref_primary_10_1016_j_ins_2022_07_136 crossref_primary_10_1109_ACCESS_2024_3361650 crossref_primary_10_1109_TKDE_2019_2903810 crossref_primary_10_1007_s00521_019_04243_4 crossref_primary_10_3390_app12147295 crossref_primary_10_1145_3458282 crossref_primary_10_1007_s10844_018_0502_y crossref_primary_10_1007_s10489_021_02417_z crossref_primary_10_1007_s10489_018_1332_x |
| Cites_doi | 10.1145/279943.279962 10.1109/ICPR.2002.1047450 10.1007/978-3-642-31537-4_2 10.1007/s10618-007-0065-y 10.1109/EDOC.2010.13 10.1016/S0747-7171(08)80013-2 10.1007/978-3-642-00328-8_11 10.1007/978-1-4614-2197-9 10.1137/1.9781611972795.35 10.1109/ICDM.2009.138 10.1016/j.engappai.2008.04.005 10.1137/1.9781611972788.74 10.1007/978-3-642-19345-3 10.1007/s00521-013-1362-6 10.1109/ICDM.2004.10095 10.1145/1273496.1273642 10.1145/1553374.1553391 10.1145/1150402.1150439 10.1109/TKDE.2006.123 10.1007/978-3-642-12186-9_16 10.1007/978-3-319-01766-2_112 10.1007/978-3-540-79396-0_13 10.1109/GRC.2009.5255152 10.1016/0377-0427(87)90125-7 10.1109/TKDE.2013.64 10.1109/TIP.2012.2207395 10.1016/j.datak.2011.07.002 10.1007/978-3-642-29749-6_3 10.1007/978-3-540-75183-0_26 |
| ContentType | Journal Article |
| Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2016 |
| Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2016 |
| DBID | 97E ESBDL RIA RIE AAYXX CITATION 7SC 8FD JQ2 L7M L~C L~D |
| DOI | 10.1109/TSC.2015.2430327 |
| DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE Xplore Open Access Journals IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Computer and Information Systems Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | CrossRef Computer and Information Systems Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Advanced Technologies Database with Aerospace ProQuest Computer Science Collection Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Computer and Information Systems Abstracts |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EISSN | 2372-0204 |
| EndPage | 845 |
| ExternalDocumentID | 10_1109_TSC_2015_2430327 7102743 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: Mining Complex Patterns – fundername: Italian Ministry of University and Research (MIUR) funderid: 10.13039/501100003407 – fundername: PON grantid: 02_00563_3470993 – fundername: ATENEO grantid: 2012 – fundername: VINCENTE—A Virtual collective INtelligenCe ENvironment to develop sustainable Technology Entrepreneurship ecosystems – fundername: University of Bari Aldo Moro |
| GroupedDBID | 0R~ 29I 4.4 5VS 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABJNI ABQJQ ABVLG ACGFO ACIWK AENEX AETIX AGQYO AGSQL AHBIQ AKJIK AKQYR ALMA_UNASSIGNED_HOLDINGS ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ EBS EJD ESBDL HZ~ IEDLZ IFIPE IPLJI JAVBF M43 O9- OCL P2P PQQKQ RIA RIE RNI RNS RZB AAYXX CITATION 7SC 8FD JQ2 L7M L~C L~D RIG |
| ID | FETCH-LOGICAL-c361t-ecab03fb6762b1e60dcd94e5bfe9e4a7d9f925ab6b49edbbc4150c9022dd9f193 |
| IEDL.DBID | RIE |
| ISSN | 1939-1374 |
| IngestDate | Sun Jun 29 15:43:10 EDT 2025 Thu Apr 24 23:13:02 EDT 2025 Wed Oct 01 01:39:46 EDT 2025 Wed Aug 27 02:48:30 EDT 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 6 |
| Language | English |
| License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/OAPA.html |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c361t-ecab03fb6762b1e60dcd94e5bfe9e4a7d9f925ab6b49edbbc4150c9022dd9f193 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| OpenAccessLink | https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/document/7102743 |
| PQID | 1847633619 |
| PQPubID | 85503 |
| PageCount | 14 |
| ParticipantIDs | crossref_primary_10_1109_TSC_2015_2430327 proquest_journals_1847633619 crossref_citationtrail_10_1109_TSC_2015_2430327 ieee_primary_7102743 |
| ProviderPackageCode | CITATION AAYXX |
| PublicationCentury | 2000 |
| PublicationDate | 2016-11-01 |
| PublicationDateYYYYMMDD | 2016-11-01 |
| PublicationDate_xml | – month: 11 year: 2016 text: 2016-11-01 day: 01 |
| PublicationDecade | 2010 |
| PublicationPlace | Piscataway |
| PublicationPlace_xml | – name: Piscataway |
| PublicationTitle | IEEE transactions on services computing |
| PublicationTitleAbbrev | TSC |
| PublicationYear | 2016 |
| Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| References | ref35 ref12 ref37 ref36 ref14 ref31 ref11 ref32 ref10 ref2 tzortzis (ref24) 0 ref39 ref17 ref16 ref19 kiritchenko (ref15) 0 qian (ref30) 2002; 13 aalst (ref1) 2011 (ref33) 2013 ref23 kumar (ref18) 0 ref26 ref25 ref20 kaufman (ref38) 0 ref22 sa (ref21) 0 (ref34) 0 cai (ref13) 0 ref28 ref29 ref8 ref7 ref9 ref4 ref3 ref6 ref5 strehl (ref27) 2003; 3 |
| References_xml | – ident: ref14 doi: 10.1145/279943.279962 – ident: ref29 doi: 10.1109/ICPR.2002.1047450 – ident: ref17 doi: 10.1007/978-3-642-31537-4_2 – ident: ref36 doi: 10.1007/s10618-007-0065-y – year: 0 ident: ref34 publication-title: Proc 1st Business Process Intelligence Challenge Co-Located 7th Int Bus Process Intell Workshop – start-page: 205 year: 0 ident: ref24 article-title: Convex mixture models for multi-view clustering publication-title: Proc 19th Int Conf Artif Neural Netw – ident: ref2 doi: 10.1109/EDOC.2010.13 – ident: ref32 doi: 10.1016/S0747-7171(08)80013-2 – ident: ref5 doi: 10.1007/978-3-642-00328-8_11 – ident: ref39 doi: 10.1007/978-1-4614-2197-9 – start-page: 405 year: 0 ident: ref38 article-title: Clustering by means of medoids publication-title: Statistical Data Analysis Based on the L1-Norm and Related Methods – ident: ref6 doi: 10.1137/1.9781611972795.35 – ident: ref25 doi: 10.1109/ICDM.2009.138 – ident: ref16 doi: 10.1016/j.engappai.2008.04.005 – volume: 13 start-page: 1382 year: 2002 ident: ref30 article-title: Analyzing popular clustering algorithms from different viewpoints publication-title: Int J Softw – ident: ref20 doi: 10.1137/1.9781611972788.74 – year: 2011 ident: ref1 publication-title: Process Mining Discovery Conformance and Enhancement of Business Processes doi: 10.1007/978-3-642-19345-3 – ident: ref19 doi: 10.1007/s00521-013-1362-6 – ident: ref11 doi: 10.1109/ICDM.2004.10095 – volume: 3 start-page: 583 year: 2003 ident: ref27 article-title: Cluster ensembles-a knowledge reuse framework for combining multiple partitions publication-title: J Mach Learn Res – start-page: 2598 year: 0 ident: ref13 article-title: Multi-view k-means clustering on big data publication-title: Proc 23rd Int Joint Conf Artif Intell – ident: ref23 doi: 10.1145/1273496.1273642 – ident: ref12 doi: 10.1145/1553374.1553391 – ident: ref22 doi: 10.1145/1150402.1150439 – start-page: 393 year: 0 ident: ref18 article-title: A co-training approach for multi-view spectral clustering publication-title: Proc 28th Int Conf Mach Learn – ident: ref4 doi: 10.1109/TKDE.2006.123 – ident: ref3 doi: 10.1007/978-3-642-12186-9_16 – ident: ref26 doi: 10.1007/978-3-319-01766-2_112 – ident: ref37 doi: 10.1007/978-3-540-79396-0_13 – ident: ref28 doi: 10.1109/GRC.2009.5255152 – ident: ref31 doi: 10.1016/0377-0427(87)90125-7 – ident: ref8 doi: 10.1109/TKDE.2013.64 – ident: ref10 doi: 10.1109/TIP.2012.2207395 – ident: ref9 doi: 10.1016/j.datak.2011.07.002 – year: 0 ident: ref21 article-title: A spectral clustering with two views publication-title: Proc ICML Workshop Learn Multiple Views – start-page: 8 year: 0 ident: ref15 article-title: Email classification with co-training publication-title: Proc Conf Centre Adv Stud Collaborative Res – ident: ref35 doi: 10.1007/978-3-642-29749-6_3 – year: 2013 ident: ref33 publication-title: Proc CEUR Workshop 3rd Bus Process Intell Challenge Co-Located 9th Int Bus Process Intell Workshop – ident: ref7 doi: 10.1007/978-3-540-75183-0_26 |
| SSID | ssj0062889 |
| Score | 2.334563 |
| Snippet | Process mining refers to the discovery, conformance, and enhancement of process models from event logs currently produced by several information systems (e.g.... |
| SourceID | proquest crossref ieee |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 832 |
| SubjectTerms | Algorithms Business process management Clustering Clustering algorithms Clusters co-training Computational modeling Data mining Machine learning multiple view learning Partitioning algorithms process mining Training |
| Title | A Co-Training Strategy for Multiple View Clustering in Process Mining |
| URI | https://ieeexplore.ieee.org/document/7102743 https://www.proquest.com/docview/1847633619 |
| Volume | 9 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE Electronic Library (IEL) customDbUrl: eissn: 2372-0204 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0062889 issn: 1939-1374 databaseCode: RIE dateStart: 20080101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3fS8MwED62PemDv6Y4nZIHXwS7dW2aNo-jbAxhvrjJ3kqSXmE4NtEW0b_epE3HUBHfCs2V9C6979Lc3QdwI_0oQtfNHM8XvkN9ik40UMoJfcUlhhQjLLN8H9hkTu8XwaIBd9taGEQsk8-wZy7Ls_x0owrzq6xv0FAjXhOaYcSqWq3a6xrWXF4fQ7q8P3uMTd5W0POodtKGM2YHdkoelR_Ot0SU8SFM67lUiSTPvSKXPfX5rU3jfyd7BAc2tCTDai0cQwPXJ7C_03CwDaMhiTfOzNJCENua9oPoyJVMbWoheVriO4lXhWmhYEYt18SWE5BpKXcK8_FoFk8cS6TgKJ8NcgeVkK6fSaY9nxwgc1OVcoqBzJAjFWHKM-4FQjJJOaZSKo3qruIa3lN9S4d4Z9Bab9Z4DgSzjA7CMJAqDanIfIHME0JE0ksD5UrWgX6t50TZLuOG7GKVlLsNlyfaMomxTGIt04HbrcRL1WHjj7Fto-jtOKvjDnRrUyb2E3xL9NZV-079_vzid6lL2NPPZlVhYRda-WuBVzrCyOV1ubS-AMn4z1w |
| linkProvider | IEEE |
| linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8MwDLbGOAAHXgMxGJADFyS69ZE-cpymoQHbLmxotypJXWli6hBsQvDrSdp0mgAhbpUaq6md-nMa2x_AlfCiCG07tVyPexb1KFqRI6UVepIJDClGmGf5DoPemN5P_EkFbla1MIiYJ59hU1_mZ_nJXC71r7KWRkOFeBuw6VNK_aJaq_S7mjeXlQeRNmuNHjs6c8tvulS5ac0aswY8OZPKD_ebY8rtHgzK2RSpJM_N5UI05ee3Ro3_ne4-7JrgkrSL1XAAFcwOYWet5WANum3SmVsjQwxBTHPaD6JiVzIwyYXkaYrvpDNb6iYKetQ0I6aggAxyuSMY33ZHnZ5lqBQs6QXOwkLJhe2lIlC-TzgY2IlMGEVfpMiQ8jBhKXN9LgJBGSZCSIXrtmQK4BN1SwV5x1DN5hmeAME0pU4Y-kImIeWpxzFwOeeRcBNf2iKoQ6vUcyxNn3FNdzGL8_2GzWJlmVhbJjaWqcP1SuKl6LHxx9iaVvRqnNFxHRqlKWPzEb7FavOqvKd6f3b6u9QlbPVGg37cvxs-nMG2ek5QlBk2oLp4XeK5ijcW4iJfZl_N1dKp |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+Co-Training+Strategy+for+Multiple+View+Clustering+in+Process+Mining&rft.jtitle=IEEE+transactions+on+services+computing&rft.au=Appice%2C+Annalisa&rft.au=Malerba%2C+Donato&rft.date=2016-11-01&rft.pub=IEEE&rft.eissn=2372-0204&rft.volume=9&rft.issue=6&rft.spage=832&rft.epage=845&rft_id=info:doi/10.1109%2FTSC.2015.2430327&rft.externalDocID=7102743 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1939-1374&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1939-1374&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1939-1374&client=summon |