A Co-Training Strategy for Multiple View Clustering in Process Mining

Process mining refers to the discovery, conformance, and enhancement of process models from event logs currently produced by several information systems (e.g. workflow management systems). By tightly coupling event logs and process models, process mining makes it possible to detect deviations, predi...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on services computing Vol. 9; no. 6; pp. 832 - 845
Main Authors Appice, Annalisa, Malerba, Donato
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 01.11.2016
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text
ISSN1939-1374
2372-0204
DOI10.1109/TSC.2015.2430327

Cover

Abstract Process mining refers to the discovery, conformance, and enhancement of process models from event logs currently produced by several information systems (e.g. workflow management systems). By tightly coupling event logs and process models, process mining makes it possible to detect deviations, predict delays, support decision making, and recommend process redesigns. Event logs are data sets containing the executions (called traces) of a business process. Several process mining algorithms have been defined to mine event logs and deliver valuable models (e.g. Petri nets) of how logged processes are being executed. However, they often generate spaghetti-like process models, which can be hard to understand. This is caused by the inherent complexity of real-life processes, which tend to be less structured and more flexible than what the stakeholders typically expect. In particular, spaghetti-like process models are discovered when all possible behaviors are shown in a single model as a result of considering the set of traces in the event log all at once.To minimize this problem, trace clustering can be used as a preprocessing step. It splits up an event log into clusters of similar traces, so as to handle variability in the recorded behavior and facilitate process model discovery. In this paper, we investigate a multiple view aware approach to trace clustering, based on a co-training strategy. In an assessment, using benchmark event logs, we show that the presented algorithm is able to discover a clustering pattern of the log, such that related traces result appropriately clustered. We evaluate the significance of the formed clusters using established machine learning and process mining metrics.
AbstractList Process mining refers to the discovery, conformance, and enhancement of process models from event logs currently produced by several information systems (e.g. workflow management systems). By tightly coupling event logs and process models, process mining makes it possible to detect deviations, predict delays, support decision making, and recommend process redesigns.Event logs are data sets containing the executions (called traces) of a business process. Several process mining algorithms have been defined to mine event logs and deliver valuable models (e.g. Petri nets) of how logged processes are being executed. However, they often generate spaghetti-like process models, which can be hard to understand. This is caused by the inherent complexity of real-life processes, which tend to be less structured and more flexible than what the stakeholders typically expect. In particular, spaghetti-like process models are discovered when all possible behaviors are shown in a single model as a result of considering the set of traces in the event log all at once.To minimize this problem, trace clustering can be used as a preprocessing step. It splits up an event log into clusters of similar traces, so as to handle variability in the recorded behavior and facilitate process model discovery. In this paper, we investigate a multiple view aware approach to trace clustering, based on a co-training strategy. In an assessment, using benchmark event logs, we show that the presented algorithm is able to discover a clustering pattern of the log, such that related traces result appropriately clustered. We evaluate the significance of the formed clusters using established machine learning and process mining metrics.
Author Appice, Annalisa
Malerba, Donato
Author_xml – sequence: 1
  givenname: Annalisa
  surname: Appice
  fullname: Appice, Annalisa
  email: annalisa.appice@uniba.it
  organization: Dipt. di Inf., Univ. degli Studi Aldo Moro di Bari, Bari, Italy
– sequence: 2
  givenname: Donato
  surname: Malerba
  fullname: Malerba, Donato
  email: donato.malerba@uniba.it
  organization: Dipt. di Inf., Univ. degli Studi Aldo Moro di Bari, Bari, Italy
BookMark eNp9kE1Lw0AQhhdRsK3eBS8LnlP3K7vdYwn1A1oUWr0u2WRStsRs3U2Q_nsTWzx48DSHeZ95mWeMzhvfAEI3lEwpJfp-s86mjNB0ygQnnKkzNGJcsYQwIs7RiGquE8qVuETjGHeESDab6RFazHHmk03IXeOaLV63IW9he8CVD3jV1a3b14DfHXzhrO5iC2FIuQa_Bl9AjHj1w12hiyqvI1yf5gS9PSw22VOyfHl8zubLpOCStgkUuSW8slJJZilIUhalFpDaCjSIXJW60izNrbRCQ2ltIWhKCk0YK_tV_8IE3R3v7oP_7CC2Zue70PSVhs6EkryvGVLkmCqCjzFAZfbBfeThYCgxgyzTyzKDLHOS1SPyD1K4Nm-db3ohrv4PvD2CDgB-exQlTAnOvwHOIXiR
CODEN ITSCAD
CitedBy_id crossref_primary_10_1108_MD_10_2023_1747
crossref_primary_10_1111_itor_13062
crossref_primary_10_1007_s10844_020_00598_6
crossref_primary_10_1109_TMM_2022_3154592
crossref_primary_10_1007_s10618_016_0488_4
crossref_primary_10_1016_j_ifacsc_2021_100150
crossref_primary_10_1016_j_ins_2023_02_089
crossref_primary_10_1109_TBDATA_2021_3128906
crossref_primary_10_1007_s00500_023_07884_9
crossref_primary_10_3390_electronics11142128
crossref_primary_10_1016_j_eswa_2024_124181
crossref_primary_10_1016_j_datak_2023_102253
crossref_primary_10_1007_s44311_024_00002_4
crossref_primary_10_1109_LSP_2025_3527231
crossref_primary_10_1016_j_engappai_2023_107028
crossref_primary_10_1016_j_knosys_2022_109736
crossref_primary_10_1016_j_sysarc_2022_102435
crossref_primary_10_1109_TSC_2021_3051771
crossref_primary_10_1007_s42979_023_02536_z
crossref_primary_10_1016_j_knosys_2023_110707
crossref_primary_10_1109_TGRS_2020_2988982
crossref_primary_10_1007_s11227_018_2601_5
crossref_primary_10_1109_TCYB_2019_2955388
crossref_primary_10_1016_j_ins_2022_07_136
crossref_primary_10_1109_ACCESS_2024_3361650
crossref_primary_10_1109_TKDE_2019_2903810
crossref_primary_10_1007_s00521_019_04243_4
crossref_primary_10_3390_app12147295
crossref_primary_10_1145_3458282
crossref_primary_10_1007_s10844_018_0502_y
crossref_primary_10_1007_s10489_021_02417_z
crossref_primary_10_1007_s10489_018_1332_x
Cites_doi 10.1145/279943.279962
10.1109/ICPR.2002.1047450
10.1007/978-3-642-31537-4_2
10.1007/s10618-007-0065-y
10.1109/EDOC.2010.13
10.1016/S0747-7171(08)80013-2
10.1007/978-3-642-00328-8_11
10.1007/978-1-4614-2197-9
10.1137/1.9781611972795.35
10.1109/ICDM.2009.138
10.1016/j.engappai.2008.04.005
10.1137/1.9781611972788.74
10.1007/978-3-642-19345-3
10.1007/s00521-013-1362-6
10.1109/ICDM.2004.10095
10.1145/1273496.1273642
10.1145/1553374.1553391
10.1145/1150402.1150439
10.1109/TKDE.2006.123
10.1007/978-3-642-12186-9_16
10.1007/978-3-319-01766-2_112
10.1007/978-3-540-79396-0_13
10.1109/GRC.2009.5255152
10.1016/0377-0427(87)90125-7
10.1109/TKDE.2013.64
10.1109/TIP.2012.2207395
10.1016/j.datak.2011.07.002
10.1007/978-3-642-29749-6_3
10.1007/978-3-540-75183-0_26
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2016
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2016
DBID 97E
ESBDL
RIA
RIE
AAYXX
CITATION
7SC
8FD
JQ2
L7M
L~C
L~D
DOI 10.1109/TSC.2015.2430327
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE Xplore Open Access Journals
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
Computer and Information Systems Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Computer and Information Systems Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Advanced Technologies Database with Aerospace
ProQuest Computer Science Collection
Computer and Information Systems Abstracts Professional
DatabaseTitleList Computer and Information Systems Abstracts

Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 2372-0204
EndPage 845
ExternalDocumentID 10_1109_TSC_2015_2430327
7102743
Genre orig-research
GrantInformation_xml – fundername: Mining Complex Patterns
– fundername: Italian Ministry of University and Research (MIUR)
  funderid: 10.13039/501100003407
– fundername: PON
  grantid: 02_00563_3470993
– fundername: ATENEO
  grantid: 2012
– fundername: VINCENTE—A Virtual collective INtelligenCe ENvironment to develop sustainable Technology Entrepreneurship ecosystems
– fundername: University of Bari Aldo Moro
GroupedDBID 0R~
29I
4.4
5VS
6IK
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABJNI
ABQJQ
ABVLG
ACGFO
ACIWK
AENEX
AETIX
AGQYO
AGSQL
AHBIQ
AKJIK
AKQYR
ALMA_UNASSIGNED_HOLDINGS
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
EBS
EJD
ESBDL
HZ~
IEDLZ
IFIPE
IPLJI
JAVBF
M43
O9-
OCL
P2P
PQQKQ
RIA
RIE
RNI
RNS
RZB
AAYXX
CITATION
7SC
8FD
JQ2
L7M
L~C
L~D
RIG
ID FETCH-LOGICAL-c361t-ecab03fb6762b1e60dcd94e5bfe9e4a7d9f925ab6b49edbbc4150c9022dd9f193
IEDL.DBID RIE
ISSN 1939-1374
IngestDate Sun Jun 29 15:43:10 EDT 2025
Thu Apr 24 23:13:02 EDT 2025
Wed Oct 01 01:39:46 EDT 2025
Wed Aug 27 02:48:30 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 6
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/OAPA.html
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c361t-ecab03fb6762b1e60dcd94e5bfe9e4a7d9f925ab6b49edbbc4150c9022dd9f193
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
OpenAccessLink https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/document/7102743
PQID 1847633619
PQPubID 85503
PageCount 14
ParticipantIDs crossref_primary_10_1109_TSC_2015_2430327
proquest_journals_1847633619
crossref_citationtrail_10_1109_TSC_2015_2430327
ieee_primary_7102743
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2016-11-01
PublicationDateYYYYMMDD 2016-11-01
PublicationDate_xml – month: 11
  year: 2016
  text: 2016-11-01
  day: 01
PublicationDecade 2010
PublicationPlace Piscataway
PublicationPlace_xml – name: Piscataway
PublicationTitle IEEE transactions on services computing
PublicationTitleAbbrev TSC
PublicationYear 2016
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref35
ref12
ref37
ref36
ref14
ref31
ref11
ref32
ref10
ref2
tzortzis (ref24) 0
ref39
ref17
ref16
ref19
kiritchenko (ref15) 0
qian (ref30) 2002; 13
aalst (ref1) 2011
(ref33) 2013
ref23
kumar (ref18) 0
ref26
ref25
ref20
kaufman (ref38) 0
ref22
sa (ref21) 0
(ref34) 0
cai (ref13) 0
ref28
ref29
ref8
ref7
ref9
ref4
ref3
ref6
ref5
strehl (ref27) 2003; 3
References_xml – ident: ref14
  doi: 10.1145/279943.279962
– ident: ref29
  doi: 10.1109/ICPR.2002.1047450
– ident: ref17
  doi: 10.1007/978-3-642-31537-4_2
– ident: ref36
  doi: 10.1007/s10618-007-0065-y
– year: 0
  ident: ref34
  publication-title: Proc 1st Business Process Intelligence Challenge Co-Located 7th Int Bus Process Intell Workshop
– start-page: 205
  year: 0
  ident: ref24
  article-title: Convex mixture models for multi-view clustering
  publication-title: Proc 19th Int Conf Artif Neural Netw
– ident: ref2
  doi: 10.1109/EDOC.2010.13
– ident: ref32
  doi: 10.1016/S0747-7171(08)80013-2
– ident: ref5
  doi: 10.1007/978-3-642-00328-8_11
– ident: ref39
  doi: 10.1007/978-1-4614-2197-9
– start-page: 405
  year: 0
  ident: ref38
  article-title: Clustering by means of medoids
  publication-title: Statistical Data Analysis Based on the L1-Norm and Related Methods
– ident: ref6
  doi: 10.1137/1.9781611972795.35
– ident: ref25
  doi: 10.1109/ICDM.2009.138
– ident: ref16
  doi: 10.1016/j.engappai.2008.04.005
– volume: 13
  start-page: 1382
  year: 2002
  ident: ref30
  article-title: Analyzing popular clustering algorithms from different viewpoints
  publication-title: Int J Softw
– ident: ref20
  doi: 10.1137/1.9781611972788.74
– year: 2011
  ident: ref1
  publication-title: Process Mining Discovery Conformance and Enhancement of Business Processes
  doi: 10.1007/978-3-642-19345-3
– ident: ref19
  doi: 10.1007/s00521-013-1362-6
– ident: ref11
  doi: 10.1109/ICDM.2004.10095
– volume: 3
  start-page: 583
  year: 2003
  ident: ref27
  article-title: Cluster ensembles-a knowledge reuse framework for combining multiple partitions
  publication-title: J Mach Learn Res
– start-page: 2598
  year: 0
  ident: ref13
  article-title: Multi-view k-means clustering on big data
  publication-title: Proc 23rd Int Joint Conf Artif Intell
– ident: ref23
  doi: 10.1145/1273496.1273642
– ident: ref12
  doi: 10.1145/1553374.1553391
– ident: ref22
  doi: 10.1145/1150402.1150439
– start-page: 393
  year: 0
  ident: ref18
  article-title: A co-training approach for multi-view spectral clustering
  publication-title: Proc 28th Int Conf Mach Learn
– ident: ref4
  doi: 10.1109/TKDE.2006.123
– ident: ref3
  doi: 10.1007/978-3-642-12186-9_16
– ident: ref26
  doi: 10.1007/978-3-319-01766-2_112
– ident: ref37
  doi: 10.1007/978-3-540-79396-0_13
– ident: ref28
  doi: 10.1109/GRC.2009.5255152
– ident: ref31
  doi: 10.1016/0377-0427(87)90125-7
– ident: ref8
  doi: 10.1109/TKDE.2013.64
– ident: ref10
  doi: 10.1109/TIP.2012.2207395
– ident: ref9
  doi: 10.1016/j.datak.2011.07.002
– year: 0
  ident: ref21
  article-title: A spectral clustering with two views
  publication-title: Proc ICML Workshop Learn Multiple Views
– start-page: 8
  year: 0
  ident: ref15
  article-title: Email classification with co-training
  publication-title: Proc Conf Centre Adv Stud Collaborative Res
– ident: ref35
  doi: 10.1007/978-3-642-29749-6_3
– year: 2013
  ident: ref33
  publication-title: Proc CEUR Workshop 3rd Bus Process Intell Challenge Co-Located 9th Int Bus Process Intell Workshop
– ident: ref7
  doi: 10.1007/978-3-540-75183-0_26
SSID ssj0062889
Score 2.334563
Snippet Process mining refers to the discovery, conformance, and enhancement of process models from event logs currently produced by several information systems (e.g....
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 832
SubjectTerms Algorithms
Business process management
Clustering
Clustering algorithms
Clusters
co-training
Computational modeling
Data mining
Machine learning
multiple view learning
Partitioning algorithms
process mining
Training
Title A Co-Training Strategy for Multiple View Clustering in Process Mining
URI https://ieeexplore.ieee.org/document/7102743
https://www.proquest.com/docview/1847633619
Volume 9
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE Electronic Library (IEL)
  customDbUrl:
  eissn: 2372-0204
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0062889
  issn: 1939-1374
  databaseCode: RIE
  dateStart: 20080101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3fS8MwED62PemDv6Y4nZIHXwS7dW2aNo-jbAxhvrjJ3kqSXmE4NtEW0b_epE3HUBHfCs2V9C6979Lc3QdwI_0oQtfNHM8XvkN9ik40UMoJfcUlhhQjLLN8H9hkTu8XwaIBd9taGEQsk8-wZy7Ls_x0owrzq6xv0FAjXhOaYcSqWq3a6xrWXF4fQ7q8P3uMTd5W0POodtKGM2YHdkoelR_Ot0SU8SFM67lUiSTPvSKXPfX5rU3jfyd7BAc2tCTDai0cQwPXJ7C_03CwDaMhiTfOzNJCENua9oPoyJVMbWoheVriO4lXhWmhYEYt18SWE5BpKXcK8_FoFk8cS6TgKJ8NcgeVkK6fSaY9nxwgc1OVcoqBzJAjFWHKM-4FQjJJOaZSKo3qruIa3lN9S4d4Z9Bab9Z4DgSzjA7CMJAqDanIfIHME0JE0ksD5UrWgX6t50TZLuOG7GKVlLsNlyfaMomxTGIt04HbrcRL1WHjj7Fto-jtOKvjDnRrUyb2E3xL9NZV-079_vzid6lL2NPPZlVhYRda-WuBVzrCyOV1ubS-AMn4z1w
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8MwDLbGOAAHXgMxGJADFyS69ZE-cpymoQHbLmxotypJXWli6hBsQvDrSdp0mgAhbpUaq6md-nMa2x_AlfCiCG07tVyPexb1KFqRI6UVepIJDClGmGf5DoPemN5P_EkFbla1MIiYJ59hU1_mZ_nJXC71r7KWRkOFeBuw6VNK_aJaq_S7mjeXlQeRNmuNHjs6c8tvulS5ac0aswY8OZPKD_ebY8rtHgzK2RSpJM_N5UI05ee3Ro3_ne4-7JrgkrSL1XAAFcwOYWet5WANum3SmVsjQwxBTHPaD6JiVzIwyYXkaYrvpDNb6iYKetQ0I6aggAxyuSMY33ZHnZ5lqBQs6QXOwkLJhe2lIlC-TzgY2IlMGEVfpMiQ8jBhKXN9LgJBGSZCSIXrtmQK4BN1SwV5x1DN5hmeAME0pU4Y-kImIeWpxzFwOeeRcBNf2iKoQ6vUcyxNn3FNdzGL8_2GzWJlmVhbJjaWqcP1SuKl6LHxx9iaVvRqnNFxHRqlKWPzEb7FavOqvKd6f3b6u9QlbPVGg37cvxs-nMG2ek5QlBk2oLp4XeK5ijcW4iJfZl_N1dKp
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+Co-Training+Strategy+for+Multiple+View+Clustering+in+Process+Mining&rft.jtitle=IEEE+transactions+on+services+computing&rft.au=Appice%2C+Annalisa&rft.au=Malerba%2C+Donato&rft.date=2016-11-01&rft.pub=IEEE&rft.eissn=2372-0204&rft.volume=9&rft.issue=6&rft.spage=832&rft.epage=845&rft_id=info:doi/10.1109%2FTSC.2015.2430327&rft.externalDocID=7102743
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1939-1374&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1939-1374&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1939-1374&client=summon