Applying agglomerative hierarchical clustering algorithms to component identification for legacy systems

Component identification, the process of evolving legacy system into finely organized component-based software systems, is a critical part of software reengineering. Currently, many component identification approaches have been developed based on agglomerative hierarchical clustering algorithms. How...

Full description

Saved in:
Bibliographic Details
Published inInformation and software technology Vol. 53; no. 6; pp. 601 - 614
Main Authors Cui, Jian Feng, Chae, Heung Seok
Format Journal Article
LanguageEnglish
Published Amsterdam Elsevier B.V 01.06.2011
Elsevier Science Ltd
Subjects
Online AccessGet full text
ISSN0950-5849
1873-6025
DOI10.1016/j.infsof.2011.01.006

Cover

Abstract Component identification, the process of evolving legacy system into finely organized component-based software systems, is a critical part of software reengineering. Currently, many component identification approaches have been developed based on agglomerative hierarchical clustering algorithms. However, there is a lack of thorough investigation on which algorithm is appropriate for component identification. This paper focuses on analyzing agglomerative hierarchical clustering algorithms in software reengineering, and then identifying their respective strengths and weaknesses in order to apply them effectively for future practical applications. A series of experiments were conducted for 18 clustering strategies combined according to various similarity measures, weighting schemes and linkage methods. Eleven subject systems with different application domains and source code sizes were used in the experiments. The component identification results are evaluated by the proposed size, coupling and cohesion criteria. The experimental results suggested that the employed similarity measures, weighting schemes and linkage methods can have various effects on component identification results with respect to the proposed size, coupling and cohesion criteria, so the hierarchical clustering algorithms produced quite different clustering results. According to the experimental results, it can be concluded that it is difficult to produce perfectly satisfactory results for a given clustering algorithm. Nevertheless, these algorithms demonstrated varied capabilities to identify components with respect to the proposed size, coupling and cohesion criteria.
AbstractList This paper focuses on analyzing agglomerative hierarchical clustering algorithms in software reengineering, and then identifying their respective strengths and weaknesses in order to apply them effectively for future practical applications. A series of experiments were conducted for 18 clustering strategies combined according to various similarity measures, weighting schemes and linkage methods. The component identification results are evaluated by the proposed size, coupling and cohesion criteria. The experimental results suggested that the employed similarity measures, weighting schemes and linkage methods can have various effects on component identification results with respect to the proposed size, coupling and cohesion criteria, so the hierarchical clustering algorithms produced quite different clustering results. According to the experimental results, it can be concluded that it is difficult to produce perfectly satisfactory results for a given clustering algorithm. Nevertheless, these algorithms demonstrated varied capabilities to identify components with respect to the proposed size, coupling and cohesion criteria.
Context Component identification, the process of evolving legacy system into finely organized component-based software systems, is a critical part of software reengineering. Currently, many component identification approaches have been developed based on agglomerative hierarchical clustering algorithms. However, there is a lack of thorough investigation on which algorithm is appropriate for component identification. Objective This paper focuses on analyzing agglomerative hierarchical clustering algorithms in software reengineering, and then identifying their respective strengths and weaknesses in order to apply them effectively for future practical applications. Method A series of experiments were conducted for 18 clustering strategies combined according to various similarity measures, weighting schemes and linkage methods. Eleven subject systems with different application domains and source code sizes were used in the experiments. The component identification results are evaluated by the proposed size, coupling and cohesion criteria. Results The experimental results suggested that the employed similarity measures, weighting schemes and linkage methods can have various effects on component identification results with respect to the proposed size, coupling and cohesion criteria, so the hierarchical clustering algorithms produced quite different clustering results. Conclusions According to the experimental results, it can be concluded that it is difficult to produce perfectly satisfactory results for a given clustering algorithm. Nevertheless, these algorithms demonstrated varied capabilities to identify components with respect to the proposed size, coupling and cohesion criteria.
Component identification, the process of evolving legacy system into finely organized component-based software systems, is a critical part of software reengineering. Currently, many component identification approaches have been developed based on agglomerative hierarchical clustering algorithms. However, there is a lack of thorough investigation on which algorithm is appropriate for component identification. This paper focuses on analyzing agglomerative hierarchical clustering algorithms in software reengineering, and then identifying their respective strengths and weaknesses in order to apply them effectively for future practical applications. A series of experiments were conducted for 18 clustering strategies combined according to various similarity measures, weighting schemes and linkage methods. Eleven subject systems with different application domains and source code sizes were used in the experiments. The component identification results are evaluated by the proposed size, coupling and cohesion criteria. The experimental results suggested that the employed similarity measures, weighting schemes and linkage methods can have various effects on component identification results with respect to the proposed size, coupling and cohesion criteria, so the hierarchical clustering algorithms produced quite different clustering results. According to the experimental results, it can be concluded that it is difficult to produce perfectly satisfactory results for a given clustering algorithm. Nevertheless, these algorithms demonstrated varied capabilities to identify components with respect to the proposed size, coupling and cohesion criteria.
Author Cui, Jian Feng
Chae, Heung Seok
Author_xml – sequence: 1
  givenname: Jian Feng
  surname: Cui
  fullname: Cui, Jian Feng
  organization: Department of Computer Science and Technology, Xiamen University of Technology, 600 LiGong Rd., Xiamen 361024, China
– sequence: 2
  givenname: Heung Seok
  surname: Chae
  fullname: Chae, Heung Seok
  email: hschae@pusan.ac.kr
  organization: Department of Science and Engineering, Pusan National University, 30 Changjeon-dong, Keumjeong-gu, Busan 609-735, South Korea
BookMark eNqFkU1rJCEQhiUksJNs_kEOktNeerbsbnVmDwsh7BcE9rI5i22XPQ629qoTmH8fk9lTDhsotMDnKYrXS3IeYkBCbhisGTDxeb92weZo1y0wtoZaIM7Iim1k1who-TlZwZZDwzf99gO5zHkPwCR0sCK7u2XxRxcmqqfJxxmTLu4J6c7VLpmdM9pT4w-5YHql_BSTK7s50xKpifNSVwmFurGezla8uBiojYl6nLQ50nys7pw_kgurfcbrf_cVefz-7c_9z-bh949f93cPjemEKA2HvuXt0OqRWQGAEjeDQNmzUWsrtoKbQXMwUg6mr8-jlGIYmBZisNb0vOuuyKfT3CXFvwfMRc0uG_ReB4yHrJiQrO1lB9uK3r5B9_GQQt1ObQSXfQtSVqg_QSbFnBNatSQ363RUDNRL-mqvTumrl_QV1AJRtS9vNOPKazYlaeffk7-eZKxBPdWfUNk4DAZHl9AUNUb3_wHPVBin1A
CitedBy_id crossref_primary_10_1142_S0219525915500046
crossref_primary_10_1007_s10796_019_09897_y
crossref_primary_10_1016_j_jss_2017_08_017
crossref_primary_10_1109_TSE_2020_3042553
crossref_primary_10_1016_j_jksuci_2015_09_004
crossref_primary_10_1016_j_automatica_2022_110739
crossref_primary_10_1515_itms_2015_0005
crossref_primary_10_1016_j_enconman_2012_07_011
crossref_primary_10_1016_j_jss_2021_111162
crossref_primary_10_1002_int_21915
crossref_primary_10_1016_j_jss_2014_05_033
crossref_primary_10_1002_smr_2253
crossref_primary_10_1080_00207721_2013_797037
crossref_primary_10_1016_j_cie_2017_11_013
crossref_primary_10_1080_15623599_2024_2303884
crossref_primary_10_1007_s10257_012_0196_6
crossref_primary_10_1088_1742_6596_801_1_012041
crossref_primary_10_1016_j_advengsoft_2012_08_002
crossref_primary_10_1002_spe_2656
crossref_primary_10_1515_jisys_2017_0244
crossref_primary_10_1155_2012_792024
crossref_primary_10_3390_nano11102631
crossref_primary_10_1049_iet_sen_2011_0061
crossref_primary_10_1016_j_procs_2015_12_377
crossref_primary_10_1631_FITEE_1500373
crossref_primary_10_1016_j_jss_2014_02_053
crossref_primary_10_2514_1_I011030
crossref_primary_10_1007_s40009_016_0472_y
crossref_primary_10_1007_s10586_017_1408_0
crossref_primary_10_4236_jsea_2013_64A005
crossref_primary_10_1016_j_infsof_2014_05_013
crossref_primary_10_1016_j_ins_2020_12_056
Cites_doi 10.1109/32.748920
10.1109/DEXA.1998.707499
10.1016/S0164-1212(03)00234-6
10.1109/TSE.2007.70732
10.1109/TSE.2006.31
10.1109/ICSM.2004.1357824
10.1109/ICEBE.2007.17
10.1093/ietisy/e88-d.6.1178
10.1145/1216993.1217006
10.1109/TSE.2005.25
10.1126/science.253.5023.974
10.1109/TSE.1985.232524
10.1109/WCRE.2007.32
10.1109/CSMR.2004.1281402
10.1109/52.363157
10.1109/32.295895
10.1109/WCRE.2000.891478
10.1109/ICEBE.2005.32
10.1016/j.infsof.2008.05.012
10.1109/WCRE.1997.624574
10.1109/FOSE.2007.15
10.1109/METRIC.2003.1232469
10.1007/11751632_104
10.1086/284927
10.1145/302405.302654
10.1023/A:1009783721306
ContentType Journal Article
Copyright 2011 Elsevier B.V.
Copyright Elsevier Science Ltd. Jun 2011
Copyright_xml – notice: 2011 Elsevier B.V.
– notice: Copyright Elsevier Science Ltd. Jun 2011
DBID AAYXX
CITATION
7SC
8FD
JQ2
L7M
L~C
L~D
DOI 10.1016/j.infsof.2011.01.006
DatabaseName CrossRef
Computer and Information Systems Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Computer and Information Systems Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Advanced Technologies Database with Aerospace
ProQuest Computer Science Collection
Computer and Information Systems Abstracts Professional
DatabaseTitleList Computer and Information Systems Abstracts
Computer and Information Systems Abstracts

DeliveryMethod fulltext_linktorsrc
Discipline Business
EISSN 1873-6025
EndPage 614
ExternalDocumentID 2343344071
10_1016_j_infsof_2011_01_006
S0950584911000176
Genre Feature
GroupedDBID --K
--M
-~X
.DC
.~1
0R~
1B1
1~.
1~5
29I
4.4
457
4G.
5GY
5VS
7-5
71M
77K
8P~
9JN
AABNK
AACTN
AAEDT
AAEDW
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AAXUO
AAYFN
AAYOK
ABBOA
ABFNM
ABFRF
ABJNI
ABMAC
ABTAH
ABXDB
ABYKQ
ACDAQ
ACGFO
ACGFS
ACGOD
ACNNM
ACRLP
ACZNC
ADBBV
ADEZE
ADJOM
ADMUD
AEBSH
AEFWE
AEKER
AENEX
AFKWA
AFTJW
AGHFR
AGUBO
AGYEJ
AHHHB
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJBFU
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
ASPBG
AVWKF
AXJTR
AZFZN
BKOJK
BKOMP
BLXMC
CS3
DU5
EBS
EFJIC
EFLBG
EJD
EO8
EO9
EP2
EP3
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-Q
G8K
GBLVA
GBOLZ
HLZ
HVGLF
HZ~
IHE
J1W
KOM
LG9
M41
MO0
MS~
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
PQQKQ
Q38
R2-
RIG
ROL
RPZ
SBC
SDF
SDG
SDP
SES
SEW
SPC
SPCBC
SSV
SSZ
T5K
TWZ
UHS
UNMZH
WH7
WUQ
XFK
ZY4
~G-
77I
AATTM
AAXKI
AAYWO
AAYXX
ABDPE
ABWVN
ACLOT
ACRPL
ACVFH
ADCNI
ADNMO
AEIPS
AEUPX
AFJKZ
AFPUW
AGQPQ
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
APXCP
CITATION
EFKBS
~HD
7SC
8FD
AFXIZ
AGCQF
AGRNS
JQ2
L7M
L~C
L~D
SSH
ID FETCH-LOGICAL-c366t-504252b2ad1f600e7e8b6e741daaf6965cba50c77bc400ed776bb1a66bffc4533
IEDL.DBID .~1
ISSN 0950-5849
IngestDate Sun Sep 28 01:16:10 EDT 2025
Fri Jul 25 02:58:04 EDT 2025
Thu Apr 24 22:53:43 EDT 2025
Sat Oct 25 05:48:38 EDT 2025
Fri Feb 23 02:23:54 EST 2024
IsPeerReviewed true
IsScholarly true
Issue 6
Keywords Weighting scheme
Legacy systems
Similarity measure
Software reengineering
Agglomerative hierarchical clustering algorithm
Component identification
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c366t-504252b2ad1f600e7e8b6e741daaf6965cba50c77bc400ed776bb1a66bffc4533
Notes SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 14
ObjectType-Article-1
ObjectType-Feature-2
content type line 23
PQID 865742077
PQPubID 41979
PageCount 14
ParticipantIDs proquest_miscellaneous_1671247309
proquest_journals_865742077
crossref_primary_10_1016_j_infsof_2011_01_006
crossref_citationtrail_10_1016_j_infsof_2011_01_006
elsevier_sciencedirect_doi_10_1016_j_infsof_2011_01_006
PublicationCentury 2000
PublicationDate 2011-06-01
PublicationDateYYYYMMDD 2011-06-01
PublicationDate_xml – month: 06
  year: 2011
  text: 2011-06-01
  day: 01
PublicationDecade 2010
PublicationPlace Amsterdam
PublicationPlace_xml – name: Amsterdam
PublicationTitle Information and software technology
PublicationYear 2011
Publisher Elsevier B.V
Elsevier Science Ltd
Publisher_xml – name: Elsevier B.V
– name: Elsevier Science Ltd
References V. Tzerpos, R.C. Holt, Software Botryology: automatic clustering of software systems, in: Proceedings of the 9th International Workshop on Database and Expert Systems Applications, 1998, pp. 811–819.
G. Canfora, M.D. Penta, New frontiers of reverse engineering, in: Proceedings of Future of Software Engineering, 2007, pp. 326–341.
J.G. Park, H.S. Chae, E.S. So, A dynamic load balancing approach based on the standard RFID middleware architecture, in: Proceedings of IEEE International Conference on E-Business Engineering, October 2007, pp. 337–340.
Briand, Daly, Wüst (b0130) 1999; 25
Maqbool, Babri (b0035) 2007; 33
S. Becker, H. Koziolek, R. Reussner, Model-based performance prediction with the Palladio component model, in: Proceedings of the 6th International Workshop on Software and Performance, February 2007.
T.A. Wiggerts, Using clustering algorithms in legacy systems remodularization, in: Proceedings of the 4th Working Conference on Reverse Engineering, October 1997, pp. 33–43.
Chidamber, Kemerer (b0105) 1994; 20
F.C. Meng, D.C. Zhan, X. F. Xu, Business component identification of enterprise information system: a hierarchical clustering method, in: Proceedings of the IEEE International Conference on e-Business Engineering, October 2005, pp. 473–480.
Lee, Shin, Lee, Wu (b0145) 2005; E88-D
Lung, Zaman, Nandi (b0075) 2004; 73
D. Quinlan, Q. Yi, G. Kumfert, T. Epperly, T. Dahlgren, M. Schordan, B. White, Toward the automated generation of components from existing source code, in: Proceedings of the 2nd Workshop on Productivity and Performance in High-end Computing, February 2005.
M. Choi, S. Lee, A coupling metric applying the characteristics of components, in: Proceedings of Workshop on Component Based Software Engineering and Software Process Model, vol. 3983/2006, May 2006, pp. 966–975.
Bennet (b0005) 1995; 12
M. Shtern, V. Tzerpos, Lossless comparison of nested software decompositions, in: Proceedings of the 14th Working Conference on Reverse Engineering, October 2007, pp. 249–258.
P. Dissaux, Using the AADL for mission critical software development, in: Proceedings of ERTS Conference, January 2007.
L. C. Briand, J. Wüst, S. V. Ikonomovski, Hakim Lounis, Investigating quality factors in object-oriented designs: an industrial case study, in: Proceedings of the 21st International Conference on Software Engineering, May 1999, pp. 345–354.
H. Washizaki, H. Yamamoto, Y. Fukazawa, A metrics suite for measuring reusability of software components, in: Proceedings of the 9th International Software Metrics Symposium, September 2003, pp. 211–223.
Mirkin (b0020) 1996
Andritsos, Tzerpos (b0175) 2005; 31
Briand, Daly, Wüst (b0125) 2004; 3
Colosimo, Lucia, Scanniello, Tortora (b0015) 2009; 51
Mitchell, Mancoridis (b0055) 2006; 32
.
Borland Together
Hutchens, Basili (b0090) 1985; 11
J. Kim, O. Kwon, J. Lee, G. Shin, Component adaptation using adaptation pattern components, in: Proceedings of IEEE International Conference on Systems, Man, and Cybernetics, October 2001, pp. 1025–1029.
O. Maqbool, H.A. Babri, The weighted combined algorithm: a linkage algorithm for software clustering, in: Proceedings of Software Maintenance and Reengineering, March 2004, pp. 15–24.
Salton (b0065) 1991; 253
J.S. Alghamdi, R.A. Rufai, S.M. Khan, OOMeter: a software quality assurance tool, in: Proceedings of the 9th European Conference on Software Maintenance and Reengineering, March 2005, pp. 190–191.
Gold (b0010) 1998
J. Luo, R. Jiang, L. Zhang, H. Mei, J. Sun, An experimental study of two graph analysis based component capture methods for object-oriented systems, in: Proceedings of the 20th International Conference on Software Maintenance, September 2004, pp. 390–398.
Kim, Kim, Chae (b0135) 2009; 16-D
T. Kunz, Developing a Measure for Process Cluster Evaluation, Technical Report TI-2/93, Technical University Darmstadt, 1993.
J. Davery, E. Burd, Evaluating the suitability of data clustering for software remodularization, in: Proceedings of the 7th Working Conference on Reverse Engineering, November 2000, pp. 268–276.
Jackson, Somers, Harvey (b0045) 1989; 133
Briand (10.1016/j.infsof.2011.01.006_b0130) 1999; 25
Lee (10.1016/j.infsof.2011.01.006_b0145) 2005; E88-D
Kim (10.1016/j.infsof.2011.01.006_b0135) 2009; 16-D
Hutchens (10.1016/j.infsof.2011.01.006_b0090) 1985; 11
Mitchell (10.1016/j.infsof.2011.01.006_b0055) 2006; 32
Jackson (10.1016/j.infsof.2011.01.006_b0045) 1989; 133
Briand (10.1016/j.infsof.2011.01.006_b0125) 2004; 3
10.1016/j.infsof.2011.01.006_b0155
Chidamber (10.1016/j.infsof.2011.01.006_b0105) 1994; 20
Maqbool (10.1016/j.infsof.2011.01.006_b0035) 2007; 33
10.1016/j.infsof.2011.01.006_b0115
10.1016/j.infsof.2011.01.006_b0050
10.1016/j.infsof.2011.01.006_b0095
10.1016/j.infsof.2011.01.006_b0150
10.1016/j.infsof.2011.01.006_b0070
Colosimo (10.1016/j.infsof.2011.01.006_b0015) 2009; 51
10.1016/j.infsof.2011.01.006_b0170
10.1016/j.infsof.2011.01.006_b0110
10.1016/j.infsof.2011.01.006_b0030
10.1016/j.infsof.2011.01.006_b0080
Salton (10.1016/j.infsof.2011.01.006_b0065) 1991; 253
Lung (10.1016/j.infsof.2011.01.006_b0075) 2004; 73
Andritsos (10.1016/j.infsof.2011.01.006_b0175) 2005; 31
10.1016/j.infsof.2011.01.006_b0025
10.1016/j.infsof.2011.01.006_b0100
10.1016/j.infsof.2011.01.006_b0160
10.1016/j.infsof.2011.01.006_b0040
Mirkin (10.1016/j.infsof.2011.01.006_b0020) 1996
10.1016/j.infsof.2011.01.006_b0060
10.1016/j.infsof.2011.01.006_b0120
Bennet (10.1016/j.infsof.2011.01.006_b0005) 1995; 12
Gold (10.1016/j.infsof.2011.01.006_b0010) 1998
10.1016/j.infsof.2011.01.006_b0165
10.1016/j.infsof.2011.01.006_b0085
10.1016/j.infsof.2011.01.006_b0140
References_xml – reference: F.C. Meng, D.C. Zhan, X. F. Xu, Business component identification of enterprise information system: a hierarchical clustering method, in: Proceedings of the IEEE International Conference on e-Business Engineering, October 2005, pp. 473–480.
– volume: 253
  start-page: 974
  year: 1991
  end-page: 979
  ident: b0065
  article-title: Development in automatic text retrieval
  publication-title: Science
– reference: Borland Together,
– volume: 32
  start-page: 193
  year: 2006
  end-page: 208
  ident: b0055
  article-title: On the automatic modularization of software systems using the bunch tool
  publication-title: IEEE Transactions on Software Engineering
– volume: 73
  start-page: 227
  year: 2004
  end-page: 244
  ident: b0075
  article-title: Applications of clustering techniques to software partitioning, recovery and restructuring
  publication-title: Journal of Systems and Software
– reference: J. Kim, O. Kwon, J. Lee, G. Shin, Component adaptation using adaptation pattern components, in: Proceedings of IEEE International Conference on Systems, Man, and Cybernetics, October 2001, pp. 1025–1029.
– reference: L. C. Briand, J. Wüst, S. V. Ikonomovski, Hakim Lounis, Investigating quality factors in object-oriented designs: an industrial case study, in: Proceedings of the 21st International Conference on Software Engineering, May 1999, pp. 345–354.
– reference: M. Choi, S. Lee, A coupling metric applying the characteristics of components, in: Proceedings of Workshop on Component Based Software Engineering and Software Process Model, vol. 3983/2006, May 2006, pp. 966–975.
– reference: P. Dissaux, Using the AADL for mission critical software development, in: Proceedings of ERTS Conference, January 2007.
– volume: 133
  start-page: 436
  year: 1989
  end-page: 453
  ident: b0045
  article-title: Similarity coefficients: measures of co-occurrence and association or simply measures of occurrence?
  publication-title: The American Naturalist
– reference: S. Becker, H. Koziolek, R. Reussner, Model-based performance prediction with the Palladio component model, in: Proceedings of the 6th International Workshop on Software and Performance, February 2007.
– reference: G. Canfora, M.D. Penta, New frontiers of reverse engineering, in: Proceedings of Future of Software Engineering, 2007, pp. 326–341.
– reference: J. Davery, E. Burd, Evaluating the suitability of data clustering for software remodularization, in: Proceedings of the 7th Working Conference on Reverse Engineering, November 2000, pp. 268–276.
– reference: D. Quinlan, Q. Yi, G. Kumfert, T. Epperly, T. Dahlgren, M. Schordan, B. White, Toward the automated generation of components from existing source code, in: Proceedings of the 2nd Workshop on Productivity and Performance in High-end Computing, February 2005.
– volume: 51
  start-page: 433
  year: 2009
  end-page: 447
  ident: b0015
  article-title: Evaluating legacy system migration technologies through empirical studies
  publication-title: Information and Software Technology
– reference: J.S. Alghamdi, R.A. Rufai, S.M. Khan, OOMeter: a software quality assurance tool, in: Proceedings of the 9th European Conference on Software Maintenance and Reengineering, March 2005, pp. 190–191.
– reference: V. Tzerpos, R.C. Holt, Software Botryology: automatic clustering of software systems, in: Proceedings of the 9th International Workshop on Database and Expert Systems Applications, 1998, pp. 811–819.
– volume: 16-D
  start-page: 407
  year: 2009
  end-page: 416
  ident: b0135
  article-title: An experimental study of generality of software defects prediction models based on object oriented metrics
  publication-title: Journal of Information Processing Systems
– volume: 31
  start-page: 150
  year: 2005
  end-page: 165
  ident: b0175
  article-title: Information-theoretic software clustering
  publication-title: IEEE Transactions on Software Engineering
– reference: O. Maqbool, H.A. Babri, The weighted combined algorithm: a linkage algorithm for software clustering, in: Proceedings of Software Maintenance and Reengineering, March 2004, pp. 15–24.
– volume: 20
  start-page: 476
  year: 1994
  end-page: 493
  ident: b0105
  article-title: A metrics suite for object oriented design
  publication-title: IEEE Transactions on Software Engineering
– volume: 11
  start-page: 749
  year: 1985
  end-page: 757
  ident: b0090
  article-title: System structure analysis: clustering with data bindings
  publication-title: IEEE Transactions on Software Engineering
– volume: 3
  start-page: 65
  year: 2004
  end-page: 117
  ident: b0125
  article-title: A unified framework for cohesion measurement in object-oriented systems
  publication-title: Empirical Software Engineering
– volume: 12
  start-page: 19
  year: 1995
  end-page: 23
  ident: b0005
  article-title: Legacy systems: coping with success
  publication-title: IEEE Software
– volume: 33
  start-page: 759
  year: 2007
  end-page: 780
  ident: b0035
  article-title: Hierarchical clustering for software architecture recovery
  publication-title: IEEE Transactions on Software Engineering
– reference: M. Shtern, V. Tzerpos, Lossless comparison of nested software decompositions, in: Proceedings of the 14th Working Conference on Reverse Engineering, October 2007, pp. 249–258.
– year: 1996
  ident: b0020
  article-title: Mathematical Classification and Clustering
– volume: E88-D
  start-page: 1178
  year: 2005
  end-page: 1190
  ident: b0145
  article-title: Extracting components from object-oriented system: a transformational approach
  publication-title: IEICE Transactions on Information and Systems
– year: 1998
  ident: b0010
  article-title: The Meaning of Legacy System, SABA Project Report: PR-SABA-01 Version 1.1
– reference: J.G. Park, H.S. Chae, E.S. So, A dynamic load balancing approach based on the standard RFID middleware architecture, in: Proceedings of IEEE International Conference on E-Business Engineering, October 2007, pp. 337–340.
– reference: H. Washizaki, H. Yamamoto, Y. Fukazawa, A metrics suite for measuring reusability of software components, in: Proceedings of the 9th International Software Metrics Symposium, September 2003, pp. 211–223.
– reference: T.A. Wiggerts, Using clustering algorithms in legacy systems remodularization, in: Proceedings of the 4th Working Conference on Reverse Engineering, October 1997, pp. 33–43.
– reference: J. Luo, R. Jiang, L. Zhang, H. Mei, J. Sun, An experimental study of two graph analysis based component capture methods for object-oriented systems, in: Proceedings of the 20th International Conference on Software Maintenance, September 2004, pp. 390–398.
– reference: T. Kunz, Developing a Measure for Process Cluster Evaluation, Technical Report TI-2/93, Technical University Darmstadt, 1993.
– reference: .
– volume: 25
  start-page: 91
  year: 1999
  end-page: 121
  ident: b0130
  article-title: A unified framework for coupling measurement in object-oriented systems
  publication-title: Transactions on Software Engineering
– volume: 25
  start-page: 91
  issue: 1
  year: 1999
  ident: 10.1016/j.infsof.2011.01.006_b0130
  article-title: A unified framework for coupling measurement in object-oriented systems
  publication-title: Transactions on Software Engineering
  doi: 10.1109/32.748920
– ident: 10.1016/j.infsof.2011.01.006_b0030
  doi: 10.1109/DEXA.1998.707499
– volume: 73
  start-page: 227
  issue: 2
  year: 2004
  ident: 10.1016/j.infsof.2011.01.006_b0075
  article-title: Applications of clustering techniques to software partitioning, recovery and restructuring
  publication-title: Journal of Systems and Software
  doi: 10.1016/S0164-1212(03)00234-6
– ident: 10.1016/j.infsof.2011.01.006_b0070
– volume: 33
  start-page: 759
  issue: 11
  year: 2007
  ident: 10.1016/j.infsof.2011.01.006_b0035
  article-title: Hierarchical clustering for software architecture recovery
  publication-title: IEEE Transactions on Software Engineering
  doi: 10.1109/TSE.2007.70732
– volume: 32
  start-page: 193
  issue: 3
  year: 2006
  ident: 10.1016/j.infsof.2011.01.006_b0055
  article-title: On the automatic modularization of software systems using the bunch tool
  publication-title: IEEE Transactions on Software Engineering
  doi: 10.1109/TSE.2006.31
– ident: 10.1016/j.infsof.2011.01.006_b0155
  doi: 10.1109/ICSM.2004.1357824
– ident: 10.1016/j.infsof.2011.01.006_b0085
– volume: 16-D
  start-page: 407
  issue: 3
  year: 2009
  ident: 10.1016/j.infsof.2011.01.006_b0135
  article-title: An experimental study of generality of software defects prediction models based on object oriented metrics
  publication-title: Journal of Information Processing Systems
– ident: 10.1016/j.infsof.2011.01.006_b0140
  doi: 10.1109/ICEBE.2007.17
– ident: 10.1016/j.infsof.2011.01.006_b0165
– volume: E88-D
  start-page: 1178
  issue: 6
  year: 2005
  ident: 10.1016/j.infsof.2011.01.006_b0145
  article-title: Extracting components from object-oriented system: a transformational approach
  publication-title: IEICE Transactions on Information and Systems
  doi: 10.1093/ietisy/e88-d.6.1178
– ident: 10.1016/j.infsof.2011.01.006_b0150
  doi: 10.1145/1216993.1217006
– volume: 31
  start-page: 150
  issue: 2
  year: 2005
  ident: 10.1016/j.infsof.2011.01.006_b0175
  article-title: Information-theoretic software clustering
  publication-title: IEEE Transactions on Software Engineering
  doi: 10.1109/TSE.2005.25
– ident: 10.1016/j.infsof.2011.01.006_b0025
– volume: 253
  start-page: 974
  year: 1991
  ident: 10.1016/j.infsof.2011.01.006_b0065
  article-title: Development in automatic text retrieval
  publication-title: Science
  doi: 10.1126/science.253.5023.974
– year: 1998
  ident: 10.1016/j.infsof.2011.01.006_b0010
– volume: 11
  start-page: 749
  issue: 8
  year: 1985
  ident: 10.1016/j.infsof.2011.01.006_b0090
  article-title: System structure analysis: clustering with data bindings
  publication-title: IEEE Transactions on Software Engineering
  doi: 10.1109/TSE.1985.232524
– ident: 10.1016/j.infsof.2011.01.006_b0120
  doi: 10.1109/WCRE.2007.32
– ident: 10.1016/j.infsof.2011.01.006_b0060
  doi: 10.1109/CSMR.2004.1281402
– ident: 10.1016/j.infsof.2011.01.006_b0115
– volume: 12
  start-page: 19
  issue: 1
  year: 1995
  ident: 10.1016/j.infsof.2011.01.006_b0005
  article-title: Legacy systems: coping with success
  publication-title: IEEE Software
  doi: 10.1109/52.363157
– volume: 20
  start-page: 476
  issue: 6
  year: 1994
  ident: 10.1016/j.infsof.2011.01.006_b0105
  article-title: A metrics suite for object oriented design
  publication-title: IEEE Transactions on Software Engineering
  doi: 10.1109/32.295895
– ident: 10.1016/j.infsof.2011.01.006_b0160
  doi: 10.1109/WCRE.2000.891478
– ident: 10.1016/j.infsof.2011.01.006_b0170
– ident: 10.1016/j.infsof.2011.01.006_b0040
  doi: 10.1109/ICEBE.2005.32
– volume: 51
  start-page: 433
  issue: 2
  year: 2009
  ident: 10.1016/j.infsof.2011.01.006_b0015
  article-title: Evaluating legacy system migration technologies through empirical studies
  publication-title: Information and Software Technology
  doi: 10.1016/j.infsof.2008.05.012
– year: 1996
  ident: 10.1016/j.infsof.2011.01.006_b0020
– ident: 10.1016/j.infsof.2011.01.006_b0050
  doi: 10.1109/WCRE.1997.624574
– ident: 10.1016/j.infsof.2011.01.006_b0080
  doi: 10.1109/FOSE.2007.15
– ident: 10.1016/j.infsof.2011.01.006_b0095
  doi: 10.1109/METRIC.2003.1232469
– ident: 10.1016/j.infsof.2011.01.006_b0100
  doi: 10.1007/11751632_104
– volume: 133
  start-page: 436
  issue: 3
  year: 1989
  ident: 10.1016/j.infsof.2011.01.006_b0045
  article-title: Similarity coefficients: measures of co-occurrence and association or simply measures of occurrence?
  publication-title: The American Naturalist
  doi: 10.1086/284927
– ident: 10.1016/j.infsof.2011.01.006_b0110
  doi: 10.1145/302405.302654
– volume: 3
  start-page: 65
  issue: 1
  year: 2004
  ident: 10.1016/j.infsof.2011.01.006_b0125
  article-title: A unified framework for cohesion measurement in object-oriented systems
  publication-title: Empirical Software Engineering
  doi: 10.1023/A:1009783721306
SSID ssj0017030
Score 2.1670005
Snippet Component identification, the process of evolving legacy system into finely organized component-based software systems, is a critical part of software...
This paper focuses on analyzing agglomerative hierarchical clustering algorithms in software reengineering, and then identifying their respective strengths and...
Context Component identification, the process of evolving legacy system into finely organized component-based software systems, is a critical part of software...
SourceID proquest
crossref
elsevier
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 601
SubjectTerms Agglomeration
Agglomerative hierarchical clustering algorithm
Algorithms
Cluster analysis
Clustering
Cohesion
Component identification
Computer programs
Criteria
Joining
Legacy systems
Similarity
Similarity measure
Similarity measures
Software
Software engineering
Software reengineering
Studies
Weighting scheme
Title Applying agglomerative hierarchical clustering algorithms to component identification for legacy systems
URI https://dx.doi.org/10.1016/j.infsof.2011.01.006
https://www.proquest.com/docview/865742077
https://www.proquest.com/docview/1671247309
Volume 53
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Baden-Württemberg Complete Freedom Collection (Elsevier)
  customDbUrl:
  eissn: 1873-6025
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0017030
  issn: 0950-5849
  databaseCode: GBLVA
  dateStart: 20110101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVESC
  databaseName: Elsevier ScienceDirect
  customDbUrl:
  eissn: 1873-6025
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0017030
  issn: 0950-5849
  databaseCode: .~1
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVESC
  databaseName: Elsevier SD Freedom Collection Journals [SCFCJ]
  customDbUrl:
  eissn: 1873-6025
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0017030
  issn: 0950-5849
  databaseCode: AIKHN
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVESC
  databaseName: ScienceDirect (Elsevier)
  customDbUrl:
  eissn: 1873-6025
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0017030
  issn: 0950-5849
  databaseCode: ACRLP
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVLSH
  databaseName: Elsevier Journals
  customDbUrl:
  mediaType: online
  eissn: 1873-6025
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0017030
  issn: 0950-5849
  databaseCode: AKRWK
  dateStart: 19950101
  isFulltext: true
  providerName: Library Specific Holdings
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3NS-QwFA-iIF7EXRVHXcmC1zhJmibtUUQZXdaLCt5CkrbjSJ0OtnPw4t_ue_0YcGERhJ7ShJb3-SN57xdCTlMdJC8yySLuHFOJECyJhGIuckal0qU6w97hv7d68qBuHuPHNXIx9MJgWWUf-7uY3kbrfmTcS3O8mM3GdwAOOKTPFEnPwK6Qdlspg7cYnL2vyjwEWnTHt8cZzh7a59oaL1BiXQ1EngJrvP6Xnv4J1G32udoh2z1spOfdn_0ga_n8J9kcqtZ3yROiSexYom46LSvcacI4RvGm6_asAFRBQ7lEWoR2VjmtXmfN00tNm4piXXk1h_RDZ1lfPdQqjAKipWU-deGNdpTP9R55uLq8v5iw_hIFFiKtGxajV0ovXSYKADe5yROvc8ARmXOFTnUcvIt5MMYHcOc8M0Z7L5zWviiCAjC4T9bn8AsHhKogAI-5NMiQAO7SXkmlZeBOKBgIYkSiQXY29AzjeNFFaYdSsmfbSdyixC2Hh-sRYatVi45h44v5ZlCL_WQpFpLAFyuPBi3a3lNrm-jYKMmNGZHfq7fgYnhu4uZ5tayt0AZQEITC9PDb3z4iW91uNO7fHJP15nWZ_wI40_iT1l5PyMb59Z_J7Qfqsfa3
linkProvider Elsevier
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1La9wwEB7SFNpcQp9kmz5U6FVdSZYl-xhCw7ZNcmkCuQlJtjdb3HWIvYde8tsz48dCCyVQ8EmWsNC8PkvfjAA-5SYqURWKJ8J7rjMpeZZIzX3irc6Vz01BucNn52Zxqb9dpVc7cDzlwhCtcvT9g0_vvfXYMh9Xc36zWs1_IDgQGD5zKnqGemUewWOdKkt_YJ_vtjwPSSo9FNwTnLpP-XM9yQul2DZTJU9JJK9_xae_PHUffk6ewf6IG9nRMLXnsFOuX8CTibb-Eq4JTlLKEvPLZd3QVhM5MkZXXfeHBSgLFusN1UXoe9XL5nbVXf9qWdcwIpY3a4w_bFWM9KFeYgwhLavLpY-_2VDzuX0FlydfLo4XfLxFgcfEmI6nZJYqKF_ICtFNacssmBKBROF9ZXKTxuBTEa0NEe25LKw1IUhvTKiqqBENvobdNU7hAJiOEgGZz6OKGQIvE7TSRkXhpcaGKGeQTGvn4lhinG66qN3EJfvphhV3tOJO4CPMDPh21M1QYuOB_nYSi_tDVRxGgQdGHk5SdKOpti4zqdVKWDuDj9u3aGN0cOLXZbNpnTQWYRD6wvzNf3_7AzxdXJydutOv598PYW_YmqbNnLew291uyneIbbrwvtfde5xm-Ew
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Applying+agglomerative+hierarchical+clustering+algorithms+to+component+identification+for+legacy+systems&rft.jtitle=Information+and+software+technology&rft.au=Cui%2C+Jian+Feng&rft.au=Chae%2C+Heung+Seok&rft.date=2011-06-01&rft.pub=Elsevier+Science+Ltd&rft.issn=0950-5849&rft.eissn=1873-6025&rft.volume=53&rft.issue=6&rft.spage=601&rft_id=info:doi/10.1016%2Fj.infsof.2011.01.006&rft.externalDBID=NO_FULL_TEXT&rft.externalDocID=2343344071
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0950-5849&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0950-5849&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0950-5849&client=summon