A network clustering algorithm for detection of protein families

Detection of protein families in large scale database is a difficult but important biological problem. Computational clustering methods can effectively address the problem. Although there exist many clustering algorithms, most of them are just based on the threshold. Their computational performances...

Full description

Saved in:
Bibliographic Details
Published in2012 Annual International Conference of the IEEE Engineering in Medicine and Biology Society Vol. 2012; pp. 6329 - 6332
Main Authors Xie, Jiang, Wang, Minchao, Dai, Dongbo, Zhang, Huiran, Zhang, Wu
Format Conference Proceeding Journal Article
LanguageEnglish
Published United States IEEE 01.01.2012
Subjects
Online AccessGet full text
ISBN1424441196
9781424441198
ISSN1094-687X
1557-170X
DOI10.1109/EMBC.2012.6347441

Cover

Abstract Detection of protein families in large scale database is a difficult but important biological problem. Computational clustering methods can effectively address the problem. Although there exist many clustering algorithms, most of them are just based on the threshold. Their computational performances are affected by the weight distribution greatly, and they are only valid for some special networks. A new network clustering algorithm, Markov Finding and Clustering (MFC), is proposed to cluster the proteins into their functionally specific families accurately in this paper. The MFC algorithm makes an improvement in the random walk process and reduces the affection of the noise on the clustering result. It has a good performance on these networks which are not well addressed by existing algorithms sensitive to the noise. Finally, experiments on the protein sequence datasets demonstrate that the algorithm is effective in the detection of protein families and has a better performance than the current algorithms.
AbstractList Detection of protein families in large scale database is a difficult but important biological problem. Computational clustering methods can effectively address the problem. Although there exist many clustering algorithms, most of them are just based on the threshold. Their computational performances are affected by the weight distribution greatly, and they are only valid for some special networks. A new network clustering algorithm, Markov Finding and Clustering (MFC), is proposed to cluster the proteins into their functionally specific families accurately in this paper. The MFC algorithm makes an improvement in the random walk process and reduces the affection of the noise on the clustering result. It has a good performance on these networks which are not well addressed by existing algorithms sensitive to the noise. Finally, experiments on the protein sequence datasets demonstrate that the algorithm is effective in the detection of protein families and has a better performance than the current algorithms.
Author Minchao Wang
Huiran Zhang
Dongbo Dai
Wu Zhang
Jiang Xie
Author_xml – sequence: 1
  givenname: Jiang
  surname: Xie
  fullname: Xie, Jiang
  organization: School of Computer Engineering and Science, Shanghai University, Shanghai 200072, China
– sequence: 2
  givenname: Minchao
  surname: Wang
  fullname: Wang, Minchao
– sequence: 3
  givenname: Dongbo
  surname: Dai
  fullname: Dai, Dongbo
– sequence: 4
  givenname: Huiran
  surname: Zhang
  fullname: Zhang, Huiran
– sequence: 5
  givenname: Wu
  surname: Zhang
  fullname: Zhang, Wu
BackLink https://www.ncbi.nlm.nih.gov/pubmed/23367376$$D View this record in MEDLINE/PubMed
BookMark eNo9UMtKAzEUjVixD_sBIkh-YGpucptkdtZSH1Bxo-CuZGZuanQeJTNF_HsHWj2bszgPOGfMBnVTE2OXIGYAIr1ZPd8tZ1KAnGmFBhFO2DQ1FnBuDBhr4JSNASX2CqR6wEZ9CBNtzfuQTdv2U_SwYJXAczaUSmmjjB6x2wWvqftu4hfPy33bUQz1lrty28TQfVTcN5EX1FHehabmjee72HQUau5dFcpA7QU7865saXrkCXu7X70uH5P1y8PTcrFOgjLQJeh8gTDPjZSWtHCIRE4YzJB06l0mMRUIxpu5zQtnM0qVlt5pQukJtFQTdn3o3e2ziorNLobKxZ_N35TecHUwBCL6l49nqV9Nu1oq
ContentType Conference Proceeding
Journal Article
DBID 6IE
6IH
CBEJK
RIE
RIO
CGR
CUY
CVF
ECM
EIF
NPM
DOI 10.1109/EMBC.2012.6347441
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
DatabaseTitle MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
DatabaseTitleList
MEDLINE
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: EIF
  name: MEDLINE
  url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search
  sourceTypes: Index Database
– sequence: 3
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISBN 9781457717871
1457717875
EndPage 6332
ExternalDocumentID 23367376
6347441
Genre orig-research
Research Support, Non-U.S. Gov't
Journal Article
GroupedDBID 6IE
6IF
6IH
AAJGR
ACGFS
AFFNX
ALMA_UNASSIGNED_HOLDINGS
CBEJK
M43
RIE
RIO
RNS
29F
29G
6IK
6IM
CGR
CUY
CVF
ECM
EIF
IPLJI
NPM
ID FETCH-LOGICAL-i371t-4afd415c7228e60a44eea074b4e69fab2490417f758cda8be9362fa6e42fe1623
IEDL.DBID RIE
ISBN 1424441196
9781424441198
ISSN 1094-687X
1557-170X
IngestDate Thu Jan 02 22:16:27 EST 2025
Wed Aug 27 02:44:23 EDT 2025
IsPeerReviewed true
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i371t-4afd415c7228e60a44eea074b4e69fab2490417f758cda8be9362fa6e42fe1623
PMID 23367376
PageCount 4
ParticipantIDs ieee_primary_6347441
pubmed_primary_23367376
PublicationCentury 2000
PublicationDate 2012-01-01
PublicationDateYYYYMMDD 2012-01-01
PublicationDate_xml – month: 01
  year: 2012
  text: 2012-01-01
  day: 01
PublicationDecade 2010
PublicationPlace United States
PublicationPlace_xml – name: United States
PublicationTitle 2012 Annual International Conference of the IEEE Engineering in Medicine and Biology Society
PublicationTitleAbbrev EMBC
PublicationTitleAlternate Conf Proc IEEE Eng Med Biol Soc
PublicationYear 2012
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0000818304
ssj0020051
ssj0061641
Score 1.9174249
Snippet Detection of protein families in large scale database is a difficult but important biological problem. Computational clustering methods can effectively address...
SourceID pubmed
ieee
SourceType Index Database
Publisher
StartPage 6329
SubjectTerms Accuracy
Algorithms
Bioinformatics
Cluster Analysis
Clustering algorithms
Educational institutions
Legged locomotion
Markov Chains
Noise
Proteins
Proteins - analysis
Proteins - chemistry
Title A network clustering algorithm for detection of protein families
URI https://ieeexplore.ieee.org/document/6347441
https://www.ncbi.nlm.nih.gov/pubmed/23367376
Volume 2012
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwELZKJ1h4tEB5yQMjafO42vEGVK0qpCIGKnWrHOcMFSVBVbLw67GdNCDEwJaHo8h3lu6-e3xHyLXx-JNhmKDnmxUeDGPhxcJXnsG3ErjymXA8BbNHNp3Dw2K4aJGbphcGEV3xGfbtpcvlp7kqbahswCLgYLvUd3jMql6tJp5iqdkiCy1qsGVPm8t0CvBYzBfbpi4IgorPz3I91fdxne40iwfj2f3IVnyF_fpvli44iuw4F1ZPYPnlgTpLNNkns-0eqgKUt35ZJH31-Yve8b-bPCDd754_-tRYs0PSwuyI7P2gK-yQ2zuaVVXjVK1Ly7BgHlO5fsk3q-L1nRr_l6ZYuOKujOaaOhKIVUZdFMVA8i6ZT8bPo6lXT2DwVhEPCg-kTo2FVzwMY2S-BECUxulIAJnQMjHYzYeAawM6VCrjBIWxh1oyhFBjYDyrY9LO8gxPCTVHQiitAs6EhFRY1hgAZVSgpZ0LmPZIx8pi-VGRbCxrMfTISSXr5sVWGWd_f3BOdq32qijJBWkXmxIvjd9QJFfuwHwBbsu3Dg
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwELaqMgALjxYoTw-MuM3j4sQbUFEVaCqGVupWOY4DFSVBVbLw67GdNKCKgS0PR5HvLN199_gOoWvl8UeeE0liqRUEvICRgFmCKHzLwRcWZYanIBzT4RSeZt6sgW7qXhgppSk-k119aXL5cSYKHSrrURd80F3qWx4AeGW3Vh1R0eRsrgYXFdzS583kOhkQGvizdVsX2HbJ6KfZnqr7oEp4qsW9h_C-r2u-nG71P00Y7Lp6oAutZrBs-KDGFg32ULjeRVmC8t4t8qgrvjYIHv-7zX3U_un6wy-1PTtADZkeot1fhIUtdHuH07JuHItloTkW1GPMl6_ZapG_fWDlAeNY5qa8K8VZgg0NxCLFJo6iQHkbTQcPk_6QVDMYyML17ZwAT2Jl44XvOIGkFgeQkiu3IwJJWcIjhd4ssP1EwQ4R8yCSTFnEhFMJTiJt5VsdoWaapfIEYXUomEiE7VPGIWaaNwZAKBUkXE8GjDuopWUx_yxpNuaVGDrouJR1_WKtjNO_P7hC28NJOJqPHsfPZ2hHa7KMmZyjZr4q5IXyIvLo0hyebz3_uls
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2012+Annual+International+Conference+of+the+IEEE+Engineering+in+Medicine+and+Biology+Society&rft.atitle=A+network+clustering+algorithm+for+detection+of+protein+families&rft.au=Jiang+Xie&rft.au=Minchao+Wang&rft.au=Dongbo+Dai&rft.au=Huiran+Zhang&rft.date=2012-01-01&rft.pub=IEEE&rft.isbn=9781424441198&rft.issn=1094-687X&rft.spage=6329&rft.epage=6332&rft_id=info:doi/10.1109%2FEMBC.2012.6347441&rft_id=info%3Apmid%2F23367376&rft.externalDocID=6347441
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1094-687X&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1094-687X&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1094-687X&client=summon