Quantitative synteny scoring improves homology inference and partitioning of gene families

Background Clustering sequences into families has long been an important step in characterization of genes and proteins. There are many algorithms developed for this purpose, most of which are based on either direct similarity between gene pairs or some sort of network structure, where weights on ed...

Full description

Saved in:
Bibliographic Details
Published inBMC bioinformatics Vol. 14; no. Suppl 15; p. S12
Main Authors Ali, Raja Hashim, Muhammad, Sayyed Auwn, Khan, Mehmood Alam, Arvestad, Lars
Format Journal Article
LanguageEnglish
Published London BioMed Central 2013
Springer Nature B.V
Subjects
Online AccessGet full text
ISSN1471-2105
1471-2105
DOI10.1186/1471-2105-14-S15-S12

Cover

More Information
Summary:Background Clustering sequences into families has long been an important step in characterization of genes and proteins. There are many algorithms developed for this purpose, most of which are based on either direct similarity between gene pairs or some sort of network structure, where weights on edges of constructed graphs are based on similarity. However, conserved synteny is an important signal that can help distinguish homology and it has not been utilized to its fullest potential. Results Here, we present GenFamClust, a pipeline that combines the network properties of sequence similarity and synteny to assess homology relationship and merge known homologs into groups of gene families. GenFamClust identifies homologs in a more informed and accurate manner as compared to similarity based approaches. We tested our method against the Neighborhood Correlation method on two diverse datasets consisting of fully sequenced genomes of eukaryotes and synthetic data. Conclusions The results obtained from both datasets confirm that synteny helps determine homology and GenFamClust improves on Neighborhood Correlation method. The accuracy as well as the definition of synteny scores is the most valuable contribution of GenFamClust.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ObjectType-Article-2
ObjectType-Feature-1
content type line 23
ObjectType-Conference-3
SourceType-Conference Papers & Proceedings-2
ISSN:1471-2105
1471-2105
DOI:10.1186/1471-2105-14-S15-S12