Identification of DNA–protein Binding Sites through Multi-Scale Local Average Blocks on Sequence Information

DNA–protein interactions appear as pivotal roles in diverse biological procedures and are paramount for cell metabolism, while identifying them with computational means is a kind of prudent scenario in depleting in vitro and in vivo experimental charging. A variety of state-of-the-art investigations...

Full description

Saved in:

Bibliographic Details
Published in	Molecules Vol. 22; no. 12; p. 2079
Main Authors	Shen, Cong, Ding, Yijie, Tang, Jijun, Song, Jian, Guo, Fei
Format	Journal Article
Language	English
Published	Switzerland MDPI AG 28.11.2017 MDPI
Subjects	Binding sites Deoxyribonucleic acid DNA DNA–protein binding sites ensemble classifier feature extraction Proteins random sub-sampling sparse representation model random sub-sampling ensemble classifier feature extraction sparse representation model DNA–protein binding sites
Online Access	Get full text
ISSN	1420-3049 1433-1373 1420-3049 1433-1373
DOI	10.3390/molecules22122079

Cover

More Information
Summary:	DNA–protein interactions appear as pivotal roles in diverse biological procedures and are paramount for cell metabolism, while identifying them with computational means is a kind of prudent scenario in depleting in vitro and in vivo experimental charging. A variety of state-of-the-art investigations have been elucidated to improve the accuracy of the DNA–protein binding sites prediction. Nevertheless, structure-based approaches are limited under the condition without 3D information, and the predictive validity is still refinable. In this essay, we address a kind of competitive method called Multi-scale Local Average Blocks (MLAB) algorithm to solve this issue. Different from structure-based routes, MLAB exploits a strategy that not only extracts local evolutionary information from primary sequences, but also using predicts solvent accessibility. Moreover, the construction about predictors of DNA–protein binding sites wields an ensemble weighted sparse representation model with random under-sampling. To evaluate the performance of MLAB, we conduct comprehensive experiments of DNA–protein binding sites prediction. MLAB gives M C C of 0.392 , 0.315 , 0.439 and 0.245 on PDNA-543, PDNA-41, PDNA-316 and PDNA-52 datasets, respectively. It shows that MLAB gains advantages by comparing with other outstanding methods. M C C for our method is increased by at least 0.053 , 0.015 and 0.064 on PDNA-543, PDNA-41 and PDNA-316 datasets, respectively.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	1420-3049 1433-1373 1420-3049 1433-1373
DOI:	10.3390/molecules22122079