A simple method for predicting the secondary structure of globular proteins: implications and accuracy

A method is presented for predicting the secondary structure of globular proteins from their amino acid sequence. It is based on a rigorous statistical exploitation of the well-known biological fact that the amino acid compositions of each secondary structure are different. We also propose an evalua...

Full description

Saved in:
Bibliographic Details
Published inBioinformatics Vol. 4; no. 3; pp. 357 - 365
Main Authors Gascuel, O., Golmard, J. L.
Format Journal Article
LanguageEnglish
Published Washington, DC Oxford University Press 01.08.1988
Oxford
Subjects
Online AccessGet full text
ISSN1367-4803
0266-7061
1460-2059
DOI10.1093/bioinformatics/4.3.357

Cover

Abstract A method is presented for predicting the secondary structure of globular proteins from their amino acid sequence. It is based on a rigorous statistical exploitation of the well-known biological fact that the amino acid compositions of each secondary structure are different. We also propose an evaluation process that allows us to estimate the capacity of a method to predict the secondary structure of a new protein which does not have any homologous proteins whose structure is already known. This evaluation process shows that our method has a prediction accuracy of 58.7% over three states for the 62 proteins of the Kabsch and Sander (1983a) data bank. This result is better than that obtained by the most widely used methods—Lim (1974), Chou and Fasman (1978) and Garnier et al. (1978)—and also than that obtained by a recent method based on local homologies (Levin et al., 1986). Our prediction method is very simple and may be implemented on any microcomputer and even on programmable pocket calculators. A simple Pascal implementation of the method prediction algorithm is given. The interpretation of our results in terms of protein folding and directions for further work are discussed.
AbstractList A method is presented for predicting the secondary structure of globular proteins from their amino acid sequence. It is based on a rigorous statistical exploitation of the well-known biological fact that the amino acid compositions of each secondary structure are different. We also propose an evaluation process that allows us to estimate the capacity of a method to predict the secondary structure of a new protein which does not have any homologous proteins whose structure is already known. This evaluation process shows that our method has a prediction accuracy of 58.7% over three states for the 62 proteins of the Kabsch and Sander (1983a) data bank. This result is better than that obtained by the most widely used methods--Lim (1974), Chou and Fasman (1978) and Garnier et al. (1978)--and also than that obtained by a recent method based on local homologies (Levin et al., 1986). Our prediction method is very simple and may be implemented on any microcomputer and even on programmable pocket calculators. A simple Pascal implementation of the method prediction algorithm is given. The interpretation of our results in terms of protein folding and directions for further work are discussed.A method is presented for predicting the secondary structure of globular proteins from their amino acid sequence. It is based on a rigorous statistical exploitation of the well-known biological fact that the amino acid compositions of each secondary structure are different. We also propose an evaluation process that allows us to estimate the capacity of a method to predict the secondary structure of a new protein which does not have any homologous proteins whose structure is already known. This evaluation process shows that our method has a prediction accuracy of 58.7% over three states for the 62 proteins of the Kabsch and Sander (1983a) data bank. This result is better than that obtained by the most widely used methods--Lim (1974), Chou and Fasman (1978) and Garnier et al. (1978)--and also than that obtained by a recent method based on local homologies (Levin et al., 1986). Our prediction method is very simple and may be implemented on any microcomputer and even on programmable pocket calculators. A simple Pascal implementation of the method prediction algorithm is given. The interpretation of our results in terms of protein folding and directions for further work are discussed.
A method is presented for predicting the secondary structure of globular proteins from their amino acid sequence. It is based on a rigorous statistical exploitation of the well-known biological fact that the amino acid compositions of each secondary structure are different. We also propose an evaluation process that allows us to estimate the capacity of a method to predict the secondary structure of a new protein which does not have any homologous proteins whose structure is already known. This evaluation process shows that our method has a prediction accuracy of 58.7% over three states for the 62 proteins of the Kabsch and Sander (1983a) data bank. This result is better than that obtained by the most widely used methods—Lim (1974), Chou and Fasman (1978) and Garnier et al. (1978)—and also than that obtained by a recent method based on local homologies (Levin et al., 1986). Our prediction method is very simple and may be implemented on any microcomputer and even on programmable pocket calculators. A simple Pascal implementation of the method prediction algorithm is given. The interpretation of our results in terms of protein folding and directions for further work are discussed.
Author Golmard, J. L.
Gascuel, O.
Author_xml – sequence: 1
  givenname: O.
  surname: Gascuel
  fullname: Gascuel, O.
  organization: Unité 194 de I'INSERM, 91 Bd. de I'Hôpital, 75013 Paris, France
– sequence: 2
  givenname: J. L.
  surname: Golmard
  fullname: Golmard, J. L.
  organization: Unité 194 de I'INSERM, 91 Bd. de I'Hôpital, 75013 Paris, France
BackLink http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=7341572$$DView record in Pascal Francis
https://www.ncbi.nlm.nih.gov/pubmed/3416198$$D View this record in MEDLINE/PubMed
BookMark eNqFkUFvFSEUhYlpU9vqT9CwMO7mFQYYZoybWrU1NnHRmjRuCAOXFp2BJzCJ_ffyfC8v0Y2rS3K_c-7lnhN0EGIAhF5SsqJkYGejjz64mGZdvMlnfMVWTMgn6JjyjjQtEcNBfbNONrwn7Ck6yfk7IYJyzo_QEeO0o0N_jNw5zn5eT4BnKA_R4mqJ1wmsN8WHe1weAGcwMVidHnEuaTFlSYCjw_dTHJdJb_BYwIf8Bm-cvKkbxZCxDhZrY5akzeMzdOj0lOH5rp6irx8_3F5cNddfLj9dnF83hklWGtdxYrQZNZdAW2phHAmxwvTgrGWge9u1woG0Q9sScNr1ptemY7IbqR4IYafo9da37vRzgVzU7LOBadIB4pKV7DlhTPIKvtiByziDVevk5_pDtTtM7b_a9XU2enJJB-PzHpOVE7KtWLfFTIo5J3B7ghK1yUn9nZPiiqmaUxW-_UdofPlzuJK0n_4vb7Zynwv82g_V6YfqJJNCXd19UzfvhLj7_F4qwX4DTjSyNQ
CODEN COABER
CitedBy_id crossref_primary_10_1101_gad_12_17_2770
crossref_primary_10_1093_nar_27_15_3120
crossref_primary_10_1101_gad_6_9_1770
ContentType Journal Article
Copyright 1989 INIST-CNRS
Copyright_xml – notice: 1989 INIST-CNRS
DBID BSCLL
AAYXX
CITATION
IQODW
CGR
CUY
CVF
ECM
EIF
NPM
7X8
DOI 10.1093/bioinformatics/4.3.357
DatabaseName Istex
CrossRef
Pascal-Francis
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
MEDLINE - Academic
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
MEDLINE - Academic
DatabaseTitleList MEDLINE - Academic

MEDLINE
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: EIF
  name: MEDLINE
  url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search
  sourceTypes: Index Database
DeliveryMethod fulltext_linktorsrc
Discipline Biology
EISSN 1460-2059
EndPage 365
ExternalDocumentID 3416198
7341572
10_1093_bioinformatics_4_3_357
ark_67375_HXZ_SB55XKD7_5
Genre Journal Article
GroupedDBID -~X
.2P
.I3
482
48X
53G
5GY
AAIMJ
AAJKP
AAKPC
AAMVS
AAPQZ
AAPXW
AARHZ
AAVAP
ABEFU
ABEJV
ABGNP
ABJNI
ABNGD
ABNKS
ABPTD
ABSMQ
ABWST
ABXVV
ABZBJ
ACGFS
ACPQN
ACUFI
ACUKT
ACYTK
ADEYI
ADFTL
ADGZP
ADHKW
ADOCK
ADRTK
ADYVW
ADZTZ
ADZXQ
AECKG
AEGPL
AEJOX
AEKKA
AEKPW
AEKSI
AELWJ
AEPUE
AETBJ
AFFNX
AFFZL
AFOFC
AFSHK
AGINJ
AGKRT
AGQPQ
AGQXC
AI.
ALMA_UNASSIGNED_HOLDINGS
ALTZX
AQDSO
ARIXL
ASAOO
ATDFG
ATTQO
AXUDD
AYOIW
AZFZN
AZVOD
BHONS
BSCLL
CXTWN
CZ4
DFGAJ
EE~
ELUNK
F5P
F9B
FEDTE
H5~
HAR
HVGLF
HW0
IOX
KSI
KSN
MBTAY
MVM
NGC
PB-
Q1.
Q5Y
QBD
RD5
ROL
ROZ
RXO
TLC
TN5
TOX
TR2
VH1
WH7
XJT
ZGI
~91
---
-E4
.-4
.DC
.GJ
0R~
1TH
23N
2WC
4.4
5WA
70D
AAIJN
AAJQQ
AAMDB
AAOGV
AAUQX
AAVLN
AAYXX
ABEUO
ABIXL
ABPQP
ABQLI
ACIWK
ACPRK
ACUXJ
ADBBV
ADEZT
ADGKP
ADHZD
ADMLS
ADPDF
ADRDM
ADVEK
AEMDU
AENEX
AENZO
AEWNT
AFGWE
AFIYH
AFRAH
AGKEF
AGSYK
AHMBA
AHXPO
AIJHB
AJEEA
AJEUX
AKHUL
AKWXX
ALUQC
AMNDL
APIBT
APWMN
ASPBG
AVWKF
BAWUL
BAYMD
BQDIO
BQUQU
BSWAC
BTQHN
C1A
C45
CAG
CDBKE
CITATION
COF
CS3
DAKXR
DIK
DILTD
DU5
D~K
EBD
EBS
EJD
EMOBN
FHSFR
FLIZI
FLUFQ
FOEOM
FQBLK
GAUVT
GJXCC
GROUPED_DOAJ
GX1
H13
HZ~
J21
JXSIZ
KAQDR
KOP
KQ8
M-Z
M49
MK~
ML0
N9A
NLBLG
NMDNZ
NOMLY
NTWIH
NU-
NVLIB
O0~
O9-
OAWHX
ODMLO
OJQWA
OK1
OVD
OVEED
O~Y
P2P
PAFKI
PEELM
PQQKQ
R44
RIG
RNI
RNS
RPM
RUSNO
RW1
RZF
RZO
SV3
TEORI
TJP
W8F
WOQ
X7H
YAYTL
YKOAZ
YXANX
ZKX
~KM
IQODW
CGR
CUY
CVF
ECM
EIF
NPM
7X8
ID FETCH-LOGICAL-c373t-f640cacba47e121debb00d5c8efdd3ea8d625fe7d9220efaf8c8ac6376b1a9003
ISSN 1367-4803
0266-7061
IngestDate Thu Jul 10 22:26:13 EDT 2025
Tue Aug 05 11:36:25 EDT 2025
Mon Jul 21 09:17:47 EDT 2025
Wed Oct 01 03:18:50 EDT 2025
Thu Apr 24 23:07:52 EDT 2025
Sat Sep 20 11:02:02 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 3
Keywords PASCAL language
Prediction
Software
Mathematical model
Secondary structure
Algorithm
Aminoacid sequence
Microcomputer
Globular protein
Language English
License CC BY 4.0
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c373t-f640cacba47e121debb00d5c8efdd3ea8d625fe7d9220efaf8c8ac6376b1a9003
Notes ark:/67375/HXZ-SB55XKD7-5
istex:7DB13403DD023023688F4F22B5B4A815DC96E093
ArticleID:4.3.357
1 Present address: Centre de Recherche en Informatique de Montpellier, 860 rue de Saint Priest, 34100 Montpellier, France
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
PMID 3416198
PQID 78403374
PQPubID 23479
PageCount 9
ParticipantIDs proquest_miscellaneous_78403374
pubmed_primary_3416198
pascalfrancis_primary_7341572
crossref_primary_10_1093_bioinformatics_4_3_357
crossref_citationtrail_10_1093_bioinformatics_4_3_357
istex_primary_ark_67375_HXZ_SB55XKD7_5
ProviderPackageCode CITATION
AAYXX
PublicationCentury 1900
PublicationDate 1988-08-01
PublicationDateYYYYMMDD 1988-08-01
PublicationDate_xml – month: 08
  year: 1988
  text: 1988-08-01
  day: 01
PublicationDecade 1980
PublicationPlace Washington, DC
Oxford
PublicationPlace_xml – name: Oxford
– name: Washington, DC
– name: England
PublicationTitle Bioinformatics
PublicationTitleAlternate Comput Appl Biosci
PublicationYear 1988
Publisher Oxford University Press
Publisher_xml – name: Oxford University Press
SSID ssj0051444
ssj0005056
Score 1.593984
Snippet A method is presented for predicting the secondary structure of globular proteins from their amino acid sequence. It is based on a rigorous statistical...
SourceID proquest
pubmed
pascalfrancis
crossref
istex
SourceType Aggregation Database
Index Database
Enrichment Source
Publisher
StartPage 357
SubjectTerms Algorithms
Amino Acid Sequence
Analytical, structural and metabolic biochemistry
Biological and medical sciences
Fundamental and applied biological sciences. Psychology
General aspects, investigation methods
Mathematical Computing
Protein Conformation
Proteins
Software
Title A simple method for predicting the secondary structure of globular proteins: implications and accuracy
URI https://api.istex.fr/ark:/67375/HXZ-SB55XKD7-5/fulltext.pdf
https://www.ncbi.nlm.nih.gov/pubmed/3416198
https://www.proquest.com/docview/78403374
Volume 4
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVASL
  databaseName: Oxford Journals Open Access Collection
  customDbUrl:
  eissn: 1460-2059
  dateEnd: 20220930
  omitProxy: true
  ssIdentifier: ssj0005056
  issn: 1367-4803
  databaseCode: TOX
  dateStart: 19850101
  isFulltext: true
  titleUrlDefault: https://academic.oup.com/journals/
  providerName: Oxford University Press
– providerCode: PRVASL
  databaseName: Oxford Journals Open Access Collection
  customDbUrl:
  eissn: 1460-2059
  dateEnd: 19971231
  omitProxy: true
  ssIdentifier: ssj0051444
  issn: 1367-4803
  databaseCode: TOX
  dateStart: 19850101
  isFulltext: true
  titleUrlDefault: https://academic.oup.com/journals/
  providerName: Oxford University Press
– providerCode: PRVASL
  databaseName: Oxford Journals Open Access Collection
  customDbUrl:
  eissn: 1460-2059
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0005056
  issn: 1367-4803
  databaseCode: TOX
  dateStart: 19850101
  isFulltext: true
  titleUrlDefault: https://academic.oup.com/journals/
  providerName: Oxford University Press
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1bb9MwGLWqTUi8IG4THQz8gHiZ0kvs1AlvYRcq2KjEWqnixXJ8kSZQN7UpAn49n-3YTbkNeImqqE6TnmN_dny-8yH0PFNSGElZYr1GEkpS6FLAo6RSMJUz1uLSVS05fzcaz-ibeTbvdD63VEvruurJb7_MK_kfVOEc4GqzZP8B2XhROAGfAV84AsJw_CuMy8PVpXX3bepAO8ng9dJuvdQhC2plF7zKSuO8U6zdL4D5ofUBcQJU59Ng3zmT0mZMbtTlzsRVyvVSyK2d31AG4rC99R3UktVl4425USa-Li-OZidnFspJPDc5Oy_fHzsG9Zq8B-VT8fI86t7qP2U0tgYwCP4JG3iz9TDa0hapSGvkJN6nugnCxBeQ-Gl8995X8DCNrWzt1FSntEd68QJtS-0fQl0UIDII3hmD0L2bQjSwJT-mk3mI4DCHdEWA4_2HzPKC9Ld_ut_88NakZtf2zy9WZCtW0M-ML5Dy-xWMm8lM76I7zRIEl55P91BHL-6jW74o6dcHyJTYswp7VmG4D7xhFQaccWQVjqzCVwYHVuHAqpe4zSkMnMKBUw_R7PRkejROmmociSSM1IkZ0YEUshKU6WE6VLqCEVtlMtdGKaJFrmApbTRTRZoOtBEml7mQIwhg1VDY9-V7aGdxtdCPENZSiLTQA0JVQQsti5wOxDA3qckrllW6i7Lwd3LZWNXbiimfuJdMEL4NA6eccIChi_qx3bU3a7mxxQuHVvy6WH60UkeW8fH8A794lWXzt8eMZ110sAVnbNAwqYueBXg5DNB2100s9NV6xRk8HCGMdtGeRz02JfblQpHv33Dpx-j2pvc9QTuAqz6AmXBdPXW0_Q7aWb6E
linkProvider Oxford University Press
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+simple+method+for+predicting+the+secondary+structure+of+globular+proteins%3A+implications+and+accuracy&rft.jtitle=Computer+applications+in+the+biosciences&rft.au=GASCUEL%2C+O&rft.au=GOLMARD%2C+J.+L&rft.date=1988-08-01&rft.pub=Oxford+University+Press&rft.issn=0266-7061&rft.volume=4&rft.issue=3&rft.spage=357&rft.epage=365&rft_id=info:doi/10.1093%2Fbioinformatics%2F4.3.357&rft.externalDBID=n%2Fa&rft.externalDocID=7341572
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1367-4803&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1367-4803&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1367-4803&client=summon