new Fourier transform approach for protein coding measure based on the format of the Z curve

MOTIVATION: At the core of most protein gene-finding algorithms are the coding measures used to make a decision on coding/non-coding. Of the protein coding measures, the Fourier measure is one of the most important. However, due to the limited length of the windows usually used, the accuracy of the...

Full description

Saved in:
Bibliographic Details
Published inBioinformatics (Oxford, England) Vol. 14; no. 8; pp. 685 - 690
Main Authors Yan, M, Lin, Z.S, Zhang, C.T
Format Journal Article
LanguageEnglish
Published Oxford Oxford University Press 1998
Subjects
Online AccessGet full text
ISSN1367-4803
1367-4811
1367-4811
DOI10.1093/bioinformatics/14.8.685

Cover

Abstract MOTIVATION: At the core of most protein gene-finding algorithms are the coding measures used to make a decision on coding/non-coding. Of the protein coding measures, the Fourier measure is one of the most important. However, due to the limited length of the windows usually used, the accuracy of the measure is not satisfactory. This paper is devoted to improving the accuracy by lengthening the sequence to amplify the periodicity of 3 in the coding regions. RESULTS: A new algorithm is presented called the lengthen-shuffle Fourier transform algorithm. For the same window length, the percentage accuracy of the new algorithm is 6-7% higher than that of the ordinary Fourier transform algorithm. The resulting percentage accuracy (average of specificity and sensitivity) of the new measure is 84.9% for the window length 162 bp. AVAILABILITY: The program is available on request fromC.-T. Zhang. Contact: ctzhang@tju.edu.cn
AbstractList At the core of most protein gene-finding algorithms are the coding measures used to make a decision on coding/non-coding. Of the protein coding measures, the Fourier measure is one of the most important. However, due to the limited length of the windows usually used, the accuracy of the measure is not satisfactory. This paper is devoted to improving the accuracy by lengthening the sequence to amplify the periodicity of 3 in the coding regions. A new algorithm is presented called the lengthen-shuffle Fourier transform algorithm. For the same window length, the percentage accuracy of the new algorithm is 6-7% higher than that of the ordinary Fourier transform algorithm. The resulting percentage accuracy (average of specificity and sensitivity) of the new measure is 84.9% for the window length 162 bp. The program is available on request fromC.-T. Zhang. ctzhang@tju.edu.cn
MOTIVATION: At the core of most protein gene-finding algorithms are the coding measures used to make a decision on coding/non-coding. Of the protein coding measures, the Fourier measure is one of the most important. However, due to the limited length of the windows usually used, the accuracy of the measure is not satisfactory. This paper is devoted to improving the accuracy by lengthening the sequence to amplify the periodicity of 3 in the coding regions. RESULTS: A new algorithm is presented called the lengthen-shuffle Fourier transform algorithm. For the same window length, the percentage accuracy of the new algorithm is 6-7% higher than that of the ordinary Fourier transform algorithm. The resulting percentage accuracy (average of specificity and sensitivity) of the new measure is 84.9% for the window length 162 bp. AVAILABILITY: The program is available on request fromC.-T. Zhang. Contact: ctzhang@tju.edu.cn
At the core of most protein gene-finding algorithms are the coding measures used to make a decision on coding/non-coding. Of the protein coding measures, the Fourier measure is one of the most important. However, due to the limited length of the windows usually used, the accuracy of the measure is not satisfactory. This paper is devoted to improving the accuracy by lengthening the sequence to amplify the periodicity of 3 in the coding regions.MOTIVATIONAt the core of most protein gene-finding algorithms are the coding measures used to make a decision on coding/non-coding. Of the protein coding measures, the Fourier measure is one of the most important. However, due to the limited length of the windows usually used, the accuracy of the measure is not satisfactory. This paper is devoted to improving the accuracy by lengthening the sequence to amplify the periodicity of 3 in the coding regions.A new algorithm is presented called the lengthen-shuffle Fourier transform algorithm. For the same window length, the percentage accuracy of the new algorithm is 6-7% higher than that of the ordinary Fourier transform algorithm. The resulting percentage accuracy (average of specificity and sensitivity) of the new measure is 84.9% for the window length 162 bp.RESULTSA new algorithm is presented called the lengthen-shuffle Fourier transform algorithm. For the same window length, the percentage accuracy of the new algorithm is 6-7% higher than that of the ordinary Fourier transform algorithm. The resulting percentage accuracy (average of specificity and sensitivity) of the new measure is 84.9% for the window length 162 bp.The program is available on request fromC.-T. Zhang.AVAILABILITYThe program is available on request fromC.-T. Zhang.ctzhang@tju.edu.cnCONTACTctzhang@tju.edu.cn
Author Zhang, C.T
Lin, Z.S
Yan, M
Author_xml – sequence: 1
  fullname: Yan, M
– sequence: 2
  fullname: Lin, Z.S
– sequence: 3
  fullname: Zhang, C.T
BackLink http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=1813130$$DView record in Pascal Francis
https://www.ncbi.nlm.nih.gov/pubmed/9789094$$D View this record in MEDLINE/PubMed
BookMark eNqVkUtv1TAQhS1U1Bf8BFQvKnb31q84zoIFqiggVeoCukFIlh-T1lViX-yEqv8et7kqopvSlWc054zOfD5AOzFFQOiIkjUlHT-xIYXYpzyaKbhyQsVaraVqXqF9ymW7EorSncea8D10UMoNIaQhjdxFu12rOtKJffQzwi0-S3MOkPGUTSz3S7HZbHIy7hrXDtdyghCxSz7EKzyCKXMGbE0Bj1PE0zXgJQpO_UP3A7s5_4Y36HVvhgJvt-8hujz79P30y-r84vPX04_nKycEnVZAqO8981I0HExDvOuYoJQC8a0gCqDjpBGScbCtpdZYb5W1LSjGlJOc8UOklr1z3Ji7WzMMepPDaPKdpkTf89L_8tJUaKUrr2p9v1jrkb9mKJMeQ3EwDCZCmotuKzPCJHtWKFSjpGRdFb7bCmc7gn9MskVe58fbuSnODH1l7kL5G1hRTjmpsnaRuZxKydC_4KQPT5wuTFWTYv3fMPyH_2jx9yZpc5VruMtvjNRQTKlOUsX_AOJuyB8
CitedBy_id crossref_primary_10_1186_1471_2105_7_S2_S2
crossref_primary_10_1016_j_compbiomed_2015_12_017
crossref_primary_10_1089_cmb_2010_0184
crossref_primary_10_1016_j_jtbi_2007_03_038
crossref_primary_10_1142_S0219720005001235
crossref_primary_10_1002_prot_10290
crossref_primary_10_1016_j_bspc_2017_01_004
crossref_primary_10_1186_1471_2199_5_12
crossref_primary_10_1016_S1672_0229_04_02028_5
crossref_primary_10_1016_S0097_8485_02_00010_4
crossref_primary_10_1002_bip_10054
crossref_primary_10_1016_j_jtbi_2011_12_002
crossref_primary_10_1186_1471_2105_11_550
crossref_primary_10_1016_j_biosystems_2010_02_008
crossref_primary_10_1371_journal_pone_0110954
crossref_primary_10_1186_1471_2164_11_309
crossref_primary_10_1109_JSTSP_2008_923851
crossref_primary_10_1186_1471_2105_9_113
crossref_primary_10_1016_j_artmed_2008_07_015
crossref_primary_10_1093_nar_gkq891
crossref_primary_10_4015_S1016237217300012
crossref_primary_10_4028_www_scientific_net_AMR_756_759_3549
crossref_primary_10_1007_s00285_012_0564_3
crossref_primary_10_1103_PhysRevE_73_031920
crossref_primary_10_1016_j_gpb_2012_02_001
ContentType Journal Article
Copyright 1999 INIST-CNRS
Copyright_xml – notice: 1999 INIST-CNRS
DBID FBQ
AAYXX
CITATION
IQODW
CGR
CUY
CVF
ECM
EIF
NPM
7S9
L.6
7X8
ADTOC
UNPAY
DOI 10.1093/bioinformatics/14.8.685
DatabaseName AGRIS
CrossRef
Pascal-Francis
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
AGRICOLA
AGRICOLA - Academic
MEDLINE - Academic
Unpaywall for CDI: Periodical Content
Unpaywall
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
AGRICOLA
AGRICOLA - Academic
MEDLINE - Academic
DatabaseTitleList MEDLINE
CrossRef
MEDLINE - Academic
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: EIF
  name: MEDLINE
  url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search
  sourceTypes: Index Database
– sequence: 3
  dbid: UNPAY
  name: Unpaywall
  url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/
  sourceTypes: Open Access Repository
– sequence: 4
  dbid: FBQ
  name: AGRIS
  url: http://www.fao.org/agris/Centre.asp?Menu_1ID=DB&Menu_2ID=DB1&Language=EN&Content=http://www.fao.org/agris/search?Language=EN
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Biology
Chemistry
EISSN 1367-4811
EndPage 690
ExternalDocumentID 10.1093/bioinformatics/14.8.685
9789094
1813130
10_1093_bioinformatics_14_8_685
US201302889618
Genre Research Support, Non-U.S. Gov't
Journal Article
GroupedDBID ---
-E4
-~X
.-4
.2P
.DC
.GJ
.I3
0R~
1TH
23N
2WC
4.4
48X
53G
5GY
5WA
70D
AABJS
AABMN
AAESY
AAIJN
AAIMJ
AAIYJ
AAJKP
AAJQQ
AAKPC
AAMDB
AAMVS
AAOGV
AAPQZ
AAPXW
AAUQX
AAVAP
AAVLN
ABEFU
ABEUO
ABIXL
ABNKS
ABPTD
ABPTK
ABQLI
ABQTQ
ABWST
ABZBJ
ACGFS
ACIWK
ACPRK
ACUFI
ACYTK
ADBBV
ADEIU
ADEYI
ADEZT
ADFTL
ADGKP
ADGZP
ADHKW
ADHZD
ADOCK
ADORX
ADPDF
ADQLU
ADRDM
ADRIX
ADRTK
ADVEK
ADYVW
ADZTZ
ADZXQ
AECKG
AEGPL
AEJOX
AEKKA
AEKSI
AELWJ
AEMDU
AENEX
AENZO
AEPUE
AETBJ
AEWNT
AFFNX
AFFZL
AFGWE
AFIYH
AFOFC
AFRAH
AFXEN
AGINJ
AGKEF
AGQXC
AGSYK
AHMBA
AHXPO
AI.
AIJHB
AIKOY
AJEEA
AJEUX
AKHUL
AKWXX
ALMA_UNASSIGNED_HOLDINGS
ALTZX
ALUQC
APIBT
APWMN
AQDSO
ARIXL
ARQIP
ASPBG
ATTQO
AUCZF
AVWKF
AXUDD
AYOIW
AZFZN
AZQFJ
AZVOD
BAWUL
BAYMD
BCRHZ
BHONS
BQDIO
BQUQU
BSWAC
BTQHN
BYORX
C1A
C45
CAG
CASEJ
CDBKE
COF
CS3
CZ4
DAKXR
DIK
DILTD
DPORF
DPPUQ
DU5
D~K
EBD
EBS
EE~
EJD
ELUNK
EMOBN
F5P
F9B
FBQ
FEDTE
FHSFR
FLIZI
FLUFQ
FOEOM
FQBLK
GAUVT
GJXCC
GROUPED_DOAJ
GX1
H5~
HAR
HVGLF
HW0
HZ~
IOX
J21
KAQDR
KC5
KOP
KQ8
KSI
KSN
M-Z
M49
MK~
ML0
N9A
NGC
NLBLG
NMDNZ
NOMLY
NTWIH
NU-
NVLIB
O0~
O9-
OAWHX
ODMLO
OJQWA
OJZSN
OK1
OVD
OVEED
O~Y
P2P
PAFKI
PB-
PEELM
PQQKQ
Q1.
Q5Y
R44
RD5
RIG
RNI
RNS
ROL
ROX
RPM
RUSNO
RW1
RXO
RZF
RZO
SV3
TEORI
TJP
TLC
TOX
TR2
VH1
W8F
WOQ
X7H
XJT
YAYTL
YKOAZ
YXANX
ZGI
ZKX
~91
~KM
AAYXX
ABEJV
ABNGD
ACUKT
ADMLS
AGQPQ
AMNDL
CITATION
JXSIZ
ABGNP
ABPQP
ABXVV
ACUXJ
H13
IQODW
CGR
CUY
CVF
ECM
EIF
NPM
7S9
L.6
7X8
ADTOC
UNPAY
ID FETCH-LOGICAL-c441t-e01dfd2d6453ea50dc924111e0d7408ee93054623eb7b1babdb8bb7e8228c6323
IEDL.DBID UNPAY
ISSN 1367-4803
1367-4811
IngestDate Tue Aug 19 19:40:22 EDT 2025
Fri Sep 05 11:55:02 EDT 2025
Fri Sep 05 07:43:13 EDT 2025
Wed Feb 19 02:36:02 EST 2025
Mon Jul 21 09:15:35 EDT 2025
Wed Oct 01 00:51:02 EDT 2025
Thu Apr 24 22:58:08 EDT 2025
Wed Dec 27 19:15:13 EST 2023
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 8
Keywords Performance evaluation
Fourier transformation
Gene
Nucleotide sequence
Computerized processing
Software
Method
Algorithm
Protein
Recognition
Language English
License CC BY 4.0
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c441t-e01dfd2d6453ea50dc924111e0d7408ee93054623eb7b1babdb8bb7e8228c6323
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
OpenAccessLink https://proxy.k.utb.cz/login?url=https://academic.oup.com/bioinformatics/article-pdf/14/8/685/48834944/bioinformatics_14_8_685.pdf
PMID 9789094
PQID 48586629
PQPubID 24069
PageCount 6
ParticipantIDs unpaywall_primary_10_1093_bioinformatics_14_8_685
proquest_miscellaneous_70000262
proquest_miscellaneous_48586629
pubmed_primary_9789094
pascalfrancis_primary_1813130
crossref_primary_10_1093_bioinformatics_14_8_685
crossref_citationtrail_10_1093_bioinformatics_14_8_685
fao_agris_US201302889618
ProviderPackageCode CITATION
AAYXX
PublicationCentury 1900
PublicationDate 1998
1998-01-01
1998-00-00
19980101
PublicationDateYYYYMMDD 1998-01-01
PublicationDate_xml – year: 1998
  text: 1998
PublicationDecade 1990
PublicationPlace Oxford
PublicationPlace_xml – name: Oxford
– name: England
PublicationTitle Bioinformatics (Oxford, England)
PublicationTitleAlternate Bioinformatics
PublicationYear 1998
Publisher Oxford University Press
Publisher_xml – name: Oxford University Press
SSID ssj0005056
Score 1.7899805
Snippet MOTIVATION: At the core of most protein gene-finding algorithms are the coding measures used to make a decision on coding/non-coding. Of the protein coding...
At the core of most protein gene-finding algorithms are the coding measures used to make a decision on coding/non-coding. Of the protein coding measures, the...
SourceID unpaywall
proquest
pubmed
pascalfrancis
crossref
fao
SourceType Open Access Repository
Aggregation Database
Index Database
Enrichment Source
Publisher
StartPage 685
SubjectTerms Algorithms
amino acid sequences
Biological and medical sciences
chemistry
computer aided gene recognition
computer analysis
computer software
DNA
DNA - chemistry
DNA - genetics
Evaluation Studies as Topic
Fourier Analysis
Fundamental and applied biological sciences. Psychology
General aspects
genetic code
genetics
lengthen shuffle fourier transform algorithm
Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects)
Nucleic Acid Conformation
proteins
Proteins - genetics
sequence periodicity
Title new Fourier transform approach for protein coding measure based on the format of the Z curve
URI https://www.ncbi.nlm.nih.gov/pubmed/9789094
https://www.proquest.com/docview/48586629
https://www.proquest.com/docview/70000262
https://academic.oup.com/bioinformatics/article-pdf/14/8/685/48834944/bioinformatics_14_8_685.pdf
UnpaywallVersion publishedVersion
Volume 14
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAFT
  databaseName: Open Access Digital Library
  customDbUrl:
  eissn: 1367-4811
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0005056
  issn: 1367-4811
  databaseCode: KQ8
  dateStart: 19960101
  isFulltext: true
  titleUrlDefault: http://grweb.coalliance.org/oadl/oadl.html
  providerName: Colorado Alliance of Research Libraries
– providerCode: PRVEBS
  databaseName: Inspec with Full Text
  customDbUrl:
  eissn: 1367-4811
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0005056
  issn: 1367-4811
  databaseCode: ADMLS
  dateStart: 19980101
  isFulltext: true
  titleUrlDefault: https://www.ebsco.com/products/research-databases/inspec-full-text
  providerName: EBSCOhost
– providerCode: PRVBFR
  databaseName: Free Medical Journals - Free Access to All
  customDbUrl:
  eissn: 1367-4811
  dateEnd: 20241105
  omitProxy: true
  ssIdentifier: ssj0005056
  issn: 1367-4811
  databaseCode: DIK
  dateStart: 19960101
  isFulltext: true
  titleUrlDefault: http://www.freemedicaljournals.com
  providerName: Flying Publisher
– providerCode: PRVFQY
  databaseName: GFMER Free Medical Journals
  customDbUrl:
  eissn: 1367-4811
  dateEnd: 20241105
  omitProxy: true
  ssIdentifier: ssj0005056
  issn: 1367-4811
  databaseCode: GX1
  dateStart: 19960101
  isFulltext: true
  titleUrlDefault: http://www.gfmer.ch/Medical_journals/Free_medical.php
  providerName: Geneva Foundation for Medical Education and Research
– providerCode: PRVASL
  databaseName: Oxford Journals Open Access Collection
  customDbUrl:
  eissn: 1367-4811
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0005056
  issn: 1367-4811
  databaseCode: TOX
  dateStart: 19850101
  isFulltext: true
  titleUrlDefault: https://academic.oup.com/journals/
  providerName: Oxford University Press
– providerCode: PRVASL
  databaseName: Oxford Journals Open Access Collection
  customDbUrl:
  eissn: 1367-4811
  dateEnd: 20220930
  omitProxy: true
  ssIdentifier: ssj0005056
  issn: 1367-4811
  databaseCode: TOX
  dateStart: 19850101
  isFulltext: true
  titleUrlDefault: https://academic.oup.com/journals/
  providerName: Oxford University Press
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Nb9QwEB21WyHEge-qQbT4wDXZ2HEc57hCVAVEhQQrFQ5EdmxXFUuy2m6o2iO_nPE6WcqKCoTELVHGTjyZjJ_jmTcAzx2ujZl0KmZaiZiXFD8pnZmYZc7YIjWOr5iY3h6Loyl_fZKfbIEacmFUHxWeDCkN-qztKUQ9bfG412c8N25M-ViOhczHaISeZYVvCCOsrWSFAglKb8OOyBGuj2Bnevxu8jHkYxUxl6vqyf0xpUMMWJlt3pryRCbCF12-NoNtO9X6eEp1jip1oRbG78DqHbjdNXN1eaFms2sT2OE9-D4MPcStfEm6pU7qqw1WyP-qm_twt4e_ZBL6eABbtnkIt0JBzMtH8HlCEOYTFwrpkeWAqslAfY6XFmTFLXHWkLr1Ey_5Gn5zEj8hG9I2BDEtCY9AWrc6-0TqbvHNPobp4csPL47ivgJEXCNMW8Y2pcYZZgTPM6vy1NS4XETvbFNT8FRaW6K74ojgrC401UobLbUuLKIeWYuMZbswatrG7gFhjCIeYFZaannOaum4o4X1_Hnc1i6PQAxvtap7enRfpWNWhW36rLpBgRGk64bzwBDy5yZ7aDaVOkU_Xk3fs9XusZS--E4E-7_Y0s8uJc1QLIJng21V6Aj87o5qbNudV1zmUghW3ixReHDCBItgNxjluvPSp0OXPAK6NtK_HcuTf2jzFEbLRWf3Ec4t9QEuZF69Oeg_yx-pUlBT
linkProvider Unpaywall
linkToUnpaywall http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwEB61WyHEgXfVIAo-cM3DjuM4xxWiqpCokGClwoHIjm1UsSSrbZaqPfLLGa-TpayoQEjcEmXsxJPJzOd4_A3AC4dzYyadiplWIuYVxU9K5yZmuTO2zIzjayamNyfieMZfnxanO6DGvTBqyApPxi0N-qwbKEQ9bXE66DNeGJdSnspUyCJFI_QsK3xLGGFtLWsUSFB6F_ZEgXB9Anuzk7fTD2E_Vhlzua6ePBxTOuaAVfn2rSlPZCJ80eVrEWzXqc7nU6pzVKkLtTB-B1bvwO1Vu1CXF2o-vxbAju7B93HoIW_lS7LqddJcbbFC_lfd3Ie7A_wl09DHA9ix7UO4FQpiXj6CT1OCMJ-4UEiP9COqJiP1OV5akjW3xFlLms4HXvI1_OYkPiAb0rUEMS0Jj0A6tz77SJrV8pt9DLOjV-9fHsdDBYi4QZjWxzajxhlmBC9yq4rMNDhdRO9sM1PyTFpbobviiOCsLjXVShsttS4toh7ZiJzl-zBpu9YeAGGMIh5gVlpqecEa6bijpfX8edw2rohAjG-1bgZ6dF-lY16HZfq8vkGBEWSbhovAEPLnJgdoNrX6jH68nr1j69VjKX3xnQgOf7Gln11KmqNYBM9H26rREfjVHdXabnVec1lIIVh1s0TpwQkTLIL9YJSbziu_HbriEdCNkf7tWJ78Q5unMOmXK3uIcK7Xz4YP8gfNB09a
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=new+Fourier+transform+approach+for+protein+coding+measure+based+on+the+format+of+the+Z+curve&rft.jtitle=Bioinformatics+%28Oxford%2C+England%29&rft.au=Yan%2C+M&rft.au=Lin%2C+Z.S&rft.au=Zhang%2C+C.T&rft.date=1998&rft.issn=1367-4803&rft.eissn=1367-4811&rft.volume=14&rft.issue=8&rft.spage=685&rft.epage=690&rft_id=info:doi/10.1093%2Fbioinformatics%2F14.8.685&rft.externalDocID=US201302889618
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1367-4803&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1367-4803&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1367-4803&client=summon