Locating Protein-Coding Regions in Human DNA Sequences by a Multiple Sensor-Neural Network Approach

Genes in higher eukaryotes may span tens or hundreds of kilobases with the protein-coding regions accounting for only a few percent of the total sequence. Identifying genes within large regions of uncharacterized DNA is a difficult undertaking and is currently the focus of many research efforts. We...

Full description

Saved in:

Bibliographic Details
Published in	Proceedings of the National Academy of Sciences - PNAS Vol. 88; no. 24; pp. 11261 - 11265
Main Authors	Uberbacher, E C, Mural, R J
Format	Journal Article
Language	English
Published	Washington, DC National Academy of Sciences of the United States of America 15.12.1991 National Acad Sciences
Subjects	550200 - Biochemistry Algorithms ALKALINE PHOSPHATASE ANIMALS Base Sequence BASIC BIOLOGICAL SCIENCES Biological and medical sciences BIOLOGICAL MARKERS BLOOD COAGULATION FACTORS Chromosome Mapping Chromosomes, Human COAGULANTS coding Databases, Factual DNA DNA - genetics DNA SEQUENCING DRUGS ENZYMES Enzymes - genetics ESTERASES Exons False positive errors Fundamental and applied biological sciences. Psychology GENES Genes, ras genetics HEMATOLOGIC AGENTS Hominidae - genetics Humans HYDROLASES MAMMALS MAN Models, Genetic Molecular and cellular biology Molecular genetics Molecular Sequence Data MOLECULAR STRUCTURE Murals neural networks Neural Networks (Computer) Nucleic acids Nucleotide sequences Open reading frames ORGANIC COMPOUNDS PATTERN RECOGNITION PHOSPHATASES PHOSPHORUS-GROUP TRANSFERASES PHOSPHOTRANSFERASES PRIMATES PROTEINS Proteins - genetics PROTHROMBIN Reading frames Sensors STRUCTURAL CHEMICAL ANALYSIS TRANSFERASES VERTEBRATES Human Exon Nucleotide sequence Localization DNA Recognition
Online Access	Get full text
ISSN	0027-8424 1091-6490
DOI	10.1073/pnas.88.24.11261

Cover

Abstract	Genes in higher eukaryotes may span tens or hundreds of kilobases with the protein-coding regions accounting for only a few percent of the total sequence. Identifying genes within large regions of uncharacterized DNA is a difficult undertaking and is currently the focus of many research efforts. We describe a reliable computational approach for locating protein-coding portions of genes in anonymous DNA sequence. Using a concept suggested by robotic environmental sensing, our method combines a set of sensor algorithms and a neural network to localize the coding regions. Several algorithms that report local characteristics of the DNA sequence, and therefore act as sensors, are also described. In its current configuration the "coding recognition module" identifies 90% of coding exons of length 100 bases or greater with less than one false positive coding exon indicated per five coding exons indicated. This is a significantly lower false positive rate than any method of which we are aware. This module demonstrates a method with general applicability to sequence-pattern recognition problems and is available for current research efforts.
AbstractList	Genes in higher eukaryotes may span tens or hundreds of kilobases with the protein-coding regions accounting for only a few percent of the total sequence. Identifying genes within large regions of uncharacterized DNA is a difficult undertaking and is currently the focus of many research efforts. We describe a reliable computational approach for locating protein-coding portions of genes in anonymous DNA sequence. Using a concept suggested by robotic environmental sensing, our method combines a set of sensor algorithms and a neural network to localize the coding regions. Several algorithms that report local characteristics of the DNA sequence, and therefore act as sensors, are also described. In its current configuration the "coding recognition module" identifies 90% of coding exons of length 100 bases or greater with less than one false positive coding exon indicated per five coding exons indicated. This is a significantly lower false positive rate than any method of which we are aware. This module demonstrates a method with general applicability to sequence-pattern recognition problems and is available for current research efforts.Genes in higher eukaryotes may span tens or hundreds of kilobases with the protein-coding regions accounting for only a few percent of the total sequence. Identifying genes within large regions of uncharacterized DNA is a difficult undertaking and is currently the focus of many research efforts. We describe a reliable computational approach for locating protein-coding portions of genes in anonymous DNA sequence. Using a concept suggested by robotic environmental sensing, our method combines a set of sensor algorithms and a neural network to localize the coding regions. Several algorithms that report local characteristics of the DNA sequence, and therefore act as sensors, are also described. In its current configuration the "coding recognition module" identifies 90% of coding exons of length 100 bases or greater with less than one false positive coding exon indicated per five coding exons indicated. This is a significantly lower false positive rate than any method of which we are aware. This module demonstrates a method with general applicability to sequence-pattern recognition problems and is available for current research efforts. Genes in higher eukaryotes may span tens or hundreds of kilobases with the protein-coding regions accounting for only a few percent of the total sequence. Identifying genes within large regions of uncharacterized DNA is a difficult undertaking and is currently the focus of many research efforts. We describe a reliable computational approach for locating protein-coding portions of genes in anonymous DNA sequence. Using a concept suggested by robotic environmental sensing, our method combines a set of sensor algorithms and a neural network to localize the coding regions. Several algorithms that report local characteristics of the DNA sequence, and therefore act as sensors, are also described. Genes in higher eukaryotes may span tens or hundreds of kilobases with the protein-coding regions accounting for only a few percent of the total sequence. Identifying genes within large regions of uncharacterized DNA is a difficult undertaking and is currently the focus of many research efforts. The authors describe a reliable computational approach for locating protein-coding portions of genes in anonymous DNA sequence. Using a concept suggested by robotic environmental sensing, the authors method combines a set of sensor algorithms and a neural network to localize the coding regions. Several algorithms that report local characteristics of the DNA sequence, and therefore act as sensors, are also described. In its current configuration the coding recognition module identifies 90% of coding exons of length 100 bases or greater with less than one false positive coding exon indicated per five coding exons indicated. This is a significantly lower false positive rate than any method of which the authors are aware. This module demonstrates a method with general applicability to sequence-pattern recognition problems and is available for current research efforts. Genes in higher eukaryotes may span tens or hundreds of kilobases with the protein-coding regions accounting for only a few percent of the total sequence. Identifying genes within large regions of uncharacterized DNA is a difficult undertaking and is currently the focus of many research efforts. We describe a reliable computational approach for locating protein-coding portions of genes in anonymous DNA sequence. Using a concept suggested by robotic environmental sensing, our method combines a set of sensor algorithms and a neural network to localize the coding regions. Several algorithms that report local characteristics of the DNA sequence, and therefore act as sensors, are also described. In its current configuration the "coding recognition module" identifies 90% of coding exons of length 100 bases or greater with less than one false positive coding exon indicated per five coding exons indicated. This is a significantly lower false positive rate than any method of which we are aware. This module demonstrates a method with general applicability to sequence-pattern recognition problems and is available for current research efforts.
Author	Edward C. Uberbacher Richard J. Mural
AuthorAffiliation	Biology Division, Oak Ridge National Laboratory, TN
AuthorAffiliation_xml	– name: Biology Division, Oak Ridge National Laboratory, TN
Author_xml	– sequence: 1 givenname: E C surname: Uberbacher fullname: Uberbacher, E C organization: Biology Division, Oak Ridge National Laboratory, TN – sequence: 2 givenname: R J surname: Mural fullname: Mural, R J organization: Biology Division, Oak Ridge National Laboratory, TN
BackLink	http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=5225985$$DView record in Pascal Francis https://www.ncbi.nlm.nih.gov/pubmed/1763041$$D View this record in MEDLINE/PubMed https://www.osti.gov/biblio/5604872$$D View this record in Osti.gov
BookMark	eNqFkkuP0zAUhS00aOgU9ixARAghNil-x5HYVOUxSKUgHmvLcZzWQ2pnbAeYf49LSzUgwaws-3zH91z7noET550B4D6CMwQr8nxwKs6EmGE6QwhzdAtMEKxRyWkNT8AEQlyVgmJ6B5zFeAEhrJmAp-AUVZxAiiZAL71Wybp18SH4ZKwrF77dbT-atfUuFtYV5-NWueLlal58MpejcdrEorkqVPFu7JMdepPPXfShXJkxqL5YmfTdh6_FfBiCV3pzF9zuVB_NvcM6BV9ev_q8OC-X79-8XcyXpeawTqVqaiGoahXjjSaatA3sFKmYaqmpKkxagmqMSEUF41QTRE2tGYcdazpedQSRKXixv3cYm61ptXEpx5FDsFsVrqRXVv6pOLuRa_9NMoIQzfbHe7uPycqobTJ6o71zRieZC1GRQ0zB00ON4PNbxCS3NmrT98oZP0ZZYVZDVtEbQcww5QTdDCIOOalFncGH19s79nX4y6w_OegqatV3QTlt4xFjOEcTLGN8j-ngYwymk7nTPAJ-9ya2lwjK3WjJ3WhJISSm8tdoZSP8y3hM8G_Ls0OknfIbv0bIbuz7ZH6kjD76P5qJB3viIiYfjggmrMaYkp9ynfLc
CODEN	PNASA6
CitedBy_id	crossref_primary_10_1016_j_bspc_2016_07_002 crossref_primary_10_1038_940 crossref_primary_10_1089_cmb_1997_4_127 crossref_primary_10_1016_S0092_8674_00_81268_4 crossref_primary_10_3109_10425179209039691 crossref_primary_10_1016_S0378_1119_02_00392_X crossref_primary_10_1101_gr_7_6_573 crossref_primary_10_1104_pp_118_3_725 crossref_primary_10_1089_cmb_1995_2_451 crossref_primary_10_1101_gr_155500 crossref_primary_10_1016_S1369_5266_00_00144_8 crossref_primary_10_1104_pp_105_063479 crossref_primary_10_1002__SICI_1097_4644_19980701_70_1_110__AID_JCB11_3_0_CO_2_T crossref_primary_10_1016_S0378_1119_01_00690_4 crossref_primary_10_1109_51_473264 crossref_primary_10_1086_302182 crossref_primary_10_1038_ng0597_54 crossref_primary_10_1111_j_1600_065X_1999_tb01392_x crossref_primary_10_1007_PL00000064 crossref_primary_10_1101_gr_8_1_48 crossref_primary_10_1016_S0378_1119_99_00141_9 crossref_primary_10_1073_pnas_95_26_15345 crossref_primary_10_1038_47134 crossref_primary_10_1089_blr_1992_11_241 crossref_primary_10_1016_j_gene_2012_09_061 crossref_primary_10_1080_21548331_1997_11443403 crossref_primary_10_1093_dnares_dsn008 crossref_primary_10_1007_s11427_010_4007_3 crossref_primary_10_1038_ng1095_130 crossref_primary_10_1101_gr_175801 crossref_primary_10_1016_0097_8485_94_85014_3 crossref_primary_10_1016_S1359_6446_01_01724_X crossref_primary_10_1038_2424 crossref_primary_10_1038_ng1196_285 crossref_primary_10_1038_81613 crossref_primary_10_1038_493 crossref_primary_10_1101_gr_7_3_250 crossref_primary_10_1038_361726a0 crossref_primary_10_1182_blood_V92_9_3025 crossref_primary_10_1074_jbc_M110981200 crossref_primary_10_1002__SICI_1522_2683_19990201_20_2_269__AID_ELPS269_3_0_CO_2_7 crossref_primary_10_1073_pnas_95_8_4463 crossref_primary_10_1080_00207720412331303660 crossref_primary_10_1006_geno_1998_5251 crossref_primary_10_1016_S0021_9258_19_36830_9 crossref_primary_10_1002_pro_5560030316 crossref_primary_10_1016_j_neucom_2007_03_005 crossref_primary_10_1038_ng1297_399 crossref_primary_10_1006_geno_1997_4916 crossref_primary_10_1016_j_febslet_2004_05_026 crossref_primary_10_3109_10425170109024996 crossref_primary_10_1038_362370a0 crossref_primary_10_1073_pnas_94_2_565 crossref_primary_10_1089_cmb_1996_3_333 crossref_primary_10_3109_10425179309015632 crossref_primary_10_1038_ng0492_3 crossref_primary_10_1038_ng1297_393 crossref_primary_10_1038_ng0396_288 crossref_primary_10_1038_sj_onc_1203023 crossref_primary_10_1080_10635150802032982 crossref_primary_10_1007_BF00292438 crossref_primary_10_1101_gr_8_8_809 crossref_primary_10_1016_S0378_1119_01_00751_X crossref_primary_10_1093_hmg_6_3_479 crossref_primary_10_1093_bioadv_vbad105 crossref_primary_10_1016_S0014_5793_99_01696_8 crossref_primary_10_1038_ng0997_65 crossref_primary_10_1007_BF00993982 crossref_primary_10_1074_jbc_274_39_27975 crossref_primary_10_1270_jsbbs_58_157 crossref_primary_10_1007_BF00411459 crossref_primary_10_1002__SICI_1098_2795_199601_43_1_1__AID_MRD1_3_0_CO_2_W crossref_primary_10_1073_pnas_94_12_6164 crossref_primary_10_1093_hmg_6_2_317 crossref_primary_10_1007_s11906_000_0085_0 crossref_primary_10_1089_cmb_1996_3_223 crossref_primary_10_3109_10425179709020880 crossref_primary_10_1006_meth_1997_0575 crossref_primary_10_1006_geno_1999_5910 crossref_primary_10_1038_5951 crossref_primary_10_1101_gr_10_8_1095 crossref_primary_10_1002_humu_20015 crossref_primary_10_1016_0960_0779_95_80025_C crossref_primary_10_1002__SICI_1097_010X_199809_10_282_1_2_245__AID_JEZ26_3_0_CO_2_R crossref_primary_10_1038_35048692 crossref_primary_10_1006_geno_1997_4659 crossref_primary_10_1016_S0378_1119_99_00456_4 crossref_primary_10_1089_cmb_1997_4_325 crossref_primary_10_1038_ng0795_269 crossref_primary_10_1016_S0959_440X_00_00095_6 crossref_primary_10_1007_s11460_010_0109_8 crossref_primary_10_1101_gr_10_6_758 crossref_primary_10_1101_gr_6_4_314 crossref_primary_10_1016_S0079_6107_98_00026_1 crossref_primary_10_1038_35020557 crossref_primary_10_1002_cfg_93 crossref_primary_10_1034_j_1399_0004_1999_550612_x crossref_primary_10_1002__SICI_1098_2299_199707_08_41_3_4_120__AID_DDR3_3_0_CO_2_N crossref_primary_10_1146_annurev_genom_5_061903_180057 crossref_primary_10_1016_S0378_4347_99_00012_2 crossref_primary_10_1126_science_7754361 crossref_primary_10_1006_geno_1997_4662 crossref_primary_10_3109_2000_1967_136 crossref_primary_10_1006_geno_1997_4786 crossref_primary_10_1006_geno_2001_6656 crossref_primary_10_1016_S0031_3203_00_00171_0 crossref_primary_10_1006_jmbi_1997_1140 crossref_primary_10_1007_BF00352457 crossref_primary_10_1016_j_chaos_2007_09_078 crossref_primary_10_1093_hmg_7_7_1071 crossref_primary_10_1101_gr_1261703 crossref_primary_10_1103_PhysRevE_61_1812 crossref_primary_10_1007_BF00364796 crossref_primary_10_1073_pnas_96_2_598 crossref_primary_10_1155_2013_191206 crossref_primary_10_1016_j_bbe_2014_03_003 crossref_primary_10_1073_pnas_95_7_3764 crossref_primary_10_1016_S0097_8485_96_80012_X crossref_primary_10_1016_0888_7543_95_80173_J crossref_primary_10_1016_j_procs_2019_04_171 crossref_primary_10_1038_15472 crossref_primary_10_1093_hmg_7_12_1873 crossref_primary_10_1006_geno_1998_5732 crossref_primary_10_1101_gr_7_6_592 crossref_primary_10_1016_S0378_1119_99_00079_7 crossref_primary_10_1002__SICI_1098_2264_199810_23_2_134__AID_GCC6_3_0_CO_2_3 crossref_primary_10_1186_1477_5751_3_7 crossref_primary_10_1006_bbrc_1999_1481 crossref_primary_10_3109_10425179509074693 crossref_primary_10_1103_PhysRevA_45_8902 crossref_primary_10_3109_10425179509074694 crossref_primary_10_1006_geno_2000_6238 crossref_primary_10_1016_j_ygeno_2005_11_016 crossref_primary_10_1038_36285 crossref_primary_10_1101_gr_8_10_1022 crossref_primary_10_1038_ng0293_137 crossref_primary_10_1038_ng0594_40 crossref_primary_10_1089_dna_1993_12_157 crossref_primary_10_15446_abc_v21n1Supl_51233 crossref_primary_10_1007_BF01188579 crossref_primary_10_1006_geno_2001_6571 crossref_primary_10_1038_nrg890 crossref_primary_10_1016_S0378_1119_98_00136_X crossref_primary_10_1089_153623102321112737 crossref_primary_10_1089_cmb_1997_4_311 crossref_primary_10_1038_ng0795_259 crossref_primary_10_1007_BF00587297 crossref_primary_10_1016_S0378_1119_98_00509_5 crossref_primary_10_1101_gr_10_4_394 crossref_primary_10_1109_5_537117 crossref_primary_10_1016_0014_5793_94_00489_7 crossref_primary_10_1128_JVI_75_3_1186_1194_2001 crossref_primary_10_1016_j_ab_2013_03_015 crossref_primary_10_1007_BF02745861 crossref_primary_10_1007_s00521_019_04603_0 crossref_primary_10_1002_0471142905_hg0605s39 crossref_primary_10_1128_mcb_14_3_1743_1751_1994 crossref_primary_10_1016_S0378_1119_98_00286_8 crossref_primary_10_1006_geno_1998_5551 crossref_primary_10_1016_S0021_9258_18_47391_7 crossref_primary_10_4018_ijsbbt_2013100105 crossref_primary_10_1038_ng0892_348 crossref_primary_10_1101_gr_8_8_791 crossref_primary_10_1006_bbrc_1998_8976 crossref_primary_10_1038_355632a0 crossref_primary_10_3390_ijms22010293 crossref_primary_10_1038_ng0696_175 crossref_primary_10_1002_bies_950160917 crossref_primary_10_1016_S1672_0229_04_02028_5 crossref_primary_10_1006_geno_1999_5844 crossref_primary_10_1016_S1359_6446_02_02282_1 crossref_primary_10_1006_geno_1997_4983 crossref_primary_10_1101_gr_8_3_291 crossref_primary_10_1128_MCB_16_11_6553 crossref_primary_10_1002__SICI_1097_0061_19980615_14_8_701__AID_YEA247_3_0_CO_2 crossref_primary_10_1016_S0378_1119_97_00076_0 crossref_primary_10_1038_45471 crossref_primary_10_1006_geno_1997_4727 crossref_primary_10_1038_75664 crossref_primary_10_1099_0022_1317_82_5_1123 crossref_primary_10_1109_64_294127 crossref_primary_10_1109_JPROC_2002_805308 crossref_primary_10_1016_0097_8485_93_85004_V crossref_primary_10_1038_ng1097_164 crossref_primary_10_1038_70539 crossref_primary_10_1093_hmg_5_2_187 crossref_primary_10_4161_bioe_26997 crossref_primary_10_1093_bioinformatics_bti310 crossref_primary_10_1038_ng0193_44 crossref_primary_10_1038_ng0492_34 crossref_primary_10_4137_BBI_S3030 crossref_primary_10_1002_ajmg_1320510431 crossref_primary_10_1007_BF01246675 crossref_primary_10_1016_S0097_8485_99_00016_9 crossref_primary_10_1161_01_RES_80_4_437 crossref_primary_10_1006_geno_1997_5147 crossref_primary_10_1016_S0092_8674_00_81203_9 crossref_primary_10_1002_0471142905_hg0606s29 crossref_primary_10_1038_ng0297_157 crossref_primary_10_1016_0960_0779_94_90020_5 crossref_primary_10_1101_gr_10_4_483 crossref_primary_10_1007_BF00587301 crossref_primary_10_1007_BF02462019 crossref_primary_10_1098_rstb_1994_0078 crossref_primary_10_1038_ng1114 crossref_primary_10_1038_35048500 crossref_primary_10_1038_nm0995_866 crossref_primary_10_1006_geno_1997_4867 crossref_primary_10_1016_S0378_1119_98_00549_6 crossref_primary_10_1186_1471_2105_3_39 crossref_primary_10_1002_0471142905_hg0601s03 crossref_primary_10_1002_mnfr_200500273 crossref_primary_10_1007_s10616_005_1719_5 crossref_primary_10_1101_gr_7_4_315 crossref_primary_10_1038_ng0393_266 crossref_primary_10_1089_cmb_1998_5_307 crossref_primary_10_1038_ng1196_300 crossref_primary_10_1006_geno_2000_6302 crossref_primary_10_1073_pnas_97_7_3491 crossref_primary_10_1006_geno_1997_5040 crossref_primary_10_1007_BF00364779 crossref_primary_10_1142_S0219720003000216 crossref_primary_10_1006_geno_1997_5162 crossref_primary_10_1006_geno_1996_4543 crossref_primary_10_1038_ng0397_252 crossref_primary_10_1016_S1097_2765_00_80057_X crossref_primary_10_1006_geno_1999_5871 crossref_primary_10_1006_jmbi_1999_3108 crossref_primary_10_1101_gr_5_1_71 crossref_primary_10_1089_cmb_1995_2_87 crossref_primary_10_1016_S0167_4781_98_00273_5 crossref_primary_10_1007_s003359900200 crossref_primary_10_1016_S0378_1119_00_00049_4 crossref_primary_10_1038_ng0596_35 crossref_primary_10_1038_ng0793_256 crossref_primary_10_1074_jbc_M105863200 crossref_primary_10_1007_s10529_011_0525_8 crossref_primary_10_1016_S0378_1119_97_00341_7 crossref_primary_10_1016_S0169_4758_99_01600_2 crossref_primary_10_1126_science_272_5259_258 crossref_primary_10_1101_gr_7_10_1020 crossref_primary_10_1016_j_ygeno_2019_10_018 crossref_primary_10_1101_gr_8_11_1172 crossref_primary_10_1089_cmb_1997_4_297 crossref_primary_10_1093_nar_26_16_3762 crossref_primary_10_1006_geno_1997_5114 crossref_primary_10_1146_annurev_genom_1_1_251 crossref_primary_10_1016_S0378_1119_96_00591_4 crossref_primary_10_1006_geno_1997_4941 crossref_primary_10_1016_j_physa_2019_122872 crossref_primary_10_1182_blood_V92_9_3025_421k53_3025_3034 crossref_primary_10_1016_S0378_4371_99_00407_0 crossref_primary_10_1101_gr_5_4_359 crossref_primary_10_1104_pp_010207 crossref_primary_10_1101_gr_7_6_642 crossref_primary_10_1016_j_dsp_2021_103202 crossref_primary_10_1093_nar_24_14_2730 crossref_primary_10_1002_0471250953_bi0409s04 crossref_primary_10_1002_mus_880181306 crossref_primary_10_1016_0888_7543_95_80167_K crossref_primary_10_1016_S0378_1119_99_00381_9 crossref_primary_10_1101_gr_9_2_158 crossref_primary_10_1016_0959_437X_95_80042_5 crossref_primary_10_1126_science_271_5254_1423 crossref_primary_10_1006_geno_1997_5106 crossref_primary_10_1038_83837 crossref_primary_10_3109_10425179909033955 crossref_primary_10_1006_geno_1997_4716 crossref_primary_10_1007_BF00993376 crossref_primary_10_1016_S0168_9525_00_02127_2 crossref_primary_10_1006_geno_1998_5690 crossref_primary_10_1007_s00521_004_0447_7 crossref_primary_10_1038_ng0396_241 crossref_primary_10_1371_journal_pone_0060731 crossref_primary_10_3923_ajps_2008_268_275 crossref_primary_10_1007_BF00993379 crossref_primary_10_1016_0168_1656_94_90040_X crossref_primary_10_1038_35080529 crossref_primary_10_1128_MCB_17_3_1490 crossref_primary_10_1038_ng0596_109 crossref_primary_10_1089_cmb_1995_2_473 crossref_primary_10_1038_sj_mp_4001190 crossref_primary_10_1089_mcg_1996_1_317 crossref_primary_10_1155_2014_261362 crossref_primary_10_1016_S0378_1119_96_00855_4 crossref_primary_10_1038_81664 crossref_primary_10_1006_geno_1997_5012 crossref_primary_10_1089_cmb_2012_0029 crossref_primary_10_1016_j_compbiomed_2009_01_010 crossref_primary_10_1006_geno_1997_4822 crossref_primary_10_1111_j_1365_2222_1995_tb00438_x crossref_primary_10_1007_BF00993384 crossref_primary_10_1016_0888_7543_95_80056_R crossref_primary_10_1038_18210 crossref_primary_10_1038_ng0895_409 crossref_primary_10_1016_S0097_8485_96_80015_5 crossref_primary_10_3109_10425179709034031 crossref_primary_10_1006_geno_1997_4829 crossref_primary_10_3109_10425179509030969 crossref_primary_10_1016_S0378_1119_01_00897_6 crossref_primary_10_1016_S0169_328X_98_00343_X crossref_primary_10_1038_ng0893_373 crossref_primary_10_1016_0888_7543_95_80131_5 crossref_primary_10_1093_bioinformatics_btp567 crossref_primary_10_1007_BF02602555 crossref_primary_10_4018_jssci_2010040101 crossref_primary_10_1002_0471250953_bi0304s00
ContentType	Journal Article
Copyright	Copyright 1991 The National Academy of Sciences of the United States of America 1992 INIST-CNRS
Copyright_xml	– notice: Copyright 1991 The National Academy of Sciences of the United States of America – notice: 1992 INIST-CNRS
DBID	AAYXX CITATION IQODW CGR CUY CVF ECM EIF NPM 7T3 8FD FR3 P64 7SC JQ2 L7M L~C L~D 7X8 OTOTI 5PM
DOI	10.1073/pnas.88.24.11261
DatabaseName	CrossRef Pascal-Francis Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed Human Genome Abstracts Technology Research Database Engineering Research Database Biotechnology and BioEngineering Abstracts Computer and Information Systems Abstracts ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional MEDLINE - Academic OSTI.GOV PubMed Central (Full Participant titles)
DatabaseTitle	CrossRef MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) Human Genome Abstracts Engineering Research Database Technology Research Database Biotechnology and BioEngineering Abstracts Computer and Information Systems Abstracts Computer and Information Systems Abstracts – Academic Advanced Technologies Database with Aerospace ProQuest Computer Science Collection Computer and Information Systems Abstracts Professional MEDLINE - Academic
DatabaseTitleList	MEDLINE - Academic Human Genome Abstracts MEDLINE CrossRef Computer and Information Systems Abstracts
Database_xml	– sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: EIF name: MEDLINE url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search sourceTypes: Index Database
DeliveryMethod	fulltext_linktorsrc
Discipline	Sciences (General)
EISSN	1091-6490
EndPage	11265
ExternalDocumentID	PMC53114 5604872 1763041 5225985 10_1073_pnas_88_24_11261 88_24_11261 2359224
Genre	Research Support, U.S. Gov't, Non-P.H.S Research Support, Non-U.S. Gov't Journal Article
GroupedDBID	--- -DZ -~X .55 .GJ 0R~ 123 29P 2AX 2FS 2WC 3O- 4.4 53G 5RE 5VS 85S AACGO AAFWJ AANCE ABBHK ABOCM ABPLY ABPPZ ABTLG ABXSQ ABZEH ACGOD ACHIC ACIWK ACNCT ACPRK ADQXQ ADULT ADXHL AENEX AEUPB AEXZC AFFNX AFOSN AFRAH ALMA_UNASSIGNED_HOLDINGS AQVQM AS~ CS3 D0L DCCCD DIK DU5 E3Z EBS EJD F5P FRP GX1 H13 HGD HH5 HQ3 HTVGU HYE IPSME JAAYA JBMMH JENOY JHFFW JKQEH JLS JLXEF JPM JSG JST KQ8 L7B LU7 MVM N9A NEJ N~3 O9- OK1 P-O PNE PQQKQ R.V RHI RNA RNS RPM RXW SA0 SJN TN5 UKR VOH W8F WH7 WHG WOQ WOW X7M XSW Y6R YKV YSK ZCA ZCG ~02 ~KM - 02 08R 0R 1AW 55 AAPBV ABFLS ABPTK ADACO ADZLD AFDAS AJYGW AS ASUFR DNJUQ DOOOF DWIUU DZ F20 GJ JSODD KM OHM PQEST RHF VQA X XFK XHC ZA5 AAYXX CITATION 692 6TJ 79B AAYJJ ACKIV AFHIN AFQQW BKOMP IQODW NHB TAE YBH CGR CUY CVF ECM EIF NPM VXZ YIF YIN 7T3 8FD FR3 P64 7SC JQ2 L7M L~C L~D 7X8 OTOTI 5PM
ID	FETCH-LOGICAL-c609t-ab9884ada56bc3c3db0fa375ad4e7723d319213748564c314e9c560f5bf67f313
ISSN	0027-8424
IngestDate	Thu Aug 21 18:25:35 EDT 2025 Fri May 19 00:37:08 EDT 2023 Fri Sep 05 12:10:44 EDT 2025 Fri Sep 05 14:05:59 EDT 2025 Fri Sep 05 14:50:30 EDT 2025 Wed Feb 19 02:33:27 EST 2025 Mon Jul 21 09:13:11 EDT 2025 Wed Oct 01 02:05:53 EDT 2025 Thu Apr 24 22:55:34 EDT 2025 Wed Nov 11 00:29:18 EST 2020 Thu May 30 08:53:05 EDT 2019 Thu May 29 08:43:05 EDT 2025
IsDoiOpenAccess	false
IsOpenAccess	true
IsPeerReviewed	true
IsScholarly	true
Issue	24
Keywords	Human Exon Nucleotide sequence Localization DNA Recognition
Language	English
License	CC BY 4.0
LinkModel	OpenURL
MergedId	FETCHMERGED-LOGICAL-c609t-ab9884ada56bc3c3db0fa375ad4e7723d319213748564c314e9c560f5bf67f313
Notes	ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 23 ObjectType-Article-1 ObjectType-Feature-2 AC05-84OR21400
PMID	1763041
PQID	16063989
PQPubID	23462
PageCount	5
ParticipantIDs	proquest_miscellaneous_25246314 pubmed_primary_1763041 proquest_miscellaneous_72590574 pascalfrancis_primary_5225985 pnas_primary_88_24_11261_fulltext jstor_primary_2359224 proquest_miscellaneous_16063989 crossref_primary_10_1073_pnas_88_24_11261 pnas_primary_88_24_11261 pubmedcentral_primary_oai_pubmedcentral_nih_gov_53114 osti_scitechconnect_5604872 crossref_citationtrail_10_1073_pnas_88_24_11261
ProviderPackageCode	RNA PNE CITATION AAYXX
PublicationCentury	1900
PublicationDate	19911215
PublicationDateYYYYMMDD	1991-12-15
PublicationDate_xml	– month: 12 year: 1991 text: 19911215 day: 15
PublicationDecade	1990
PublicationPlace	Washington, DC
PublicationPlace_xml	– name: Washington, DC – name: United States
PublicationTitle	Proceedings of the National Academy of Sciences - PNAS
PublicationTitleAlternate	Proc Natl Acad Sci U S A
PublicationYear	1991
Publisher	National Academy of Sciences of the United States of America National Acad Sciences
Publisher_xml	– name: National Academy of Sciences of the United States of America – name: National Acad Sciences
SSID	ssj0009580
Score	1.9506234
Snippet	Genes in higher eukaryotes may span tens or hundreds of kilobases with the protein-coding regions accounting for only a few percent of the total sequence....
SourceID	pubmedcentral osti proquest pubmed pascalfrancis crossref pnas jstor
SourceType	Open Access Repository Aggregation Database Index Database Enrichment Source Publisher
StartPage	11261
SubjectTerms	550200 - Biochemistry Algorithms ALKALINE PHOSPHATASE ANIMALS Base Sequence BASIC BIOLOGICAL SCIENCES Biological and medical sciences BIOLOGICAL MARKERS BLOOD COAGULATION FACTORS Chromosome Mapping Chromosomes, Human COAGULANTS coding Databases, Factual DNA DNA - genetics DNA SEQUENCING DRUGS ENZYMES Enzymes - genetics ESTERASES Exons False positive errors Fundamental and applied biological sciences. Psychology GENES Genes, ras genetics HEMATOLOGIC AGENTS Hominidae - genetics Humans HYDROLASES MAMMALS MAN Models, Genetic Molecular and cellular biology Molecular genetics Molecular Sequence Data MOLECULAR STRUCTURE Murals neural networks Neural Networks (Computer) Nucleic acids Nucleotide sequences Open reading frames ORGANIC COMPOUNDS PATTERN RECOGNITION PHOSPHATASES PHOSPHORUS-GROUP TRANSFERASES PHOSPHOTRANSFERASES PRIMATES PROTEINS Proteins - genetics PROTHROMBIN Reading frames Sensors STRUCTURAL CHEMICAL ANALYSIS TRANSFERASES VERTEBRATES
Title	Locating Protein-Coding Regions in Human DNA Sequences by a Multiple Sensor-Neural Network Approach
URI	https://www.jstor.org/stable/2359224 http://www.pnas.org/content/88/24/11261.abstract https://www.ncbi.nlm.nih.gov/pubmed/1763041 https://www.proquest.com/docview/16063989 https://www.proquest.com/docview/25246314 https://www.proquest.com/docview/72590574 https://www.osti.gov/biblio/5604872 https://pubmed.ncbi.nlm.nih.gov/PMC53114
Volume	88
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
journalDatabaseRights	– providerCode: PRVFSB databaseName: Free Full-Text Journals in Chemistry customDbUrl: eissn: 1091-6490 dateEnd: 20250502 omitProxy: true ssIdentifier: ssj0009580 issn: 0027-8424 databaseCode: HH5 dateStart: 19150101 isFulltext: true titleUrlDefault: http://abc-chemistry.org/ providerName: ABC ChemistRy – providerCode: PRVAFT databaseName: Open Access Digital Library customDbUrl: eissn: 1091-6490 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0009580 issn: 0027-8424 databaseCode: KQ8 dateStart: 19150101 isFulltext: true titleUrlDefault: http://grweb.coalliance.org/oadl/oadl.html providerName: Colorado Alliance of Research Libraries – providerCode: PRVAFT databaseName: Open Access Digital Library customDbUrl: eissn: 1091-6490 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0009580 issn: 0027-8424 databaseCode: KQ8 dateStart: 19150115 isFulltext: true titleUrlDefault: http://grweb.coalliance.org/oadl/oadl.html providerName: Colorado Alliance of Research Libraries – providerCode: PRVBFR databaseName: Free Medical Journals customDbUrl: eissn: 1091-6490 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0009580 issn: 0027-8424 databaseCode: DIK dateStart: 19150101 isFulltext: true titleUrlDefault: http://www.freemedicaljournals.com providerName: Flying Publisher – providerCode: PRVFQY databaseName: GFMER Free Medical Journals customDbUrl: eissn: 1091-6490 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0009580 issn: 0027-8424 databaseCode: GX1 dateStart: 0 isFulltext: true titleUrlDefault: http://www.gfmer.ch/Medical_journals/Free_medical.php providerName: Geneva Foundation for Medical Education and Research – providerCode: PRVAQN databaseName: PubMed Central customDbUrl: eissn: 1091-6490 dateEnd: 20250502 omitProxy: true ssIdentifier: ssj0009580 issn: 0027-8424 databaseCode: RPM dateStart: 19150101 isFulltext: true titleUrlDefault: https://www.ncbi.nlm.nih.gov/pmc/ providerName: National Library of Medicine
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3Nb9MwFLdgXLggBpsWtoGREGKq0jWJ7TrHCjpNqCs7tFJvluMk2iSUjKU7jL-e9xLnq2r5ukStP2rX75fn9-z3QcgH2DJSLZORy0dCuswEkauNL1yjU5FwFsMOjQf6V3NxuWRfV3y14V2yjobm51a_kv-hKpQBXdFL9h8o2_woFMBnoC88gcLw_Csaz3I8cCvdyXPMWumavPRRwWwL1kK8ysH3ZT4ZNEbTKHHq1pKwAEU2v3cxsCWQK6vMwptY413h9brZ7IratGBenyVOWs8Uyy6KgTu4nrd5jpcRUFDXGJm2x7NXOHBJantHFVuvPA8NOio3zMYpADY7VrlD14xVyg6AbFXFJtFvyevsufidb2XowIEwC3GmCyD-0GfDTtdu7Oz5N3WxnM3UYrpafLz74WJaMbx-tzlWnpJnPrD9MrfHyutEZZaVj5Kdvb3HhkHPN4fsyS2V6Srs4jnwYTSn1QW8UWmVCgXD5ELnbSrLpuVtR5RZvCQvrA5CJxWg9smTJHtF9muy0U82FPnZa2JqhNE-wqhFGL3NaIkwCgijDcJo9Eg1rRFGewijFmG0RtgBWV5MF58vXZuWwzViFK5dHYVSMh1rLiITmCCORqkOxlzHLAFdLYgDjLGHYY24gJffY0loQK5OeZSKcRp4wSHZy_IsOSI0FiCfetqXMo5AMxBhkIRhajSLdQJcInbIeb3oytiY9Zg65bsqbSfGgcKVVlIqn6mSTA45a3rcVfFaftP2oKRj09APeAgirUOOka4KZFAMpGzQ4sysFfwH0O59h5z2yN30Bk2Gh5I75KgcqC7uDfh-V5VKra2XQ97VsFHA6fH6TmdJ_lAoT6A6IcPdLXzuMwELvrvFGKYIGhq0OKyA2C4SCBojBlPkPYQ29RiGvl-T3d6U4ehhF_fYmz9O-5g8bznHCdlb3z8kpyDQr6O35Uv5C3hT-ak
linkProvider	Geneva Foundation for Medical Education and Research
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Locating+protein-coding+regions+in+human+DNA+sequences+by+a+multiple+sensor-neural+network+approach&rft.jtitle=Proceedings+of+the+National+Academy+of+Sciences+-+PNAS&rft.au=Uberbacher%2C+E+C&rft.au=Mural%2C+R+J&rft.date=1991-12-15&rft.issn=0027-8424&rft.volume=88&rft.issue=24&rft.spage=11261&rft.epage=11265&rft_id=info:doi/10.1073%2Fpnas.88.24.11261&rft.externalDBID=NO_FULL_TEXT
thumbnail_m	http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=http%3A%2F%2Fwww.pnas.org%2Fcontent%2F88%2F24.cover.gif
thumbnail_s	http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=http%3A%2F%2Fwww.pnas.org%2Fcontent%2F88%2F24.cover.gif