Do “Newly Born” orphan proteins resemble “Never Born” proteins? A study using three deep learning algorithms
“Newly Born” proteins, devoid of detectable homology to any other proteins, known as orphan proteins, occur in a single species or within a taxonomically restricted gene family. They are generated by the expression of novel open reading frames, and appear throughout evolution. We were curious if thr...
Saved in:
| Published in | Proteins, structure, function, and bioinformatics Vol. 91; no. 8; pp. 1097 - 1115 |
|---|---|
| Main Authors | , , , , , |
| Format | Journal Article |
| Language | English |
| Published |
Hoboken, USA
John Wiley & Sons, Inc
01.08.2023
Wiley Subscription Services, Inc |
| Subjects | |
| Online Access | Get full text |
| ISSN | 0887-3585 1097-0134 1097-0134 |
| DOI | 10.1002/prot.26496 |
Cover
| Abstract | “Newly Born” proteins, devoid of detectable homology to any other proteins, known as orphan proteins, occur in a single species or within a taxonomically restricted gene family. They are generated by the expression of novel open reading frames, and appear throughout evolution. We were curious if three recently developed programs for predicting protein structures, namely, AlphaFold2, RoseTTAFold, and ESMFold, might be of value for comparison of such “Newly Born” proteins to random polypeptides with amino acid content similar to that of native proteins, which have been called “Never Born” proteins. The programs were used to compare the structures of two sets of “Never Born” proteins that had been expressed—Group 1, which had been shown experimentally to possess substantial secondary structure, and Group 3, which had been shown to be intrinsically disordered. Overall, although the models generated were scored as being of low quality, they nevertheless revealed some general principles. Specifically, all four members of Group 1 were predicted to be compact by all three algorithms, in agreement with the experimental data, whereas the members of Group 3 were predicted to be very extended, as would be expected for intrinsically disordered proteins, again consistent with the experimental data. These predicted differences were shown to be statistically significant by comparing their accessible surface areas. The three programs were then used to predict the structures of three orphan proteins whose crystal structures had been solved, two of which display novel folds. Surprisingly, only for the protein which did not have a novel fold, and was taxonomically restricted, rather than being a true orphan, did all three algorithms predict very similar, high‐quality structures, closely resembling the crystal structure. Finally, they were used to predict the structures of seven orphan proteins with well‐identified biological functions, whose 3D structures are not known. Two proteins, which were predicted to be disordered based on their sequences, are predicted by all three structure algorithms to be extended structures. The other five were predicted to be compact structures with only two exceptions in the case of AlphaFold2. All three prediction algorithms make remarkably similar and high‐quality predictions for one large protein, HCO_11565, from a nematode. It is conjectured that this is due to many homologs in the taxonomically restricted family of which it is a member, and to the fact that the Dali server revealed several nonrelated proteins with similar folds. An animated Interactive 3D Complement (I3DC) is available in Proteopedia at http://proteopedia.org/w/Journal:Proteins:3 |
|---|---|
| AbstractList | "Newly Born" proteins, devoid of detectable homology to any other proteins, known as orphan proteins, occur in a single species or within a taxonomically restricted gene family. They are generated by the expression of novel open reading frames, and appear throughout evolution. We were curious if three recently developed programs for predicting protein structures, namely, AlphaFold2, RoseTTAFold, and ESMFold, might be of value for comparison of such "Newly Born" proteins to random polypeptides with amino acid content similar to that of native proteins, which have been called "Never Born" proteins. The programs were used to compare the structures of two sets of "Never Born" proteins that had been expressed-Group 1, which had been shown experimentally to possess substantial secondary structure, and Group 3, which had been shown to be intrinsically disordered. Overall, although the models generated were scored as being of low quality, they nevertheless revealed some general principles. Specifically, all four members of Group 1 were predicted to be compact by all three algorithms, in agreement with the experimental data, whereas the members of Group 3 were predicted to be very extended, as would be expected for intrinsically disordered proteins, again consistent with the experimental data. These predicted differences were shown to be statistically significant by comparing their accessible surface areas. The three programs were then used to predict the structures of three orphan proteins whose crystal structures had been solved, two of which display novel folds. Surprisingly, only for the protein which did not have a novel fold, and was taxonomically restricted, rather than being a true orphan, did all three algorithms predict very similar, high-quality structures, closely resembling the crystal structure. Finally, they were used to predict the structures of seven orphan proteins with well-identified biological functions, whose 3D structures are not known. Two proteins, which were predicted to be disordered based on their sequences, are predicted by all three structure algorithms to be extended structures. The other five were predicted to be compact structures with only two exceptions in the case of AlphaFold2. All three prediction algorithms make remarkably similar and high-quality predictions for one large protein, HCO_11565, from a nematode. It is conjectured that this is due to many homologs in the taxonomically restricted family of which it is a member, and to the fact that the Dali server revealed several nonrelated proteins with similar folds. An animated Interactive 3D Complement (I3DC) is available in Proteopedia at http://proteopedia.org/w/Journal:Proteins:3. “Newly Born” proteins, devoid of detectable homology to any other proteins, known as orphan proteins, occur in a single species or within a taxonomically restricted gene family. They are generated by the expression of novel open reading frames, and appear throughout evolution. We were curious if three recently developed programs for predicting protein structures, namely, AlphaFold2, RoseTTAFold, and ESMFold, might be of value for comparison of such “Newly Born” proteins to random polypeptides with amino acid content similar to that of native proteins, which have been called “Never Born” proteins. The programs were used to compare the structures of two sets of “Never Born” proteins that had been expressed—Group 1, which had been shown experimentally to possess substantial secondary structure, and Group 3, which had been shown to be intrinsically disordered. Overall, although the models generated were scored as being of low quality, they nevertheless revealed some general principles. Specifically, all four members of Group 1 were predicted to be compact by all three algorithms, in agreement with the experimental data, whereas the members of Group 3 were predicted to be very extended, as would be expected for intrinsically disordered proteins, again consistent with the experimental data. These predicted differences were shown to be statistically significant by comparing their accessible surface areas. The three programs were then used to predict the structures of three orphan proteins whose crystal structures had been solved, two of which display novel folds. Surprisingly, only for the protein which did not have a novel fold, and was taxonomically restricted, rather than being a true orphan, did all three algorithms predict very similar, high‐quality structures, closely resembling the crystal structure. Finally, they were used to predict the structures of seven orphan proteins with well‐identified biological functions, whose 3D structures are not known. Two proteins, which were predicted to be disordered based on their sequences, are predicted by all three structure algorithms to be extended structures. The other five were predicted to be compact structures with only two exceptions in the case of AlphaFold2. All three prediction algorithms make remarkably similar and high‐quality predictions for one large protein, HCO_11565, from a nematode. It is conjectured that this is due to many homologs in the taxonomically restricted family of which it is a member, and to the fact that the Dali server revealed several nonrelated proteins with similar folds. An animated Interactive 3D Complement (I3DC) is available in Proteopedia at http://proteopedia.org/w/Journal:Proteins:3 “Newly Born” proteins, devoid of detectable homology to any other proteins, known as orphan proteins, occur in a single species or within a taxonomically restricted gene family. They are generated by the expression of novel open reading frames, and appear throughout evolution. We were curious if three recently developed programs for predicting protein structures, namely, AlphaFold2, RoseTTAFold, and ESMFold, might be of value for comparison of such “Newly Born” proteins to random polypeptides with amino acid content similar to that of native proteins, which have been called “Never Born” proteins. The programs were used to compare the structures of two sets of “Never Born” proteins that had been expressed—Group 1, which had been shown experimentally to possess substantial secondary structure, and Group 3, which had been shown to be intrinsically disordered. Overall, although the models generated were scored as being of low quality, they nevertheless revealed some general principles. Specifically, all four members of Group 1 were predicted to be compact by all three algorithms, in agreement with the experimental data, whereas the members of Group 3 were predicted to be very extended, as would be expected for intrinsically disordered proteins, again consistent with the experimental data. These predicted differences were shown to be statistically significant by comparing their accessible surface areas. The three programs were then used to predict the structures of three orphan proteins whose crystal structures had been solved, two of which display novel folds. Surprisingly, only for the protein which did not have a novel fold, and was taxonomically restricted, rather than being a true orphan, did all three algorithms predict very similar, high‐quality structures, closely resembling the crystal structure. Finally, they were used to predict the structures of seven orphan proteins with well‐identified biological functions, whose 3D structures are not known. Two proteins, which were predicted to be disordered based on their sequences, are predicted by all three structure algorithms to be extended structures. The other five were predicted to be compact structures with only two exceptions in the case of AlphaFold2. All three prediction algorithms make remarkably similar and high‐quality predictions for one large protein, HCO_11565, from a nematode. It is conjectured that this is due to many homologs in the taxonomically restricted family of which it is a member, and to the fact that the Dali server revealed several nonrelated proteins with similar folds. An animated Interactive 3D Complement (I3DC) is available in Proteopedia at http://proteopedia.org/w/Journal:Proteins:3 "Newly Born" proteins, devoid of detectable homology to any other proteins, known as orphan proteins, occur in a single species or within a taxonomically restricted gene family. They are generated by the expression of novel open reading frames, and appear throughout evolution. We were curious if three recently developed programs for predicting protein structures, namely, AlphaFold2, RoseTTAFold, and ESMFold, might be of value for comparison of such "Newly Born" proteins to random polypeptides with amino acid content similar to that of native proteins, which have been called "Never Born" proteins. The programs were used to compare the structures of two sets of "Never Born" proteins that had been expressed-Group 1, which had been shown experimentally to possess substantial secondary structure, and Group 3, which had been shown to be intrinsically disordered. Overall, although the models generated were scored as being of low quality, they nevertheless revealed some general principles. Specifically, all four members of Group 1 were predicted to be compact by all three algorithms, in agreement with the experimental data, whereas the members of Group 3 were predicted to be very extended, as would be expected for intrinsically disordered proteins, again consistent with the experimental data. These predicted differences were shown to be statistically significant by comparing their accessible surface areas. The three programs were then used to predict the structures of three orphan proteins whose crystal structures had been solved, two of which display novel folds. Surprisingly, only for the protein which did not have a novel fold, and was taxonomically restricted, rather than being a true orphan, did all three algorithms predict very similar, high-quality structures, closely resembling the crystal structure. Finally, they were used to predict the structures of seven orphan proteins with well-identified biological functions, whose 3D structures are not known. Two proteins, which were predicted to be disordered based on their sequences, are predicted by all three structure algorithms to be extended structures. The other five were predicted to be compact structures with only two exceptions in the case of AlphaFold2. All three prediction algorithms make remarkably similar and high-quality predictions for one large protein, HCO_11565, from a nematode. It is conjectured that this is due to many homologs in the taxonomically restricted family of which it is a member, and to the fact that the Dali server revealed several nonrelated proteins with similar folds. An animated Interactive 3D Complement (I3DC) is available in Proteopedia at http://proteopedia.org/w/Journal:Proteins:3."Newly Born" proteins, devoid of detectable homology to any other proteins, known as orphan proteins, occur in a single species or within a taxonomically restricted gene family. They are generated by the expression of novel open reading frames, and appear throughout evolution. We were curious if three recently developed programs for predicting protein structures, namely, AlphaFold2, RoseTTAFold, and ESMFold, might be of value for comparison of such "Newly Born" proteins to random polypeptides with amino acid content similar to that of native proteins, which have been called "Never Born" proteins. The programs were used to compare the structures of two sets of "Never Born" proteins that had been expressed-Group 1, which had been shown experimentally to possess substantial secondary structure, and Group 3, which had been shown to be intrinsically disordered. Overall, although the models generated were scored as being of low quality, they nevertheless revealed some general principles. Specifically, all four members of Group 1 were predicted to be compact by all three algorithms, in agreement with the experimental data, whereas the members of Group 3 were predicted to be very extended, as would be expected for intrinsically disordered proteins, again consistent with the experimental data. These predicted differences were shown to be statistically significant by comparing their accessible surface areas. The three programs were then used to predict the structures of three orphan proteins whose crystal structures had been solved, two of which display novel folds. Surprisingly, only for the protein which did not have a novel fold, and was taxonomically restricted, rather than being a true orphan, did all three algorithms predict very similar, high-quality structures, closely resembling the crystal structure. Finally, they were used to predict the structures of seven orphan proteins with well-identified biological functions, whose 3D structures are not known. Two proteins, which were predicted to be disordered based on their sequences, are predicted by all three structure algorithms to be extended structures. The other five were predicted to be compact structures with only two exceptions in the case of AlphaFold2. All three prediction algorithms make remarkably similar and high-quality predictions for one large protein, HCO_11565, from a nematode. It is conjectured that this is due to many homologs in the taxonomically restricted family of which it is a member, and to the fact that the Dali server revealed several nonrelated proteins with similar folds. An animated Interactive 3D Complement (I3DC) is available in Proteopedia at http://proteopedia.org/w/Journal:Proteins:3. |
| Author | Sussman, Joel L. Shao, Wei Silman, Israel Liu, Jing Yuan, Rongqing Wang, Jitong |
| Author_xml | – sequence: 1 givenname: Jing surname: Liu fullname: Liu, Jing organization: Faculty of Biotechnology and Food Engineering, Technion‐Israel Institute of Technology – sequence: 2 givenname: Rongqing surname: Yuan fullname: Yuan, Rongqing organization: Tsinghua University – sequence: 3 givenname: Wei surname: Shao fullname: Shao, Wei organization: Shanghai Jiao Tong University – sequence: 4 givenname: Jitong surname: Wang fullname: Wang, Jitong organization: Tsinghua University – sequence: 5 givenname: Israel orcidid: 0000-0003-1923-0829 surname: Silman fullname: Silman, Israel email: israel.silman@weizmann.ac.il organization: The Weizmann Institute of Science – sequence: 6 givenname: Joel L. orcidid: 0000-0003-0306-3878 surname: Sussman fullname: Sussman, Joel L. email: joel.sussman@weizmann.ac.il organization: The Weizmann Institute of Science |
| BackLink | https://www.ncbi.nlm.nih.gov/pubmed/37092778$$D View this record in MEDLINE/PubMed |
| BookMark | eNp90FFP1TAUB_DGQOSCvvgBTBNfDGbYrlvXPRkERRMihOBz07udcUu6drSdN3vjg-iX45PY64AHYnhq0_zO6fmfXbRlnQWE3lByQAnJPw7exYOcFzV_gRaU1FVGKCu20IIIUWWsFOUO2g3hmhDCa8Zfoh1WkTqvKrFA8djhu9vfP2BtJvzZeXt3-wc7P6yUxZu-oG3AHgL0SwOz_AX-UT6QT_gQhzi2Ex6Dtlc4rjwAbgEGbEB5u3lT5sp5HVd9eIW2O2UCvL4_99DPr18uj75lp2cn348OT7OGpTkzlfOqaFWXk7rlbbfkrGS5okCqOqXsuhK4okoJ1nVLIjhlZVPRdGs5bQkROdtDH-a-ox3UtFbGyMHrXvlJUiI3u5ObAPLf7pJ-P-v0djNCiLLXoQFjlAU3BpkLUpa0KGid6Lsn9NqN3qYsSbGkqlKIpN7eq3HZQ_v498P2EyAzaLwLwUMnGx1V1M5Gr7T5_5D7T0qeTURnvNYGpmekPL84u5xr_gJQFrl0 |
| CitedBy_id | crossref_primary_10_1007_s00239_024_10174_z crossref_primary_10_1093_gbe_evae107 crossref_primary_10_1093_gbe_evae069 crossref_primary_10_3390_plants13243601 crossref_primary_10_3389_fpls_2025_1532449 crossref_primary_10_1093_gbe_evae175 crossref_primary_10_1002_prot_26652 crossref_primary_10_1093_gbe_evae176 crossref_primary_10_1038_s41559_023_02252_0 crossref_primary_10_1016_j_ijpara_2024_08_003 |
| Cites_doi | 10.1021/bi047993o 10.1093/nar/26.1.316 10.7554/eLife.53500 10.1038/s41467-021-21511-x 10.1093/nar/gkac387 10.1016/j.biochi.2011.07.014 10.3389/fphy.2019.00010 10.1002/prot.10471 10.1093/bioinformatics/btt473 10.3389/fphar.2022.1014804 10.1038/s41598-018-25867-x 10.1093/gbe/evaa194 10.1038/nrg3053 10.1002/prot.10559 10.1038/nsb0696-488 10.1126/science.abj8754 10.7554/eLife.03523 10.1002/prot.20148 10.1002/prot.10011 10.1371/journal.pgen.1002942 10.1038/s41586-021-04184-w 10.1038/s41467-021-24773-7 10.1016/j.celrep.2022.111808 10.1107/S2052252520000986 10.1101/gr.095026.109 10.1016/j.sbi.2020.11.010 10.1038/s41592-021-01117-3 10.1002/cbdv.200690088 10.1371/journal.pone.0056162 10.1371/journal.pone.0036634 10.1110/ps.04690804 10.1093/bioinformatics/bti537 10.1002/cbdv.200690087 10.1371/journal.pone.0031673 10.1038/s41587-022-01432-w 10.1016/j.sbi.2014.05.006 10.1371/journal.pgen.1003996 10.1038/s41592-022-01488-1 10.1038/s41592-021-01362-6 10.7554/eLife.44392 10.1371/journal.pgen.1003860 10.1098/rstb.2014.0332 10.1021/bi400502c 10.1038/nature11184 10.1093/bioinformatics/btac474 10.1093/bioadv/vbab043 10.1016/S0959-440X(02)00337-8 10.1093/nar/28.1.235 10.1038/nbt.2419 10.1371/journal.pgen.1002379 10.1371/journal.pcbi.1000734 10.1101/2022.02.18.481080 10.1016/j.jmb.2021.167208 10.1021/bi00163a039 10.1093/bioinformatics/btp660 10.1038/s41586-021-03819-2 10.1016/S0065-3233(00)53005-8 10.1093/molbev/msn281 10.1038/nsb1095-856 10.1002/pro.3749 10.1101/gr.098376.109 10.1107/S0907444998009378 10.1093/nar/gkab1238 10.1126/science.860134 10.1002/prot.26237 10.1038/s41598-017-15635-8 10.1002/prot.20138 10.1093/database/bas003 10.1021/acs.jpcb.2c05508 10.1038/35070613 10.1016/j.sbi.2008.10.002 10.12688/f1000research.10079.1 10.1002/prot.10018 10.1016/j.tig.2009.07.006 10.1002/1097-0134(20001115)41:3<415::AID-PROT130>3.0.CO;2-7 |
| ContentType | Journal Article |
| Copyright | 2023 The Authors. published by Wiley Periodicals LLC. 2023 The Authors. Proteins: Structure, Function, and Bioinformatics published by Wiley Periodicals LLC. 2023. This article is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
| Copyright_xml | – notice: 2023 The Authors. published by Wiley Periodicals LLC. – notice: 2023 The Authors. Proteins: Structure, Function, and Bioinformatics published by Wiley Periodicals LLC. – notice: 2023. This article is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
| DBID | 24P AAYXX CITATION NPM 7QL 7QO 7QP 7QR 7TK 7TM 7U9 8FD C1K FR3 H94 K9. M7N P64 RC3 7X8 ADTOC UNPAY |
| DOI | 10.1002/prot.26496 |
| DatabaseName | Wiley Online Library Open Access CrossRef PubMed Bacteriology Abstracts (Microbiology B) Biotechnology Research Abstracts Calcium & Calcified Tissue Abstracts Chemoreception Abstracts Neurosciences Abstracts Nucleic Acids Abstracts Virology and AIDS Abstracts Technology Research Database Environmental Sciences and Pollution Management Engineering Research Database AIDS and Cancer Research Abstracts ProQuest Health & Medical Complete (Alumni) Algology Mycology and Protozoology Abstracts (Microbiology C) Biotechnology and BioEngineering Abstracts Genetics Abstracts MEDLINE - Academic Unpaywall for CDI: Periodical Content Unpaywall |
| DatabaseTitle | CrossRef PubMed Virology and AIDS Abstracts Technology Research Database Nucleic Acids Abstracts ProQuest Health & Medical Complete (Alumni) Neurosciences Abstracts Biotechnology and BioEngineering Abstracts Environmental Sciences and Pollution Management Genetics Abstracts Biotechnology Research Abstracts Bacteriology Abstracts (Microbiology B) Algology Mycology and Protozoology Abstracts (Microbiology C) AIDS and Cancer Research Abstracts Chemoreception Abstracts Engineering Research Database Calcium & Calcified Tissue Abstracts MEDLINE - Academic |
| DatabaseTitleList | PubMed Virology and AIDS Abstracts CrossRef MEDLINE - Academic |
| Database_xml | – sequence: 1 dbid: 24P name: Wiley Online Library Open Access url: https://authorservices.wiley.com/open-science/open-access/browse-journals.html sourceTypes: Publisher – sequence: 2 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 3 dbid: UNPAY name: Unpaywall url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/ sourceTypes: Open Access Repository |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Anatomy & Physiology Chemistry Biology |
| EISSN | 1097-0134 |
| EndPage | 1115 |
| ExternalDocumentID | 10.1002/prot.26496 37092778 10_1002_prot_26496 PROT26496 |
| Genre | article Journal Article |
| GrantInformation_xml | – fundername: Center for Scientific Excellence at the Weizmann Institute of Science |
| GroupedDBID | -~X .3N .GA .GJ .Y3 05W 0R~ 10A 1L6 1OB 1OC 1ZS 24P 31~ 33P 3SF 3WU 4.4 4ZD 50Y 50Z 51W 51X 52M 52N 52O 52P 52S 52T 52U 52W 52X 53G 5RE 5VS 66C 6TJ 702 7PT 8-0 8-1 8-3 8-4 8-5 8UM 930 A03 AAESR AAEVG AAHHS AAHQN AAMNL AANHP AANLZ AAONW AASGY AAXRX AAYCA AAZKR ABCQN ABCUV ABEML ABIJN ABLJU ACAHQ ACBWZ ACCFJ ACCZN ACFBH ACGFS ACIWK ACPOU ACPRK ACRPL ACSCC ACXBN ACXQS ACYXJ ADBBV ADEOM ADIZJ ADKYN ADMGS ADNMO ADOZA ADZMN AEEZP AEIGN AEIMD AEQDE AEUQT AEUYR AFBPY AFFPM AFGKR AFPWT AFRAH AFWVQ AFZJQ AHBTC AHMBA AITYG AIURR AIWBW AJBDE AJXKR ALAGY ALMA_UNASSIGNED_HOLDINGS ALUQN ALVPJ AMBMR AMYDB ASPBG ATUGU AUFTA AVWKF AZBYB AZFZN AZVAB BAFTC BDRZF BFHJK BHBCM BLYAC BMNLL BNHUX BROTX BRXPI BY8 CS3 D-E D-F D0L DCZOG DPXWK DR1 DR2 DRFUL DRSTM EBD EBS EJD EMOBN F00 F01 F04 F5P FA8 FEDTE G-S G.N GNP GODZA H.T H.X HBH HF~ HGLYW HHY HHZ HVGLF HZ~ IX1 JPC KQQ LATKE LAW LC2 LC3 LEEKS LH4 LH6 LITHE LOXES LP6 LP7 LUTES LW6 LYRES MEWTI MK4 MRFUL MRSTM MSFUL MSSTM MXFUL MXSTM N04 N05 N9A NDZJH NF~ NNB O66 O9- P2P P2W P2X P4D PALCI PQQKQ Q.N Q11 QB0 QRW R.K RBB RIWAO RJQFR RNS ROL RWI RX1 SAMSI SUPJJ SV3 UB1 V2E W8V W99 WBFHL WBKPD WIB WIH WIK WJL WOHZO WQJ WRC WSB WXSBR WYISQ XG1 XPP XV2 ZGI ZXP ZZTAW ~IA ~WT AAYXX AEYWJ AGHNM AGQPQ AGYGG AIQQE CITATION NPM 7QL 7QO 7QP 7QR 7TK 7TM 7U9 8FD C1K FR3 H94 K9. M7N P64 RC3 7X8 ADTOC UNPAY |
| ID | FETCH-LOGICAL-c3936-a2674daf209d6dfb63532a1e079109ff5e6a1aa83ffb086135c71b08d61d00823 |
| IEDL.DBID | 24P |
| ISSN | 0887-3585 1097-0134 |
| IngestDate | Wed Oct 01 15:58:01 EDT 2025 Wed Oct 01 14:28:23 EDT 2025 Tue Oct 07 06:05:35 EDT 2025 Mon Jul 21 06:00:40 EDT 2025 Thu Apr 24 23:02:07 EDT 2025 Wed Oct 01 01:19:21 EDT 2025 Wed Jan 22 16:19:15 EST 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 8 |
| Keywords | orphan protein molten globule taxonomically restricted intrinsically disordered protein protein structure prediction |
| Language | English |
| License | Attribution 2023 The Authors. Proteins: Structure, Function, and Bioinformatics published by Wiley Periodicals LLC. cc-by |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c3936-a2674daf209d6dfb63532a1e079109ff5e6a1aa83ffb086135c71b08d61d00823 |
| Notes | Jing Liu and Rongqing Yuan contributed equally to this work. ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
| ORCID | 0000-0003-0306-3878 0000-0003-1923-0829 |
| OpenAccessLink | https://onlinelibrary.wiley.com/doi/abs/10.1002%2Fprot.26496 |
| PMID | 37092778 |
| PQID | 2835517588 |
| PQPubID | 1016441 |
| PageCount | 19 |
| ParticipantIDs | unpaywall_primary_10_1002_prot_26496 proquest_miscellaneous_2805514419 proquest_journals_2835517588 pubmed_primary_37092778 crossref_citationtrail_10_1002_prot_26496 crossref_primary_10_1002_prot_26496 wiley_primary_10_1002_prot_26496_PROT26496 |
| ProviderPackageCode | CITATION AAYXX |
| PublicationCentury | 2000 |
| PublicationDate | August 2023 2023-08-00 2023-Aug 20230801 |
| PublicationDateYYYYMMDD | 2023-08-01 |
| PublicationDate_xml | – month: 08 year: 2023 text: August 2023 |
| PublicationDecade | 2020 |
| PublicationPlace | Hoboken, USA |
| PublicationPlace_xml | – name: Hoboken, USA – name: United States – name: Hokoben |
| PublicationTitle | Proteins, structure, function, and bioinformatics |
| PublicationTitleAlternate | Proteins |
| PublicationYear | 2023 |
| Publisher | John Wiley & Sons, Inc Wiley Subscription Services, Inc |
| Publisher_xml | – name: John Wiley & Sons, Inc – name: Wiley Subscription Services, Inc |
| References | 2017; 6 2013; 29 2021; 68 2017; 7 2012; 2012 2012; 487 2002; 12 2021; 600 2000; 41 2014; 26 2005; 21 2011; 12 2020; 12 2013; 8 2003; 53 2013; 9 2020; 7 2001; 410 2015; 370 2018; 8 2010; 26 2014; 3 2021; 596 2021; 433 2022; 40 2002; 46 2000; 11 2020; 9 2013; 52 2000; 53 2009; 19 1998; 54 1996; 3 2022; 38 2022; 126 2010; 6 2014; 10 2019; 8 1998; 26 2019; 7 2009; 25 2021; 89 2000; 28 2022; 50 2015; 169 2008; 18 2006; 3 2022; 41 1995; 2 1992; 31 2005; 44 2009; 26 2011; 7 2012; 30 1999 2004; 54 2021; 12 2022 2011; 93 2021; 18 2004; 56 2004; 13 2022; 13 2021; 373 2022; 2 2012; 7 1977; 196 2012; 8 2022; 19 2020; 29 e_1_2_9_75_1 e_1_2_9_31_1 e_1_2_9_52_1 e_1_2_9_73_1 e_1_2_9_79_1 e_1_2_9_10_1 e_1_2_9_35_1 e_1_2_9_56_1 e_1_2_9_77_1 e_1_2_9_12_1 e_1_2_9_33_1 e_1_2_9_54_1 e_1_2_9_71_1 Perochon A (e_1_2_9_50_1) 2015; 169 e_1_2_9_14_1 e_1_2_9_39_1 e_1_2_9_16_1 e_1_2_9_37_1 e_1_2_9_58_1 e_1_2_9_18_1 e_1_2_9_41_1 e_1_2_9_64_1 e_1_2_9_20_1 e_1_2_9_62_1 e_1_2_9_22_1 e_1_2_9_45_1 e_1_2_9_68_1 e_1_2_9_24_1 e_1_2_9_43_1 e_1_2_9_66_1 e_1_2_9_8_1 e_1_2_9_6_1 e_1_2_9_4_1 e_1_2_9_26_1 e_1_2_9_49_1 e_1_2_9_28_1 e_1_2_9_47_1 Bränden C (e_1_2_9_2_1) 1999 e_1_2_9_30_1 e_1_2_9_53_1 e_1_2_9_74_1 e_1_2_9_51_1 e_1_2_9_72_1 e_1_2_9_11_1 e_1_2_9_34_1 e_1_2_9_57_1 e_1_2_9_78_1 e_1_2_9_13_1 e_1_2_9_32_1 e_1_2_9_55_1 e_1_2_9_76_1 e_1_2_9_70_1 e_1_2_9_15_1 e_1_2_9_38_1 e_1_2_9_17_1 e_1_2_9_36_1 e_1_2_9_59_1 e_1_2_9_19_1 e_1_2_9_42_1 e_1_2_9_63_1 e_1_2_9_40_1 e_1_2_9_61_1 e_1_2_9_21_1 e_1_2_9_46_1 e_1_2_9_67_1 e_1_2_9_23_1 e_1_2_9_44_1 e_1_2_9_65_1 e_1_2_9_7_1 e_1_2_9_80_1 e_1_2_9_5_1 e_1_2_9_3_1 e_1_2_9_9_1 e_1_2_9_25_1 e_1_2_9_27_1 e_1_2_9_48_1 Dunker AK (e_1_2_9_60_1) 2000; 11 e_1_2_9_69_1 e_1_2_9_29_1 |
| References_xml | – volume: 19 start-page: 11 issue: 1 year: 2022 end-page: 12 article-title: Protein structure predictions to atomic accuracy with AlphaFold publication-title: Nat Methods – volume: 68 start-page: 175 year: 2021 end-page: 183 article-title: Structure and function of naturally evolved de novo proteins publication-title: Curr Opin Struct Biol – volume: 25 start-page: 404 issue: 9 year: 2009 end-page: 413 article-title: More than just orphans: are taxonomically‐restricted genes important in evolution? publication-title: Trends Genet – volume: 93 start-page: 1928 issue: 11 year: 2011 end-page: 1934 article-title: Exonization of transposed elements: a challenge and opportunity for evolution publication-title: Biochimie – volume: 126 start-page: 8439 issue: 42 year: 2022 end-page: 8446 article-title: Prediction of intrinsic disorder using Rosetta ResidueDisorder and AlphaFold2 publication-title: J Chem Phys B – volume: 8 issue: 1 year: 2018 article-title: Novel erythrocyte clumps revealed by an orphan gene Newtic1 in circulating blood and regenerating limbs of the adult newt publication-title: Sci Rep – volume: 9 issue: 10 year: 2013 article-title: De novo ORFs in are important to organismal fitness and evolved rapidly from previously non‐coding sequences publication-title: PLoS Genet – volume: 89 start-page: 1607 issue: 12 year: 2021 end-page: 1617 article-title: Critical assessment of methods of protein structure prediction (CASP)‐round XIV publication-title: Proteins – volume: 12 start-page: 2183 issue: 11 year: 2020 end-page: 2195 article-title: Stochastic gain and loss of novel transcribed open reading frames in the human lineage publication-title: Genome Biol Evol – volume: 11 start-page: 161 year: 2000 end-page: 171 article-title: Intrinsic protein disorder in complete genomes publication-title: Genome Inform – volume: 7 issue: 11 year: 2011 article-title: De novo origin of human protein‐coding genes publication-title: PLoS Genet – volume: 3 start-page: 827 issue: 8 year: 2006 end-page: 839 article-title: Investigation of de novo totally random biosequences, part I: a general method for in vitro selection of folded domains from a random polypeptide library displayed on phage publication-title: Chem Biodivers – volume: 6 year: 2017 article-title: Fact or fiction: updates on how protein‐coding genes might emerge de novo from previously non‐coding DNA publication-title: F1000Res – volume: 433 issue: 20 year: 2021 article-title: AlphaFold and implications for intrinsically disordered proteins publication-title: J Mol Biol – volume: 2 start-page: 856 issue: 10 year: 1995 end-page: 864 article-title: Cooperatively folded proteins in random sequence libraries publication-title: Nat Struct Biol – volume: 8 issue: 9 year: 2012 article-title: Hominoid‐specific de novo protein‐coding genes originating from long non‐coding RNAs publication-title: PLoS Genet – volume: 28 start-page: 235 issue: 1 year: 2000 end-page: 242 article-title: The Protein Data Bank publication-title: Nucleic Acids Res – volume: 26 start-page: 310 issue: 3 year: 2010 end-page: 318 article-title: Globally, unrelated protein sequences appear random publication-title: Bioinformatics – volume: 7 issue: 5 year: 2012 article-title: Do natural proteins differ from random sequences polypeptides? Natural vs. random proteins classification using an evolutionary neural network publication-title: PLoS One – volume: 56 start-page: 607 issue: 3 year: 2004 end-page: 610 article-title: Crystal structure of an orphan protein (TM0875) from at 2.00‐Å resolution reveals a new fold publication-title: Proteins – volume: 7 year: 2019 article-title: Intrinsically disordered proteins and their “mysterious” (meta)physics publication-title: Front Phys – volume: 46 start-page: 61 issue: 1 year: 2002 end-page: 71 article-title: A unifold, mesofold, and superfold model of protein fold use publication-title: Proteins – volume: 26 start-page: 73 year: 2014 end-page: 83 article-title: Orphans and new gene origination, a structural and evolutionary perspective publication-title: Curr Opin Struct Biol – volume: 56 start-page: 564 issue: 3 year: 2004 end-page: 571 article-title: Novel structure and nucleotide binding properties of HI1480 from : a protein with no known sequence homologues publication-title: Proteins – volume: 12 issue: 1 year: 2021 article-title: Improved protein structure refinement guided by deep learning based accuracy estimation publication-title: Nat Commun – volume: 18 start-page: 756 issue: 6 year: 2008 end-page: 764 article-title: Function and structure of inherently disordered proteins publication-title: Curr Opin Struct Biol – volume: 2 issue: 1 year: 2022 article-title: Folding the unfoldable: using AlphaFold to explore spurious proteins publication-title: Bioinform Adv – year: 2022 – volume: 29 start-page: 128 issue: 1 year: 2020 end-page: 140 article-title: DALI and the persistence of protein shape publication-title: Protein Sci – volume: 7 issue: 1 year: 2017 article-title: Random protein sequences can form defined secondary structures and are well‐tolerated in vivo publication-title: Sci Rep – volume: 196 start-page: 1161 issue: 4295 year: 1977 end-page: 1166 article-title: Evolution and tinkering publication-title: Science – volume: 373 start-page: 871 issue: 6557 year: 2021 end-page: 876 article-title: Accurate prediction of protein structures and interactions using a three‐track neural network publication-title: Science – volume: 54 start-page: 20 issue: 1 year: 2004 end-page: 40 article-title: Proteomic signatures: amino acid and oligopeptide compositions differentiate among phyla publication-title: Proteins – volume: 13 start-page: 1711 issue: 7 year: 2004 end-page: 1723 article-title: De novo proteins from designed combinatorial libraries publication-title: Protein Sci – volume: 9 start-page: e53500 year: 2020 article-title: Synteny‐based analyses indicate that sequence divergence is not the main source of orphan genes publication-title: Elife – volume: 410 start-page: 715 issue: 6829 year: 2001 end-page: 718 article-title: Functional proteins from a random‐sequence library publication-title: Nature – volume: 7 issue: 2 year: 2012 article-title: Structural view of a non Pfam singleton and crystal packing analysis publication-title: PLoS One – volume: 10 issue: 1 year: 2014 article-title: NCYM, a cis‐antisense gene of MYCN, encodes a de novo evolved protein that inhibits GSK3beta resulting in the stabilization of MYCN in human neuroblastomas publication-title: PLoS Genet – volume: 487 start-page: 370 issue: 7407 year: 2012 end-page: 374 article-title: Proto‐genes and de novo gene birth publication-title: Nature – volume: 19 start-page: 1752 issue: 10 year: 2009 end-page: 1759 article-title: Recent de novo origin of human protein‐coding genes publication-title: Genome Res – volume: 41 start-page: 415 issue: 3 year: 2000 end-page: 427 article-title: Why are “natively unfolded” proteins unstructured under physiologic conditions? publication-title: Proteins – volume: 19 start-page: 679 issue: 6 year: 2022 end-page: 682 article-title: ColabFold: making protein folding accessible to all publication-title: Nat Methods – volume: 44 start-page: 1989 issue: 6 year: 2005 end-page: 2000 article-title: Comparing and combining predictors of mostly disordered proteins publication-title: Biochemistry – volume: 26 start-page: 603 issue: 3 year: 2009 end-page: 612 article-title: Origin of primate orphan genes: a comparative genomics approach publication-title: Mol Biol Evol – volume: 600 start-page: 547 issue: 7889 year: 2021 end-page: 552 article-title: De novo protein design by deep network hallucination publication-title: Nature – volume: 52 start-page: 5167 issue: 31 year: 2013 end-page: 5175 article-title: Cooperative unfolding of compact conformations of the intrinsically disordered protein osteopontin publication-title: Biochemistry – volume: 40 start-page: 1617 issue: 11 year: 2022 end-page: 1623 article-title: Single‐sequence protein structure prediction using a language model and deep learning publication-title: Nat Biotechnol – volume: 29 start-page: 2722 issue: 21 year: 2013 end-page: 2728 article-title: lDDT: a local superposition‐free score for comparing protein structures and models using distance difference tests publication-title: Bioinformatics – volume: 26 start-page: 316 issue: 1 year: 1998 end-page: 319 article-title: Touring protein fold space with Dali/FSSP publication-title: Nucleic Acids Res – volume: 12 start-page: 692 issue: 10 year: 2011 end-page: 702 article-title: The evolutionary origin of orphan genes publication-title: Nat Rev Genet – volume: 8 year: 2019 article-title: A de novo evolved gene in the house mouse regulates female pregnancy cycles publication-title: Elife – volume: 53 start-page: 758 issue: 3 year: 2003 end-page: 767 article-title: The intracellular domain of the cholinesterase‐like neural adhesion protein, gliotactin, is natively unfolded publication-title: Proteins – volume: 21 start-page: 3435 issue: 16 year: 2005 end-page: 3438 article-title: FoldIndex: a simple tool to predict whether a given protein sequence is intrinsically unfolded publication-title: Bioinformatics – volume: 370 issue: 1678 year: 2015 article-title: New genes from non‐coding sequence: the role of de novo protein‐coding genes in eukaryotic evolutionary innovation publication-title: Philos Trans R Soc Lond B Biol Sci – volume: 50 start-page: W210 issue: W1 year: 2022 end-page: W215 article-title: Dali server: structural unification of protein families publication-title: Nucleic Acids Res – volume: 3 year: 2014 article-title: Long non‐coding RNAs as a source of new peptides publication-title: Elife – volume: 30 start-page: 1072 issue: 11 year: 2012 end-page: 1080 article-title: Protein structure prediction from sequence variation publication-title: Nat Biotechnol – volume: 19 start-page: 1693 issue: 10 year: 2009 end-page: 1695 article-title: Darwinian alchemy: human genes from noncoding DNA publication-title: Genome Res – volume: 7 start-page: 287 issue: Pt 2 year: 2020 end-page: 293 article-title: Structure and mechanism of copper‐carbonic anhydrase II: a nitrite reductase publication-title: IUCrJ – volume: 13 year: 2022 article-title: Thermal proteome profiling reveals orphan protein HCO_011565 as a target of the nematocidal small molecule UMW‐868 publication-title: Front Pharmacol – volume: 53 start-page: 209 year: 2000 end-page: 282 article-title: Role of the molten globule state in protein folding publication-title: Adv Protein Chem – volume: 41 issue: 12 year: 2022 article-title: De novo birth of functional microproteins in the human lineage publication-title: Cell Rep – volume: 3 start-page: 840 issue: 8 year: 2006 end-page: 859 article-title: Investigation of de novo totally random biosequences, part II: on the folding frequency in a totally random library of de novo proteins obtained by phage display publication-title: Chem Biodivers – volume: 8 issue: 2 year: 2013 article-title: PBOV1 is a human de novo gene with tumor‐specific expression that is associated with a positive clinical outcome of cancer publication-title: PLoS One – volume: 31 start-page: 12248 issue: 48 year: 1992 end-page: 12254 article-title: Chemical modification of acetylcholinesterase by disulfides: appearance of a "molten globule" state publication-title: Biochemistry – volume: 169 start-page: 2895 issue: 4 year: 2015 end-page: 2906 article-title: encodes a orphan protein that interacts with SnRK1 and enhances resistance to the Mycotoxigenic fungus publication-title: Plant Physiol – volume: 2012 year: 2012 article-title: AntiFam: a tool to help identify spurious ORFs in protein annotation publication-title: Database – volume: 54 start-page: 1078 issue: Pt 6 Pt 1 year: 1998 end-page: 1084 article-title: Protein data bank (PDB): a database of 3D structural information of biological macromolecules publication-title: Acta Crystallogr D Biol Crystallogr – volume: 50 issue: 7 year: 2022 article-title: Foster thy young: enhanced prediction of orphan genes in assembled genomes publication-title: Nucleic Acids Res – volume: 6 issue: 3 year: 2010 article-title: A human‐specific de novo protein‐coding gene associated with human brain functions publication-title: PLoS Comput Biol – volume: 3 start-page: 488 issue: 6 year: 1996 end-page: 490 article-title: How molten is the molten globule? publication-title: Nat Struct Biol – volume: 38 start-page: ii95 issue: Supplement_2 year: 2022 end-page: ii98 article-title: DistilProtBert: a distilled protein language model used to distinguish between real proteins and their randomly shuffled counterparts publication-title: Bioinformatics – volume: 18 start-page: 472 issue: 5 year: 2021 end-page: 481 article-title: Critical assessment of protein intrinsic disorder prediction publication-title: Nat Methods – volume: 12 start-page: 409 issue: 3 year: 2002 end-page: 416 article-title: Did evolution leap to create the protein universe? publication-title: Curr Opin Struct Biol – volume: 46 start-page: 1 issue: 1 year: 2002 end-page: 7 article-title: Intrinsic structural disorder and sequence features of the cell cycle inhibitor p57Kip2 publication-title: Proteins – volume: 596 start-page: 583 issue: 7873 year: 2021 end-page: 589 article-title: Highly accurate protein structure prediction with AlphaFold publication-title: Nature – year: 1999 – volume: 12 issue: 1 year: 2021 article-title: flDPnn: accurate intrinsic disorder prediction with putative propensities of disorder functions publication-title: Nat Commun – volume-title: Introduction to Protein Structure year: 1999 ident: e_1_2_9_2_1 – ident: e_1_2_9_61_1 doi: 10.1021/bi047993o – ident: e_1_2_9_55_1 doi: 10.1093/nar/26.1.316 – ident: e_1_2_9_20_1 doi: 10.7554/eLife.53500 – ident: e_1_2_9_54_1 doi: 10.1038/s41467-021-21511-x – ident: e_1_2_9_66_1 doi: 10.1093/nar/gkac387 – ident: e_1_2_9_23_1 doi: 10.1016/j.biochi.2011.07.014 – ident: e_1_2_9_59_1 doi: 10.3389/fphy.2019.00010 – ident: e_1_2_9_39_1 doi: 10.1002/prot.10471 – ident: e_1_2_9_53_1 doi: 10.1093/bioinformatics/btt473 – ident: e_1_2_9_45_1 doi: 10.3389/fphar.2022.1014804 – ident: e_1_2_9_51_1 doi: 10.1038/s41598-018-25867-x – volume: 11 start-page: 161 year: 2000 ident: e_1_2_9_60_1 article-title: Intrinsic protein disorder in complete genomes publication-title: Genome Inform – ident: e_1_2_9_30_1 doi: 10.1093/gbe/evaa194 – ident: e_1_2_9_15_1 doi: 10.1038/nrg3053 – ident: e_1_2_9_3_1 doi: 10.1002/prot.10559 – ident: e_1_2_9_75_1 doi: 10.1038/nsb0696-488 – ident: e_1_2_9_13_1 doi: 10.1126/science.abj8754 – ident: e_1_2_9_31_1 doi: 10.7554/eLife.03523 – ident: e_1_2_9_43_1 doi: 10.1002/prot.20148 – ident: e_1_2_9_71_1 doi: 10.1002/prot.10011 – ident: e_1_2_9_29_1 doi: 10.1371/journal.pgen.1002942 – ident: e_1_2_9_14_1 doi: 10.1038/s41586-021-04184-w – ident: e_1_2_9_58_1 doi: 10.1038/s41467-021-24773-7 – ident: e_1_2_9_65_1 doi: 10.1016/j.celrep.2022.111808 – ident: e_1_2_9_38_1 doi: 10.1107/S2052252520000986 – ident: e_1_2_9_25_1 doi: 10.1101/gr.095026.109 – ident: e_1_2_9_33_1 – ident: e_1_2_9_73_1 doi: 10.1016/j.sbi.2020.11.010 – ident: e_1_2_9_67_1 doi: 10.1038/s41592-021-01117-3 – ident: e_1_2_9_8_1 doi: 10.1002/cbdv.200690088 – ident: e_1_2_9_46_1 doi: 10.1371/journal.pone.0056162 – ident: e_1_2_9_5_1 doi: 10.1371/journal.pone.0036634 – ident: e_1_2_9_12_1 doi: 10.1110/ps.04690804 – ident: e_1_2_9_57_1 doi: 10.1093/bioinformatics/bti537 – ident: e_1_2_9_7_1 doi: 10.1002/cbdv.200690087 – ident: e_1_2_9_44_1 doi: 10.1371/journal.pone.0031673 – ident: e_1_2_9_80_1 doi: 10.1038/s41587-022-01432-w – ident: e_1_2_9_72_1 doi: 10.1016/j.sbi.2014.05.006 – ident: e_1_2_9_48_1 doi: 10.1371/journal.pgen.1003996 – ident: e_1_2_9_52_1 doi: 10.1038/s41592-022-01488-1 – ident: e_1_2_9_78_1 doi: 10.1038/s41592-021-01362-6 – ident: e_1_2_9_49_1 doi: 10.7554/eLife.44392 – ident: e_1_2_9_17_1 doi: 10.1371/journal.pgen.1003860 – ident: e_1_2_9_18_1 doi: 10.1098/rstb.2014.0332 – ident: e_1_2_9_41_1 doi: 10.1021/bi400502c – ident: e_1_2_9_16_1 doi: 10.1038/nature11184 – ident: e_1_2_9_6_1 doi: 10.1093/bioinformatics/btac474 – ident: e_1_2_9_68_1 doi: 10.1093/bioadv/vbab043 – ident: e_1_2_9_70_1 doi: 10.1016/S0959-440X(02)00337-8 – ident: e_1_2_9_35_1 doi: 10.1093/nar/28.1.235 – volume: 169 start-page: 2895 issue: 4 year: 2015 ident: e_1_2_9_50_1 article-title: TaFROG encodes a Pooideae orphan protein that interacts with SnRK1 and enhances resistance to the Mycotoxigenic fungus Fusarium graminearum publication-title: Plant Physiol – ident: e_1_2_9_79_1 doi: 10.1038/nbt.2419 – ident: e_1_2_9_28_1 doi: 10.1371/journal.pgen.1002379 – ident: e_1_2_9_47_1 doi: 10.1371/journal.pcbi.1000734 – ident: e_1_2_9_64_1 doi: 10.1101/2022.02.18.481080 – ident: e_1_2_9_63_1 doi: 10.1016/j.jmb.2021.167208 – ident: e_1_2_9_77_1 doi: 10.1021/bi00163a039 – ident: e_1_2_9_4_1 doi: 10.1093/bioinformatics/btp660 – ident: e_1_2_9_32_1 doi: 10.1038/s41586-021-03819-2 – ident: e_1_2_9_76_1 doi: 10.1016/S0065-3233(00)53005-8 – ident: e_1_2_9_27_1 doi: 10.1093/molbev/msn281 – ident: e_1_2_9_10_1 doi: 10.1038/nsb1095-856 – ident: e_1_2_9_56_1 doi: 10.1002/pro.3749 – ident: e_1_2_9_26_1 doi: 10.1101/gr.098376.109 – ident: e_1_2_9_34_1 doi: 10.1107/S0907444998009378 – ident: e_1_2_9_21_1 doi: 10.1093/nar/gkab1238 – ident: e_1_2_9_22_1 doi: 10.1126/science.860134 – ident: e_1_2_9_36_1 doi: 10.1002/prot.26237 – ident: e_1_2_9_9_1 doi: 10.1038/s41598-017-15635-8 – ident: e_1_2_9_42_1 doi: 10.1002/prot.20138 – ident: e_1_2_9_69_1 doi: 10.1093/database/bas003 – ident: e_1_2_9_62_1 doi: 10.1021/acs.jpcb.2c05508 – ident: e_1_2_9_11_1 doi: 10.1038/35070613 – ident: e_1_2_9_37_1 doi: 10.1016/j.sbi.2008.10.002 – ident: e_1_2_9_19_1 doi: 10.12688/f1000research.10079.1 – ident: e_1_2_9_40_1 doi: 10.1002/prot.10018 – ident: e_1_2_9_24_1 doi: 10.1016/j.tig.2009.07.006 – ident: e_1_2_9_74_1 doi: 10.1002/1097-0134(20001115)41:3<415::AID-PROT130>3.0.CO;2-7 |
| SSID | ssj0006936 |
| Score | 2.4841766 |
| Snippet | “Newly Born” proteins, devoid of detectable homology to any other proteins, known as orphan proteins, occur in a single species or within a taxonomically... "Newly Born" proteins, devoid of detectable homology to any other proteins, known as orphan proteins, occur in a single species or within a taxonomically... |
| SourceID | unpaywall proquest pubmed crossref wiley |
| SourceType | Open Access Repository Aggregation Database Index Database Enrichment Source Publisher |
| StartPage | 1097 |
| SubjectTerms | Algorithms Amino acids Crystal structure Deep learning Experimental data Homology intrinsically disordered protein Machine learning molten globule Open reading frames orphan protein Polypeptides Protein folding Protein structure protein structure prediction Proteins Secondary structure Statistical analysis taxonomically restricted |
| SummonAdditionalLinks | – databaseName: Unpaywall dbid: UNPAY link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1ba9RAFD7oFik-eGm9rFQZsQgKWZLMJcmTbKulCNYiXahPYZKZWYvZZNnNIutTf4j-uf4Sz0wutCpF8G1IziRzOWfmm8v5DsBu4EuaiRhXqpmOPIbDnydzxjyaM66SPMmNst7IH47E4YS9P-Wnl7z4G36IfsPNWoYbr62Bz5Vpxvn2dD-03mr1CKf0RNyEDcERjQ9gY3J0PP7cE3xyF5XTnrPiqpmynqH0cuarc9IfQPM2bK7KuVx_k0VxFcO6SejgLsiu-M3dk6-jVZ2N8u-_MTv-T_3uwZ0WoZJxo1L34YYut2B7XOLqfLYmL4m7M-o247fg1l6X2tzvIsdtQ_22IhfnP3D8LNZkr1qUF-c_SYUdKkvieCHOyiWxbk-zrNCNJBpUL9mJvCFj4thvib2cPyU1qp0mSus5aWNdTIksptXirP4yWz6AycG7k_1Dr43u4OU0ocKToYiYkib0EyWUyRD50FAG2o8QwSTGcC1kIGVMjclw3RVQnkcBppQIlDsffAiDsir1YyA5NxE-R7CVcaZZnCAmS4xvfC2NUJQP4VXXv2neUp_bCBxF2pA2h6mtWeoaeggvetl5Q_jxV6mdTk3S1uiXqaWu4wjH4ngIz_vX2Pb2DEaWulpZGd9iVBYkQ3jUqFf_Gxr5WPIIc-_2-nZtGV47_blGJD3-9PHEpZ782zd3YFAvVvopAq06e9ba0i-Zhy16 priority: 102 providerName: Unpaywall |
| Title | Do “Newly Born” orphan proteins resemble “Never Born” proteins? A study using three deep learning algorithms |
| URI | https://onlinelibrary.wiley.com/doi/abs/10.1002%2Fprot.26496 https://www.ncbi.nlm.nih.gov/pubmed/37092778 https://www.proquest.com/docview/2835517588 https://www.proquest.com/docview/2805514419 https://onlinelibrary.wiley.com/doi/pdfdirect/10.1002/prot.26496 |
| UnpaywallVersion | publishedVersion |
| Volume | 91 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVWIB databaseName: Wiley Online Library - Core collection (SURFmarket) issn: 0887-3585 databaseCode: DR2 dateStart: 19960101 customDbUrl: isFulltext: true eissn: 1097-0134 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0006936 providerName: Wiley-Blackwell |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1bS9xAFD5YpVgfpNW2brUypVJoIZrLZJIBoaxakULtIi7YpzBJZrZCdrLsZin75g9p_5y_xDOTi0iL0JcwJCdkci4z39y-A7DnuSJIWYwj1VRGDsXmzxEZpU6Q0TDnGc9Ubk4jfztnZ0P69Sq8WoLD9ixMzQ_RTbiZyLDttQlwkc4O7klDDY_BPnbnnD2BFQ-BjPFvnw66dphxmyCwDiNExR05qX9w_-7D7ugvjLkGq3M9EYtfoigewlfb_5w-h_UGOJJ-bekXsCT1Bmz2NQ6axwvygditnHaOfAOeHrWl1eM2odsmVCclub35jc1asSBH5VTf3vwhJepZaGLpGq71jJjTSOO0kLUk-nkn2Yp8Jn1iSWmJ2TM_IhV6gyS5lBPSpKAYEVGMyul19XM8ewnD0y-Xx2dOk3TByQJUmCN8FtFcKN_lOctVioAk8IUn3QiBBVcqlEx4QsSBUikOh7wgzCIPSznzcrts9wqWdanlFpAsVBHeRwyUhlTSmCNU4spVrhSK5UHYg4-t7pOsYSQ3iTGKpOZS9hPzZ4m1Uw_ed7KTmofjn1I7rQmTJhZniWGUCxElxXEP3nWPUfdmaURoWc6NjGugI_V4D17Xpu8-E0Qu1jzCt_c6X3i0Dp-smzwikgwuvl_a0pv_Ed6GZybZfb39cAeWq-lcvkVIVKW71vPxenLh78LK8HzQ_3EHPU0NkA |
| linkProvider | Wiley-Blackwell |
| linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3bbtQwEB1BEdrygKCFslDAiAoJpLS5OE78hLaFaoG2VGgr9S1yYntbKZusdrNC-9YPaX-uX8LYuVQVqBJvVjJWHHtmfMaXMwBbniuClMUYqaYqcii6P0dklDpBRkPJM55paW4jHx6x4Qn9fhqeNmdzzF2Ymh-iW3AzlmH9tTFwsyC9c8MaaogMtnE-5-w-PKDMYyb28ulx54gZtxkCaztCWNyxk_o7N3Vvz0d_gcxH0FsUU7H8LfL8Nn61E9D-E3jcIEcyqIf6KdxTxRqsDwqMmidL8oHYs5x2kXwNHu62pd5em9FtHaovJbm-uES_li_Jbjkrri-uSIkdLQpi-RrOizkx15Emaa5qSVT0TrIV-UwGxLLSEnNofkwqVAdFpFJT0uSgGBORj8vZeXU2mT-Dk_2vo72h02RdcLIAO8wRPouoFNp3uWRSp4hIAl94yo0QWXCtQ8WEJ0QcaJ1iPOQFYRZ5WJLMk3bf7jmsFGWhXgDJQh3hcwRBaUgVjTliJa5d7SqhmQzCPnxs-z7JGkpykxkjT2oyZT8xf5bYcerD-052WhNx_FNqsx3CpDHGeWIo5UKESXHch3fda-x7szciClUujIxrsCP1eB826qHvPhNELrY8wtpbnS7c2YZPVk3uEEmOf_0c2dLL_xF-C73h6PAgOfh29OMVrPqIt-qziJuwUs0W6jXioyp9Y63gD_GnDmM |
| linkToPdf | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1bb9MwFD6CIRh74LIB6xhgxDQJpHS5OBc_oW6lGrcxTZu0t8iJ7TKRJlWbCpWn_ZDx5_ZLOHYu1QBNgjcr-aLE8Tn2Z_v4OwBbjs29JIhwpprI0KLY_Vk8pdTyUuoLlrJUCX0a-fNBsH9CP5z6p3Vsjj4LU-lDtAtu2jNMf60dXI6F2lmohmohgy6O5yy4CbeozyId0dc_WqhHBcxkCKz8CGlxq07q7iyevToe_UEyV2B5lo_5_DvPsqv81QxAg_tVltWp0S3UcSffurMy6aY_flN1_O-6PYB7NTUlvcqWHsINma_CWi_HafloTraJCRY1q_CrcHu3KS3vNSnj1qDsF-Ty_AI7zmxOdotJfnn-kxTYkjwnRhDiLJ8Sfd5plGSyQqIntcgG8pb0iJG9JToqf0hKtDdJhJRjUie5GBKeDYvJWfl1NH0EJ4N3x3v7Vp3WwUo9bBGLu0FIBVeuzUQgVIKUx3O5I-0QqQtTypcBdziPPKUSnHA5np-GDpZE4AizMfgYlvIil-tAUl-FeB1ZVuJTSSOGZIwpW9mSq0B4fgdeN40bp7XmuU69kcWVWrMb65rF5kd34FWLHVdKH39FbTY2EtfePo21Zp2PPCyKOvCyvY3_Xm--8FwWM42xNTmlDuvAk8q22td4oY1fHuLTW62xXfsNb4zxXAOJD4--HJvSxr-AX8Cdw_4g_vT-4ONTuOsin6tiHTdhqZzM5DPkX2Xy3HjZL2Q_Lyw |
| linkToUnpaywall | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1ba9RAFD7oFik-eGm9rFQZsQgKWZLMJcmTbKulCNYiXahPYZKZWYvZZNnNIutTf4j-uf4Sz0wutCpF8G1IziRzOWfmm8v5DsBu4EuaiRhXqpmOPIbDnydzxjyaM66SPMmNst7IH47E4YS9P-Wnl7z4G36IfsPNWoYbr62Bz5Vpxvn2dD-03mr1CKf0RNyEDcERjQ9gY3J0PP7cE3xyF5XTnrPiqpmynqH0cuarc9IfQPM2bK7KuVx_k0VxFcO6SejgLsiu-M3dk6-jVZ2N8u-_MTv-T_3uwZ0WoZJxo1L34YYut2B7XOLqfLYmL4m7M-o247fg1l6X2tzvIsdtQ_22IhfnP3D8LNZkr1qUF-c_SYUdKkvieCHOyiWxbk-zrNCNJBpUL9mJvCFj4thvib2cPyU1qp0mSus5aWNdTIksptXirP4yWz6AycG7k_1Dr43u4OU0ocKToYiYkib0EyWUyRD50FAG2o8QwSTGcC1kIGVMjclw3RVQnkcBppQIlDsffAiDsir1YyA5NxE-R7CVcaZZnCAmS4xvfC2NUJQP4VXXv2neUp_bCBxF2pA2h6mtWeoaeggvetl5Q_jxV6mdTk3S1uiXqaWu4wjH4ngIz_vX2Pb2DEaWulpZGd9iVBYkQ3jUqFf_Gxr5WPIIc-_2-nZtGV47_blGJD3-9PHEpZ782zd3YFAvVvopAq06e9ba0i-Zhy16 |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Do+%E2%80%9CNewly+Born%E2%80%9D+orphan+proteins+resemble+%E2%80%9CNever+Born%E2%80%9D+proteins%3F+A+study+using+three+deep+learning+algorithms&rft.jtitle=Proteins%2C+structure%2C+function%2C+and+bioinformatics&rft.au=Liu%2C+Jing&rft.au=Yuan%2C+Rongqing&rft.au=Shao%2C+Wei&rft.au=Wang%2C+Jitong&rft.date=2023-08-01&rft.pub=John+Wiley+%26+Sons%2C+Inc&rft.issn=0887-3585&rft.eissn=1097-0134&rft.volume=91&rft.issue=8&rft.spage=1097&rft.epage=1115&rft_id=info:doi/10.1002%2Fprot.26496&rft.externalDBID=10.1002%252Fprot.26496&rft.externalDocID=PROT26496 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0887-3585&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0887-3585&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0887-3585&client=summon |