A generative vision model that trains with high data efficiency and breaks text-based CAPTCHAs
Learning from a few examples and generalizing to markedly different situations are capabilities of human visual intelligence that are yet to be matched by leading machine learning models. By drawing inspiration from systems neuroscience, we introduce a probabilistic generative model for vision in wh...
Saved in:
| Published in | Science (American Association for the Advancement of Science) Vol. 358; no. 6368 |
|---|---|
| Main Authors | , , , , , , , , , , , |
| Format | Journal Article |
| Language | English |
| Published |
United States
08.12.2017
|
| Online Access | Get more information |
| ISSN | 1095-9203 |
| DOI | 10.1126/science.aag2612 |
Cover
| Abstract | Learning from a few examples and generalizing to markedly different situations are capabilities of human visual intelligence that are yet to be matched by leading machine learning models. By drawing inspiration from systems neuroscience, we introduce a probabilistic generative model for vision in which message-passing-based inference handles recognition, segmentation, and reasoning in a unified way. The model demonstrates excellent generalization and occlusion-reasoning capabilities and outperforms deep neural networks on a challenging scene text recognition benchmark while being 300-fold more data efficient. In addition, the model fundamentally breaks the defense of modern text-based CAPTCHAs (Completely Automated Public Turing test to tell Computers and Humans Apart) by generatively segmenting characters without CAPTCHA-specific heuristics. Our model emphasizes aspects such as data efficiency and compositionality that may be important in the path toward general artificial intelligence. |
|---|---|
| AbstractList | Learning from a few examples and generalizing to markedly different situations are capabilities of human visual intelligence that are yet to be matched by leading machine learning models. By drawing inspiration from systems neuroscience, we introduce a probabilistic generative model for vision in which message-passing-based inference handles recognition, segmentation, and reasoning in a unified way. The model demonstrates excellent generalization and occlusion-reasoning capabilities and outperforms deep neural networks on a challenging scene text recognition benchmark while being 300-fold more data efficient. In addition, the model fundamentally breaks the defense of modern text-based CAPTCHAs (Completely Automated Public Turing test to tell Computers and Humans Apart) by generatively segmenting characters without CAPTCHA-specific heuristics. Our model emphasizes aspects such as data efficiency and compositionality that may be important in the path toward general artificial intelligence. |
| Author | Laan, Christopher George, Dileep Lázaro-Gredilla, Miguel Kansky, Ken Marthi, Bhaskara Liu, Yi Lou, Xinghua Lavin, Alex Wang, Huayan Lehrach, Wolfgang Meng, Zhaoshi Phoenix, D Scott |
| Author_xml | – sequence: 1 givenname: Dileep orcidid: 0000-0002-4948-6297 surname: George fullname: George, Dileep email: dileep@vicarious.com, miguel@vicarious.com organization: Vicarious AI, 2 Union Square, Union City, CA 94587, USA. dileep@vicarious.com miguel@vicarious.com – sequence: 2 givenname: Wolfgang orcidid: 0000-0001-8258-2961 surname: Lehrach fullname: Lehrach, Wolfgang organization: Vicarious AI, 2 Union Square, Union City, CA 94587, USA – sequence: 3 givenname: Ken orcidid: 0000-0003-1579-9882 surname: Kansky fullname: Kansky, Ken organization: Vicarious AI, 2 Union Square, Union City, CA 94587, USA – sequence: 4 givenname: Miguel orcidid: 0000-0002-4528-5084 surname: Lázaro-Gredilla fullname: Lázaro-Gredilla, Miguel email: dileep@vicarious.com, miguel@vicarious.com organization: Vicarious AI, 2 Union Square, Union City, CA 94587, USA. dileep@vicarious.com miguel@vicarious.com – sequence: 5 givenname: Christopher surname: Laan fullname: Laan, Christopher organization: Vicarious AI, 2 Union Square, Union City, CA 94587, USA – sequence: 6 givenname: Bhaskara surname: Marthi fullname: Marthi, Bhaskara organization: Vicarious AI, 2 Union Square, Union City, CA 94587, USA – sequence: 7 givenname: Xinghua surname: Lou fullname: Lou, Xinghua organization: Vicarious AI, 2 Union Square, Union City, CA 94587, USA – sequence: 8 givenname: Zhaoshi surname: Meng fullname: Meng, Zhaoshi organization: Vicarious AI, 2 Union Square, Union City, CA 94587, USA – sequence: 9 givenname: Yi orcidid: 0000-0003-2745-6940 surname: Liu fullname: Liu, Yi organization: Vicarious AI, 2 Union Square, Union City, CA 94587, USA – sequence: 10 givenname: Huayan surname: Wang fullname: Wang, Huayan organization: Vicarious AI, 2 Union Square, Union City, CA 94587, USA – sequence: 11 givenname: Alex orcidid: 0000-0003-3422-7820 surname: Lavin fullname: Lavin, Alex organization: Vicarious AI, 2 Union Square, Union City, CA 94587, USA – sequence: 12 givenname: D Scott surname: Phoenix fullname: Phoenix, D Scott organization: Vicarious AI, 2 Union Square, Union City, CA 94587, USA |
| BackLink | https://www.ncbi.nlm.nih.gov/pubmed/29074582$$D View this record in MEDLINE/PubMed |
| BookMark | eNo1j8tOwzAURC0Eog9Ys0P3B1L8SJx6GVVQkCrBArZUN_F1YmjdKjaF_j0Rj9VoZqQZnQk7DbtAjF0JPhNC6pvYeAoNzRBbqYU8YWPBTZEZydWITWJ843zwRp2zkTS8zIu5HLPXCloK1GPyB4KDj34XYLuztIHUYYLUow8RPn3qoPNtBxYTAjnnf96OgMFC3RO-R0j0lbIaI1lYVE_Pi_sqXrAzh5tIl386ZS93t0OTrR6XD4tqlTUqL1OmTNGYgiuN2iHZUhgha61FIQo9FzkZMzc5KYuo0MocXVM7zrUcgHgth2jKrn939x_1lux63_st9sf1P6j8Bih9VjM |
| CitedBy_id | crossref_primary_10_1007_s10462_021_10130_z crossref_primary_10_1016_j_neucom_2019_04_055 crossref_primary_10_1145_3231589 crossref_primary_10_1038_s41467_021_22559_5 crossref_primary_10_1007_s11704_020_9001_8 crossref_primary_10_2197_ipsjjip_26_625 crossref_primary_10_1007_s12559_023_10181_0 crossref_primary_10_1016_j_patcog_2020_107482 crossref_primary_10_1145_3378446 crossref_primary_10_1016_j_neucom_2024_128701 crossref_primary_10_1038_s41467_018_07289_5 crossref_primary_10_1109_ACCESS_2020_2982945 crossref_primary_10_1021_acs_chemrev_8b00719 crossref_primary_10_1016_j_neucom_2018_03_081 crossref_primary_10_1007_s11263_020_01401_3 crossref_primary_10_1088_1361_6463_ab8036 crossref_primary_10_5937_kultura2484151G crossref_primary_10_1126_sciadv_adr6698 crossref_primary_10_1162_neco_a_01446 crossref_primary_10_1073_pnas_2221122120 crossref_primary_10_1109_TIFS_2018_2821096 crossref_primary_10_3390_app12084059 crossref_primary_10_1145_3505226 crossref_primary_10_1016_j_conb_2019_02_011 crossref_primary_10_1016_j_neunet_2024_106821 crossref_primary_10_1371_journal_pone_0230432 crossref_primary_10_7717_peerj_cs_879 crossref_primary_10_1017_S0140525X23002777 crossref_primary_10_1016_j_neucom_2019_08_109 crossref_primary_10_1109_TNNLS_2020_3005574 crossref_primary_10_1038_s41467_024_46976_4 crossref_primary_10_1109_TMM_2020_3013376 crossref_primary_10_1002_cpe_5585 crossref_primary_10_1016_j_chemolab_2020_104070 crossref_primary_10_1145_3559754 crossref_primary_10_2174_1574887115666200621183459 crossref_primary_10_1016_j_autcon_2020_103535 crossref_primary_10_1038_s41467_023_36346_x crossref_primary_10_3390_app9112331 crossref_primary_10_1007_s10489_019_01607_0 crossref_primary_10_1002_asi_24334 crossref_primary_10_2478_popets_2021_0055 crossref_primary_10_1007_s00521_020_05345_0 crossref_primary_10_20965_jaciii_2021_p0450 crossref_primary_10_1007_s11042_021_11485_9 crossref_primary_10_1073_pnas_1905544116 crossref_primary_10_1126_scirobotics_aav3150 crossref_primary_10_3389_fncom_2020_00039 crossref_primary_10_1007_s11263_020_01405_z crossref_primary_10_1007_s10462_018_9631_5 crossref_primary_10_1109_ACCESS_2024_3442976 crossref_primary_10_3389_frai_2023_974295 crossref_primary_10_3390_math8030332 crossref_primary_10_1371_journal_pcbi_1010269 crossref_primary_10_1609_aimag_v40i3_4810 crossref_primary_10_1109_TIP_2019_2944560 crossref_primary_10_1002_cpe_4769 crossref_primary_10_1016_j_neucom_2020_11_057 crossref_primary_10_1016_j_patcog_2018_07_027 crossref_primary_10_1152_jn_00160_2021 crossref_primary_10_1109_ACCESS_2019_2956508 crossref_primary_10_1109_TDSC_2023_3238408 crossref_primary_10_1007_s00500_023_09225_2 crossref_primary_10_1016_j_cose_2021_102178 crossref_primary_10_1016_j_petsci_2023_02_017 crossref_primary_10_1016_j_neucom_2024_129319 crossref_primary_10_1111_jan_15981 crossref_primary_10_1016_j_cub_2024_10_074 crossref_primary_10_1016_j_chemolab_2020_104123 crossref_primary_10_5188_ijsmer_25_212 crossref_primary_10_3756_artsci_17_52 crossref_primary_10_1016_j_neucom_2019_04_088 crossref_primary_10_1360_TB_2022_1266 crossref_primary_10_1016_j_conb_2019_01_010 crossref_primary_10_1016_j_jisa_2022_103318 crossref_primary_10_1080_0952813X_2022_2078889 crossref_primary_10_1093_pnasnexus_pgad337 crossref_primary_10_3390_app13074602 crossref_primary_10_3389_fncom_2020_554097 crossref_primary_10_1049_cvi2_12340 crossref_primary_10_1002_ail2_43 crossref_primary_10_1038_s41746_019_0148_3 crossref_primary_10_1097_APO_0000000000000301 crossref_primary_10_3390_jimaging7060092 crossref_primary_10_1016_j_conb_2020_11_009 crossref_primary_10_1126_sciadv_aax5979 crossref_primary_10_1007_s40747_024_01661_3 crossref_primary_10_1162_neco_a_01510 crossref_primary_10_1109_JIOT_2020_3040441 crossref_primary_10_1109_ACCESS_2020_3030076 crossref_primary_10_1109_TNNLS_2018_2886008 crossref_primary_10_1016_j_compchemeng_2020_107068 crossref_primary_10_3103_S0147688221060113 crossref_primary_10_1016_j_neucom_2020_09_075 crossref_primary_10_1016_j_patcog_2020_107305 crossref_primary_10_1146_annurev_vision_091718_014951 crossref_primary_10_1016_j_compeleceng_2021_107593 crossref_primary_10_1016_j_nic_2020_06_002 crossref_primary_10_1049_ise2_12047 crossref_primary_10_1016_j_jacr_2023_12_009 crossref_primary_10_1016_j_neucom_2022_04_130 crossref_primary_10_1523_ENEURO_0443_17_2018 crossref_primary_10_1073_pnas_2023123118 crossref_primary_10_3390_min13091187 crossref_primary_10_3390_su16031166 crossref_primary_10_1016_j_survophthal_2022_08_005 crossref_primary_10_1063_1_5129306 crossref_primary_10_1017_S0140525X22002813 crossref_primary_10_1162_neco_a_01627 crossref_primary_10_1007_s11263_022_01666_w crossref_primary_10_1016_j_futures_2018_11_007 crossref_primary_10_1109_TIFS_2019_2928622 crossref_primary_10_3389_fnint_2023_1108271 crossref_primary_10_7717_peerj_cs_613 crossref_primary_10_1167_jov_23_5_16 crossref_primary_10_3390_ijms24032266 crossref_primary_10_1109_MCE_2019_2936631 crossref_primary_10_1016_j_neubiorev_2020_07_005 crossref_primary_10_1109_TVCG_2022_3147154 |
| ContentType | Journal Article |
| Copyright | Copyright © 2017 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works. |
| Copyright_xml | – notice: Copyright © 2017 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works. |
| DBID | NPM |
| DOI | 10.1126/science.aag2612 |
| DatabaseName | PubMed |
| DatabaseTitle | PubMed |
| DatabaseTitleList | PubMed |
| Database_xml | – sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database |
| DeliveryMethod | no_fulltext_linktorsrc |
| Discipline | Sciences (General) Biology |
| EISSN | 1095-9203 |
| ExternalDocumentID | 29074582 |
| Genre | Journal Article |
| GroupedDBID | --- --Z -DZ -ET -~X .-4 ..I .55 .DC 08G 0B8 0R~ 0WA 123 18M 2FS 2KS 2WC 34G 36B 39C 3R3 53G 5RE 66. 6OB 6TJ 7X2 7~K 85S 8F7 AABCJ AACGO AAIKC AAMNW AANCE AAWTO ABBHK ABCQX ABDBF ABDEX ABDQB ABEFU ABIVO ABJNI ABOCM ABPLY ABPPZ ABQIJ ABTLG ABWJO ABXSQ ABZEH ACBEA ACBEC ACGFO ACGFS ACGOD ACHIC ACIWK ACMJI ACNCT ACPRK ACQOY ACUHS ADDRP ADMHC ADUKH AEGBM AENEX AEUPB AEXZC AFFDN AFFNX AFHKK AFQFN AFRAH AGFXO AGNAY AGSOS AHMBA AIDAL AIDUJ AJGZS ALIPV ALMA_UNASSIGNED_HOLDINGS AQVQM ASPBG AVWKF BKF BLC C45 C51 CS3 DB2 DCCCD DU5 EBS EJD EMOBN ESX F5P FA8 FEDTE GX1 HZ~ I.T IAO IEA IGG IGS IH2 IHR INH INR IOF IOV IPO IPSME IPY ISE JAAYA JBMMH JCF JENOY JHFFW JKQEH JLS JLXEF JPM JSG JST K-O KCC L7B LSO LU7 M0P MQT MVM N9A NEJ NHB NPM O9- OCB OFXIZ OGEVE OK1 OMK OVD P-O P2P PKN PQQKQ PZZ QS- RHF RHI RXW SA0 SC5 SJN TAE TEORI TN5 TWZ UBW UCV UHB UIG UKR UMD UNMZH UQL USG VQA VVN WH7 WI4 X7M XJF XZL Y6R YCJ YIF YIN YK4 YKV YNT YOJ YR2 YR5 YRY YSQ YV5 YWH YYP YYQ YZZ ZCA ZE2 ~02 ~G0 ~KM ~ZZ |
| ID | FETCH-LOGICAL-c347t-395c95036a6faed71912b6615156814e99894e3daa3ad24afcbf00620090b2ad2 |
| IngestDate | Wed Feb 19 02:43:42 EST 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 6368 |
| Language | English |
| License | Copyright © 2017 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works. |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c347t-395c95036a6faed71912b6615156814e99894e3daa3ad24afcbf00620090b2ad2 |
| ORCID | 0000-0002-4948-6297 0000-0002-4528-5084 0000-0003-2745-6940 0000-0003-3422-7820 0000-0003-1579-9882 0000-0001-8258-2961 |
| PMID | 29074582 |
| ParticipantIDs | pubmed_primary_29074582 |
| PublicationCentury | 2000 |
| PublicationDate | 2017-Dec-08 |
| PublicationDateYYYYMMDD | 2017-12-08 |
| PublicationDate_xml | – month: 12 year: 2017 text: 2017-Dec-08 day: 08 |
| PublicationDecade | 2010 |
| PublicationPlace | United States |
| PublicationPlace_xml | – name: United States |
| PublicationTitle | Science (American Association for the Advancement of Science) |
| PublicationTitleAlternate | Science |
| PublicationYear | 2017 |
| SSID | ssj0009593 |
| Score | 2.6479206 |
| Snippet | Learning from a few examples and generalizing to markedly different situations are capabilities of human visual intelligence that are yet to be matched by... |
| SourceID | pubmed |
| SourceType | Index Database |
| Title | A generative vision model that trains with high data efficiency and breaks text-based CAPTCHAs |
| URI | https://www.ncbi.nlm.nih.gov/pubmed/29074582 |
| Volume | 358 |
| hasFullText | |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LT9wwELYWUKVeEAuF8pQPHKhWQYmTOMkxWkERL3EAlVORnThL1bK72l0O5U_1LzKTcTYWLRL0EkV2EkWeLzPjyXwzjO0XgZJZpIwnywA2KMJoD0lpXin8KqwybUSKfOeLS3lyE53exredzh8na-lxpg-Lp3_ySv5HqjAGckWW7DskO38oDMA5yBeOIGE4vknGOTZANrZ2N7HEqbUNuJNIUsT2D5a-hmWJe5gOihkcP-oPmiovwZZY_Zz2MAHEQ5NW9vr51XX_JJ-6fmujAsAfnf_jcSQ7T1bMKaWgyTCwtznhBorB15oO9JEZtwlB9xNFbam-jX5VA2UtKtoCsKYU4T1reWvn9R_-4ElNRt7XCVhggDPRAAaPlgVgYxlgHzEvhNSvIf3rY-tI4Yeugg7j1EGiDKkPz9-63-lWaQ6VGmB1NPdKEN74oYaCwJBAnL5h9kUx7mZqgS0kCXYKucTgkFPj2VaPcnhZzZtg2Wl794stTO3KXK-wZbsH4TkBqss6ZrjKPlBX0t-rrGulNuUHtij5lzX2Pect1jhhjddY44g1TljjiDWOWOOINd5ijQPWOGGNt1jjDdY-sZvjIzjzbG8OrwijZOaFWVxkMbg_SlbKlAls-4WWtXss0yAyGRb2N2GpVKhKEamq0BXydWGVfC1gaJ0tDkdD85lx7VdSY15w4KeRLlOdGAmOLDj-aRwlQm6yDVqtuzEVYLlr1nHr1Zlt9rHF1w5bquCLN7vgPs70Xi20Z5UDcHg |
| linkProvider | National Library of Medicine |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+generative+vision+model+that+trains+with+high+data+efficiency+and+breaks+text-based+CAPTCHAs&rft.jtitle=Science+%28American+Association+for+the+Advancement+of+Science%29&rft.au=George%2C+Dileep&rft.au=Lehrach%2C+Wolfgang&rft.au=Kansky%2C+Ken&rft.au=L%C3%A1zaro-Gredilla%2C+Miguel&rft.date=2017-12-08&rft.eissn=1095-9203&rft.volume=358&rft.issue=6368&rft_id=info:doi/10.1126%2Fscience.aag2612&rft_id=info%3Apmid%2F29074582&rft_id=info%3Apmid%2F29074582&rft.externalDocID=29074582 |