GFTE: Graph-Based Financial Table Extraction
Tabular data is a crucial form of information expression, which can organize data in a standard structure for easy information retrieval and comparison. However, in financial industry and many other fields, tables are often disclosed in unstructured digital files, e.g. Portable Document Format (PDF)...
Saved in:
| Published in | Pattern Recognition. ICPR International Workshops and Challenges Vol. 12662; pp. 644 - 658 |
|---|---|
| Main Authors | , , , , , |
| Format | Book Chapter |
| Language | English |
| Published |
Switzerland
Springer International Publishing AG
2021
Springer International Publishing |
| Series | Lecture Notes in Computer Science |
| Subjects | |
| Online Access | Get full text |
| ISBN | 9783030687892 3030687899 |
| ISSN | 0302-9743 1611-3349 |
| DOI | 10.1007/978-3-030-68790-8_50 |
Cover
| Abstract | Tabular data is a crucial form of information expression, which can organize data in a standard structure for easy information retrieval and comparison. However, in financial industry and many other fields, tables are often disclosed in unstructured digital files, e.g. Portable Document Format (PDF) and images, which are difficult to be extracted directly. In this paper, to facilitate deep learning based table extraction from unstructured digital files, we publish a standard Chinese dataset named FinTab, which contains more than 1,600 financial tables of diverse kinds and their corresponding structure representation in JSON. In addition, we propose a novel graph-based convolutional neural network model named GFTE as a baseline for future comparison. GFTE integrates image feature, position feature and textual feature together for precise edge prediction and reaches overall good results https://github.com/Irene323/GFTE. |
|---|---|
| AbstractList | Tabular data is a crucial form of information expression, which can organize data in a standard structure for easy information retrieval and comparison. However, in financial industry and many other fields, tables are often disclosed in unstructured digital files, e.g. Portable Document Format (PDF) and images, which are difficult to be extracted directly. In this paper, to facilitate deep learning based table extraction from unstructured digital files, we publish a standard Chinese dataset named FinTab, which contains more than 1,600 financial tables of diverse kinds and their corresponding structure representation in JSON. In addition, we propose a novel graph-based convolutional neural network model named GFTE as a baseline for future comparison. GFTE integrates image feature, position feature and textual feature together for precise edge prediction and reaches overall good results https://github.com/Irene323/GFTE. |
| Author | Yan, Junchi Zhou, Yi Li, Yiren Huang, Zheng Ye, Fan Liu, Xianhui |
| Author_xml | – sequence: 1 givenname: Yiren orcidid: 0000-0002-8684-628X surname: Li fullname: Li, Yiren email: irene716@sjtu.edu.cn – sequence: 2 givenname: Zheng surname: Huang fullname: Huang, Zheng – sequence: 3 givenname: Junchi surname: Yan fullname: Yan, Junchi – sequence: 4 givenname: Yi surname: Zhou fullname: Zhou, Yi – sequence: 5 givenname: Fan surname: Ye fullname: Ye, Fan – sequence: 6 givenname: Xianhui surname: Liu fullname: Liu, Xianhui |
| BookMark | eNpNUMtOwzAQNFAQbekfcMgHYFg_Y3ODqi1IlbiUs2UnGxqIkhAHic_HpRw4rHY1uzPamRmZtF2LhFwzuGUA-Z3NDRUUBFBtcgvUOAUnZJFgkcBfzJySKdOMUSGkPfu_M5ZPyDTNnNpcigsyY1wy0DpX6pIsYnwHAK6ACyWm5Gaz3q3us83g-z199BHLbF23vi1q32Q7HxrMVt_j4Iux7torcl75JuLir8_J63q1Wz7R7cvmefmwpT1TBqgKBVYeqxJKlMiUlb7gQVcs18YgllUQSrJgg_cFKggix9JXLJh0xlAKMSf8qBv7oW7fcHCh6z6iY-AO-bhk1gmXLLrfLNwhn0SSR1I_dJ9fGEeHB1aBbXq_Kfa-H3GITiswklun81TKih9Or2SI |
| ContentType | Book Chapter |
| Copyright | Springer Nature Switzerland AG 2021 |
| Copyright_xml | – notice: Springer Nature Switzerland AG 2021 |
| DBID | FFUUA |
| DOI | 10.1007/978-3-030-68790-8_50 |
| DatabaseName | ProQuest Ebook Central - Book Chapters - Demo use only |
| DatabaseTitleList | |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Applied Sciences Computer Science |
| EISBN | 9783030687908 3030687902 |
| EISSN | 1611-3349 |
| Editor | Del Bimbo, Alberto Bertini, Marco Vezzani, Roberto Sclaroff, Stan Mei, Tao Farinella, Giovanni Maria Cucchiara, Rita Escalante, Hugo Jair |
| Editor_xml | – sequence: 1 fullname: Del Bimbo, Alberto – sequence: 2 fullname: Bertini, Marco – sequence: 3 fullname: Vezzani, Roberto – sequence: 4 fullname: Sclaroff, Stan – sequence: 5 fullname: Mei, Tao – sequence: 6 fullname: Farinella, Giovanni Maria – sequence: 7 fullname: Cucchiara, Rita – sequence: 8 fullname: Escalante, Hugo Jair |
| EndPage | 658 |
| ExternalDocumentID | EBC6508429_679_659 |
| GroupedDBID | 38. AABBV AABLV ABNDO ACWLQ AEDXK AEJLV AEKFX AELOD AIYYB ALMA_UNASSIGNED_HOLDINGS ARRAB BAHJK BBABE CZZ DBWEY FFUUA I4C IEZ OCUHQ ORHYB SBO TPJZQ TSXQS Z5O Z7R Z7S Z7U Z7W Z7X Z7Y Z7Z Z81 Z82 Z83 Z84 Z85 Z87 Z88 -DT -GH -~X 1SB 29L 2HA 2HV 5QI 875 AASHB ABMNI ACGFS ADCXD AEFIE EJD F5P FEDTE HVGLF LAS LDH P2P RNI RSU SVGTG VI1 ~02 |
| ID | FETCH-LOGICAL-p1580-5bcefaefd0de4e1594ac2b6f17688eedfb3541b9baace50b37edaf1b8ac21e433 |
| ISBN | 9783030687892 3030687899 |
| ISSN | 0302-9743 |
| IngestDate | Wed Sep 17 04:49:35 EDT 2025 Tue Oct 21 01:39:40 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | false |
| LCCallNum | TA1634 |
| Language | English |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-p1580-5bcefaefd0de4e1594ac2b6f17688eedfb3541b9baace50b37edaf1b8ac21e433 |
| OCLC | 1241066755 |
| ORCID | 0000-0002-8684-628X |
| PQID | EBC6508429_679_659 |
| PageCount | 15 |
| ParticipantIDs | springer_books_10_1007_978_3_030_68790_8_50 proquest_ebookcentralchapters_6508429_679_659 |
| PublicationCentury | 2000 |
| PublicationDate | 2021 |
| PublicationDateYYYYMMDD | 2021-01-01 |
| PublicationDate_xml | – year: 2021 text: 2021 |
| PublicationDecade | 2020 |
| PublicationPlace | Switzerland |
| PublicationPlace_xml | – name: Switzerland – name: Cham |
| PublicationSeriesSubtitle | Image Processing, Computer Vision, Pattern Recognition, and Graphics |
| PublicationSeriesTitle | Lecture Notes in Computer Science |
| PublicationSeriesTitleAlternate | Lect.Notes Computer |
| PublicationSubtitle | Virtual Event, January 10-15, 2021, Proceedings, Part II |
| PublicationTitle | Pattern Recognition. ICPR International Workshops and Challenges |
| PublicationYear | 2021 |
| Publisher | Springer International Publishing AG Springer International Publishing |
| Publisher_xml | – name: Springer International Publishing AG – name: Springer International Publishing |
| RelatedPersons | Hartmanis, Juris Gao, Wen Bertino, Elisa Woeginger, Gerhard Goos, Gerhard Steffen, Bernhard Yung, Moti |
| RelatedPersons_xml | – sequence: 1 givenname: Gerhard surname: Goos fullname: Goos, Gerhard – sequence: 2 givenname: Juris surname: Hartmanis fullname: Hartmanis, Juris – sequence: 3 givenname: Elisa surname: Bertino fullname: Bertino, Elisa – sequence: 4 givenname: Wen surname: Gao fullname: Gao, Wen – sequence: 5 givenname: Bernhard orcidid: 0000-0001-9619-1558 surname: Steffen fullname: Steffen, Bernhard – sequence: 6 givenname: Gerhard orcidid: 0000-0001-8816-2693 surname: Woeginger fullname: Woeginger, Gerhard – sequence: 7 givenname: Moti surname: Yung fullname: Yung, Moti |
| SSID | ssj0002502353 ssj0002792 |
| Score | 1.825403 |
| Snippet | Tabular data is a crucial form of information expression, which can organize data in a standard structure for easy information retrieval and comparison.... |
| SourceID | springer proquest |
| SourceType | Publisher |
| StartPage | 644 |
| SubjectTerms | Deep learning Document analysis Document image processing |
| Title | GFTE: Graph-Based Financial Table Extraction |
| URI | http://ebookcentral.proquest.com/lib/SITE_ID/reader.action?docID=6508429&ppg=659 http://link.springer.com/10.1007/978-3-030-68790-8_50 |
| Volume | 12662 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Lj9MwELZKuSAOvMUuD-XArQTFsZMmSBygaresygqhLlq4WPEj6l7a1SaVED-BX81MbCdp2MtyaFRZTuzMOOOZz57PhLzRYPBMLMsQXGsWcpnqUGYaviuupDIF56rh2f5yli7P-elFcjEa_entWtrX8p36fWNeyf9oFcpAr5glewvNtg-FAvgP-oUraBiuA-f3EGa1pBcNMyYC8G4LELQ6-Tz7-m2A8iEcXm12V5aMeeYPT2l96VWznv_jspcUttw7FPnnxriZDQ1D4fI4tmpz2QHOu729vz_4ThbrOWINJ0iHHX6CmVJPFi23x7pJ15r_qq9tVoW1bMi4XH1YuTWNs13dbBWb-GMnvBXqwxQxHcAUHqYciKDD2g7iWoaRTDbN8j70ycB2Q_RjzaGx5jpFEkZmSU-dCU4tn6SbzVNLDP_PRNHfGwJPDqE1zKkXCP_cgQ6Myd2P89PV9xavA1cxZkk3yyPxol2hsr3CvCHf69wyO3Vv0cvZvKnJg-hmsCDf-Dnrh-Q-5r4EmJQC8ntERmb7mDxwkUrgNFBBkdeKL3tC3qLO3wc9jQetxoNG40Gn8afkfDFfz5ahO4ojvKJJFoUJfLhlYUodacMNuMC8ULFMSwrRagZuVilZwqnMZVEok0SSTY0uSiozqEYNZ-wZGW93W_OcBHnEkrRkqoi04jGX-ZQqGVNW5JrlJqNHJPTCEM2GAbdLWdlXrwTGFOBFiXQKvyQ_IhMvMYHVK-GZuEHUggkQtWhELVDUx7eq_YLc68bySzKur_fmFTihtXztxsdfRD19CQ |
| linkProvider | Library Specific Holdings |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.title=Pattern+Recognition.+ICPR+International+Workshops+and+Challenges&rft.au=Li%2C+Yiren&rft.au=Huang%2C+Zheng&rft.au=Yan%2C+Junchi&rft.au=Zhou%2C+Yi&rft.atitle=GFTE%3A+Graph-Based+Financial+Table+Extraction&rft.series=Lecture+Notes+in+Computer+Science&rft.date=2021-01-01&rft.pub=Springer+International+Publishing&rft.isbn=9783030687892&rft.issn=0302-9743&rft.eissn=1611-3349&rft.spage=644&rft.epage=658&rft_id=info:doi/10.1007%2F978-3-030-68790-8_50 |
| thumbnail_s | http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=https%3A%2F%2Febookcentral.proquest.com%2Fcovers%2F6508429-l.jpg |