Using standoff properties for marking-up historical documents in the humanities
Markup in the form of tags is often embedded into documents to describe formatting structures and other features, as in HTML on the Web. But in the humanities, the use of embedded markup for the transcription of historical documents leads to problems in the representation of overlapping features, an...
Saved in:
Published in | Information technology (Munich, Germany) Vol. 58; no. 2; pp. 63 - 69 |
---|---|
Main Author | |
Format | Journal Article |
Language | English |
Published |
De Gruyter Oldenbourg
01.03.2016
|
Subjects | |
Online Access | Get full text |
ISSN | 1611-2776 2196-7032 2196-7032 |
DOI | 10.1515/itit-2015-0030 |
Cover
Abstract | Markup in the form of tags is often embedded into documents to describe formatting structures and other features, as in HTML on the
Web. But in the humanities, the use of embedded markup for the transcription of historical documents leads to problems in the
representation of overlapping features, and subjective variation in the use of different markup tags for the same features compromises
interoperability of the transcriptions. “Standoff” techniques, in which the markup and the text it describes are stored separately, can
help alleviate these problems. “Standoff properties” is a technique for recording textual properties that do not conform to
a context-free grammar, and can freely overlap. This allows a divide-and-conquer approach to markup, whereby sets of markup properties
can record different aspects of a text, which can then be recombined as needed. Despite these advantages, standoff techniques are
usually considered impractical when both the underlying text and its markup are subject to change. To circumvent this problem, this
paper describes a practical algorithm for updating a set of standoff markup properties separately from the text. |
---|---|
AbstractList | Markup in the form of tags is often embedded into documents to describe formatting structures and other features, as in HTML on the
Web. But in the humanities, the use of embedded markup for the transcription of historical documents leads to problems in the
representation of overlapping features, and subjective variation in the use of different markup tags for the same features compromises
interoperability of the transcriptions. “Standoff” techniques, in which the markup and the text it describes are stored separately, can
help alleviate these problems. “Standoff properties” is a technique for recording textual properties that do not conform to
a context-free grammar, and can freely overlap. This allows a divide-and-conquer approach to markup, whereby sets of markup properties
can record different aspects of a text, which can then be recombined as needed. Despite these advantages, standoff techniques are
usually considered impractical when both the underlying text and its markup are subject to change. To circumvent this problem, this
paper describes a practical algorithm for updating a set of standoff markup properties separately from the text. |
Author | Schmidt, Desmond Allan |
Author_xml | – sequence: 1 givenname: Desmond Allan surname: Schmidt fullname: Schmidt, Desmond Allan email: d.schmidt1@uq.edu.au organization: 1 University of Queensland, Brisbane, QLD, Australia |
BookMark | eNqFkMtqwzAQRUVJoUnabdf6ASd6WJZMVyX0BYFsmrVRLClR6khGkin--8qk29LVDAzncucswMx5pwF4xGiFGWZrm2wqCMKsQIiiGzAnuK4KjiiZgTmuMC4I59UdWMR4RojUXOA52O2jdUcYk3TKGwP74HsdktURGh_gRYavfC-GHp5sTD7YVnZQ-Xa4aJcitA6mk4an4SKdnah7cGtkF_XD71yC_evL5-a92O7ePjbP26IlZZ1rmlKUSDB-aEnLlOJCHpSRVEkjckvKS1SVktUKcUIEbjEissKGE611LfGBLsH6mju4Xo7fsuuaPthcd2wwaiYfzeSjmXw0k49MrK5EG3yMQZv_gacrkNOTDkofwzDmpTn7Ibj83B8gE6Si9AfNqXoi |
ContentType | Journal Article |
DBID | AAYXX CITATION ADTOC UNPAY |
DOI | 10.1515/itit-2015-0030 |
DatabaseName | CrossRef Unpaywall for CDI: Periodical Content Unpaywall |
DatabaseTitle | CrossRef |
DatabaseTitleList | CrossRef |
Database_xml | – sequence: 1 dbid: UNPAY name: Unpaywall url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/ sourceTypes: Open Access Repository |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering |
EISSN | 2196-7032 |
EndPage | 69 |
ExternalDocumentID | 10.1515/itit-2015-0030 10_1515_itit_2015_0030 10_1515_itit_2015_003058263 |
GroupedDBID | 0R~ 4.4 5GY 6FP AAAEU AADQG AAFPC AAGVJ AAJBH AALGR AAONY AAOUV AAPJK AARVR AASOL AASQH AAWFC AAXCG ABAOT ABAQN ABFKT ABIQR ABJNI ABMBZ ABPLS ABRQL ABSOE ABUVI ABWLS ABXMZ ABYKJ ACDEB ACEFL ACGFS ACMKP ACPMA ACUND ACXLN ADALX ADEQT ADGQD ADGYE ADJVZ ADNPR ADOZN AECWL AEGVQ AEICA AEKEB AEQDQ AERZL AEXIE AFBAA AFBDD AFGNR AFQUK AFYRI AGBEV AHVWV AHXUK AIERV AIKXB AIWOI AJATJ AKXKS ALMA_UNASSIGNED_HOLDINGS AMAVY ASYPN AZMOX BAKPI BBCWN BCIFA BLHJL CFGNV CS3 DSRVY EBS EJD FSTRU HZ~ IY9 KDIRW O9- QD8 SLJYH UK5 WTRAM AAYXX CITATION 9-L ADTOC AEUFC AFAUI AHGSO UNPAY |
ID | FETCH-LOGICAL-c249t-2f4840857bc2c5dd78abdfa3daf8277374064a59d072281c102a61f72eee9a1b3 |
IEDL.DBID | UNPAY |
ISSN | 1611-2776 2196-7032 |
IngestDate | Tue Aug 19 17:45:10 EDT 2025 Wed Oct 01 01:39:48 EDT 2025 Sat Sep 06 17:03:48 EDT 2025 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 2 |
Language | English |
License | This content is free. |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c249t-2f4840857bc2c5dd78abdfa3daf8277374064a59d072281c102a61f72eee9a1b3 |
OpenAccessLink | https://proxy.k.utb.cz/login?url=http://www.degruyter.com/downloadpdf/j/itit.2016.58.issue-2/itit-2015-0030/itit-2015-0030.xml |
PageCount | 7 |
ParticipantIDs | unpaywall_primary_10_1515_itit_2015_0030 crossref_primary_10_1515_itit_2015_0030 walterdegruyter_journals_10_1515_itit_2015_003058263 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2016-03-01 |
PublicationDateYYYYMMDD | 2016-03-01 |
PublicationDate_xml | – month: 03 year: 2016 text: 2016-03-01 day: 01 |
PublicationDecade | 2010 |
PublicationTitle | Information technology (Munich, Germany) |
PublicationYear | 2016 |
Publisher | De Gruyter Oldenbourg |
Publisher_xml | – name: De Gruyter Oldenbourg |
SSID | ssj0029781 |
Score | 1.9680866 |
Snippet | Markup in the form of tags is often embedded into documents to describe formatting structures and other features, as in HTML on the
Web. But in the humanities,... |
SourceID | unpaywall crossref walterdegruyter |
SourceType | Open Access Repository Index Database Publisher |
StartPage | 63 |
SubjectTerms | Applied computing→Document management and text processing→Document management→Text Editing historical text editing Information systems→World Wide Web→Web data description languages→Markup languages overlapping hierarchies Standoff markup |
Title | Using standoff properties for marking-up historical documents in the humanities |
URI | https://www.degruyter.com/doi/10.1515/itit-2015-0030 http://www.degruyter.com/downloadpdf/j/itit.2016.58.issue-2/itit-2015-0030/itit-2015-0030.xml |
UnpaywallVersion | publishedVersion |
Volume | 58 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
journalDatabaseRights | – providerCode: PRVAZK databaseName: De Gruyter Complete Journal Package 2023 customDbUrl: eissn: 2196-7032 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0029781 issn: 2196-7032 databaseCode: AGBEV dateStart: 19950120 isFulltext: true titleUrlDefault: https://www.degruyterbrill.com providerName: Walter de Gruyter |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3NT9swFH-C9jA4MNhAMDbkAxK7uG2cODbHgvgQEh8HOrFT5PiDAiWNSiPGDvvbeU7SihVN48IxUew4_r3o_Z7f888A29JGXHa0oNoyDFDwD6IK_SztMMscS0Omywz-6Vl83ItOrvjVHExKeX1VpbHXo-JpXAmkto3XjB8qkxvXvm3f4BT6eqy4xWU1MsrKm4h2wKk32ZnL1q_7wTw0Y59_akCzd3bR_emDsDgIKBPV4XNohRQtn9WijujiZzr5y2l9KLJcPT2qwWARlh7LpPZ0yC980-FH-DPZ4VOVpNy1inHa0r9fCz6-z2cvw1LNakm3MsMVmLPZJ1h8oXX4Gc7L4gRSLl0MnSO5zwKMvJwrQd5M7lW5aE-LnPSnyiXEDHVR7sEjNxlBqkr6lV4HtlqF3uHB5f4xrU9zoBpDPByai6SXUxOpZpobI6RKjVOhUU4iEKFAahEpvms6gjEZaGQ-Kg6cYNbaXRWk4Ro0smFm14FwobhTJoxlrKMg5dhDkEouNWMOGbDYgJ0JXEleiXYkPthBYBM_TYmfJq-K2tmA71M0__toNAN2Uv_uD_9owTF0C7-8_Q2bsOAxrircvkJjPCrsN6Q843QLmt2jvYMfW7UBPwMWaP8f |
linkProvider | Unpaywall |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LT8MwDLZgHIDDeIs3OSDBpWxNmyYcAQHjfQHErUrThCFGN22tEPx6nLarBgghwbl15NqO7M9xvgJsC-0z0VTcUZoiQMEd5EjMs06Tampo5FGVn-BfXQetO__8gT2M3IWxY5Wxfuxnb2nBkNqIuyqzjbKKawAzcOMJTYkOdpljo7TRTl864zCBYCXwazBxcHp4fF_BLsvqZGFX4LoO5TwoqRu_L_MpNU1mSU--vcpOZxrqr_nRdaXXSAY6mQE11L0YPHney9JoT71_oXX838fNQr0sUMlBEVFzMKaTeZgeoS1cgJt8zoDkXYiuMaRnG_p9y8xKsAQmLzLvvztZj7QrEhIy1GFAnhKCVSdpF9QbKLUIdyfHt0ctp_wxg6MQraFuxheWGY1HiioWx1zIKDbSi6URaGGPWz9Ith83OaXCVVjEyMA1nGqt96UbeUtQS7qJXgbCuGRGxl4gAuW7EcMV3EgwoSg1WMzyFdgZ-iTsFfwbocUtaKfQ2im0drIEp80V2K1c9uur_hePhuXOHfwgwRCFeat_E9uCydbt1WV4eXZ9sQZT-DAohtjWoZb2M72BVU0abZZh-wFETvUC |
linkToPdf | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LT9wwEB5RkFo4AC0g3viA1F7Cbpw4Nkdey5v2UFBvkZ-AgGy0uxGCX884yUY8hCrBORlrMmNrvm88-gKwLmzMRFvzQFuKBAVPUCCxzgZtaqmjKqK6vME_PUsOzuOjf2w4TdivxyqNvewVD4NKIbVlurrwjbJGawArcOsaQ4kJDlngd2krN-4LjGGtF8i_xrb2t_cuGtblRZ0860rCMKCcJ7Vy49tVXlSmb0WWy4d7eXs7AZP35c1149azAtSZAjV0vZo7udkoBmpDP75SdfzUt03DZA1PyVa1n77DiM1-wMQz0cIZ-F1OGZCyB9F1juS-nd_zuqwEATC5k2X3PShyctVIkJChC31ynRHEnOSqEt5Aq1k47-z93TkI6t8yBBq5GrrmYuF10bjSVDNjuJDKOBkZ6QQGOOKIEWLJNk2bUypCjRBGJqHj1Fq7KUMVzcFo1s3sPBDGJXPSRIlIdBwqhiuESjChKXUIZfkC_BymJM0r9Y3UsxYMU-rDlPoweXnT9gL8ajL231fjVwlN63Pbf8eCIQeLFj9mtgZf_-x20pPDs-MlGMdnSTXBtgyjg15hVxDSDNRqvWmfAKVQ87s |
linkToUnpaywall | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3NT9swFH-CcgAO3QZDfGyTD5PYxW3jxLF7RAhUTaLssEpwihx_rEBJo9KIjwN_O89JWnWd0LhwTBQ7jn8ver_n9_wzwHdpIy47WlBtGQYo-AdRhX6WdphljqUh02UG_6wf9wbRzwt-sQKzUl5fVWnsn0nxOK0EUtvGa8aPlcmNa1-3r3AKfT1W3OKyGhll5U1EO-DUm-zSZevhdrQKa7HPPzVgbdD_dXTpg7A4CCgT1eFzaIUULZ_Voo7o4pc6-ctprRdZrh7v1Wi0Cc37Mqk9H_KCbzr9AM-zHT5VScpNq5imLf30r-Dj-3z2R2jWrJYcVWb4CVZstgWbC1qH23BeFieQculi7BzJfRZg4uVcCfJmcqvKRXta5GQ4Vy4hZqyLcg8eucoIUlUyrPQ6sNVnGJye_D7u0fo0B6oxxMOhuUh6OTWRaqa5MUKq1DgVGuUkAhEKpBaR4l3TEYzJQCPzUXHgBLPWdlWQhjvQyMaZ3QXCheJOmTCWsY6ClGMPQSq51Iw5ZMBiDw5ncCV5JdqR-GAHgU38NCV-mrwqamcPfszR_O-j0RLYSf27373SgmPoFu6__Q0HsOExrircvkBjOinsV6Q80_RbbbgvJWP9lw |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Using+standoff+properties+for+marking-up+historical+documents+in+the+humanities&rft.jtitle=Information+technology+%28Munich%2C+Germany%29&rft.au=Schmidt%2C+Desmond+Allan&rft.date=2016-03-01&rft.issn=1611-2776&rft.eissn=2196-7032&rft.volume=58&rft.issue=2&rft.spage=63&rft.epage=69&rft_id=info:doi/10.1515%2Fitit-2015-0030&rft.externalDBID=n%2Fa&rft.externalDocID=10_1515_itit_2015_0030 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1611-2776&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1611-2776&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1611-2776&client=summon |