Using standoff properties for marking-up historical documents in the humanities

Markup in the form of tags is often embedded into documents to describe formatting structures and other features, as in HTML on the Web. But in the humanities, the use of embedded markup for the transcription of historical documents leads to problems in the representation of overlapping features, an...

Full description

Saved in:
Bibliographic Details
Published inInformation technology (Munich, Germany) Vol. 58; no. 2; pp. 63 - 69
Main Author Schmidt, Desmond Allan
Format Journal Article
LanguageEnglish
Published De Gruyter Oldenbourg 01.03.2016
Subjects
Online AccessGet full text
ISSN1611-2776
2196-7032
2196-7032
DOI10.1515/itit-2015-0030

Cover

Abstract Markup in the form of tags is often embedded into documents to describe formatting structures and other features, as in HTML on the Web. But in the humanities, the use of embedded markup for the transcription of historical documents leads to problems in the representation of overlapping features, and subjective variation in the use of different markup tags for the same features compromises interoperability of the transcriptions. “Standoff” techniques, in which the markup and the text it describes are stored separately, can help alleviate these problems. “Standoff properties” is a technique for recording textual properties that do not conform to a context-free grammar, and can freely overlap. This allows a divide-and-conquer approach to markup, whereby sets of markup properties can record different aspects of a text, which can then be recombined as needed. Despite these advantages, standoff techniques are usually considered impractical when both the underlying text and its markup are subject to change. To circumvent this problem, this paper describes a practical algorithm for updating a set of standoff markup properties separately from the text.
AbstractList Markup in the form of tags is often embedded into documents to describe formatting structures and other features, as in HTML on the Web. But in the humanities, the use of embedded markup for the transcription of historical documents leads to problems in the representation of overlapping features, and subjective variation in the use of different markup tags for the same features compromises interoperability of the transcriptions. “Standoff” techniques, in which the markup and the text it describes are stored separately, can help alleviate these problems. “Standoff properties” is a technique for recording textual properties that do not conform to a context-free grammar, and can freely overlap. This allows a divide-and-conquer approach to markup, whereby sets of markup properties can record different aspects of a text, which can then be recombined as needed. Despite these advantages, standoff techniques are usually considered impractical when both the underlying text and its markup are subject to change. To circumvent this problem, this paper describes a practical algorithm for updating a set of standoff markup properties separately from the text.
Author Schmidt, Desmond Allan
Author_xml – sequence: 1
  givenname: Desmond Allan
  surname: Schmidt
  fullname: Schmidt, Desmond Allan
  email: d.schmidt1@uq.edu.au
  organization: 1 University of Queensland, Brisbane, QLD, Australia
BookMark eNqFkMtqwzAQRUVJoUnabdf6ASd6WJZMVyX0BYFsmrVRLClR6khGkin--8qk29LVDAzncucswMx5pwF4xGiFGWZrm2wqCMKsQIiiGzAnuK4KjiiZgTmuMC4I59UdWMR4RojUXOA52O2jdUcYk3TKGwP74HsdktURGh_gRYavfC-GHp5sTD7YVnZQ-Xa4aJcitA6mk4an4SKdnah7cGtkF_XD71yC_evL5-a92O7ePjbP26IlZZ1rmlKUSDB-aEnLlOJCHpSRVEkjckvKS1SVktUKcUIEbjEissKGE611LfGBLsH6mju4Xo7fsuuaPthcd2wwaiYfzeSjmXw0k49MrK5EG3yMQZv_gacrkNOTDkofwzDmpTn7Ibj83B8gE6Si9AfNqXoi
ContentType Journal Article
DBID AAYXX
CITATION
ADTOC
UNPAY
DOI 10.1515/itit-2015-0030
DatabaseName CrossRef
Unpaywall for CDI: Periodical Content
Unpaywall
DatabaseTitle CrossRef
DatabaseTitleList CrossRef

Database_xml – sequence: 1
  dbid: UNPAY
  name: Unpaywall
  url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/
  sourceTypes: Open Access Repository
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 2196-7032
EndPage 69
ExternalDocumentID 10.1515/itit-2015-0030
10_1515_itit_2015_0030
10_1515_itit_2015_003058263
GroupedDBID 0R~
4.4
5GY
6FP
AAAEU
AADQG
AAFPC
AAGVJ
AAJBH
AALGR
AAONY
AAOUV
AAPJK
AARVR
AASOL
AASQH
AAWFC
AAXCG
ABAOT
ABAQN
ABFKT
ABIQR
ABJNI
ABMBZ
ABPLS
ABRQL
ABSOE
ABUVI
ABWLS
ABXMZ
ABYKJ
ACDEB
ACEFL
ACGFS
ACMKP
ACPMA
ACUND
ACXLN
ADALX
ADEQT
ADGQD
ADGYE
ADJVZ
ADNPR
ADOZN
AECWL
AEGVQ
AEICA
AEKEB
AEQDQ
AERZL
AEXIE
AFBAA
AFBDD
AFGNR
AFQUK
AFYRI
AGBEV
AHVWV
AHXUK
AIERV
AIKXB
AIWOI
AJATJ
AKXKS
ALMA_UNASSIGNED_HOLDINGS
AMAVY
ASYPN
AZMOX
BAKPI
BBCWN
BCIFA
BLHJL
CFGNV
CS3
DSRVY
EBS
EJD
FSTRU
HZ~
IY9
KDIRW
O9-
QD8
SLJYH
UK5
WTRAM
AAYXX
CITATION
9-L
ADTOC
AEUFC
AFAUI
AHGSO
UNPAY
ID FETCH-LOGICAL-c249t-2f4840857bc2c5dd78abdfa3daf8277374064a59d072281c102a61f72eee9a1b3
IEDL.DBID UNPAY
ISSN 1611-2776
2196-7032
IngestDate Tue Aug 19 17:45:10 EDT 2025
Wed Oct 01 01:39:48 EDT 2025
Sat Sep 06 17:03:48 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 2
Language English
License This content is free.
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c249t-2f4840857bc2c5dd78abdfa3daf8277374064a59d072281c102a61f72eee9a1b3
OpenAccessLink https://proxy.k.utb.cz/login?url=http://www.degruyter.com/downloadpdf/j/itit.2016.58.issue-2/itit-2015-0030/itit-2015-0030.xml
PageCount 7
ParticipantIDs unpaywall_primary_10_1515_itit_2015_0030
crossref_primary_10_1515_itit_2015_0030
walterdegruyter_journals_10_1515_itit_2015_003058263
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2016-03-01
PublicationDateYYYYMMDD 2016-03-01
PublicationDate_xml – month: 03
  year: 2016
  text: 2016-03-01
  day: 01
PublicationDecade 2010
PublicationTitle Information technology (Munich, Germany)
PublicationYear 2016
Publisher De Gruyter Oldenbourg
Publisher_xml – name: De Gruyter Oldenbourg
SSID ssj0029781
Score 1.9680866
Snippet Markup in the form of tags is often embedded into documents to describe formatting structures and other features, as in HTML on the Web. But in the humanities,...
SourceID unpaywall
crossref
walterdegruyter
SourceType Open Access Repository
Index Database
Publisher
StartPage 63
SubjectTerms Applied computing→Document management and text processing→Document management→Text Editing
historical text editing
Information systems→World Wide Web→Web data description languages→Markup languages
overlapping hierarchies
Standoff markup
Title Using standoff properties for marking-up historical documents in the humanities
URI https://www.degruyter.com/doi/10.1515/itit-2015-0030
http://www.degruyter.com/downloadpdf/j/itit.2016.58.issue-2/itit-2015-0030/itit-2015-0030.xml
UnpaywallVersion publishedVersion
Volume 58
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAZK
  databaseName: De Gruyter Complete Journal Package 2023
  customDbUrl:
  eissn: 2196-7032
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0029781
  issn: 2196-7032
  databaseCode: AGBEV
  dateStart: 19950120
  isFulltext: true
  titleUrlDefault: https://www.degruyterbrill.com
  providerName: Walter de Gruyter
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3NT9swFH-C9jA4MNhAMDbkAxK7uG2cODbHgvgQEh8HOrFT5PiDAiWNSiPGDvvbeU7SihVN48IxUew4_r3o_Z7f888A29JGXHa0oNoyDFDwD6IK_SztMMscS0Omywz-6Vl83ItOrvjVHExKeX1VpbHXo-JpXAmkto3XjB8qkxvXvm3f4BT6eqy4xWU1MsrKm4h2wKk32ZnL1q_7wTw0Y59_akCzd3bR_emDsDgIKBPV4XNohRQtn9WijujiZzr5y2l9KLJcPT2qwWARlh7LpPZ0yC980-FH-DPZ4VOVpNy1inHa0r9fCz6-z2cvw1LNakm3MsMVmLPZJ1h8oXX4Gc7L4gRSLl0MnSO5zwKMvJwrQd5M7lW5aE-LnPSnyiXEDHVR7sEjNxlBqkr6lV4HtlqF3uHB5f4xrU9zoBpDPByai6SXUxOpZpobI6RKjVOhUU4iEKFAahEpvms6gjEZaGQ-Kg6cYNbaXRWk4Ro0smFm14FwobhTJoxlrKMg5dhDkEouNWMOGbDYgJ0JXEleiXYkPthBYBM_TYmfJq-K2tmA71M0__toNAN2Uv_uD_9owTF0C7-8_Q2bsOAxrircvkJjPCrsN6Q843QLmt2jvYMfW7UBPwMWaP8f
linkProvider Unpaywall
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LT8MwDLZgHIDDeIs3OSDBpWxNmyYcAQHjfQHErUrThCFGN22tEPx6nLarBgghwbl15NqO7M9xvgJsC-0z0VTcUZoiQMEd5EjMs06Tampo5FGVn-BfXQetO__8gT2M3IWxY5Wxfuxnb2nBkNqIuyqzjbKKawAzcOMJTYkOdpljo7TRTl864zCBYCXwazBxcHp4fF_BLsvqZGFX4LoO5TwoqRu_L_MpNU1mSU--vcpOZxrqr_nRdaXXSAY6mQE11L0YPHney9JoT71_oXX838fNQr0sUMlBEVFzMKaTeZgeoS1cgJt8zoDkXYiuMaRnG_p9y8xKsAQmLzLvvztZj7QrEhIy1GFAnhKCVSdpF9QbKLUIdyfHt0ctp_wxg6MQraFuxheWGY1HiioWx1zIKDbSi6URaGGPWz9Ith83OaXCVVjEyMA1nGqt96UbeUtQS7qJXgbCuGRGxl4gAuW7EcMV3EgwoSg1WMzyFdgZ-iTsFfwbocUtaKfQ2im0drIEp80V2K1c9uur_hePhuXOHfwgwRCFeat_E9uCydbt1WV4eXZ9sQZT-DAohtjWoZb2M72BVU0abZZh-wFETvUC
linkToPdf http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LT9wwEB5RkFo4AC0g3viA1F7Cbpw4Nkdey5v2UFBvkZ-AgGy0uxGCX884yUY8hCrBORlrMmNrvm88-gKwLmzMRFvzQFuKBAVPUCCxzgZtaqmjKqK6vME_PUsOzuOjf2w4TdivxyqNvewVD4NKIbVlurrwjbJGawArcOsaQ4kJDlngd2krN-4LjGGtF8i_xrb2t_cuGtblRZ0860rCMKCcJ7Vy49tVXlSmb0WWy4d7eXs7AZP35c1149azAtSZAjV0vZo7udkoBmpDP75SdfzUt03DZA1PyVa1n77DiM1-wMQz0cIZ-F1OGZCyB9F1juS-nd_zuqwEATC5k2X3PShyctVIkJChC31ynRHEnOSqEt5Aq1k47-z93TkI6t8yBBq5GrrmYuF10bjSVDNjuJDKOBkZ6QQGOOKIEWLJNk2bUypCjRBGJqHj1Fq7KUMVzcFo1s3sPBDGJXPSRIlIdBwqhiuESjChKXUIZfkC_BymJM0r9Y3UsxYMU-rDlPoweXnT9gL8ajL231fjVwlN63Pbf8eCIQeLFj9mtgZf_-x20pPDs-MlGMdnSTXBtgyjg15hVxDSDNRqvWmfAKVQ87s
linkToUnpaywall http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3NT9swFH-CcgAO3QZDfGyTD5PYxW3jxLF7RAhUTaLssEpwihx_rEBJo9KIjwN_O89JWnWd0LhwTBQ7jn8ver_n9_wzwHdpIy47WlBtGQYo-AdRhX6WdphljqUh02UG_6wf9wbRzwt-sQKzUl5fVWnsn0nxOK0EUtvGa8aPlcmNa1-3r3AKfT1W3OKyGhll5U1EO-DUm-zSZevhdrQKa7HPPzVgbdD_dXTpg7A4CCgT1eFzaIUULZ_Voo7o4pc6-ctprRdZrh7v1Wi0Cc37Mqk9H_KCbzr9AM-zHT5VScpNq5imLf30r-Dj-3z2R2jWrJYcVWb4CVZstgWbC1qH23BeFieQculi7BzJfRZg4uVcCfJmcqvKRXta5GQ4Vy4hZqyLcg8eucoIUlUyrPQ6sNVnGJye_D7u0fo0B6oxxMOhuUh6OTWRaqa5MUKq1DgVGuUkAhEKpBaR4l3TEYzJQCPzUXHgBLPWdlWQhjvQyMaZ3QXCheJOmTCWsY6ClGMPQSq51Iw5ZMBiDw5ncCV5JdqR-GAHgU38NCV-mrwqamcPfszR_O-j0RLYSf27373SgmPoFu6__Q0HsOExrircvkBjOinsV6Q80_RbbbgvJWP9lw
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Using+standoff+properties+for+marking-up+historical+documents+in+the+humanities&rft.jtitle=Information+technology+%28Munich%2C+Germany%29&rft.au=Schmidt%2C+Desmond+Allan&rft.date=2016-03-01&rft.issn=1611-2776&rft.eissn=2196-7032&rft.volume=58&rft.issue=2&rft.spage=63&rft.epage=69&rft_id=info:doi/10.1515%2Fitit-2015-0030&rft.externalDBID=n%2Fa&rft.externalDocID=10_1515_itit_2015_0030
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1611-2776&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1611-2776&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1611-2776&client=summon