You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello! @ephremishac and I would like to submit the following dataset for Syriac: 140 folios of Serto from the 16th century, transcribed as part of the Vienna HTR Winter School.
Here is our dataset YAML file:
schema: https://htr-united.github.io/schema/2023-06-27/schema.jsontitle: ÖNB Cod. Syr. 1, Ground Truth from HTR Winter School 2024url: https://github.com/HTR-School-Vienna/2024--Syriacauthors:
- name: Ephremsurname: Aboud Ishacorcid: 0000-0003-2943-6556roles:
- project-manager
- name: Christinesurname: Roughanorcid: 0009-0004-5999-8749roles:
- project-manager
- name: Ammarsurname: Awadroles:
- transcriber
- name: Carlo Biuzzisurname: Emilioorcid: 0000-0002-6108-3650roles:
- transcriber
- name: Saranyasurname: Chandranroles:
- transcriber
- name: Jennifersurname: Griggsorcid: 0000-0002-7857-806Xroles:
- transcriber
- name: Polinasurname: Ivanovaorcid: 0009-0002-6853-2129roles:
- transcriber
- name: Brankosurname: Maleševićorcid: 0009-0008-2419-6323roles:
- transcriber
- name: Stefansurname: Marićorcid: 0009-0008-5129-1932roles:
- transcriber
- name: Francescasurname: Nateriroles:
- transcriber
- name: Ivansurname: Petrovorcid: 0000-0003-4386-0097roles:
- transcriber
- name: Cristinasurname: Tavaroles:
- transcriber
- name: Maria S.surname: Thomasorcid: 0009-0008-1416-3499roles:
- transcriberinstitutions: []description: >- Ground truth of 140 folios of ÖNB Cod. Syr. 1. This ground truth was produced by participants of the Vienna 2024 HTR Winter School, who used Transkribus to manually correct a preliminary automatic transcription that had been generated using Kraken/eScriptorium.language:
- syrproduction-software: Transkribusautomatically-aligned: falsescript:
- iso: Syrjscript-type: only-manuscripttime:
notBefore: '1545'notAfter: '1545'hands:
count: '1'precision: exactlicense:
name: CC-BY 4.0url: https://creativecommons.org/licenses/by/4.0/format: Page-XMLvolume:
- metric: linescount: 2869citation-file-link: https://github.com/HTR-School-Vienna/2024--Syriac/blob/main/CITATION.cfftranscription-guidelines: >- The segmentation of the folios followed the SegmOnto vocabulary for annotationof regions:
- MainZone: the main column of text.
- MainZone-gold: any sections of the main column where the text is written ingold block characters, as in the start of the text here. (The - character is asubstitution for SegmOnto's recommended : character for declaring subtypes,since Transkribus did not allow for use of the colon character in the regionname.)
- MarginTextZone: any marginal words or phrases, including catchwords. Alsoused for interlinear glosses.
- NumberingZone: any page or folio numbers.The transcription includes spaces, the Syriac letters, some diacritics,punctuation, and no vowel dots or markings.
- Allowed diacritics:
- Syome
- Dots over feminine suffix heh
- Dots in pronouns: above for demonstrative, below for personal
- Dots in verbs: to distinguish participles and perfects
- Dots to distinguish homographs
- Excluded diacritics:
- Vowel dots
- Dots of hardening and softening (qushoyo and rukokho)Punctuation marks were not normalized, but rather transcribed as they appearin the manuscript (. ܆ ܇ : ܀).Transkribus's unclear tag was used when readings were uncertain or the textwas damaged or unclear.
The text was updated successfully, but these errors were encountered:
Following up -- the dataset is now available on Zenodo in addition to the Github repository. Since the image file count/size seems to use up Github's monthly bandwidth rather quickly, I would update our submission to change the value for url:
old version: url: https://github.com/HTR-School-Vienna/2024--Syriac
new version: url: https://doi.org/10.5281/zenodo.14714089
Hello! @ephremishac and I would like to submit the following dataset for Syriac: 140 folios of Serto from the 16th century, transcribed as part of the Vienna HTR Winter School.
Here is our dataset YAML file:
The text was updated successfully, but these errors were encountered: