1 to 10 of 31 Results
Mar 21, 2023 - Ground truth data for HTR on South Asian Scripts
Derrick, Tom; British Library, 2023, "Ground Truth transcriptions for training OCR of historical Bengali printed texts – Recognition of Early Indian Printed Documents competition - updated with improved XML coordinates", https://doi.org/10.11588/data/AIQSXL, heiDATA, V1
This dataset comprises 81 digitised images (TIFF files) drawn from a selection of early printed Bengali books (1713-1914) digitised through the Two Centuries of Indian Print project (https://www.bl.uk/projects/two-centuries-of-indian-print). Also contained are ground truth transc... |
ZIP Archive - 1002.2 MB -
MD5: 2e97b3f935d9b834d057e9d423be1b30
|
Feb 24, 2023 - Ground truth data for HTR on South Asian Scripts
Tübingen University Library, 2023, "Ground Truth data for printed Malayalam", https://doi.org/10.11588/data/L2KRZO, heiDATA, V1
Ground Truth (GT) data (JPG, PAGE and ALTO XML files) which can be used to train OCR models that recognize printed text in Malayalam script. The training material is gathered from 19th and 20th centuries prints. The GT data was trained in Transkribus with the HTR+ and the PyLaia... |
Feb 24, 2023 -
Ground Truth data for printed Malayalam
ZIP Archive - 6.6 MB -
MD5: a5eabde1cb44fb2ad2be83228e534b41
|
Feb 24, 2023 -
Ground Truth data for printed Malayalam
ZIP Archive - 11.2 MB -
MD5: a82c90b56669a1a829ad754bffb871cf
|
Feb 24, 2023 -
Ground Truth data for printed Malayalam
ZIP Archive - 12.3 MB -
MD5: 1d0c81551baa135228be4cf9b63f6648
|
Feb 24, 2023 -
Ground Truth data for printed Malayalam
ZIP Archive - 9.8 MB -
MD5: 887edc8349eb421a04fa71dacf4dfdf8
|
Feb 24, 2023 -
Ground Truth data for printed Malayalam
ZIP Archive - 16.9 MB -
MD5: 87c28600177975b0964ad9457147af51
|
Dec 8, 2022 - Ground truth data for HTR on South Asian Scripts
O'Neill, Alexander, 2022, "Ground Truth Model for Pracalit for Sanskrit and Newar MSS 16th to 19th C.", https://doi.org/10.11588/data/WI9184, heiDATA, V1
Ground truth data for a an OCR model. Will be continually updated. Originally trained on Transkribus with a PyLaia model created from ground truth data based on transcripts into Pracalit Unicode of four Nepalese manuscripts. The manuscripts used to create this model are Staatsbib... |
ZIP Archive - 479.7 MB -
MD5: 56e2cc32f0d0081fe109b596166f215f
|