HeiCuBeDa Hilprecht - Heidelberg Cuneiform Benchmark Dataset for the Hilprecht Collection (doi:10.11588/data/IE8CCN)

View:

Part 1: Document Description
Part 2: Study Description
Part 5: Other Study-Related Materials
Entire Codebook

(external link) (external link) (external link) (external link)

Document Description

Citation

Title:

HeiCuBeDa Hilprecht - Heidelberg Cuneiform Benchmark Dataset for the Hilprecht Collection

Identification Number:

doi:10.11588/data/IE8CCN

Distributor:

heiDATA

Date of Distribution:

2019-06-06

Version:

2

Bibliographic Citation:

Mara, Hubert, 2019, "HeiCuBeDa Hilprecht - Heidelberg Cuneiform Benchmark Dataset for the Hilprecht Collection", https://doi.org/10.11588/data/IE8CCN, heiDATA, V2

Study Description

Citation

Title:

HeiCuBeDa Hilprecht - Heidelberg Cuneiform Benchmark Dataset for the Hilprecht Collection

Identification Number:

doi:10.11588/data/IE8CCN

Authoring Entity:

Mara, Hubert (IWR, Heidelberg University)

Other identifications and acknowledgements:

Bayer, Paul Victor

Producer:

Hubert Mara

Bartosz Bogacz

Date of Production:

2019-03-11

Software used in Production:

GigaMesh Software Framework

Distributor:

heiDATA

Access Authority:

Mara, Hubert

Date of Deposit:

2019-02-25

Holdings Information:

https://doi.org/10.11588/data/IE8CCN

Study Scope

Keywords:

Arts and Humanities, Computer and Information Science

Abstract:

The number of known cuneiform tablets is assumed to be in the hundreds of thousands. A fraction has been published by printing photographs and manual tracings in books, which is collected by the online Cuneiform Digital Library Initiative (CDLI) catalog including some of these images and providing metadata for more than 100.000 tablets. While 3D-acquisition of tablets is the most modern way for their documentation, the number of 3D-datasets is relatively small and often not openly accessible. However, the Hilprecht Archive Online (HAO) provides 1977 high-resolution 3D scans of tablets under an Open Access license. While both the HAO and the CDLI are accessible publicly, large-scale machine learning and pattern recognition on cuneiform tablets remains elusive, because the data is only accessible by navigating web pages, the tablet identifiers between collections are inconsistent, and the 3D data is unprepared and challenging for automated processing. We enable large-scale analysis of cuneiform tablets by this HeiCuBeda for Hilprecht assembly, which is a cross-referenced benchmark dataset of processed cuneiform tablets: (i) frontally aligned 3D tablets with pre-computed high-dimensional surface features, (ii) six-views raster images for off-the-shelf image processing, and (iii) metadata, transcriptions, and transliterations, for a subset of 707 tablets, for learning alignment between 3D data, image and linguistic expression. This is the first dataset of its kind, and of its size, in cuneiform research. This benchmark dataset is prepared for ease-of-use and immediate availability for computational researches, lowering the barrier to experiment and apply standard methods of analysis. A script in Python is provided to retrieve and compute an updated JSON database of the CDLI’s metadata and raster images. Up-to-date code and meta-data are also available at <a href="https://gitlab.com/fcgl/releases/-/tree/master/mara_icdar_2019">https://gitlab.com/fcgl/releases/-/tree/master/mara_icdar_2019</a>.

Date of Collection:

2018-07-24-2018-08-222019-03-01-2019-03-11

Kind of Data:

Cuneiform tablets

Kind of Data:

3D Measurement data

Notes:

Further Identifiers of the persons involved:<p> <p> <ul> <li>Hubert Mara: ORCID: <a href="https://orcid.org/0000-0002-2004-4153">https://orcid.org/0000-0002-2004-4153</a>, Wikidata: <a href="https://www.wikidata.org/wiki/Q97924674">https://www.wikidata.org/wiki/Q97924674</a></li> <li>Bartosz Bogacz: ORCID: <a href="https://orcid.org/0000-0002-8323-5694">https://orcid.org/0000-0002-8323-5694</a>, Wikidata: <a href="https://www.wikidata.org/wiki/Q102869220">https://www.wikidata.org/wiki/Q102869220</a></li> <li>Paul Victor Bayer: ORCID: <a href="https://orcid.org/0000-0003-1528-5531">https://orcid.org/0000-0003-1528-5531</a></li> </ul>

Methodology and Processing

Sources Statement

Origins of Sources:

Hilprecht Sammlung, Jena, Germany, <a href="https://hilprecht.mpiwg-berlin.mpg.de/ ">https://hilprecht.mpiwg-berlin.mpg.de/</a><p/> Cuneiform Digital Library Initiative (CDLI) <a href="https://cdli.ucla.edu/">https://cdli.ucla.edu/</a>

Data Access

Other Study Description Materials

Related Studies

Heidelberg Cuneiform 3D Database (HeiCu3Da) for the Hilprecht Collection: <a href="https://doi.org/10.11588/heidicon.hilprecht">https://doi.org/10.11588/heidicon.hilprecht</a>

Related Publications

Citation

Title:

H. Mara and B. Bogacz, "Breaking the Code on Broken Tablets: The Learning Challenge for Annotated Cuneiform Script in Normalized 2D and 3D Datasets," 2019 International Conference on Document Analysis and Recognition (ICDAR), Sydney, NSW, Australia, 2019, pp. 148-153.

Identification Number:

https://doi.org/10.1109/ICDAR.2019.00032

Bibliographic Citation:

H. Mara and B. Bogacz, "Breaking the Code on Broken Tablets: The Learning Challenge for Annotated Cuneiform Script in Normalized 2D and 3D Datasets," 2019 International Conference on Document Analysis and Recognition (ICDAR), Sydney, NSW, Australia, 2019, pp. 148-153.

Citation

Title:

Bartosz Bogacz and Hubert Mara: Period Classification of 3D Cuneiform Tablets with Geometric Neural Networks. In: Proceedings of the 17th International Conference on Frontiers of Handwriting Recognition (ICFHR). Dortmund, Germany 2020.

Identification Number:

https://doi.org/10.1109/ICFHR2020.2020.00053

Bibliographic Citation:

Bartosz Bogacz and Hubert Mara: Period Classification of 3D Cuneiform Tablets with Geometric Neural Networks. In: Proceedings of the 17th International Conference on Frontiers of Handwriting Recognition (ICFHR). Dortmund, Germany 2020.

Citation

Title:

GigaMesh and Gilgamesh - 3D Multiscale Integral Invariant Cuneiform Character Extraction

Identification Number:

https://doi.org/10.2312/VAST/VAST10/131-138

Bibliographic Citation:

GigaMesh and Gilgamesh - 3D Multiscale Integral Invariant Cuneiform Character Extraction

Citation

Title:

Multi-Scale Integral Invariants for Robust Character Extraction from Irregular Polygon Mesh Data

Identification Number:

10.11588/heidok.00013890

Bibliographic Citation:

Multi-Scale Integral Invariants for Robust Character Extraction from Irregular Polygon Mesh Data

Other Study-Related Materials

Label:

HeiCuBeDa_00_Supplementary_Documentation.pdf

Text:

Supplementary Documentation about the contents of the HeiCuBeDa and HeiCu3Da bundles.

Notes:

application/pdf

Other Study-Related Materials

Label:

HeiCuBeDa_01_Logo_1977.pdf

Text:

Logo for the HeiCuBeDa Hilprecht dataset consisting of 1977 cuneiform tablets.

Notes:

application/pdf

Other Study-Related Materials

Label:

HeiCuBeDa_A1_Images_Sideviews_MSII_Filter.zip

Text:

A complete set of six side views for each of the 1977 3D-datasets using the MSII filter response to highlight surface details i.e. cuneiform script and sealings. Recommended for learning tasks. The images are stored as PNGs.

Notes:

application/zip

Other Study-Related Materials

Label:

HeiCuBeDa_A2_Images_Sideviews_VirtualLight.zip

Text:

Complete set of eight side views of the 3D-models rendering using a virtual light source and a metallic surface to mimic the illumination setup of photographs. The images are stored as PNGs.

Notes:

application/zip

Other Study-Related Materials

Label:

HeiCuBeDa_B_Hilprecht_Database_240121.json

Notes:

application/json

Other Study-Related Materials

Label:

HeiCuBeDa_B_Scrape_CDLI_240121.py

Notes:

text/x-python

Other Study-Related Materials

Label:

HeiCuBeDa_C_3DData_with_MSII_and_FunctionValue_part01.zip

Text:

Stanford Polygon (PLY) files including the feature vectors computed using the volume based integral invariant.

Notes:

application/zip

Other Study-Related Materials

Label:

HeiCuBeDa_C_3DData_with_MSII_and_FunctionValue_part02.zip

Text:

Stanford Polygon (PLY) files including the feature vectors computed using the volume based integral invariant.

Notes:

application/zip

Other Study-Related Materials

Label:

HeiCuBeDa_C_3DData_with_MSII_and_FunctionValue_part03.zip

Text:

Stanford Polygon (PLY) files including the feature vectors computed using the volume based integral invariant.

Notes:

application/zip

Other Study-Related Materials

Label:

HeiCuBeDa_C_3DData_with_MSII_and_FunctionValue_part04.zip

Text:

Stanford Polygon (PLY) files including the feature vectors computed using the volume based integral invariant.

Notes:

application/zip

Other Study-Related Materials

Label:

HeiCuBeDa_C_3DData_with_MSII_and_FunctionValue_part05.zip

Text:

Stanford Polygon (PLY) files including the feature vectors computed using the volume based integral invariant.

Notes:

application/zip

Other Study-Related Materials

Label:

HeiCuBeDa_C_3DData_with_MSII_and_FunctionValue_part06.zip

Text:

Stanford Polygon (PLY) files including the feature vectors computed using the volume based integral invariant.

Notes:

application/zip

Other Study-Related Materials

Label:

HeiCuBeDa_C_3DData_with_MSII_and_FunctionValue_part07.zip

Text:

Stanford Polygon (PLY) files including the feature vectors computed using the volume based integral invariant.

Notes:

application/zip

Other Study-Related Materials

Label:

HeiCuBeDa_C_3DData_with_MSII_and_FunctionValue_part08.zip

Text:

Stanford Polygon (PLY) files including the feature vectors computed using the volume based integral invariant.

Notes:

application/zip

Other Study-Related Materials

Label:

HeiCuBeDa_C_3DData_with_MSII_and_FunctionValue_part09.zip

Text:

Stanford Polygon (PLY) files including the feature vectors computed using the volume based integral invariant.

Notes:

application/zip

Other Study-Related Materials

Label:

HeiCuBeDa_C_3DData_with_MSII_and_FunctionValue_part10.zip

Text:

Stanford Polygon (PLY) files including the feature vectors computed using the volume based integral invariant.

Notes:

application/zip

Other Study-Related Materials

Label:

HeiCuBeDa_C_3DData_with_MSII_and_FunctionValue_part11.zip

Text:

Stanford Polygon (PLY) files including the feature vectors computed using the volume based integral invariant.

Notes:

application/zip

Other Study-Related Materials

Label:

HeiCuBeDa_C_3DData_with_MSII_and_FunctionValue_part12.zip

Text:

Stanford Polygon (PLY) files including the feature vectors computed using the volume based integral invariant.

Notes:

application/zip

Other Study-Related Materials

Label:

HeiCuBeDa_C_3DData_with_MSII_and_FunctionValue_part13.zip

Text:

Stanford Polygon (PLY) files including the feature vectors computed using the volume based integral invariant.

Notes:

application/zip

Other Study-Related Materials

Label:

HeiCuBeDa_C_3DData_with_MSII_and_FunctionValue_part14.zip

Text:

Stanford Polygon (PLY) files including the feature vectors computed using the volume based integral invariant.

Notes:

application/zip

Other Study-Related Materials

Label:

HeiCuBeDa_C_3DData_with_MSII_and_FunctionValue_part15.zip

Text:

Stanford Polygon (PLY) files including the feature vectors computed using the volume based integral invariant.

Notes:

application/zip

Other Study-Related Materials

Label:

HeiCuBeDa_C_3DData_with_MSII_and_FunctionValue_part16.zip

Text:

Stanford Polygon (PLY) files including the feature vectors computed using the volume based integral invariant.

Notes:

application/zip

Other Study-Related Materials

Label:

HeiCuBeDa_C_3DData_with_MSII_and_FunctionValue_part17.zip

Text:

Stanford Polygon (PLY) files including the feature vectors computed using the volume based integral invariant.

Notes:

application/zip

Other Study-Related Materials

Label:

HeiCuBeDa_C_3DData_with_MSII_and_FunctionValue_part18.zip

Text:

Stanford Polygon (PLY) files including the feature vectors computed using the volume based integral invariant.

Notes:

application/zip

Other Study-Related Materials

Label:

HeiCuBeDa_C_3DData_with_MSII_and_FunctionValue_part19.zip

Text:

Stanford Polygon (PLY) files including the feature vectors computed using the volume based integral invariant.

Notes:

application/zip

Other Study-Related Materials

Label:

HeiCuBeDa_C_3DData_with_MSII_and_FunctionValue_part20.zip

Text:

Stanford Polygon (PLY) files including the feature vectors computed using the volume based integral invariant.

Notes:

application/zip

Other Study-Related Materials

Label:

HeiCuBeDa_D_MSII_filter_results_surface_integration_part01.tar

Text:

Multi-scale surface based integral invariants as GNU Octave MAT file compressed with bzip2. One vector per line. First column is the vertex index.

Notes:

application/x-tar

Other Study-Related Materials

Label:

HeiCuBeDa_D_MSII_filter_results_surface_integration_part02.tar

Text:

Multi-scale surface based integral invariants as GNU Octave MAT file compressed with bzip2. One vector per line. First column is the vertex index.

Notes:

application/x-tar

Other Study-Related Materials

Label:

HeiCuBeDa_D_MSII_filter_results_surface_integration_part03.tar

Text:

Multi-scale surface based integral invariants as GNU Octave MAT file compressed with bzip2. One vector per line. First column is the vertex index.

Notes:

application/x-tar

Other Study-Related Materials

Label:

HeiCuBeDa_D_MSII_filter_results_surface_integration_part04.tar

Text:

Multi-scale surface based integral invariants as GNU Octave MAT file compressed with bzip2. One vector per line. First column is the vertex index.

Notes:

application/x-tar

Other Study-Related Materials

Label:

HeiCuBeDa_D_MSII_filter_results_surface_integration_part05.tar

Text:

Multi-scale surface based integral invariants as GNU Octave MAT file compressed with bzip2. One vector per line. First column is the vertex index.

Notes:

application/x-tar

Other Study-Related Materials

Label:

HeiCuBeDa_D_MSII_filter_results_surface_integration_part06.tar

Text:

Multi-scale surface based integral invariants as GNU Octave MAT file compressed with bzip2. One vector per line. First column is the vertex index.

Notes:

application/x-tar

Other Study-Related Materials

Label:

HeiCuBeDa_D_MSII_filter_results_surface_integration_part07.tar

Text:

Multi-scale surface based integral invariants as GNU Octave MAT file compressed with bzip2. One vector per line. First column is the vertex index.

Notes:

application/x-tar

Other Study-Related Materials

Label:

HeiCuBeDa_D_MSII_filter_results_surface_integration_part08.tar

Text:

Multi-scale surface based integral invariants as GNU Octave MAT file compressed with bzip2. One vector per line. First column is the vertex index.

Notes:

application/x-tar

Other Study-Related Materials

Label:

HeiCuBeDa_D_MSII_filter_results_surface_integration_part09.tar

Text:

Multi-scale surface based integral invariants as GNU Octave MAT file compressed with bzip2. One vector per line. First column is the vertex index.

Notes:

application/x-tar

Other Study-Related Materials

Label:

HeiCuBeDa_D_MSII_filter_results_surface_integration_part10.tar

Text:

Multi-scale surface based integral invariants as GNU Octave MAT file compressed with bzip2. One vector per line. First column is the vertex index.

Notes:

application/x-tar

Other Study-Related Materials

Label:

HeiCuBeDa_D_MSII_filter_results_surface_integration_part11.tar

Text:

Multi-scale surface based integral invariants as GNU Octave MAT file compressed with bzip2. One vector per line. First column is the vertex index.

Notes:

application/x-tar

Other Study-Related Materials

Label:

HeiCuBeDa_D_MSII_filter_results_surface_integration_part12.tar

Text:

Multi-scale surface based integral invariants as GNU Octave MAT file compressed with bzip2. One vector per line. First column is the vertex index.

Notes:

application/x-tar

Other Study-Related Materials

Label:

HeiCuBeDa_D_MSII_filter_results_surface_integration_part13.tar

Text:

Multi-scale surface based integral invariants as GNU Octave MAT file compressed with bzip2. One vector per line. First column is the vertex index.

Notes:

application/x-tar

Other Study-Related Materials

Label:

HeiCuBeDa_D_MSII_filter_results_surface_integration_part14.tar

Text:

Multi-scale surface based integral invariants as GNU Octave MAT file compressed with bzip2. One vector per line. First column is the vertex index.

Notes:

application/x-tar

Other Study-Related Materials

Label:

HeiCuBeDa_D_MSII_filter_results_surface_integration_part15.tar

Text:

Multi-scale surface based integral invariants as GNU Octave MAT file compressed with bzip2. One vector per line. First column is the vertex index.

Notes:

application/x-tar

Other Study-Related Materials

Label:

HeiCuBeDa_D_MSII_filter_results_surface_integration_part16.tar

Text:

Multi-scale surface based integral invariants as GNU Octave MAT file compressed with bzip2. One vector per line. First column is the vertex index.

Notes:

application/x-tar

Other Study-Related Materials

Label:

HeiCuBeDa_D_MSII_filter_results_surface_integration_part17.tar

Text:

Multi-scale surface based integral invariants as GNU Octave MAT file compressed with bzip2. One vector per line. First column is the vertex index.

Notes:

application/x-tar

Other Study-Related Materials

Label:

HeiCuBeDa_D_MSII_filter_results_surface_integration_part18.tar

Text:

Multi-scale surface based integral invariants as GNU Octave MAT file compressed with bzip2. One vector per line. First column is the vertex index.

Notes:

application/x-tar

Other Study-Related Materials

Label:

HeiCuBeDa_D_MSII_filter_results_surface_integration_part19.tar

Text:

Multi-scale surface based integral invariants as GNU Octave MAT file compressed with bzip2. One vector per line. First column is the vertex index.

Notes:

application/x-tar

Other Study-Related Materials

Label:

HeiCuBeDa_D_MSII_filter_results_surface_integration_part20.tar

Text:

Multi-scale surface based integral invariants as GNU Octave MAT file compressed with bzip2. One vector per line. First column is the vertex index.

Notes:

application/x-tar