HeiCuBeDa Hilprecht  Heidelberg Cuneiform Benchmark Dataset for the Hilprecht Collection 
doi:10.11588/data/IE8CCN 
heiDATA 
20190606 
2 
Mara, Hubert, 2019, "HeiCuBeDa Hilprecht  Heidelberg Cuneiform Benchmark Dataset for the Hilprecht Collection", https://doi.org/10.11588/data/IE8CCN, heiDATA, V2 
HeiCuBeDa Hilprecht  Heidelberg Cuneiform Benchmark Dataset for the Hilprecht Collection 
doi:10.11588/data/IE8CCN 
Mara, Hubert (IWR, Heidelberg University) 
Bayer, Paul Victor 
Hubert Mara 
Bartosz Bogacz 

20190311 
GigaMesh Software Framework 
heiDATA 
Mara, Hubert 
20190225 
https://doi.org/10.11588/data/IE8CCN 
Arts and Humanities, Computer and Information Science 
Abstract: 
The number of known cuneiform tablets is assumed to be in the hundreds of thousands. A fraction has been published by printing photographs and manual tracings in books, which is collected by the online Cuneiform Digital Library Initiative (CDLI) catalog including some of these images and providing metadata for more than 100.000 tablets. While 3Dacquisition of tablets is the most modern way for their documentation, the number of 3Ddatasets is relatively small and often not openly accessible. However, the Hilprecht Archive Online (HAO) provides 1977 highresolution 3D scans of tablets under an Open Access license. While both the HAO and the CDLI are accessible publicly, largescale machine learning and pattern recognition on cuneiform tablets remains elusive, because the data is only accessible by navigating web pages, the tablet identifiers between collections are inconsistent, and the 3D data is unprepared and challenging for automated processing. We enable largescale analysis of cuneiform tablets by this HeiCuBeda for Hilprecht assembly, which is a crossreferenced benchmark dataset of processed cuneiform tablets: (i) frontally aligned 3D tablets with precomputed highdimensional surface features, (ii) sixviews raster images for offtheshelf image processing, and (iii) metadata, transcriptions, and transliterations, for a subset of 707 tablets, for learning alignment between 3D data, image and linguistic expression. This is the first dataset of its kind, and of its size, in cuneiform research. This benchmark dataset is prepared for easeofuse and immediate availability for computational researches, lowering the barrier to experiment and apply standard methods of analysis. A script in Python is provided to retrieve and compute an updated JSON database of the CDLI’s metadata and raster images. Uptodate code and metadata are also available at <a href="https://gitlab.com/fcgl/releases//tree/master/mara_icdar_2019">https://gitlab.com/fcgl/releases//tree/master/mara_icdar_2019</a>. 
20180724201808222019030120190311 
Cuneiform tablets 
3D Measurement data 
Further Identifiers of the persons involved:<p> <p> <ul> <li>Hubert Mara: ORCID: <a href="https://orcid.org/0000000220044153">https://orcid.org/0000000220044153</a>, Wikidata: <a href="https://www.wikidata.org/wiki/Q97924674">https://www.wikidata.org/wiki/Q97924674</a></li> <li>Bartosz Bogacz: ORCID: <a href="https://orcid.org/0000000283235694">https://orcid.org/0000000283235694</a>, Wikidata: <a href="https://www.wikidata.org/wiki/Q102869220">https://www.wikidata.org/wiki/Q102869220</a></li> <li>Paul Victor Bayer: ORCID: <a href="https://orcid.org/0000000315285531">https://orcid.org/0000000315285531</a></li> </ul> 
Hilprecht Sammlung, Jena, Germany, <a href="https://hilprecht.mpiwgberlin.mpg.de/ ">https://hilprecht.mpiwgberlin.mpg.de/</a><p/> Cuneiform Digital Library Initiative (CDLI) <a href="https://cdli.ucla.edu/">https://cdli.ucla.edu/</a> 
Heidelberg Cuneiform 3D Database (HeiCu3Da) for the Hilprecht Collection: <a href="https://doi.org/10.11588/heidicon.hilprecht">https://doi.org/10.11588/heidicon.hilprecht</a> 

H. Mara and B. Bogacz, "Breaking the Code on Broken Tablets: The Learning Challenge for Annotated Cuneiform Script in Normalized 2D and 3D Datasets," 2019 International Conference on Document Analysis and Recognition (ICDAR), Sydney, NSW, Australia, 2019, pp. 148153. 
https://doi.org/10.1109/ICDAR.2019.00032 
H. Mara and B. Bogacz, "Breaking the Code on Broken Tablets: The Learning Challenge for Annotated Cuneiform Script in Normalized 2D and 3D Datasets," 2019 International Conference on Document Analysis and Recognition (ICDAR), Sydney, NSW, Australia, 2019, pp. 148153. 
Bartosz Bogacz and Hubert Mara: Period Classification of 3D Cuneiform Tablets with Geometric Neural Networks. In: Proceedings of the 17th International Conference on Frontiers of Handwriting Recognition (ICFHR). Dortmund, Germany 2020. 
https://doi.org/10.1109/ICFHR2020.2020.00053 
Bartosz Bogacz and Hubert Mara: Period Classification of 3D Cuneiform Tablets with Geometric Neural Networks. In: Proceedings of the 17th International Conference on Frontiers of Handwriting Recognition (ICFHR). Dortmund, Germany 2020. 
GigaMesh and Gilgamesh  3D Multiscale Integral Invariant Cuneiform Character Extraction 
https://doi.org/10.2312/VAST/VAST10/131138 
GigaMesh and Gilgamesh  3D Multiscale Integral Invariant Cuneiform Character Extraction 
MultiScale Integral Invariants for Robust Character Extraction from Irregular Polygon Mesh Data 
10.11588/heidok.00013890 
MultiScale Integral Invariants for Robust Character Extraction from Irregular Polygon Mesh Data 
HeiCuBeDa_00_Supplementary_Documentation.pdf 
Supplementary Documentation about the contents of the HeiCuBeDa and HeiCu3Da bundles. 
application/pdf 
HeiCuBeDa_01_Logo_1977.pdf 
Logo for the HeiCuBeDa Hilprecht dataset consisting of 1977 cuneiform tablets. 
application/pdf 
HeiCuBeDa_A1_Images_Sideviews_MSII_Filter.zip 
A complete set of six side views for each of the 1977 3Ddatasets using the MSII filter response to highlight surface details i.e. cuneiform script and sealings. Recommended for learning tasks. The images are stored as PNGs. 
application/zip 
HeiCuBeDa_A2_Images_Sideviews_VirtualLight.zip 
Complete set of eight side views of the 3Dmodels rendering using a virtual light source and a metallic surface to mimic the illumination setup of photographs. The images are stored as PNGs. 
application/zip 
HeiCuBeDa_B_Hilprecht_Database_240121.json 
application/json 
HeiCuBeDa_B_Scrape_CDLI_240121.py 
text/xpython 
HeiCuBeDa_C_3DData_with_MSII_and_FunctionValue_part01.zip 
Stanford Polygon (PLY) files including the feature vectors computed using the volume based integral invariant. 
application/zip 
HeiCuBeDa_C_3DData_with_MSII_and_FunctionValue_part02.zip 
Stanford Polygon (PLY) files including the feature vectors computed using the volume based integral invariant. 
application/zip 
HeiCuBeDa_C_3DData_with_MSII_and_FunctionValue_part03.zip 
Stanford Polygon (PLY) files including the feature vectors computed using the volume based integral invariant. 
application/zip 
HeiCuBeDa_C_3DData_with_MSII_and_FunctionValue_part04.zip 
Stanford Polygon (PLY) files including the feature vectors computed using the volume based integral invariant. 
application/zip 
HeiCuBeDa_C_3DData_with_MSII_and_FunctionValue_part05.zip 
Stanford Polygon (PLY) files including the feature vectors computed using the volume based integral invariant. 
application/zip 
HeiCuBeDa_C_3DData_with_MSII_and_FunctionValue_part06.zip 
Stanford Polygon (PLY) files including the feature vectors computed using the volume based integral invariant. 
application/zip 
HeiCuBeDa_C_3DData_with_MSII_and_FunctionValue_part07.zip 
Stanford Polygon (PLY) files including the feature vectors computed using the volume based integral invariant. 
application/zip 
HeiCuBeDa_C_3DData_with_MSII_and_FunctionValue_part08.zip 
Stanford Polygon (PLY) files including the feature vectors computed using the volume based integral invariant. 
application/zip 
HeiCuBeDa_C_3DData_with_MSII_and_FunctionValue_part09.zip 
Stanford Polygon (PLY) files including the feature vectors computed using the volume based integral invariant. 
application/zip 
HeiCuBeDa_C_3DData_with_MSII_and_FunctionValue_part10.zip 
Stanford Polygon (PLY) files including the feature vectors computed using the volume based integral invariant. 
application/zip 
HeiCuBeDa_C_3DData_with_MSII_and_FunctionValue_part11.zip 
Stanford Polygon (PLY) files including the feature vectors computed using the volume based integral invariant. 
application/zip 
HeiCuBeDa_C_3DData_with_MSII_and_FunctionValue_part12.zip 
Stanford Polygon (PLY) files including the feature vectors computed using the volume based integral invariant. 
application/zip 
HeiCuBeDa_C_3DData_with_MSII_and_FunctionValue_part13.zip 
Stanford Polygon (PLY) files including the feature vectors computed using the volume based integral invariant. 
application/zip 
HeiCuBeDa_C_3DData_with_MSII_and_FunctionValue_part14.zip 
Stanford Polygon (PLY) files including the feature vectors computed using the volume based integral invariant. 
application/zip 
HeiCuBeDa_C_3DData_with_MSII_and_FunctionValue_part15.zip 
Stanford Polygon (PLY) files including the feature vectors computed using the volume based integral invariant. 
application/zip 
HeiCuBeDa_C_3DData_with_MSII_and_FunctionValue_part16.zip 
Stanford Polygon (PLY) files including the feature vectors computed using the volume based integral invariant. 
application/zip 
HeiCuBeDa_C_3DData_with_MSII_and_FunctionValue_part17.zip 
Stanford Polygon (PLY) files including the feature vectors computed using the volume based integral invariant. 
application/zip 
HeiCuBeDa_C_3DData_with_MSII_and_FunctionValue_part18.zip 
Stanford Polygon (PLY) files including the feature vectors computed using the volume based integral invariant. 
application/zip 
HeiCuBeDa_C_3DData_with_MSII_and_FunctionValue_part19.zip 
Stanford Polygon (PLY) files including the feature vectors computed using the volume based integral invariant. 
application/zip 
HeiCuBeDa_C_3DData_with_MSII_and_FunctionValue_part20.zip 
Stanford Polygon (PLY) files including the feature vectors computed using the volume based integral invariant. 
application/zip 
HeiCuBeDa_D_MSII_filter_results_surface_integration_part01.tar 
Multiscale surface based integral invariants as GNU Octave MAT file compressed with bzip2. One vector per line. First column is the vertex index. 
application/xtar 
HeiCuBeDa_D_MSII_filter_results_surface_integration_part02.tar 
Multiscale surface based integral invariants as GNU Octave MAT file compressed with bzip2. One vector per line. First column is the vertex index. 
application/xtar 
HeiCuBeDa_D_MSII_filter_results_surface_integration_part03.tar 
Multiscale surface based integral invariants as GNU Octave MAT file compressed with bzip2. One vector per line. First column is the vertex index. 
application/xtar 
HeiCuBeDa_D_MSII_filter_results_surface_integration_part04.tar 
Multiscale surface based integral invariants as GNU Octave MAT file compressed with bzip2. One vector per line. First column is the vertex index. 
application/xtar 
HeiCuBeDa_D_MSII_filter_results_surface_integration_part05.tar 
Multiscale surface based integral invariants as GNU Octave MAT file compressed with bzip2. One vector per line. First column is the vertex index. 
application/xtar 
HeiCuBeDa_D_MSII_filter_results_surface_integration_part06.tar 
Multiscale surface based integral invariants as GNU Octave MAT file compressed with bzip2. One vector per line. First column is the vertex index. 
application/xtar 
HeiCuBeDa_D_MSII_filter_results_surface_integration_part07.tar 
Multiscale surface based integral invariants as GNU Octave MAT file compressed with bzip2. One vector per line. First column is the vertex index. 
application/xtar 
HeiCuBeDa_D_MSII_filter_results_surface_integration_part08.tar 
Multiscale surface based integral invariants as GNU Octave MAT file compressed with bzip2. One vector per line. First column is the vertex index. 
application/xtar 
HeiCuBeDa_D_MSII_filter_results_surface_integration_part09.tar 
Multiscale surface based integral invariants as GNU Octave MAT file compressed with bzip2. One vector per line. First column is the vertex index. 
application/xtar 
HeiCuBeDa_D_MSII_filter_results_surface_integration_part10.tar 
Multiscale surface based integral invariants as GNU Octave MAT file compressed with bzip2. One vector per line. First column is the vertex index. 
application/xtar 
HeiCuBeDa_D_MSII_filter_results_surface_integration_part11.tar 
Multiscale surface based integral invariants as GNU Octave MAT file compressed with bzip2. One vector per line. First column is the vertex index. 
application/xtar 
HeiCuBeDa_D_MSII_filter_results_surface_integration_part12.tar 
Multiscale surface based integral invariants as GNU Octave MAT file compressed with bzip2. One vector per line. First column is the vertex index. 
application/xtar 
HeiCuBeDa_D_MSII_filter_results_surface_integration_part13.tar 
Multiscale surface based integral invariants as GNU Octave MAT file compressed with bzip2. One vector per line. First column is the vertex index. 
application/xtar 
HeiCuBeDa_D_MSII_filter_results_surface_integration_part14.tar 
Multiscale surface based integral invariants as GNU Octave MAT file compressed with bzip2. One vector per line. First column is the vertex index. 
application/xtar 
HeiCuBeDa_D_MSII_filter_results_surface_integration_part15.tar 
Multiscale surface based integral invariants as GNU Octave MAT file compressed with bzip2. One vector per line. First column is the vertex index. 
application/xtar 
HeiCuBeDa_D_MSII_filter_results_surface_integration_part16.tar 
Multiscale surface based integral invariants as GNU Octave MAT file compressed with bzip2. One vector per line. First column is the vertex index. 
application/xtar 
HeiCuBeDa_D_MSII_filter_results_surface_integration_part17.tar 
Multiscale surface based integral invariants as GNU Octave MAT file compressed with bzip2. One vector per line. First column is the vertex index. 
application/xtar 
HeiCuBeDa_D_MSII_filter_results_surface_integration_part18.tar 
Multiscale surface based integral invariants as GNU Octave MAT file compressed with bzip2. One vector per line. First column is the vertex index. 
application/xtar 
HeiCuBeDa_D_MSII_filter_results_surface_integration_part19.tar 
Multiscale surface based integral invariants as GNU Octave MAT file compressed with bzip2. One vector per line. First column is the vertex index. 
application/xtar 
HeiCuBeDa_D_MSII_filter_results_surface_integration_part20.tar 
Multiscale surface based integral invariants as GNU Octave MAT file compressed with bzip2. One vector per line. First column is the vertex index. 
