heiDATA

Metrics

290,842 Downloads

Featured Dataverses

In order to use this feature you must have at least one published or linked dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

Subject: Computer and Information Science Subject: Arts and Humanities

21 to 30 of 64 Results

ErKon3D - Quantifying Deformation in Aegean Sealing Practices [Dataset] Jun 7, 2023 - IWR Computer Graphics Mara, Hubert, 2023, "ErKon3D - Quantifying Deformation in Aegean Sealing Practices [Dataset]", https://doi.org/10.11588/data/UMJXI0, heiDATA, V1 In Bronze Aegean society, seals played an important role by authenticating, securing and marking. The study of the seals and their engraved motifs provides valuable insight into the social and political organization and administration of Aegean societies. A key research question...
Ground Truth transcriptions for training OCR of historical Bengali printed texts – Recognition of Early Indian Printed Documents competition - updated with improved XML coordinates Mar 21, 2023 - Ground truth data for HTR on South Asian Scripts Derrick, Tom; British Library, 2023, "Ground Truth transcriptions for training OCR of historical Bengali printed texts – Recognition of Early Indian Printed Documents competition - updated with improved XML coordinates", https://doi.org/10.11588/data/AIQSXL, heiDATA, V1 This dataset comprises 81 digitised images (TIFF files) drawn from a selection of early printed Bengali books (1713-1914) digitised through the Two Centuries of Indian Print project (https://www.bl.uk/projects/two-centuries-of-indian-print). Also contained are ground truth transc...
Ground truth data for HTR on South Asian Scripts(FID4SA – Specialized Information Service South Asia) Oct 26, 2022FID4SA@heiDATA A collection of Ground Truth data for handwritten and printed text recognition for South Asian scripts provided by FID4SA - Specialized Information Service South Asia. Interested researchers can download the data archived here and use it as training data for their own text recogn...
FID4SA@heiDATA(Heidelberg University Library) Oct 26, 2022 Data publications of the FID4SA – Specialized Information Service South Asia.
HeiCuBeDa Hilprecht - Heidelberg Cuneiform Benchmark Dataset for the Hilprecht Collection Mar 26, 2021 - IWR Computer Graphics Mara, Hubert, 2019, "HeiCuBeDa Hilprecht - Heidelberg Cuneiform Benchmark Dataset for the Hilprecht Collection", https://doi.org/10.11588/data/IE8CCN, heiDATA, V2 The number of known cuneiform tablets is assumed to be in the hundreds of thousands. A fraction has been published by printing photographs and manual tracings in books, which is collected by the online Cuneiform Digital Library Initiative (CDLI) catalog including some of these im...
X-SRL Dataset and mBERT Word Aligner Feb 17, 2021 - Empirical Linguistics and Computational Language Modeling (LiMo) Daza, Angel, 2021, "X-SRL Dataset and mBERT Word Aligner", https://doi.org/10.11588/data/HVXXIJ, heiDATA, V1 This code contains a method to automatically align words from parallel sentences by using multilingual BERT pre-trained embeddings. This can be used to transfer source annotations (for example labeled English sentences) into the target side (for example a German translation of th...
German Twitter Titling Corpus Jan 20, 2021 - Empirical Linguistics and Computational Language Modeling (LiMo) van den Berg, Esther; Korfhage, Katharina; Ruppenhofer, Josef; Wiegand, Michael; Markert, Katja, 2020, "German Twitter Titling Corpus", https://doi.org/10.11588/data/AOSUY6, heiDATA, V2, UNF:6:14BxjwJS7Q3mfI6ei7iBBw== [fileUNF] The German Titling Twitter Corpus consists of 1904 stance-annotated tweets collected in June/July 2018 mentioning 24 German politicians with a doctoral degree. The Addendum contains an additional 296 stance-annotated tweets from each month of 2018 mentioning 10 politicians with a...
OwnReality API-only web application Oct 26, 2020 - OwnReality. To Each His Own Reality Schepp, Moritz, 2020, "OwnReality API-only web application", https://doi.org/10.11588/data/KZHLS8, heiDATA, V1 This dataset contains the data platform for the research project "OwnReality. To Each His Own Reality". During the course of the project, data was gathered and entered into a database. In general, this platform allows the integration of that data into web based systems such as co...
OwnReality. To Each His Own Reality(DFK-Paris) Oct 26, 2020DFK-Paris Research data from the project OwnReality
DFK-Paris(Deutsches Forum für Kunstgeschichte) Oct 26, 2020arthistoricum.net@heiDATA Open Research Data from the German Center for Art History (Deutsches Forum für Kunstgeschichte)

ErKon3D - Quantifying Deformation in Aegean Sealing Practices [Dataset]

Jun 7, 2023 - IWR Computer Graphics

Mara, Hubert, 2023, "ErKon3D - Quantifying Deformation in Aegean Sealing Practices [Dataset]", https://doi.org/10.11588/data/UMJXI0, heiDATA, V1

In Bronze Aegean society, seals played an important role by authenticating, securing and marking. The study of the seals and their engraved motifs provides valuable insight into the social and political organization and administration of Aegean societies. A key research question...

Ground Truth transcriptions for training OCR of historical Bengali printed texts – Recognition of Early Indian Printed Documents competition - updated with improved XML coordinates

Mar 21, 2023 - Ground truth data for HTR on South Asian Scripts

Derrick, Tom; British Library, 2023, "Ground Truth transcriptions for training OCR of historical Bengali printed texts – Recognition of Early Indian Printed Documents competition - updated with improved XML coordinates", https://doi.org/10.11588/data/AIQSXL, heiDATA, V1

This dataset comprises 81 digitised images (TIFF files) drawn from a selection of early printed Bengali books (1713-1914) digitised through the Two Centuries of Indian Print project (https://www.bl.uk/projects/two-centuries-of-indian-print). Also contained are ground truth transc...

Ground truth data for HTR on South Asian Scripts(FID4SA – Specialized Information Service South Asia)

Oct 26, 2022FID4SA@heiDATA

A collection of Ground Truth data for handwritten and printed text recognition for South Asian scripts provided by FID4SA - Specialized Information Service South Asia. Interested researchers can download the data archived here and use it as training data for their own text recogn...

FID4SA@heiDATA(Heidelberg University Library)

Oct 26, 2022

Data publications of the FID4SA – Specialized Information Service South Asia.

HeiCuBeDa Hilprecht - Heidelberg Cuneiform Benchmark Dataset for the Hilprecht Collection

Mar 26, 2021 - IWR Computer Graphics

Mara, Hubert, 2019, "HeiCuBeDa Hilprecht - Heidelberg Cuneiform Benchmark Dataset for the Hilprecht Collection", https://doi.org/10.11588/data/IE8CCN, heiDATA, V2

The number of known cuneiform tablets is assumed to be in the hundreds of thousands. A fraction has been published by printing photographs and manual tracings in books, which is collected by the online Cuneiform Digital Library Initiative (CDLI) catalog including some of these im...

X-SRL Dataset and mBERT Word Aligner

Feb 17, 2021 - Empirical Linguistics and Computational Language Modeling (LiMo)

Daza, Angel, 2021, "X-SRL Dataset and mBERT Word Aligner", https://doi.org/10.11588/data/HVXXIJ, heiDATA, V1

This code contains a method to automatically align words from parallel sentences by using multilingual BERT pre-trained embeddings. This can be used to transfer source annotations (for example labeled English sentences) into the target side (for example a German translation of th...

German Twitter Titling Corpus

Jan 20, 2021 - Empirical Linguistics and Computational Language Modeling (LiMo)

van den Berg, Esther; Korfhage, Katharina; Ruppenhofer, Josef; Wiegand, Michael; Markert, Katja, 2020, "German Twitter Titling Corpus", https://doi.org/10.11588/data/AOSUY6, heiDATA, V2, UNF:6:14BxjwJS7Q3mfI6ei7iBBw== [fileUNF]

The German Titling Twitter Corpus consists of 1904 stance-annotated tweets collected in June/July 2018 mentioning 24 German politicians with a doctoral degree. The Addendum contains an additional 296 stance-annotated tweets from each month of 2018 mentioning 10 politicians with a...

OwnReality API-only web application

Oct 26, 2020 - OwnReality. To Each His Own Reality

Schepp, Moritz, 2020, "OwnReality API-only web application", https://doi.org/10.11588/data/KZHLS8, heiDATA, V1

This dataset contains the data platform for the research project "OwnReality. To Each His Own Reality". During the course of the project, data was gathered and entered into a database. In general, this platform allows the integration of that data into web based systems such as co...

OwnReality. To Each His Own Reality(DFK-Paris)

Oct 26, 2020DFK-Paris

Research data from the project OwnReality

DFK-Paris(Deutsches Forum für Kunstgeschichte)

Oct 26, 2020arthistoricum.net@heiDATA

Open Research Data from the German Center for Art History (Deutsches Forum für Kunstgeschichte)

Add Data

Share Dataverse

Link Dataverse

Reset Modifications