Metrics
191,583 Downloads
Featured Dataverses

In order to use this feature you must have at least one published dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

Advanced Search

51 to 60 of 93 Results
Aug 16, 2023 - arthistoricum.net@heiDATA
Knaus, Gudrun; Kailus, Angela; Stein, Regine, 2022, "LIDO-Handbuch für die Erfassung und Publikation von Metadaten zu kulturellen Objekten - Band 2: Malerei und Skulptur [Anwendungsbeispiele]", https://doi.org/10.11588/data/CHEPS6, heiDATA, V3
LIDO (Lightweight Information Describing Objects) ist ein XML-Schema für die standardkonforme Bereitstellung von Metadaten über kulturelle Objekte in einer Vielzahl von digitalen Kontexten. Basierend auf diesem internationalen Standard dient das "LIDO-Handbuch für die Erfassung u...
Jun 13, 2020 - Statistical Natural Language Processing Group
Beilharz, Benjamin; Sun, Xin, 2019, "LibriVoxDeEn - A Corpus for German-to-English Speech Translation and Speech Recognition", https://doi.org/10.11588/data/TMEDTX, heiDATA, V2
This dataset is a corpus of sentence-aligned triples of German audio, German text, and English translation, based on German audio books. The corpus consists of over 100 hours of audio material and over 50k parallel sentences. The speech data are low in disfluencies because of the...
Sep 2, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo)
Wiegand, Michael, 2019, "Lexicon of Abusive Words (EN)", https://doi.org/10.11588/data/MKPEYV, heiDATA, V1
This goldstandard contains a bootstrapped lexicon of abusive words. The lexicon comprises a large set of English negative polar expressions annotated as either abusive or not.
Apr 24, 2024 - AIPHES
Mihaylov, Todor, 2024, "Knowledge-Enhanced Neural Networks for Machine Reading Comprehension [Source Code and Additional Material]", https://doi.org/10.11588/data/HU3ARF, heiDATA, V1
Machine Reading Comprehension is a language understanding task where a system is expected to read a given passage of text and typically answer questions about it. When humans assess the task of reading comprehension, in addition to the presented text, they usually use the knowled...
Aug 19, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo)
Kotnis, Bhushan, 2019, "KGE Algorithms", https://doi.org/10.11588/data/CSXYSS, heiDATA, V1
An updated method for link prediction that uses a regularization factor that models relation argument types Abstract (Kotnis and Nastase, 2017): Learning relations based on evidence from knowledge repositories relies on processing the available relation instances. Knowledge repos...
Nov 2, 2023 - Heidelberg Centre for Transcultural Studies (HCTS)
Henke, Konstantin; Arnold, Matthias, 2023, "Jing bao ground truth – text block crops and annotations", https://doi.org/10.11588/data/PVYWKB, heiDATA, V1
This is the data set related to the paper "Language Model Assisted OCR Classification for Republican Chinese Newspaper Text", JDADH 11/2023. In this work, we present methods to obtain a neural optical character recognition (OCR) tool for article blocks in a Republican Chinese new...
Feb 26, 2024 - RATIO_EXPLAIN
Becker, Maria, 2024, "IKAT-EN", https://doi.org/10.11588/data/RUBM2E, heiDATA, V1, UNF:6:To3aHa8xO8P28fzpCz1Qvw== [fileUNF]
A corpus consisting of high-quality human annotations of missing and implied information in argumentative texts (English version). The data is further annotated with semantic clause types and commonsense knowledge relations.
Feb 26, 2024 - RATIO_EXPLAIN
Becker, Maria, 2024, "IKAT-DE", https://doi.org/10.11588/data/4BA5LY, heiDATA, V1
A corpus consisting of high-quality human annotations of missing and implied information in argumentative texts (German version). The data is further annotated with semantic clause types and commonsense knowledge relations.
Mar 26, 2021 - IWR Computer Graphics
Mara, Hubert, 2019, "HeiCuBeDa Hilprecht - Heidelberg Cuneiform Benchmark Dataset for the Hilprecht Collection", https://doi.org/10.11588/data/IE8CCN, heiDATA, V2
The number of known cuneiform tablets is assumed to be in the hundreds of thousands. A fraction has been published by printing photographs and manual tracings in books, which is collected by the online Cuneiform Digital Library Initiative (CDLI) catalog including some of these im...
Nov 13, 2023 - Neural Techniques for German Dependency Parsing
Do, Bich-Ngoc; Rehbein, Ines; Frank, Anette, 2023, "Head Selection Parsers and LSTM Labelers", https://doi.org/10.11588/data/BPWWJL, heiDATA, V1
This resource contains code, data and pre-trained models for various types of neural dependency parsers and LSTM labelers used in the papers: Do et al. (2017). "What Do We Need to Know About an Unknown Word When Parsing German" Do and Rehbein (2017). "Evaluating LSTM Models for G...
Add Data

Sign up or log in to create a dataverse or add a dataset.

Share Dataverse

Share this dataverse on your favorite social media networks.

Link Dataverse
Reset Modifications

Are you sure you want to reset the selected metadata fields? If you do this, any customizations (hidden, required, optional) you have done will no longer appear.