heiDATA

Metrics

191,583 Downloads

Featured Dataverses

In order to use this feature you must have at least one published dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

Subject: Computer and Information Science

51 to 60 of 93 Results

LIDO-Handbuch für die Erfassung und Publikation von Metadaten zu kulturellen Objekten - Band 2: Malerei und Skulptur [Anwendungsbeispiele] Aug 16, 2023 - arthistoricum.net@heiDATA Knaus, Gudrun; Kailus, Angela; Stein, Regine, 2022, "LIDO-Handbuch für die Erfassung und Publikation von Metadaten zu kulturellen Objekten - Band 2: Malerei und Skulptur [Anwendungsbeispiele]", https://doi.org/10.11588/data/CHEPS6, heiDATA, V3 LIDO (Lightweight Information Describing Objects) ist ein XML-Schema für die standardkonforme Bereitstellung von Metadaten über kulturelle Objekte in einer Vielzahl von digitalen Kontexten. Basierend auf diesem internationalen Standard dient das "LIDO-Handbuch für die Erfassung u...
LibriVoxDeEn - A Corpus for German-to-English Speech Translation and Speech Recognition Jun 13, 2020 - Statistical Natural Language Processing Group Beilharz, Benjamin; Sun, Xin, 2019, "LibriVoxDeEn - A Corpus for German-to-English Speech Translation and Speech Recognition", https://doi.org/10.11588/data/TMEDTX, heiDATA, V2 This dataset is a corpus of sentence-aligned triples of German audio, German text, and English translation, based on German audio books. The corpus consists of over 100 hours of audio material and over 50k parallel sentences. The speech data are low in disfluencies because of the...
Lexicon of Abusive Words (EN) Sep 2, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo) Wiegand, Michael, 2019, "Lexicon of Abusive Words (EN)", https://doi.org/10.11588/data/MKPEYV, heiDATA, V1 This goldstandard contains a bootstrapped lexicon of abusive words. The lexicon comprises a large set of English negative polar expressions annotated as either abusive or not.
Knowledge-Enhanced Neural Networks for Machine Reading Comprehension [Source Code and Additional Material] Apr 24, 2024 - AIPHES Mihaylov, Todor, 2024, "Knowledge-Enhanced Neural Networks for Machine Reading Comprehension [Source Code and Additional Material]", https://doi.org/10.11588/data/HU3ARF, heiDATA, V1 Machine Reading Comprehension is a language understanding task where a system is expected to read a given passage of text and typically answer questions about it. When humans assess the task of reading comprehension, in addition to the presented text, they usually use the knowled...
KGE Algorithms Aug 19, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo) Kotnis, Bhushan, 2019, "KGE Algorithms", https://doi.org/10.11588/data/CSXYSS, heiDATA, V1 An updated method for link prediction that uses a regularization factor that models relation argument types Abstract (Kotnis and Nastase, 2017): Learning relations based on evidence from knowledge repositories relies on processing the available relation instances. Knowledge repos...
Jing bao ground truth – text block crops and annotations Nov 2, 2023 - Heidelberg Centre for Transcultural Studies (HCTS) Henke, Konstantin; Arnold, Matthias, 2023, "Jing bao ground truth – text block crops and annotations", https://doi.org/10.11588/data/PVYWKB, heiDATA, V1 This is the data set related to the paper "Language Model Assisted OCR Classification for Republican Chinese Newspaper Text", JDADH 11/2023. In this work, we present methods to obtain a neural optical character recognition (OCR) tool for article blocks in a Republican Chinese new...
IKAT-EN Feb 26, 2024 - RATIO_EXPLAIN Becker, Maria, 2024, "IKAT-EN", https://doi.org/10.11588/data/RUBM2E, heiDATA, V1, UNF:6:To3aHa8xO8P28fzpCz1Qvw== [fileUNF] A corpus consisting of high-quality human annotations of missing and implied information in argumentative texts (English version). The data is further annotated with semantic clause types and commonsense knowledge relations.
IKAT-DE Feb 26, 2024 - RATIO_EXPLAIN Becker, Maria, 2024, "IKAT-DE", https://doi.org/10.11588/data/4BA5LY, heiDATA, V1 A corpus consisting of high-quality human annotations of missing and implied information in argumentative texts (German version). The data is further annotated with semantic clause types and commonsense knowledge relations.
HeiCuBeDa Hilprecht - Heidelberg Cuneiform Benchmark Dataset for the Hilprecht Collection Mar 26, 2021 - IWR Computer Graphics Mara, Hubert, 2019, "HeiCuBeDa Hilprecht - Heidelberg Cuneiform Benchmark Dataset for the Hilprecht Collection", https://doi.org/10.11588/data/IE8CCN, heiDATA, V2 The number of known cuneiform tablets is assumed to be in the hundreds of thousands. A fraction has been published by printing photographs and manual tracings in books, which is collected by the online Cuneiform Digital Library Initiative (CDLI) catalog including some of these im...
Head Selection Parsers and LSTM Labelers Nov 13, 2023 - Neural Techniques for German Dependency Parsing Do, Bich-Ngoc; Rehbein, Ines; Frank, Anette, 2023, "Head Selection Parsers and LSTM Labelers", https://doi.org/10.11588/data/BPWWJL, heiDATA, V1 This resource contains code, data and pre-trained models for various types of neural dependency parsers and LSTM labelers used in the papers: Do et al. (2017). "What Do We Need to Know About an Unknown Word When Parsing German" Do and Rehbein (2017). "Evaluating LSTM Models for G...

LIDO-Handbuch für die Erfassung und Publikation von Metadaten zu kulturellen Objekten - Band 2: Malerei und Skulptur [Anwendungsbeispiele]

Aug 16, 2023 - arthistoricum.net@heiDATA

Knaus, Gudrun; Kailus, Angela; Stein, Regine, 2022, "LIDO-Handbuch für die Erfassung und Publikation von Metadaten zu kulturellen Objekten - Band 2: Malerei und Skulptur [Anwendungsbeispiele]", https://doi.org/10.11588/data/CHEPS6, heiDATA, V3

LIDO (Lightweight Information Describing Objects) ist ein XML-Schema für die standardkonforme Bereitstellung von Metadaten über kulturelle Objekte in einer Vielzahl von digitalen Kontexten. Basierend auf diesem internationalen Standard dient das "LIDO-Handbuch für die Erfassung u...

LibriVoxDeEn - A Corpus for German-to-English Speech Translation and Speech Recognition

Jun 13, 2020 - Statistical Natural Language Processing Group

Beilharz, Benjamin; Sun, Xin, 2019, "LibriVoxDeEn - A Corpus for German-to-English Speech Translation and Speech Recognition", https://doi.org/10.11588/data/TMEDTX, heiDATA, V2

This dataset is a corpus of sentence-aligned triples of German audio, German text, and English translation, based on German audio books. The corpus consists of over 100 hours of audio material and over 50k parallel sentences. The speech data are low in disfluencies because of the...

Lexicon of Abusive Words (EN)

Sep 2, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo)

Wiegand, Michael, 2019, "Lexicon of Abusive Words (EN)", https://doi.org/10.11588/data/MKPEYV, heiDATA, V1

This goldstandard contains a bootstrapped lexicon of abusive words. The lexicon comprises a large set of English negative polar expressions annotated as either abusive or not.

Knowledge-Enhanced Neural Networks for Machine Reading Comprehension [Source Code and Additional Material]

Apr 24, 2024 - AIPHES

Mihaylov, Todor, 2024, "Knowledge-Enhanced Neural Networks for Machine Reading Comprehension [Source Code and Additional Material]", https://doi.org/10.11588/data/HU3ARF, heiDATA, V1

Machine Reading Comprehension is a language understanding task where a system is expected to read a given passage of text and typically answer questions about it. When humans assess the task of reading comprehension, in addition to the presented text, they usually use the knowled...

KGE Algorithms

Aug 19, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo)

Kotnis, Bhushan, 2019, "KGE Algorithms", https://doi.org/10.11588/data/CSXYSS, heiDATA, V1

An updated method for link prediction that uses a regularization factor that models relation argument types Abstract (Kotnis and Nastase, 2017): Learning relations based on evidence from knowledge repositories relies on processing the available relation instances. Knowledge repos...

Jing bao ground truth – text block crops and annotations

Nov 2, 2023 - Heidelberg Centre for Transcultural Studies (HCTS)

Henke, Konstantin; Arnold, Matthias, 2023, "Jing bao ground truth – text block crops and annotations", https://doi.org/10.11588/data/PVYWKB, heiDATA, V1

This is the data set related to the paper "Language Model Assisted OCR Classification for Republican Chinese Newspaper Text", JDADH 11/2023. In this work, we present methods to obtain a neural optical character recognition (OCR) tool for article blocks in a Republican Chinese new...

IKAT-EN

Feb 26, 2024 - RATIO_EXPLAIN

Becker, Maria, 2024, "IKAT-EN", https://doi.org/10.11588/data/RUBM2E, heiDATA, V1, UNF:6:To3aHa8xO8P28fzpCz1Qvw== [fileUNF]

A corpus consisting of high-quality human annotations of missing and implied information in argumentative texts (English version). The data is further annotated with semantic clause types and commonsense knowledge relations.

IKAT-DE

Feb 26, 2024 - RATIO_EXPLAIN

Becker, Maria, 2024, "IKAT-DE", https://doi.org/10.11588/data/4BA5LY, heiDATA, V1

A corpus consisting of high-quality human annotations of missing and implied information in argumentative texts (German version). The data is further annotated with semantic clause types and commonsense knowledge relations.

HeiCuBeDa Hilprecht - Heidelberg Cuneiform Benchmark Dataset for the Hilprecht Collection

Mar 26, 2021 - IWR Computer Graphics

Mara, Hubert, 2019, "HeiCuBeDa Hilprecht - Heidelberg Cuneiform Benchmark Dataset for the Hilprecht Collection", https://doi.org/10.11588/data/IE8CCN, heiDATA, V2

The number of known cuneiform tablets is assumed to be in the hundreds of thousands. A fraction has been published by printing photographs and manual tracings in books, which is collected by the online Cuneiform Digital Library Initiative (CDLI) catalog including some of these im...

Head Selection Parsers and LSTM Labelers

Nov 13, 2023 - Neural Techniques for German Dependency Parsing

Do, Bich-Ngoc; Rehbein, Ines; Frank, Anette, 2023, "Head Selection Parsers and LSTM Labelers", https://doi.org/10.11588/data/BPWWJL, heiDATA, V1

This resource contains code, data and pre-trained models for various types of neural dependency parsers and LSTM labelers used in the papers: Do et al. (2017). "What Do We Need to Know About an Unknown Word When Parsing German" Do and Rehbein (2017). "Evaluating LSTM Models for G...

Add Data

Share Dataverse

Link Dataverse

Reset Modifications