heiDATA

Metrics

207,636 Downloads

Featured Dataverses

In order to use this feature you must have at least one published dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

There was an error with your search parameters. Please clear your search and try again.

1 to 10 of 18 Results

WikiWarsDE Corpus Aug 13, 2014 - Database Systems Research Group Strötgen, Jannik; Gertz, Michael, 2014, "WikiWarsDE Corpus", https://doi.org/10.11588/data/10026, heiDATA, V1 The WikiWarsDE corpus is a German corpus containing Wikipedia articles with annotations of temporal expressions. Its creation was motivated by the English WikiWars corpus (Mazur & Dale 2010). WikiWarsDE was developed to support research on temporal information extraction and norm...
Text und Data Mining an wissenschaftlichen Repositorien und Publikationsservern in Deutschland - Zusammenfassung der Ergebnisse einer Umfrage im Februar und März 2016 Nov 2, 2016 - Perspektive Bibliothek Drees, Bastian, 2016, "Text und Data Mining an wissenschaftlichen Repositorien und Publikationsservern in Deutschland - Zusammenfassung der Ergebnisse einer Umfrage im Februar und März 2016", https://doi.org/10.11588/data/10090, heiDATA, V2 Es wurden die auf den Homepages angegebenen Ansprechpartner wissenschaftlicher Repositorien und Publikationsserver in Deutschland zu ihren Erfahrungen mit Text und Data Mining befragt. Die Befragung fand zwischen dem 22. und 26.2.2016 per E-Mail statt. Es wurden Ansprechpartner v...
Statistical Natural Language Processing Group(Heidelberg University - Department of Computational Linguistics) May 21, 2014 The Statistical Natural Language Processing Group is part of the Department of Computational Linguistics. Our research addresses various aspects of the problem of the confusion of languages, by means of statistical learning techniques. Research topics include the following: Stati...
Source Code, Data and Additional Material for the Thesis: "Aspects of Coherence for Entity Analysis" Feb 6, 2019 - AIPHES Heinzerling, Benjamin, 2019, "Source Code, Data and Additional Material for the Thesis: "Aspects of Coherence for Entity Analysis"", https://doi.org/10.11588/data/9JKAVW, heiDATA, V1 This dataset contains source code and system output used in the PhD thesis "Aspects of Coherence for Entity Analysis". This dataset is split into three parts corresponding to the chapters describing the three main contributions of the thesis: chapter3.tar.gz: Java source code for...
Selectional Preference Embeddings (EMNLP 2017) Jan 31, 2019 - AIPHES Heinzerling, Benjamin, 2019, "Selectional Preference Embeddings (EMNLP 2017)", https://doi.org/10.11588/data/FJQ4XL, heiDATA, V1 Joint embeddings of selectional preferences, words, and fine-grained entity types. The vocabulary consists of: verbs and their dependency relation separated by "@", e.g. "sink@nsubj" or "elect@dobj" words and short noun phrases, e.g. "Titanic" fine-grained entity types using the...
RATIO_EXPLAIN(Heidelberg University, Department of Computational Linguistics) Feb 26, 2024 Open Research Data from the ExpLAIN project, a joint research project of the NLP Group at the Computational Linguistics Department of Heidelberg University and the Data and Web Science Groupat University of Mannheim.
PatTR: Patent Translation Resource Jun 16, 2014 - Statistical Natural Language Processing Group Wäschle, Katharina; Riezler, Stefan, 2014, "PatTR: Patent Translation Resource", https://doi.org/10.11588/data/10002, heiDATA, V3 PatTR is a sentence-parallel corpus extracted from the MAREC patent collection. The current version contains more than 22 million German-English and 18 million French-English parallel sentences collected from all patent text sections as well as 5 million German-French sentence pa...
Opinion role extractor Sep 2, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo) Wiegand, Michael, 2019, "Opinion role extractor", https://doi.org/10.11588/data/3W7AQP, heiDATA, V1 System for the Extraction of Subjective Expressions, Sentiment Sources and Sentiment Targets from German Text
Natural Language Processing Group(Universität Heidelberg) Jan 17, 2024 The main purpose of language is to encode and communicate information of all sorts. Our research focuses on semantics — the study of meaning — and how a machine can assign meaning to utterances: words, sentences and texts, as humans can do. Our work is linguistically informed and...
LibriVoxDeEn - A Corpus for German-to-English Speech Translation and Speech Recognition Jun 13, 2020 - Statistical Natural Language Processing Group Beilharz, Benjamin; Sun, Xin, 2019, "LibriVoxDeEn - A Corpus for German-to-English Speech Translation and Speech Recognition", https://doi.org/10.11588/data/TMEDTX, heiDATA, V2 This dataset is a corpus of sentence-aligned triples of German audio, German text, and English translation, based on German audio books. The corpus consists of over 100 hours of audio material and over 50k parallel sentences. The speech data are low in disfluencies because of the...

WikiWarsDE Corpus

Aug 13, 2014 - Database Systems Research Group

Strötgen, Jannik; Gertz, Michael, 2014, "WikiWarsDE Corpus", https://doi.org/10.11588/data/10026, heiDATA, V1

The WikiWarsDE corpus is a German corpus containing Wikipedia articles with annotations of temporal expressions. Its creation was motivated by the English WikiWars corpus (Mazur & Dale 2010). WikiWarsDE was developed to support research on temporal information extraction and norm...

Text und Data Mining an wissenschaftlichen Repositorien und Publikationsservern in Deutschland - Zusammenfassung der Ergebnisse einer Umfrage im Februar und März 2016

Nov 2, 2016 - Perspektive Bibliothek

Drees, Bastian, 2016, "Text und Data Mining an wissenschaftlichen Repositorien und Publikationsservern in Deutschland - Zusammenfassung der Ergebnisse einer Umfrage im Februar und März 2016", https://doi.org/10.11588/data/10090, heiDATA, V2

Es wurden die auf den Homepages angegebenen Ansprechpartner wissenschaftlicher Repositorien und Publikationsserver in Deutschland zu ihren Erfahrungen mit Text und Data Mining befragt. Die Befragung fand zwischen dem 22. und 26.2.2016 per E-Mail statt. Es wurden Ansprechpartner v...

Statistical Natural Language Processing Group(Heidelberg University - Department of Computational Linguistics)

May 21, 2014

The Statistical Natural Language Processing Group is part of the Department of Computational Linguistics. Our research addresses various aspects of the problem of the confusion of languages, by means of statistical learning techniques. Research topics include the following: Stati...

Source Code, Data and Additional Material for the Thesis: "Aspects of Coherence for Entity Analysis"

Feb 6, 2019 - AIPHES

Heinzerling, Benjamin, 2019, "Source Code, Data and Additional Material for the Thesis: "Aspects of Coherence for Entity Analysis"", https://doi.org/10.11588/data/9JKAVW, heiDATA, V1

This dataset contains source code and system output used in the PhD thesis "Aspects of Coherence for Entity Analysis". This dataset is split into three parts corresponding to the chapters describing the three main contributions of the thesis: chapter3.tar.gz: Java source code for...

Selectional Preference Embeddings (EMNLP 2017)

Jan 31, 2019 - AIPHES

Heinzerling, Benjamin, 2019, "Selectional Preference Embeddings (EMNLP 2017)", https://doi.org/10.11588/data/FJQ4XL, heiDATA, V1

Joint embeddings of selectional preferences, words, and fine-grained entity types. The vocabulary consists of: verbs and their dependency relation separated by "@", e.g. "sink@nsubj" or "elect@dobj" words and short noun phrases, e.g. "Titanic" fine-grained entity types using the...

RATIO_EXPLAIN(Heidelberg University, Department of Computational Linguistics)

Feb 26, 2024

Open Research Data from the ExpLAIN project, a joint research project of the NLP Group at the Computational Linguistics Department of Heidelberg University and the Data and Web Science Groupat University of Mannheim.

PatTR: Patent Translation Resource

Jun 16, 2014 - Statistical Natural Language Processing Group

Wäschle, Katharina; Riezler, Stefan, 2014, "PatTR: Patent Translation Resource", https://doi.org/10.11588/data/10002, heiDATA, V3

PatTR is a sentence-parallel corpus extracted from the MAREC patent collection. The current version contains more than 22 million German-English and 18 million French-English parallel sentences collected from all patent text sections as well as 5 million German-French sentence pa...

Opinion role extractor

Sep 2, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo)

Wiegand, Michael, 2019, "Opinion role extractor", https://doi.org/10.11588/data/3W7AQP, heiDATA, V1

System for the Extraction of Subjective Expressions, Sentiment Sources and Sentiment Targets from German Text

Natural Language Processing Group(Universität Heidelberg)

Jan 17, 2024

The main purpose of language is to encode and communicate information of all sorts. Our research focuses on semantics — the study of meaning — and how a machine can assign meaning to utterances: words, sentences and texts, as humans can do. Our work is linguistically informed and...

LibriVoxDeEn - A Corpus for German-to-English Speech Translation and Speech Recognition

Jun 13, 2020 - Statistical Natural Language Processing Group

Beilharz, Benjamin; Sun, Xin, 2019, "LibriVoxDeEn - A Corpus for German-to-English Speech Translation and Speech Recognition", https://doi.org/10.11588/data/TMEDTX, heiDATA, V2

This dataset is a corpus of sentence-aligned triples of German audio, German text, and English translation, based on German audio books. The corpus consists of over 100 hours of audio material and over 50k parallel sentences. The speech data are low in disfluencies because of the...

Add Data

Share Dataverse

Link Dataverse

Reset Modifications