heiDATA

Metrics

207,636 Downloads

Featured Dataverses

In order to use this feature you must have at least one published dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

There was an error with your search parameters. Please clear your search and try again.

1 to 7 of 7 Results

LibriVoxDeEn - A Corpus for German-to-English Speech Translation and Speech Recognition Jun 13, 2020 - Statistical Natural Language Processing Group Beilharz, Benjamin; Sun, Xin, 2019, "LibriVoxDeEn - A Corpus for German-to-English Speech Translation and Speech Recognition", https://doi.org/10.11588/data/TMEDTX, heiDATA, V2 This dataset is a corpus of sentence-aligned triples of German audio, German text, and English translation, based on German audio books. The corpus consists of over 100 hours of audio material and over 50k parallel sentences. The speech data are low in disfluencies because of the...
Lauchheim II.2. Katalog der Gräber 301–600 Jul 2, 2019 - Propylaeum@heiDATA Höke, Benjamin; Gauß, Florian; Peek, Christina; Stelzner, Jörg, 2019, "Lauchheim II.2. Katalog der Gräber 301–600", https://doi.org/10.11588/data/HB97MY, heiDATA, V1 Mit rund 1300 Gräbern aus dem Zeitraum vom späten 5. bis zum späten 7. Jahrhundert ist das Gräberfeld von Lauchheim 'Wasserfurche' (Ostalbkreis) bis heute der größte bekannte merowingerzeitliche Bestattungsplatz Süddeutschlands. In den Jahren 1986 bis 1996 wurde das fast vollstän...
Pooled clone collections by multiplexed CRISPR-Cas12a-assisted gene tagging in yeast [Dataset] May 28, 2019 - KnopLab Buchmuller, Benjamin C; Herbst, Konrad; Meurer, Matthias; Kirrmaier, Daniel; Sass, Ehud; Levy, Emmanuel D; Knop, Michael, 2019, "Pooled clone collections by multiplexed CRISPR-Cas12a-assisted gene tagging in yeast [Dataset]", https://doi.org/10.11588/data/L45TRX, heiDATA, V2 Data accompanying the paper "Pooled clone collections by multiplexed CRISPR-Cas12a-assisted gene tagging in yeast" by Buchmuller and Herbst et al, 2019, Nat Communications. This contains raw NGS data for all genotyping analysis in the publication as well as the source code of the...
BPEmb: Pre-trained Subword Embeddings in 275 Languages (LREC 2018) Feb 6, 2019 - AIPHES Heinzerling, Benjamin, 2019, "BPEmb: Pre-trained Subword Embeddings in 275 Languages (LREC 2018)", https://doi.org/10.11588/data/V9CXPR, heiDATA, V1 BPEmb is a collection of pre-trained subword unit embeddings in 275 languages, based on Byte-Pair Encoding (BPE). In an evaluation using fine-grained entity typing as testbed, BPEmb performs competitively, and for some languages better than alternative subword approaches, while r...
Source Code, Data and Additional Material for the Thesis: "Aspects of Coherence for Entity Analysis" Feb 6, 2019 - AIPHES Heinzerling, Benjamin, 2019, "Source Code, Data and Additional Material for the Thesis: "Aspects of Coherence for Entity Analysis"", https://doi.org/10.11588/data/9JKAVW, heiDATA, V1 This dataset contains source code and system output used in the PhD thesis "Aspects of Coherence for Entity Analysis". This dataset is split into three parts corresponding to the chapters describing the three main contributions of the thesis: chapter3.tar.gz: Java source code for...
Abstract Anaphora Resolution [Source Code] Feb 4, 2019 - AIPHES Marasovic, Ana, 2019, "Abstract Anaphora Resolution [Source Code]", https://doi.org/10.11588/data/UDMPY5, heiDATA, V1 Abstract Anaphora Resolution (AAR) aims to find the interpretation of nominal expressions (e.g., this result, those two actions) and pronominal expressions (e.g., this, that, it) that refer to abstract-object-antecedents such as facts, events, plans, actions, or situations. The f...
Selectional Preference Embeddings (EMNLP 2017) Jan 31, 2019 - AIPHES Heinzerling, Benjamin, 2019, "Selectional Preference Embeddings (EMNLP 2017)", https://doi.org/10.11588/data/FJQ4XL, heiDATA, V1 Joint embeddings of selectional preferences, words, and fine-grained entity types. The vocabulary consists of: verbs and their dependency relation separated by "@", e.g. "sink@nsubj" or "elect@dobj" words and short noun phrases, e.g. "Titanic" fine-grained entity types using the...

LibriVoxDeEn - A Corpus for German-to-English Speech Translation and Speech Recognition

Jun 13, 2020 - Statistical Natural Language Processing Group

Beilharz, Benjamin; Sun, Xin, 2019, "LibriVoxDeEn - A Corpus for German-to-English Speech Translation and Speech Recognition", https://doi.org/10.11588/data/TMEDTX, heiDATA, V2

This dataset is a corpus of sentence-aligned triples of German audio, German text, and English translation, based on German audio books. The corpus consists of over 100 hours of audio material and over 50k parallel sentences. The speech data are low in disfluencies because of the...

Lauchheim II.2. Katalog der Gräber 301–600

Jul 2, 2019 - Propylaeum@heiDATA

Höke, Benjamin; Gauß, Florian; Peek, Christina; Stelzner, Jörg, 2019, "Lauchheim II.2. Katalog der Gräber 301–600", https://doi.org/10.11588/data/HB97MY, heiDATA, V1

Mit rund 1300 Gräbern aus dem Zeitraum vom späten 5. bis zum späten 7. Jahrhundert ist das Gräberfeld von Lauchheim 'Wasserfurche' (Ostalbkreis) bis heute der größte bekannte merowingerzeitliche Bestattungsplatz Süddeutschlands. In den Jahren 1986 bis 1996 wurde das fast vollstän...

Pooled clone collections by multiplexed CRISPR-Cas12a-assisted gene tagging in yeast [Dataset]

May 28, 2019 - KnopLab

Buchmuller, Benjamin C; Herbst, Konrad; Meurer, Matthias; Kirrmaier, Daniel; Sass, Ehud; Levy, Emmanuel D; Knop, Michael, 2019, "Pooled clone collections by multiplexed CRISPR-Cas12a-assisted gene tagging in yeast [Dataset]", https://doi.org/10.11588/data/L45TRX, heiDATA, V2

Data accompanying the paper "Pooled clone collections by multiplexed CRISPR-Cas12a-assisted gene tagging in yeast" by Buchmuller and Herbst et al, 2019, Nat Communications. This contains raw NGS data for all genotyping analysis in the publication as well as the source code of the...

BPEmb: Pre-trained Subword Embeddings in 275 Languages (LREC 2018)

Feb 6, 2019 - AIPHES

Heinzerling, Benjamin, 2019, "BPEmb: Pre-trained Subword Embeddings in 275 Languages (LREC 2018)", https://doi.org/10.11588/data/V9CXPR, heiDATA, V1

BPEmb is a collection of pre-trained subword unit embeddings in 275 languages, based on Byte-Pair Encoding (BPE). In an evaluation using fine-grained entity typing as testbed, BPEmb performs competitively, and for some languages better than alternative subword approaches, while r...

Source Code, Data and Additional Material for the Thesis: "Aspects of Coherence for Entity Analysis"

Feb 6, 2019 - AIPHES

Heinzerling, Benjamin, 2019, "Source Code, Data and Additional Material for the Thesis: "Aspects of Coherence for Entity Analysis"", https://doi.org/10.11588/data/9JKAVW, heiDATA, V1

This dataset contains source code and system output used in the PhD thesis "Aspects of Coherence for Entity Analysis". This dataset is split into three parts corresponding to the chapters describing the three main contributions of the thesis: chapter3.tar.gz: Java source code for...

Abstract Anaphora Resolution [Source Code]

Feb 4, 2019 - AIPHES

Marasovic, Ana, 2019, "Abstract Anaphora Resolution [Source Code]", https://doi.org/10.11588/data/UDMPY5, heiDATA, V1

Abstract Anaphora Resolution (AAR) aims to find the interpretation of nominal expressions (e.g., this result, those two actions) and pronominal expressions (e.g., this, that, it) that refer to abstract-object-antecedents such as facts, events, plans, actions, or situations. The f...

Selectional Preference Embeddings (EMNLP 2017)

Jan 31, 2019 - AIPHES

Heinzerling, Benjamin, 2019, "Selectional Preference Embeddings (EMNLP 2017)", https://doi.org/10.11588/data/FJQ4XL, heiDATA, V1

Joint embeddings of selectional preferences, words, and fine-grained entity types. The vocabulary consists of: verbs and their dependency relation separated by "@", e.g. "sink@nsubj" or "elect@dobj" words and short noun phrases, e.g. "Titanic" fine-grained entity types using the...

Add Data

Share Dataverse

Link Dataverse

Reset Modifications