Skip to main content
Metrics
32,603 Downloads
Featured Dataverses

In order to use this feature you must have at least one published dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

Find Advanced Search

1 to 10 of 32 Results
Aug 23, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo)
van den Berg, Esther, 2019, "Twitter Titling Corpus", https://doi.org/10.11588/data/IOHXDF, heiDATA, V1, UNF:6:+F3lLKziwMvjy+xyktkilw== [fileUNF]
The Twitter Titling Corpus contains 4002 stance-annotated tweets collected between 20 June 2017 and 30 August 2017 mentioning 6 presidents. Each tweet is annotated for the naming form used to refer to the president, for the purpose of a study on the relation between naming variat...
Mar 26, 2020 - Empirical Linguistics and Computational Language Modeling (LiMo)
Rehbein, Ines; Ruppenhofer, Josef; Do, Bich-Ngoc, 2020, "tweeDe", https://doi.org/10.11588/data/S90S35, heiDATA, V1
A German UD Twitter treebank, with >12,000 tokens from 519 tweets, annotated in the Universal Dependencies framework
Oct 7, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo)
Marasović, Ana; Zhou, Mengfei; Frank, Anette, 2019, "The MSC Data Set", https://doi.org/10.11588/data/JEESIQ, heiDATA, V1
From this page you can download resources we created for modal sense classification as reported in Zhou et al. (2015), Marasović et al. (2016) and Marasović and Frank (2015) (see "Related Publication" below): Heuristically sense-annotated training data acquired from EUROPARL and...
Sep 5, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo)
Wiegand, Michael; Ruppenhofer, Josef; Schulder, Marc, 2019, "Sentiment View Lexicon (EN)", https://doi.org/10.11588/data/2JK48O, heiDATA, V1
This gold standard contains sentiment expressions (verbs, nouns and adjectives) that have been annotated according to their (prior) sentiment view. Each sentiment expression is labelled either as actor or speaker view.
Sep 5, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo)
Wiegand, Michael; Bocionek, Christine; Ruppenhofer, Josef, 2019, "Sentiment Compound Data (DE)", https://doi.org/10.11588/data/LSTRK3, heiDATA, V1
This dataset contains gold standards that are required for building a classifier that automatically extracts opinion (noun) compounds.
Mar 26, 2020 - Empirical Linguistics and Computational Language Modeling (LiMo)
Rehbein, Ines; Ruppenhofer, Josef; Zimmermann, Victor, 2020, "Pre-trained POS tagging models for German social media", https://doi.org/10.11588/data/W3JBV4, heiDATA, V1
Pre-trained POS tagging models for the HunPos tagger (Halácsy et al. 2007) the biLSTM-char-CRF tagger (Reimers & Gurevych 2017) Online-Flors (Yin et al. 2015). References: Halácsy, P., Kornai, A., and Oravecz, C. (2007). HunPos: An open source trigram tagger. In Proceedings of th...
Sep 2, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo)
Wiegand, Michael, 2019, "Opinion role extractor", https://doi.org/10.11588/data/3W7AQP, heiDATA, V1
System for the Extraction of Subjective Expressions, Sentiment Sources and Sentiment Targets from German Text
Aug 19, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo)
Kotnis, Bhushan, 2019, "Negative Sampling for Learning Knowledge Graph Embeddings", https://doi.org/10.11588/data/YYULL2, heiDATA, V1
Reimplementation of four KG factorization methods and six negative sampling methods. Abstract Knowledge graphs are large, useful, but incomplete knowledge repositories. They encode knowledge through entities and relations which define each other through the connective structure o...
Oct 7, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo)
Marasović, Ana, 2019, "Multilingual Modal Sense Classification using a Convolutional Neural Network [Source Code]", https://doi.org/10.11588/data/ERDJDI, heiDATA, V1
Abstract Modal sense classification (MSC) is aspecial WSD task that depends on themeaning of the proposition in the modal’s scope. We explore a CNN architecture for classifying modal sense in English and German. We show that CNNs are superior to manually designed feature-based cl...
Mar 26, 2020 - Empirical Linguistics and Computational Language Modeling (LiMo)
Rehbein, Ines; Ruppenhofer, Josef, 2020, "MACE-AL-TREE", https://doi.org/10.11588/data/THPEBR, heiDATA, V1
An method for detecting noise in automatically annotated dependency parse trees, combining MACE (Hovy et al. 2013) with Active Learning.
Add Data

Sign up or log in to create a dataverse or add a dataset.

Share Dataverse

Share this dataverse on your favorite social media networks.

Link Dataverse
Reset Modifications

Are you sure you want to reset the selected metadata fields? If you do this, any customizations (hidden, required, optional) you have done will no longer appear.

Contact heiDATA Support

heiDATA Support

Please fill this out to prove you are not a robot.

+ =