1 to 10 of 11 Results
Aug 23, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo)
van den Berg, Esther; Korfhage, Katharina; Ruppenhofer, Josef; Wiegand, Michael; Markert, Katja, 2019, "Twitter Titling Corpus", https://doi.org/10.11588/data/IOHXDF, heiDATA, V1, UNF:6:+F3lLKziwMvjy+xyktkilw== [fileUNF]
The Twitter Titling Corpus contains 4002 stance-annotated tweets collected between 20 June 2017 and 30 August 2017 mentioning 6 presidents. Each tweet is annotated for the naming form used to refer to the president, for the purpose of a study on the relation between naming variat... |
Mar 26, 2020 - Empirical Linguistics and Computational Language Modeling (LiMo)
Rehbein, Ines; Ruppenhofer, Josef; Do, Bich-Ngoc, 2020, "tweeDe", https://doi.org/10.11588/data/S90S35, heiDATA, V1
A German UD Twitter treebank, with >12,000 tokens from 519 tweets, annotated in the Universal Dependencies framework |
Sep 5, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo)
Wiegand, Michael; Ruppenhofer, Josef; Schulder, Marc, 2019, "Sentiment View Lexicon (EN)", https://doi.org/10.11588/data/2JK48O, heiDATA, V1
This gold standard contains sentiment expressions (verbs, nouns and adjectives) that have been annotated according to their (prior) sentiment view. Each sentiment expression is labelled either as actor or speaker view. |
Sep 5, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo)
Wiegand, Michael; Bocionek, Christine; Ruppenhofer, Josef, 2019, "Sentiment Compound Data (DE)", https://doi.org/10.11588/data/LSTRK3, heiDATA, V1
This dataset contains gold standards that are required for building a classifier that automatically extracts opinion (noun) compounds. |
Mar 26, 2020 - Empirical Linguistics and Computational Language Modeling (LiMo)
Rehbein, Ines; Ruppenhofer, Josef; Zimmermann, Victor, 2020, "Pre-trained POS tagging models for German social media", https://doi.org/10.11588/data/W3JBV4, heiDATA, V1
Pre-trained POS tagging models for the HunPos tagger (Halácsy et al. 2007) the biLSTM-char-CRF tagger (Reimers & Gurevych 2017) Online-Flors (Yin et al. 2015). References: Halácsy, P., Kornai, A., and Oravecz, C. (2007). HunPos: An open source trigram tagger. In Proceedings of th... |
Mar 26, 2020 - Empirical Linguistics and Computational Language Modeling (LiMo)
Rehbein, Ines; Ruppenhofer, Josef, 2020, "MACE-AL-TREE", https://doi.org/10.11588/data/THPEBR, heiDATA, V1
An method for detecting noise in automatically annotated dependency parse trees, combining MACE (Hovy et al. 2013) with Active Learning. |
Mar 26, 2020 - Empirical Linguistics and Computational Language Modeling (LiMo)
Rehbein, Ines; Ruppenhofer, Josef; Steen, Julius, 2020, "MACE-AL", https://doi.org/10.11588/data/C2OQN4, heiDATA, V1
A method for detecting noise in automatically annotated sequence-labelled data, combining MACE (Hovy et al. 2013) with Active Learning. |
Jan 20, 2021 - Empirical Linguistics and Computational Language Modeling (LiMo)
van den Berg, Esther; Korfhage, Katharina; Ruppenhofer, Josef; Wiegand, Michael; Markert, Katja, 2020, "German Twitter Titling Corpus", https://doi.org/10.11588/data/AOSUY6, heiDATA, V2, UNF:6:14BxjwJS7Q3mfI6ei7iBBw== [fileUNF]
The German Titling Twitter Corpus consists of 1904 stance-annotated tweets collected in June/July 2018 mentioning 24 German politicians with a doctoral degree. The Addendum contains an additional 296 stance-annotated tweets from each month of 2018 mentioning 10 politicians with a... |
Mar 26, 2020 - Empirical Linguistics and Computational Language Modeling (LiMo)
Rehbein, Ines; Ruppenhofer, Josef, 2020, "German causal language annotations and lexicon (verbs, nouns, prepositions) (DE)", https://doi.org/10.11588/data/ZHI94V, heiDATA, V1
Annotations of causal verbs, nouns and prepositions in context and lexicon file for causal verbs, nouns and prepositions. |
Oct 8, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo)
Ruppenhofer, Josef, 2019, "Affixoid Dataset (DE)", https://doi.org/10.11588/data/QKF4LT, heiDATA, V1, UNF:6:+MGK9lTPTXx7Rclu1BpPnw== [fileUNF]
The dataset contains the manual annotations for the COLING 2018 submission "Distinguishing affixoid formations from compounds" by Josef Ruppenhofer, Michael Wiegand, Rebecca Wilm and Katja Markert. 1788 complex words containing one of 7 German suffixoid candidates (e.g. -hai, -go... |