Skip to main content
Empirical Linguistics and Computational Language Modeling (LiMo) (Department of Computational Linguistics of Heidelberg University and Leibniz Institute for the German Language)

Data publications of the Leibniz ScienceCampus “Empirical Linguistics and Computational Language Modeling”

The Leibniz ScienceCampus “Empirical Linguistics and Computational Language Modeling” (LiMo) is a cooperative research project between the Leibniz Institute for the German Language (Leibniz-Institut für Deutsche Sprache, IDS) in Mannheim and the Department of Computational Linguistics at Heidelberg University (ICL). The general aims of the project are to develop new methods, models, and tools for compiling and analysing automatically large German textual corpora covering different domains, genres and language varieties.

The project is supported by funds from the Baden-Württemberg Ministry of Science, Research and the Arts and the Leibniz Association together with funds provided by the Leibniz Institute for the German Language and Heidelberg University.

Funding Period: 2015 – 2020

Featured Dataverses

In order to use this feature you must have at least one published dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

Find Advanced Search

11 to 20 of 27 Results
Mar 26, 2020
Rehbein, Ines; Ruppenhofer, Josef; Steen, Julius, 2020, "MACE-AL", https://doi.org/10.11588/data/C2OQN4, heiDATA, V1
A method for detecting noise in automatically annotated sequence-labelled data, combining MACE (Hovy et al. 2013) with Active Learning.
Sep 2, 2019
Wiegand, Michael, 2019, "Lexicon of Abusive Words (EN)", https://doi.org/10.11588/data/MKPEYV, heiDATA, V1
This goldstandard contains a bootstrapped lexicon of abusive words. The lexicon comprises a large set of English negative polar expressions annotated as either abusive or not.
Aug 19, 2019
Kotnis, Bhushan, 2019, "KGE Algorithms", https://doi.org/10.11588/data/CSXYSS, heiDATA, V1
An updated method for link prediction that uses a regularization factor that models relation argument types Abstract (Kotnis and Nastase, 2017): Learning relations based on evidence from knowledge repositories relies on processing the available relation instances. Knowledge repos...
Sep 2, 2019
Wiegand, Michael, 2019, "GermEval-2018 Corpus (DE)", https://doi.org/10.11588/data/0B5VML, heiDATA, V1
This dataset comprises the training and test data (German tweets) from the GermEval 2018 Shared on Offensive Language Detection.
Mar 6, 2020
van den Berg, Esther, 2020, "German Twitter Titling Corpus", https://doi.org/10.11588/data/AOSUY6, heiDATA, V1, UNF:6:xIy4tRguIiz8xpg52FlxOA== [fileUNF]
The German Titling Twitter Corpus consists of 1904 stance-annotated tweets collected in June/July 2018 mentioning 24 German politicians with a doctoral degree. The Addendum contains an additional 296 stance-annotated tweets from each month of 2018 mentioning 6 left-leaning and 4...
Mar 26, 2020
Rehbein, Ines; Ruppenhofer, Josef, 2020, "German causal language annotations and lexicon (verbs, nouns, prepositions) (DE)", https://doi.org/10.11588/data/ZHI94V, heiDATA, V1
Annotations of causal verbs, nouns and prepositions in context and lexicon file for causal verbs, nouns and prepositions.
Dec 10, 2019
Becker, Maria, 2019, "GER_SET: Situation Entity Type labelled corpus for German", https://doi.org/10.11588/data/BBQYD0, heiDATA, V1
Semantic clause types, also called Situation Entity (SE) types (Smith, 2003) are linguistic characterizations of aspectual properties shown to be useful for tasks like argumentation structure analysis (Becker et al., 2016), genre characterization (Palmer and Friedrich, 2014), and...
Oct 22, 2019
Becker, Maria, 2019, "Genre-sensitive Neural Situation Entity classifier (DE, EN)", https://doi.org/10.11588/data/XXKWU0, heiDATA, V1
This is a Classifier for situation entity types as described in Becker et al., 2017. These clause types depend on a combination of syntactic-semantic and contextual features. We explore this task in a deeplearning framework, where tuned word representations capture lexical, synta...
Jan 23, 2020
Daza, Angel, 2020, "Encoder-Decoder Model for Semantic Role Labeling", https://doi.org/10.11588/data/TOI9NQ, heiDATA, V1
Abstract (Daza & Frank 2019): We propose a Cross-lingual Encoder-Decoder model that simultaneously translates and generates sentences with Semantic Role Labeling annotations in a resource-poor target language. Unlike annotation projection techniques, our model does not need paral...
Jul 15, 2019
Nastase, Vivi; Fritz, Devon; Frank, Anette, 2019, "DeModify", https://doi.org/10.11588/data/KIWEMF, heiDATA, V1
deModify consists of 3631 instances, each with three annotations obtained through CrowdFlower. An instance is a short story in which a modifier is annotated with respect to its impact on the information in the story, assessed through its deletion from the context: crucial, not-cr...
Add Data

Sign up or log in to create a dataverse or add a dataset.

Share Dataverse

Share this dataverse on your favorite social media networks.

Link Dataverse
Reset Modifications

Are you sure you want to reset the selected metadata fields? If you do this, any customizations (hidden, required, optional) you have done will no longer appear.

Contact heiDATA Support

heiDATA Support

Please fill this out to prove you are not a robot.

+ =