Skip to main content
Empirical Linguistics and Computational Language Modeling (LiMo) (Department of Computational Linguistics of Heidelberg University and Leibniz Institute for the German Language)

Data publications of the Leibniz ScienceCampus “Empirical Linguistics and Computational Language Modeling”

The Leibniz ScienceCampus “Empirical Linguistics and Computational Language Modeling” (LiMo) is a cooperative research project between the Leibniz Institute for the German Language (Leibniz-Institut für Deutsche Sprache, IDS) in Mannheim and the Department of Computational Linguistics at Heidelberg University (ICL). The general aims of the project are to develop new methods, models, and tools for compiling and analysing automatically large German textual corpora covering different domains, genres and language varieties.

The project is supported by funds from the Baden-Württemberg Ministry of Science, Research and the Arts and the Leibniz Association together with funds provided by the Leibniz Institute for the German Language and Heidelberg University.

Funding Period: 2015 – 2020

Featured Dataverses

In order to use this feature you must have at least one published dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

Find Advanced Search

21 to 30 of 82 Results
Mar 6, 2020
van den Berg, Esther, 2020, "German Twitter Titling Corpus",, heiDATA, V1, UNF:6:xIy4tRguIiz8xpg52FlxOA== [fileUNF]
The German Titling Twitter Corpus consists of 1904 stance-annotated tweets collected in June/July 2018 mentioning 24 German politicians with a doctoral degree. The Addendum contains an additional 296 stance-annotated tweets from each month of 2018 mentioning 6 left-leaning and 4...
Tab-Delimited - 22.1 KB - MD5: 8b1b4d8169475e1a74aa8ff620b7a483
Tab-Delimited - 119.5 KB - MD5: 2e631a49b1fdd3ffe8a091bcb16482fa
Markdown Text - 1.3 KB - MD5: fba1140865e1ceb050c05897826e3410
Jan 23, 2020
Daza, Angel, 2020, "Encoder-Decoder Model for Semantic Role Labeling",, heiDATA, V1
Abstract (Daza & Frank 2019): We propose a Cross-lingual Encoder-Decoder model that simultaneously translates and generates sentences with Semantic Role Labeling annotations in a resource-poor target language. Unlike annotation projection techniques, our model does not need paral...
Markdown Text - 8.4 KB - MD5: 835ab4a78a83f8bea4d55dd6caa51837
ZIP Archive - 42.5 MB - MD5: 5a525ee5066a01138845a1276c110956
Dec 10, 2019
Becker, Maria, 2019, "GER_SET: Situation Entity Type labelled corpus for German",, heiDATA, V1
Semantic clause types, also called Situation Entity (SE) types (Smith, 2003) are linguistic characterizations of aspectual properties shown to be useful for tasks like argumentation structure analysis (Becker et al., 2016), genre characterization (Palmer and Friedrich, 2014), and...
ZIP Archive - 414.5 KB - MD5: e1733e5ce7ef02577239d5a9ada0d8ba
Oct 22, 2019
Becker, Maria, 2019, "Genre-sensitive Neural Situation Entity classifier (DE, EN)",, heiDATA, V1
This is a Classifier for situation entity types as described in Becker et al., 2017. These clause types depend on a combination of syntactic-semantic and contextual features. We explore this task in a deeplearning framework, where tuned word representations capture lexical, synta...
Add Data

Sign up or log in to create a dataverse or add a dataset.

Share Dataverse

Share this dataverse on your favorite social media networks.

Link Dataverse
Reset Modifications

Are you sure you want to reset the selected metadata fields? If you do this, any customizations (hidden, required, optional) you have done will no longer appear.

Contact heiDATA Support

heiDATA Support

Please fill this out to prove you are not a robot.

+ =