Skip to main content
Empirical Linguistics and Computational Language Modeling (LiMo) (Department of Computational Linguistics of Heidelberg University and Leibniz Institute for the German Language)

Data publications of the Leibniz ScienceCampus “Empirical Linguistics and Computational Language Modeling”

The Leibniz ScienceCampus “Empirical Linguistics and Computational Language Modeling” (LiMo) is a cooperative research project between the Leibniz Institute for the German Language (Leibniz-Institut für Deutsche Sprache, IDS) in Mannheim and the Department of Computational Linguistics at Heidelberg University (ICL). The general aims of the project are to develop new methods, models, and tools for compiling and analysing automatically large German textual corpora covering different domains, genres and language varieties.

The project is supported by funds from the Baden-Württemberg Ministry of Science, Research and the Arts and the Leibniz Association together with funds provided by the Leibniz Institute for the German Language and Heidelberg University.

Funding Period: 2015 – 2020

Featured Dataverses

In order to use this feature you must have at least one published dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

Find Advanced Search

11 to 20 of 58 Results
Oct 8, 2019 - Affixoid Dataset (DE)
Tab-Delimited - 61.6 KB - MD5: 8e2e107227a8ab7d59fb9a48dfa9f475
Oct 8, 2019 - Affixoid Dataset (DE)
Plain Text - 758 B - MD5: 017f60a9c77782cd97a45c4dd74e117c
Oct 7, 2019
Marasović, Ana; Zhou, Mengfei; Frank, Anette, 2019, "The MSC Data Set",, heiDATA, V1
From this page you can download resources we created for modal sense classification as reported in Zhou et al. (2015), Marasović et al. (2016) and Marasović and Frank (2015) (see "Related Publication" below): Heuristically sense-annotated training data acquired from EUROPARL and...
Oct 7, 2019 - The MSC Data Set
ZIP Archive - 6.2 MB - MD5: 98dbe1d608c24c3dfd31f166daeee77b
Oct 7, 2019
Marasović, Ana, 2019, "Multilingual Modal Sense Classification using a Convolutional Neural Network [Source Code]",, heiDATA, V1
Abstract Modal sense classification (MSC) is aspecial WSD task that depends on themeaning of the proposition in the modal’s scope. We explore a CNN architecture for classifying modal sense in English and German. We show that CNNs are superior to manually designed feature-based cl...
Sep 5, 2019
Wiegand, Michael; Bocionek, Christine; Ruppenhofer, Josef, 2019, "Sentiment Compound Data (DE)",, heiDATA, V1
This dataset contains gold standards that are required for building a classifier that automatically extracts opinion (noun) compounds.
Plain Text - 34.6 KB - MD5: 13ac9f60aa9ba2fbb42d0b9d2b9f6e2f
Plain Text - 18.2 KB - MD5: 4a17ffc27c9f3b240fbf4fe17783c89c
Plain Text - 3.4 KB - MD5: 5900f0947dba284902650ebd6b5fb2a6
Add Data

Sign up or log in to create a dataverse or add a dataset.

Share Dataverse

Share this dataverse on your favorite social media networks.

Link Dataverse
Reset Modifications

Are you sure you want to reset the selected metadata fields? If you do this, any customizations (hidden, required, optional) you have done will no longer appear.

Contact heiDATA Support

heiDATA Support

Please fill this out to prove you are not a robot.

+ =