Data publications of the Leibniz ScienceCampus “Empirical Linguistics and Computational Language Modeling”

The Leibniz ScienceCampus “Empirical Linguistics and Computational Language Modeling” (LiMo) is a cooperative research project between the Leibniz Institute for the German Language (Leibniz-Institut für Deutsche Sprache, IDS) in Mannheim and the Department of Computational Linguistics at Heidelberg University (ICL). The general aims of the project are to develop new methods, models, and tools for compiling and analysing automatically large German textual corpora covering different domains, genres and language varieties.

The project is supported by funds from the Baden-Württemberg Ministry of Science, Research and the Arts and the Leibniz Association together with funds provided by the Leibniz Institute for the German Language and Heidelberg University.

Funding Period: 2015 – 2020

Featured Dataverses

In order to use this feature you must have at least one published dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

Advanced Search

11 to 20 of 185 Results
Nov 13, 2023 - Neural Techniques for German Dependency Parsing
Do, Bich-Ngoc; Rehbein, Ines, 2023, "Tool for Extracting PP Attachment Disambiguation Dataset", https://doi.org/10.11588/data/RHD3KS, heiDATA, V1
This resource contains code to extract a PP attachment disambiguation dataset as described in the paper: Do and Rehbein (2020). "Parsers Know Best: German PP Attachment Revisited". The input is in CoNLL format, and the output format is similar to the one described in de Kok et al...
ZIP Archive - 36.0 KB - MD5: 3e8f69e918c003c92700af524474ad31
Code
Oct 7, 2019
Marasović, Ana; Zhou, Mengfei; Frank, Anette, 2019, "The MSC Data Set", https://doi.org/10.11588/data/JEESIQ, heiDATA, V1
From this page you can download resources we created for modal sense classification as reported in Zhou et al. (2015), Marasović et al. (2016) and Marasović and Frank (2015) (see "Related Publication" below): Heuristically sense-annotated training data acquired from EUROPARL and...
Adobe PDF - 126.2 KB - MD5: 846a2849d5f0f4a119d504d79260c6fa
Documentation
ZIP Archive - 42.5 MB - MD5: 5a525ee5066a01138845a1276c110956
Code
Gzip Archive - 361.8 MB - MD5: c31097376ca9d2b08c013fcdbba10c6d
Data
Gzip Archive - 47.1 MB - MD5: 29fe91de0ec97d2282ebdac310e66e9d
Data
Gzip Archive - 32.4 MB - MD5: a7fa938e3000c7e0427ef3c2b3ec8d28
Data
Gzip Archive - 32.5 MB - MD5: 2d06256605fad989aa18b00e46922463
Data
Gzip Archive - 68.0 MB - MD5: a362283a76428bb053c6f59a928b3a8d
Data
Add Data

Sign up or log in to create a dataverse or add a dataset.

Share Dataverse

Share this dataverse on your favorite social media networks.

Link Dataverse
Reset Modifications

Are you sure you want to reset the selected metadata fields? If you do this, any customizations (hidden, required, optional) you have done will no longer appear.