Empirical Linguistics and Computational Language Modeling (LiMo)

Data publications of the Leibniz ScienceCampus “Empirical Linguistics and Computational Language Modeling”

The Leibniz ScienceCampus “Empirical Linguistics and Computational Language Modeling” (LiMo) is a cooperative research project between the Leibniz Institute for the German Language (Leibniz-Institut für Deutsche Sprache, IDS) in Mannheim and the Department of Computational Linguistics at Heidelberg University (ICL). The general aims of the project are to develop new methods, models, and tools for compiling and analysing automatically large German textual corpora covering different domains, genres and language varieties.

The project is supported by funds from the Baden-Württemberg Ministry of Science, Research and the Arts and the Leibniz Association together with funds provided by the Leibniz Institute for the German Language and Heidelberg University.

Funding Period: 2015 – 2020

Featured Dataverses

In order to use this feature you must have at least one published dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

41 to 50 of 184 Results

sentimentViewsOfOpinionNouns.txt Sep 5, 2019 - Sentiment Compound Data (DE) Plain Text - 31.9 KB - MD5: bf369686743f258705fd6cc675cfcaf0 Data
supplementaryNotes.pdf Sep 5, 2019 - Sentiment Compound Data (DE) Adobe PDF - 126.2 KB - MD5: 846a2849d5f0f4a119d504d79260c6fa Documentation
Multilingual Modal Sense Classification using a Convolutional Neural Network [Source Code] Oct 7, 2019 Marasović, Ana, 2019, "Multilingual Modal Sense Classification using a Convolutional Neural Network [Source Code]", https://doi.org/10.11588/data/ERDJDI, heiDATA, V1 Abstract Modal sense classification (MSC) is aspecial WSD task that depends on themeaning of the proposition in the modal’s scope. We explore a CNN architecture for classifying modal sense in English and German. We show that CNNs are superior to manually designed feature-based cl...
modal-sense-classifcation.zip Oct 7, 2019 - Multilingual Modal Sense Classification using a Convolutional Neural Network [Source Code] ZIP Archive - 3.0 MB - MD5: 63c05670056bb1992a1e5ec370f0ccf3
The MSC Data Set Oct 7, 2019 Marasović, Ana; Zhou, Mengfei; Frank, Anette, 2019, "The MSC Data Set", https://doi.org/10.11588/data/JEESIQ, heiDATA, V1 From this page you can download resources we created for modal sense classification as reported in Zhou et al. (2015), Marasović et al. (2016) and Marasović and Frank (2015) (see "Related Publication" below): Heuristically sense-annotated training data acquired from EUROPARL and...
MSC Data Set.zip Oct 7, 2019 - The MSC Data Set ZIP Archive - 6.2 MB - MD5: 98dbe1d608c24c3dfd31f166daeee77b
Affixoid Dataset (DE) Oct 8, 2019 Ruppenhofer, Josef, 2019, "Affixoid Dataset (DE)", https://doi.org/10.11588/data/QKF4LT, heiDATA, V1, UNF:6:+MGK9lTPTXx7Rclu1BpPnw== [fileUNF] The dataset contains the manual annotations for the COLING 2018 submission "Distinguishing affixoid formations from compounds" by Josef Ruppenhofer, Michael Wiegand, Rebecca Wilm and Katja Markert. 1788 complex words containing one of 7 German suffixoid candidates (e.g. -hai, -go...
dataset_annotations.tab Oct 8, 2019 - Affixoid Dataset (DE) Tabular Data - 61.6 KB - 1 Variables, 1787 Observations - UNF:6:+MGK9lTPTXx7Rclu1BpPnw== Data
README.txt Oct 8, 2019 - Affixoid Dataset (DE) Plain Text - 758 B - MD5: 017f60a9c77782cd97a45c4dd74e117c Documentation
COREC – A neural multi-label COmmonsense RElation Classification system Oct 22, 2019 Becker, Maria, 2019, "COREC – A neural multi-label COmmonsense RElation Classification system", https://doi.org/10.11588/data/E5EHBV, heiDATA, V1 We examine the learnability of Commonsense knowledge relations as represented in CONCEPTNET. We develop a neural open world multi-label classification system that focuses on the evaluation of classification accuracy for individual relations. Based on an in-depth study of the spec...

sentimentViewsOfOpinionNouns.txt

Sep 5, 2019 - Sentiment Compound Data (DE)

Plain Text - 31.9 KB -

Data

supplementaryNotes.pdf

Sep 5, 2019 - Sentiment Compound Data (DE)

Adobe PDF - 126.2 KB -

Documentation

Multilingual Modal Sense Classification using a Convolutional Neural Network [Source Code]

Oct 7, 2019

Marasović, Ana, 2019, "Multilingual Modal Sense Classification using a Convolutional Neural Network [Source Code]", https://doi.org/10.11588/data/ERDJDI, heiDATA, V1

Abstract Modal sense classification (MSC) is aspecial WSD task that depends on themeaning of the proposition in the modal’s scope. We explore a CNN architecture for classifying modal sense in English and German. We show that CNNs are superior to manually designed feature-based cl...

modal-sense-classifcation.zip

Oct 7, 2019 - Multilingual Modal Sense Classification using a Convolutional Neural Network [Source Code]

ZIP Archive - 3.0 MB -

The MSC Data Set

Oct 7, 2019

Marasović, Ana; Zhou, Mengfei; Frank, Anette, 2019, "The MSC Data Set", https://doi.org/10.11588/data/JEESIQ, heiDATA, V1

From this page you can download resources we created for modal sense classification as reported in Zhou et al. (2015), Marasović et al. (2016) and Marasović and Frank (2015) (see "Related Publication" below): Heuristically sense-annotated training data acquired from EUROPARL and...

MSC Data Set.zip

Oct 7, 2019 - The MSC Data Set

ZIP Archive - 6.2 MB -

Affixoid Dataset (DE)

Oct 8, 2019

Ruppenhofer, Josef, 2019, "Affixoid Dataset (DE)", https://doi.org/10.11588/data/QKF4LT, heiDATA, V1, UNF:6:+MGK9lTPTXx7Rclu1BpPnw== [fileUNF]

The dataset contains the manual annotations for the COLING 2018 submission "Distinguishing affixoid formations from compounds" by Josef Ruppenhofer, Michael Wiegand, Rebecca Wilm and Katja Markert. 1788 complex words containing one of 7 German suffixoid candidates (e.g. -hai, -go...

dataset_annotations.tab

Oct 8, 2019 - Affixoid Dataset (DE)

Tabular Data - 61.6 KB - 1 Variables, 1787 Observations -

Data

README.txt

Oct 8, 2019 - Affixoid Dataset (DE)

Plain Text - 758 B -

Documentation

COREC – A neural multi-label COmmonsense RElation Classification system

Oct 22, 2019

Becker, Maria, 2019, "COREC – A neural multi-label COmmonsense RElation Classification system", https://doi.org/10.11588/data/E5EHBV, heiDATA, V1

We examine the learnability of Commonsense knowledge relations as represented in CONCEPTNET. We develop a neural open world multi-label classification system that focuses on the evaluation of classification accuracy for individual relations. Based on an in-depth study of the spec...

Add Data

Share Dataverse

Link Dataverse

Reset Modifications