Empirical Linguistics and Computational Language Modeling (LiMo)

Data publications of the Leibniz ScienceCampus “Empirical Linguistics and Computational Language Modeling”

The Leibniz ScienceCampus “Empirical Linguistics and Computational Language Modeling” (LiMo) is a cooperative research project between the Leibniz Institute for the German Language (Leibniz-Institut für Deutsche Sprache, IDS) in Mannheim and the Department of Computational Linguistics at Heidelberg University (ICL). The general aims of the project are to develop new methods, models, and tools for compiling and analysing automatically large German textual corpora covering different domains, genres and language varieties.

The project is supported by funds from the Baden-Württemberg Ministry of Science, Research and the Arts and the Leibniz Association together with funds provided by the Leibniz Institute for the German Language and Heidelberg University.

Funding Period: 2015 – 2020

Featured Dataverses

In order to use this feature you must have at least one published dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

21 to 30 of 37 Results

Multilingual Modal Sense Classification using a Convolutional Neural Network [Source Code] Oct 7, 2019 Marasović, Ana, 2019, "Multilingual Modal Sense Classification using a Convolutional Neural Network [Source Code]", https://doi.org/10.11588/data/ERDJDI, heiDATA, V1 Abstract Modal sense classification (MSC) is aspecial WSD task that depends on themeaning of the proposition in the modal’s scope. We explore a CNN architecture for classifying modal sense in English and German. We show that CNNs are superior to manually designed feature-based cl...
Negative Sampling for Learning Knowledge Graph Embeddings Aug 19, 2019 Kotnis, Bhushan, 2019, "Negative Sampling for Learning Knowledge Graph Embeddings", https://doi.org/10.11588/data/YYULL2, heiDATA, V1 Reimplementation of four KG factorization methods and six negative sampling methods. Abstract Knowledge graphs are large, useful, but incomplete knowledge repositories. They encode knowledge through entities and relations which define each other through the connective structure o...
Neural Dependency Parser with Biaffine Attention Nov 13, 2023 - Neural Techniques for German Dependency Parsing Fankhauser, Peter; Do, Bich-Ngoc; Kupietz, Marc, 2023, "Neural Dependency Parser with Biaffine Attention", https://doi.org/10.11588/data/DZ9MUS, heiDATA, V1 This resource contains the code of the dependency parser used in the paper: Fankhauser, et al. (2020). "Evaluating a Dependency Parser on DeReKo". The parser is a re-implementation of the neural dependency parser from Dozat and Manning (2017). In addition, we include two pre-trai...
Neural Dependency Parser with Biaffine Attention and BERT Embeddings Nov 13, 2023 - Neural Techniques for German Dependency Parsing Do, Bich-Ngoc; Rehbein, Ines, 2023, "Neural Dependency Parser with Biaffine Attention and BERT Embeddings", https://doi.org/10.11588/data/0U6IWL, heiDATA, V1 This resource contains the code of the dependency parser used in the paper: Do and Rehbein (2020). "Parsers Know Best: German PP Attachment Revisited". The parser is a re-implementation of the neural dependency parser from Dozat and Manning (2017) and is extended to use the BERT...
Neural PP Attachment Disambiguation Systems Nov 13, 2023 - Neural Techniques for German Dependency Parsing Do, Bich-Ngoc; Rehbein, Ines, 2023, "Neural PP Attachment Disambiguation Systems", https://doi.org/10.11588/data/DKWKGJ, heiDATA, V1 This resource contains code for different types of neural PP attachment disambiguation systems: A disambiguation system inspired by de Kok et al. (2017) but with the ranking loss function. A disambiguation system with biaffine attention similar to the neural dependency parser in...
Neural Rerankers for Dependency Parsing Nov 13, 2023 - Neural Techniques for German Dependency Parsing Do, Bich-Ngoc; Rehbein, Ines, 2023, "Neural Rerankers for Dependency Parsing", https://doi.org/10.11588/data/NNGPQZ, heiDATA, V1 This resource contains code for different types of neural rerankers (RCNN, RCNN-shared and GCN) from the paper: Do and Rehbein (2020). "Neural Reranking for Dependency Parsing: An Evaluation". We also include in this resource the pre-trained models of different rerankers on 3 lan...
Opinion role extractor Sep 2, 2019 Wiegand, Michael, 2019, "Opinion role extractor", https://doi.org/10.11588/data/3W7AQP, heiDATA, V1 System for the Extraction of Subjective Expressions, Sentiment Sources and Sentiment Targets from German Text
Pre-trained POS tagging models for German social media Mar 26, 2020 Rehbein, Ines; Ruppenhofer, Josef; Zimmermann, Victor, 2020, "Pre-trained POS tagging models for German social media", https://doi.org/10.11588/data/W3JBV4, heiDATA, V1 Pre-trained POS tagging models for the HunPos tagger (Halácsy et al. 2007) the biLSTM-char-CRF tagger (Reimers & Gurevych 2017) Online-Flors (Yin et al. 2015). References: Halácsy, P., Kornai, A., and Oravecz, C. (2007). HunPos: An open source trigram tagger. In Proceedings of th...
Real-World PP Attachment Disambiguation Dataset Nov 13, 2023 - Neural Techniques for German Dependency Parsing Do, Bich-Ngoc; Rehbein, Ines, 2023, "Real-World PP Attachment Disambiguation Dataset", https://doi.org/10.11588/data/NB46XR, heiDATA, V1 This resource contains a German dataset for real-world PP attachment disambiguation. The creation, analysis and experiment results of the dataset are described in the paper: Do and Rehbein (2020). "Parsers Know Best: German PP Attachment Revisited"
Sentiment Compound Data (DE) Sep 5, 2019 Wiegand, Michael; Bocionek, Christine; Ruppenhofer, Josef, 2019, "Sentiment Compound Data (DE)", https://doi.org/10.11588/data/LSTRK3, heiDATA, V1 This dataset contains gold standards that are required for building a classifier that automatically extracts opinion (noun) compounds.

Multilingual Modal Sense Classification using a Convolutional Neural Network [Source Code]

Oct 7, 2019

Marasović, Ana, 2019, "Multilingual Modal Sense Classification using a Convolutional Neural Network [Source Code]", https://doi.org/10.11588/data/ERDJDI, heiDATA, V1

Abstract Modal sense classification (MSC) is aspecial WSD task that depends on themeaning of the proposition in the modal’s scope. We explore a CNN architecture for classifying modal sense in English and German. We show that CNNs are superior to manually designed feature-based cl...

Negative Sampling for Learning Knowledge Graph Embeddings

Aug 19, 2019

Kotnis, Bhushan, 2019, "Negative Sampling for Learning Knowledge Graph Embeddings", https://doi.org/10.11588/data/YYULL2, heiDATA, V1

Reimplementation of four KG factorization methods and six negative sampling methods. Abstract Knowledge graphs are large, useful, but incomplete knowledge repositories. They encode knowledge through entities and relations which define each other through the connective structure o...

Neural Dependency Parser with Biaffine Attention

Nov 13, 2023 - Neural Techniques for German Dependency Parsing

Fankhauser, Peter; Do, Bich-Ngoc; Kupietz, Marc, 2023, "Neural Dependency Parser with Biaffine Attention", https://doi.org/10.11588/data/DZ9MUS, heiDATA, V1

This resource contains the code of the dependency parser used in the paper: Fankhauser, et al. (2020). "Evaluating a Dependency Parser on DeReKo". The parser is a re-implementation of the neural dependency parser from Dozat and Manning (2017). In addition, we include two pre-trai...

Neural Dependency Parser with Biaffine Attention and BERT Embeddings

Nov 13, 2023 - Neural Techniques for German Dependency Parsing

Do, Bich-Ngoc; Rehbein, Ines, 2023, "Neural Dependency Parser with Biaffine Attention and BERT Embeddings", https://doi.org/10.11588/data/0U6IWL, heiDATA, V1

This resource contains the code of the dependency parser used in the paper: Do and Rehbein (2020). "Parsers Know Best: German PP Attachment Revisited". The parser is a re-implementation of the neural dependency parser from Dozat and Manning (2017) and is extended to use the BERT...

Neural PP Attachment Disambiguation Systems

Nov 13, 2023 - Neural Techniques for German Dependency Parsing

Do, Bich-Ngoc; Rehbein, Ines, 2023, "Neural PP Attachment Disambiguation Systems", https://doi.org/10.11588/data/DKWKGJ, heiDATA, V1

This resource contains code for different types of neural PP attachment disambiguation systems: A disambiguation system inspired by de Kok et al. (2017) but with the ranking loss function. A disambiguation system with biaffine attention similar to the neural dependency parser in...

Neural Rerankers for Dependency Parsing

Nov 13, 2023 - Neural Techniques for German Dependency Parsing

Do, Bich-Ngoc; Rehbein, Ines, 2023, "Neural Rerankers for Dependency Parsing", https://doi.org/10.11588/data/NNGPQZ, heiDATA, V1

This resource contains code for different types of neural rerankers (RCNN, RCNN-shared and GCN) from the paper: Do and Rehbein (2020). "Neural Reranking for Dependency Parsing: An Evaluation". We also include in this resource the pre-trained models of different rerankers on 3 lan...

Opinion role extractor

Sep 2, 2019

Wiegand, Michael, 2019, "Opinion role extractor", https://doi.org/10.11588/data/3W7AQP, heiDATA, V1

System for the Extraction of Subjective Expressions, Sentiment Sources and Sentiment Targets from German Text

Pre-trained POS tagging models for German social media

Mar 26, 2020

Rehbein, Ines; Ruppenhofer, Josef; Zimmermann, Victor, 2020, "Pre-trained POS tagging models for German social media", https://doi.org/10.11588/data/W3JBV4, heiDATA, V1

Pre-trained POS tagging models for the HunPos tagger (Halácsy et al. 2007) the biLSTM-char-CRF tagger (Reimers & Gurevych 2017) Online-Flors (Yin et al. 2015). References: Halácsy, P., Kornai, A., and Oravecz, C. (2007). HunPos: An open source trigram tagger. In Proceedings of th...

Real-World PP Attachment Disambiguation Dataset

Nov 13, 2023 - Neural Techniques for German Dependency Parsing

Do, Bich-Ngoc; Rehbein, Ines, 2023, "Real-World PP Attachment Disambiguation Dataset", https://doi.org/10.11588/data/NB46XR, heiDATA, V1

This resource contains a German dataset for real-world PP attachment disambiguation. The creation, analysis and experiment results of the dataset are described in the paper: Do and Rehbein (2020). "Parsers Know Best: German PP Attachment Revisited"

Sentiment Compound Data (DE)

Sep 5, 2019

Wiegand, Michael; Bocionek, Christine; Ruppenhofer, Josef, 2019, "Sentiment Compound Data (DE)", https://doi.org/10.11588/data/LSTRK3, heiDATA, V1

This dataset contains gold standards that are required for building a classifier that automatically extracts opinion (noun) compounds.

Add Data

Share Dataverse

Link Dataverse

Reset Modifications