Empirical Linguistics and Computational Language Modeling (LiMo)

Data publications of the Leibniz ScienceCampus “Empirical Linguistics and Computational Language Modeling”

The Leibniz ScienceCampus “Empirical Linguistics and Computational Language Modeling” (LiMo) is a cooperative research project between the Leibniz Institute for the German Language (Leibniz-Institut für Deutsche Sprache, IDS) in Mannheim and the Department of Computational Linguistics at Heidelberg University (ICL). The general aims of the project are to develop new methods, models, and tools for compiling and analysing automatically large German textual corpora covering different domains, genres and language varieties.

The project is supported by funds from the Baden-Württemberg Ministry of Science, Research and the Arts and the Leibniz Association together with funds provided by the Leibniz Institute for the German Language and Heidelberg University.

Funding Period: 2015 – 2020

Featured Dataverses

In order to use this feature you must have at least one published dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

51 to 60 of 185 Results

GTTC_addendum.tab Jan 20, 2021 - German Twitter Titling Corpus Tabular Data - 19.7 KB - 5 Variables, 296 Observations - UNF:6:e8JLFj0rmt8hCbrLS38QTg== Data
Head Selection Parsers and LSTM Labelers Nov 13, 2023 - Neural Techniques for German Dependency Parsing Do, Bich-Ngoc; Rehbein, Ines; Frank, Anette, 2023, "Head Selection Parsers and LSTM Labelers", https://doi.org/10.11588/data/BPWWJL, heiDATA, V1 This resource contains code, data and pre-trained models for various types of neural dependency parsers and LSTM labelers used in the papers: Do et al. (2017). "What Do We Need to Know About an Unknown Word When Parsing German" Do and Rehbein (2017). "Evaluating LSTM Models for G...
holderVsTarget.gold.txt Sep 5, 2019 - Sentiment Compound Data (DE) Plain Text - 34.6 KB - MD5: 13ac9f60aa9ba2fbb42d0b9d2b9f6e2f Data
hunpos-social-media.model.bz2 Mar 26, 2020 - Pre-trained POS tagging models for German social media Bzip Archive - 6.0 MB - MD5: 130e09643a6ec5b26bcdf520571f261d Data
KGE Algorithms Aug 19, 2019 Kotnis, Bhushan, 2019, "KGE Algorithms", https://doi.org/10.11588/data/CSXYSS, heiDATA, V1 An updated method for link prediction that uses a regularization factor that models relation argument types Abstract (Kotnis and Nastase, 2017): Learning relations based on evidence from knowledge repositories relies on processing the available relation instances. Knowledge repos...
kge-rl-master.zip Aug 19, 2019 - KGE Algorithms ZIP Archive - 19.4 KB - MD5: d2e8ac74e3f20d2cdec2225962c7e2f0 Code
kge-rl-master.zip Aug 19, 2019 - Negative Sampling for Learning Knowledge Graph Embeddings ZIP Archive - 19.4 KB - MD5: d2e8ac74e3f20d2cdec2225962c7e2f0 Code
Lexicon of Abusive Words (EN) Sep 2, 2019 Wiegand, Michael, 2019, "Lexicon of Abusive Words (EN)", https://doi.org/10.11588/data/MKPEYV, heiDATA, V1 This goldstandard contains a bootstrapped lexicon of abusive words. The lexicon comprises a large set of English negative polar expressions annotated as either abusive or not.
lexicon-of-abusive-words-master.zip Sep 2, 2019 - Lexicon of Abusive Words (EN) ZIP Archive - 738.4 KB - MD5: 46f33f5b7a9c866b1a2fb6dc956b945d
LICENSE Sep 5, 2019 - Sentiment View Lexicon (EN) Plain Text - 18.2 KB - MD5: 4a17ffc27c9f3b240fbf4fe17783c89c Documentation

GTTC_addendum.tab

Jan 20, 2021 - German Twitter Titling Corpus

Tabular Data - 19.7 KB - 5 Variables, 296 Observations -

Data

Head Selection Parsers and LSTM Labelers

Nov 13, 2023 - Neural Techniques for German Dependency Parsing

Do, Bich-Ngoc; Rehbein, Ines; Frank, Anette, 2023, "Head Selection Parsers and LSTM Labelers", https://doi.org/10.11588/data/BPWWJL, heiDATA, V1

This resource contains code, data and pre-trained models for various types of neural dependency parsers and LSTM labelers used in the papers: Do et al. (2017). "What Do We Need to Know About an Unknown Word When Parsing German" Do and Rehbein (2017). "Evaluating LSTM Models for G...

holderVsTarget.gold.txt

Sep 5, 2019 - Sentiment Compound Data (DE)

Plain Text - 34.6 KB -

Data

hunpos-social-media.model.bz2

Mar 26, 2020 - Pre-trained POS tagging models for German social media

Bzip Archive - 6.0 MB -

Data

KGE Algorithms

Aug 19, 2019

Kotnis, Bhushan, 2019, "KGE Algorithms", https://doi.org/10.11588/data/CSXYSS, heiDATA, V1

An updated method for link prediction that uses a regularization factor that models relation argument types Abstract (Kotnis and Nastase, 2017): Learning relations based on evidence from knowledge repositories relies on processing the available relation instances. Knowledge repos...

kge-rl-master.zip

Aug 19, 2019 - KGE Algorithms

ZIP Archive - 19.4 KB -

Code

kge-rl-master.zip

Aug 19, 2019 - Negative Sampling for Learning Knowledge Graph Embeddings

ZIP Archive - 19.4 KB -

Code

Lexicon of Abusive Words (EN)

Sep 2, 2019

Wiegand, Michael, 2019, "Lexicon of Abusive Words (EN)", https://doi.org/10.11588/data/MKPEYV, heiDATA, V1

This goldstandard contains a bootstrapped lexicon of abusive words. The lexicon comprises a large set of English negative polar expressions annotated as either abusive or not.

lexicon-of-abusive-words-master.zip

Sep 2, 2019 - Lexicon of Abusive Words (EN)

ZIP Archive - 738.4 KB -

LICENSE

Sep 5, 2019 - Sentiment View Lexicon (EN)

Plain Text - 18.2 KB -

Documentation

Add Data

Share Dataverse

Link Dataverse

Reset Modifications