heiDATA

Metrics

189,636 Downloads

Featured Dataverses

In order to use this feature you must have at least one published dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

Publication Year: 2019 Subject: Computer and Information Science Subject: Arts and Humanities

1 to 10 of 20 Results

Empirical Linguistics and Computational Language Modeling (LiMo)(Department of Computational Linguistics of Heidelberg University and Leibniz Institute for the German Language) Jul 12, 2019 Data publications of the Leibniz ScienceCampus “Empirical Linguistics and Computational Language Modeling” The Leibniz ScienceCampus “Empirical Linguistics and Computational Language Modeling” (LiMo) is a cooperative research project between the Leibniz Institute for the German L...
AMR parse quality prediction [Source Code] Jul 12, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo) Opitz, Juri, 2019, "AMR parse quality prediction [Source Code]", https://doi.org/10.11588/data/STHBGW, heiDATA, V1 Accuracy prediction for AMR parsing predicts 33 accuracy metrics for a given sentence and its (automatic) AMR parse Abstract (Opitz and Frank, 2019): Semantic proto-role labeling (SPRL) is an alternative to semantic role labeling (SRL) that moves beyond a categorical definition o...
DeModify Jul 15, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo) Nastase, Vivi; Fritz, Devon; Frank, Anette, 2019, "DeModify", https://doi.org/10.11588/data/KIWEMF, heiDATA, V1 deModify consists of 3631 instances, each with three annotations obtained through CrowdFlower. An instance is a short story in which a modifier is annotated with respect to its impact on the information in the story, assessed through its deletion from the context: crucial, not-cr...
ACL word segmentation correction Jul 15, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo) Nastase, Vivi; Hitschler, Julian, 2019, "ACL word segmentation correction", https://doi.org/10.11588/data/VK99LU, heiDATA, V1 The data in this collection consists of two parallel directories, one ("raw") containing the raw text of 18850 articles from the ACL 2013/02 collection, the other ("re-segmented") the word-resegmented version of these articles, obtained using nematus, a seq2seq neural model used...
Abstract graphs, abstract paths, grounded paths for Freebase and NELL Jul 15, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo) Nastase, Vivi; Kotnis, Bhushan, 2019, "Abstract graphs, abstract paths, grounded paths for Freebase and NELL", https://doi.org/10.11588/data/AVLFPZ, heiDATA, V1 We describe a method for representing knowledge graphs that capture an intensional representation of the original extensional information. This representation is very compact, and it abstracts away from individual links, allowing us to find better path candidates, as shown by the...
KGE Algorithms Aug 19, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo) Kotnis, Bhushan, 2019, "KGE Algorithms", https://doi.org/10.11588/data/CSXYSS, heiDATA, V1 An updated method for link prediction that uses a regularization factor that models relation argument types Abstract (Kotnis and Nastase, 2017): Learning relations based on evidence from knowledge repositories relies on processing the available relation instances. Knowledge repos...
Negative Sampling for Learning Knowledge Graph Embeddings Aug 19, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo) Kotnis, Bhushan, 2019, "Negative Sampling for Learning Knowledge Graph Embeddings", https://doi.org/10.11588/data/YYULL2, heiDATA, V1 Reimplementation of four KG factorization methods and six negative sampling methods. Abstract Knowledge graphs are large, useful, but incomplete knowledge repositories. They encode knowledge through entities and relations which define each other through the connective structure o...
Twitter Titling Corpus Aug 23, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo) van den Berg, Esther; Korfhage, Katharina; Ruppenhofer, Josef; Wiegand, Michael; Markert, Katja, 2019, "Twitter Titling Corpus", https://doi.org/10.11588/data/IOHXDF, heiDATA, V1, UNF:6:+F3lLKziwMvjy+xyktkilw== [fileUNF] The Twitter Titling Corpus contains 4002 stance-annotated tweets collected between 20 June 2017 and 30 August 2017 mentioning 6 presidents. Each tweet is annotated for the naming form used to refer to the president, for the purpose of a study on the relation between naming variat...
Opinion role extractor Sep 2, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo) Wiegand, Michael, 2019, "Opinion role extractor", https://doi.org/10.11588/data/3W7AQP, heiDATA, V1 System for the Extraction of Subjective Expressions, Sentiment Sources and Sentiment Targets from German Text
Lexicon of Abusive Words (EN) Sep 2, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo) Wiegand, Michael, 2019, "Lexicon of Abusive Words (EN)", https://doi.org/10.11588/data/MKPEYV, heiDATA, V1 This goldstandard contains a bootstrapped lexicon of abusive words. The lexicon comprises a large set of English negative polar expressions annotated as either abusive or not.

Empirical Linguistics and Computational Language Modeling (LiMo)(Department of Computational Linguistics of Heidelberg University and Leibniz Institute for the German Language)

Jul 12, 2019

Data publications of the Leibniz ScienceCampus “Empirical Linguistics and Computational Language Modeling” The Leibniz ScienceCampus “Empirical Linguistics and Computational Language Modeling” (LiMo) is a cooperative research project between the Leibniz Institute for the German L...

AMR parse quality prediction [Source Code]

Jul 12, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo)

Opitz, Juri, 2019, "AMR parse quality prediction [Source Code]", https://doi.org/10.11588/data/STHBGW, heiDATA, V1

Accuracy prediction for AMR parsing predicts 33 accuracy metrics for a given sentence and its (automatic) AMR parse Abstract (Opitz and Frank, 2019): Semantic proto-role labeling (SPRL) is an alternative to semantic role labeling (SRL) that moves beyond a categorical definition o...

DeModify

Jul 15, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo)

Nastase, Vivi; Fritz, Devon; Frank, Anette, 2019, "DeModify", https://doi.org/10.11588/data/KIWEMF, heiDATA, V1

deModify consists of 3631 instances, each with three annotations obtained through CrowdFlower. An instance is a short story in which a modifier is annotated with respect to its impact on the information in the story, assessed through its deletion from the context: crucial, not-cr...

ACL word segmentation correction

Jul 15, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo)

Nastase, Vivi; Hitschler, Julian, 2019, "ACL word segmentation correction", https://doi.org/10.11588/data/VK99LU, heiDATA, V1

The data in this collection consists of two parallel directories, one ("raw") containing the raw text of 18850 articles from the ACL 2013/02 collection, the other ("re-segmented") the word-resegmented version of these articles, obtained using nematus, a seq2seq neural model used...

Abstract graphs, abstract paths, grounded paths for Freebase and NELL

Jul 15, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo)

Nastase, Vivi; Kotnis, Bhushan, 2019, "Abstract graphs, abstract paths, grounded paths for Freebase and NELL", https://doi.org/10.11588/data/AVLFPZ, heiDATA, V1

We describe a method for representing knowledge graphs that capture an intensional representation of the original extensional information. This representation is very compact, and it abstracts away from individual links, allowing us to find better path candidates, as shown by the...

KGE Algorithms

Aug 19, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo)

Kotnis, Bhushan, 2019, "KGE Algorithms", https://doi.org/10.11588/data/CSXYSS, heiDATA, V1

An updated method for link prediction that uses a regularization factor that models relation argument types Abstract (Kotnis and Nastase, 2017): Learning relations based on evidence from knowledge repositories relies on processing the available relation instances. Knowledge repos...

Negative Sampling for Learning Knowledge Graph Embeddings

Aug 19, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo)

Kotnis, Bhushan, 2019, "Negative Sampling for Learning Knowledge Graph Embeddings", https://doi.org/10.11588/data/YYULL2, heiDATA, V1

Reimplementation of four KG factorization methods and six negative sampling methods. Abstract Knowledge graphs are large, useful, but incomplete knowledge repositories. They encode knowledge through entities and relations which define each other through the connective structure o...

Twitter Titling Corpus

Aug 23, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo)

van den Berg, Esther; Korfhage, Katharina; Ruppenhofer, Josef; Wiegand, Michael; Markert, Katja, 2019, "Twitter Titling Corpus", https://doi.org/10.11588/data/IOHXDF, heiDATA, V1, UNF:6:+F3lLKziwMvjy+xyktkilw== [fileUNF]

The Twitter Titling Corpus contains 4002 stance-annotated tweets collected between 20 June 2017 and 30 August 2017 mentioning 6 presidents. Each tweet is annotated for the naming form used to refer to the president, for the purpose of a study on the relation between naming variat...

Opinion role extractor

Sep 2, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo)

Wiegand, Michael, 2019, "Opinion role extractor", https://doi.org/10.11588/data/3W7AQP, heiDATA, V1

System for the Extraction of Subjective Expressions, Sentiment Sources and Sentiment Targets from German Text

Lexicon of Abusive Words (EN)

Sep 2, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo)

Wiegand, Michael, 2019, "Lexicon of Abusive Words (EN)", https://doi.org/10.11588/data/MKPEYV, heiDATA, V1

This goldstandard contains a bootstrapped lexicon of abusive words. The lexicon comprises a large set of English negative polar expressions annotated as either abusive or not.

Add Data

Share Dataverse

Link Dataverse

Reset Modifications