Skip to main content
Featured Dataverses

In order to use this feature you must have at least one published dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

Find Advanced Search

1 to 10 of 25 Results
Jan 31, 2019 - AIPHES
Heinzerling, Benjamin, 2019, "Selectional Preference Embeddings (EMNLP 2017)",, heiDATA, V1
Joint embeddings of selectional preferences, words, and fine-grained entity types. The vocabulary consists of: verbs and their dependency relation separated by "@", e.g. "sink@nsubj" or "elect@dobj" words and short noun phrases, e.g. "Titanic" fine-grained entity types using the...
Feb 4, 2019 - AIPHES
Marasovic, Ana, 2019, "SRL4ORL: Improving Opinion Role Labeling Using Multi-Task Learning With Semantic Role Labeling [Source Code]",, heiDATA, V1
This repository contains code for reproducing experiments done in Marasovic and Frank (2018). Paper abstract: For over a decade, machine learning has been used to extract opinion-holder-target structures from text to answer the question "Who expressed what kind of sentiment towar...
Feb 4, 2019 - AIPHES
Marasovic, Ana, 2019, "Abstract Anaphora Resolution [Source Code]",, heiDATA, V1
Abstract Anaphora Resolution (AAR) aims to find the interpretation of nominal expressions (e.g., this result, those two actions) and pronominal expressions (e.g., this, that, it) that refer to abstract-object-antecedents such as facts, events, plans, actions, or situations. The f...
Feb 6, 2019 - AIPHES
Heinzerling, Benjamin, 2019, "Source Code, Data and Additional Material for the Thesis: "Aspects of Coherence for Entity Analysis"",, heiDATA, V1
This dataset contains source code and system output used in the PhD thesis "Aspects of Coherence for Entity Analysis". This dataset is split into three parts corresponding to the chapters describing the three main contributions of the thesis: chapter3.tar.gz: Java source code for...
Feb 6, 2019 - AIPHES
Heinzerling, Benjamin, 2019, "BPEmb: Pre-trained Subword Embeddings in 275 Languages (LREC 2018)",, heiDATA, V1
BPEmb is a collection of pre-trained subword unit embeddings in 275 languages, based on Byte-Pair Encoding (BPE). In an evaluation using fine-grained entity typing as testbed, BPEmb performs competitively, and for some languages better than alternative subword approaches, while r...
Jun 6, 2019 - IWR Computer Graphics
Mara, Hubert, 2019, "HeiCuBeDa Hilprecht - Heidelberg Cuneiform Benchmark Dataset for the Hilprecht Collection",, heiDATA, V1
The number of known cuneiform tablets is assumed to be in the hundreds of thousands. A fraction has been published by printing photographs and manual tracings in books, which is collected by the online Cuneiform Digital Library Initiative (CDLI) catalog including some of these im...
Jul 12, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo)
Opitz, Juri, 2019, "AMR parse quality prediction [Source Code]",, heiDATA, V1
Accuracy prediction for AMR parsing predicts 33 accuracy metrics for a given sentence and its (automatic) AMR parse Abstract (Opitz and Frank, 2019): Semantic proto-role labeling (SPRL) is an alternative to semantic role labeling (SRL) that moves beyond a categorical definition o...
Jul 15, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo)
Nastase, Vivi; Fritz, Devon; Frank, Anette, 2019, "DeModify",, heiDATA, V1
deModify consists of 3631 instances, each with three annotations obtained through CrowdFlower. An instance is a short story in which a modifier is annotated with respect to its impact on the information in the story, assessed through its deletion from the context: crucial, not-cr...
Jul 15, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo)
Nastase, Vivi; Hitschler, Julian, 2019, "ACL word segmentation correction",, heiDATA, V1
The data in this collection consists of two parallel directories, one ("raw") containing the raw text of 18850 articles from the ACL 2013/02 collection, the other ("re-segmented") the word-resegmented version of these articles, obtained using nematus, a seq2seq neural model used...
Jul 15, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo)
Nastase, Vivi; Kotnis, Bhushan, 2019, "Abstract graphs, abstract paths, grounded paths for Freebase and NELL",, heiDATA, V1
We describe a method for representing knowledge graphs that capture an intensional representation of the original extensional information. This representation is very compact, and it abstracts away from individual links, allowing us to find better path candidates, as shown by the...
Add Data

Sign up or log in to create a dataverse or add a dataset.

Share Dataverse

Share this dataverse on your favorite social media networks.

Link Dataverse
Reset Modifications

Are you sure you want to reset the selected metadata fields? If you do this, any customizations (hidden, required, optional) you have done will no longer appear.

Contact heiDATA Support

heiDATA Support

Please fill this out to prove you are not a robot.

+ =