1 to 10 of 184 Results
Feb 17, 2021 -
X-SRL Dataset and mBERT Word Aligner
ZIP Archive - 37.7 KB -
MD5: 6b35c476556dfdb2b9b25a7a1cdc755d
|
Feb 17, 2021
Daza, Angel, 2021, "X-SRL Dataset and mBERT Word Aligner", https://doi.org/10.11588/data/HVXXIJ, heiDATA, V1
This code contains a method to automatically align words from parallel sentences by using multilingual BERT pre-trained embeddings. This can be used to transfer source annotations (for example labeled English sentences) into the target side (for example a German translation of th... |
Aug 23, 2019 -
Twitter Titling Corpus
Tabular Data - 219.0 KB - 5 Variables, 4002 Observations - UNF:6:+F3lLKziwMvjy+xyktkilw==
|
Aug 23, 2019
van den Berg, Esther; Korfhage, Katharina; Ruppenhofer, Josef; Wiegand, Michael; Markert, Katja, 2019, "Twitter Titling Corpus", https://doi.org/10.11588/data/IOHXDF, heiDATA, V1, UNF:6:+F3lLKziwMvjy+xyktkilw== [fileUNF]
The Twitter Titling Corpus contains 4002 stance-annotated tweets collected between 20 June 2017 and 30 August 2017 mentioning 6 presidents. Each tweet is annotated for the naming form used to refer to the president, for the purpose of a study on the relation between naming variat... |
Mar 26, 2020 -
tweeDe
Unknown - 945.9 KB -
MD5: 32d20db78b577a921d9fd4bc3868770e
|
Mar 26, 2020
Rehbein, Ines; Ruppenhofer, Josef; Do, Bich-Ngoc, 2020, "tweeDe", https://doi.org/10.11588/data/S90S35, heiDATA, V1
A German UD Twitter treebank, with >12,000 tokens from 519 tweets, annotated in the Universal Dependencies framework |
Nov 13, 2023 -
Real-World PP Attachment Disambiguation Dataset
Gzip Archive - 1.7 MB -
MD5: b2d04463fd249e1a19e641a99c65e70d
|
Nov 13, 2023 -
Real-World PP Attachment Disambiguation Dataset
Gzip Archive - 4.3 MB -
MD5: b37e0268b451b32e52948e47baf80603
|
Nov 13, 2023 -
Topological Field Labeler for German
ZIP Archive - 32.4 KB -
MD5: 3bf4fe4ba2daaade0ae9c765233145c3
|
Nov 13, 2023 - Neural Techniques for German Dependency Parsing
Do, Bich-Ngoc; Rehbein, Ines, 2023, "Topological Field Labeler for German", https://doi.org/10.11588/data/YYNQFF, heiDATA, V1
This resource contains the code of the topological labeler used in the paper: Do and Rehbein (2020). "Parsers Know Best: German PP Attachment Revisited". For this tool, labeling topological field is formulated as a sequence labeling task. We also include in this resource two pre-... |