View: |
Part 1: Document Description
|
Citation |
|
---|---|
Title: |
X-SRL Dataset and mBERT Word Aligner |
Identification Number: |
doi:10.11588/data/HVXXIJ |
Distributor: |
heiDATA |
Date of Distribution: |
2021-02-17 |
Version: |
1 |
Bibliographic Citation: |
Daza, Angel, 2021, "X-SRL Dataset and mBERT Word Aligner", https://doi.org/10.11588/data/HVXXIJ, heiDATA, V1 |
Citation |
|
Title: |
X-SRL Dataset and mBERT Word Aligner |
Identification Number: |
doi:10.11588/data/HVXXIJ |
Authoring Entity: |
Daza, Angel (Leibniz Institute for the German Language / Department of Computational Linguistics, Heidelberg University) |
Date of Production: |
2020 |
Distributor: |
heiDATA |
Access Authority: |
Daza, Angel |
Holdings Information: |
https://doi.org/10.11588/data/HVXXIJ |
Study Scope |
|
Keywords: |
Arts and Humanities, Computer and Information Science, word alignment, annotation projection, multilingual semantic role labeling, SRL, multilingual BERT |
Topic Classification: |
Semantic Role Labeling |
Abstract: |
This code contains a method to automatically align words from parallel sentences by using multilingual BERT pre-trained embeddings. This can be used to transfer source annotations (for example labeled English sentences) into the target side (for example a German translation of the sentence) by transferring the label into the best-aligned target word. This newly labeled data can be used to train different multilingual SOTA models to improve performance, especially for the lower-resource languages. |
Kind of Data: |
program source code |
Methodology and Processing |
|
Sources Statement |
|
Data Access |
|
Other Study Description Materials |
|
Related Publications |
|
Citation |
|
Title: |
<p>Daza, Angel and Frank, Anette (2020). X-SRL: A Parallel Cross-lingual Semantic Role Labeling Dataset. In <em>Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing</em>, November 16-20, 2020, Online.</p> |
Identification Number: |
2010.01998 |
Bibliographic Citation: |
<p>Daza, Angel and Frank, Anette (2020). X-SRL: A Parallel Cross-lingual Semantic Role Labeling Dataset. In <em>Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing</em>, November 16-20, 2020, Online.</p> |
Label: |
README.md |
Notes: |
text/markdown |
Label: |
xsrl_mbert_aligner.zip |
Notes: |
application/zip |