View: |
Part 1: Document Description
|
Citation |
|
---|---|
Title: |
A harmonised testsuite for social media POS tagging (DE) |
Identification Number: |
doi:10.11588/data/KXLMHN |
Distributor: |
heiDATA |
Date of Distribution: |
2020-03-26 |
Version: |
1 |
Bibliographic Citation: |
Rehbein, Ines; Ruppenhofer, Josef; Zimmermann, Victor, 2020, "A harmonised testsuite for social media POS tagging (DE)", https://doi.org/10.11588/data/KXLMHN, heiDATA, V1 |
Citation |
|
Title: |
A harmonised testsuite for social media POS tagging (DE) |
Identification Number: |
doi:10.11588/data/KXLMHN |
Authoring Entity: |
Rehbein, Ines (Leibniz Institute for the German Language) |
Ruppenhofer, Josef (Leibniz Institute for the German Language) |
|
Zimmermann, Victor (Department of Computational Linguistics, Heidelberg University) |
|
Date of Production: |
2018 |
Distributor: |
heiDATA |
Access Authority: |
Rehbein, Ines |
Holdings Information: |
https://doi.org/10.11588/data/KXLMHN |
Study Scope |
|
Keywords: |
Arts and Humanities, Computer and Information Science, POS tagging, German, Tweets, German web data |
Topic Classification: |
Social media data, POS tagging |
Abstract: |
<p>A harmonised POS testsuite of web data, CMC and Twitter microtext, with word forms and STTS pos tags (+ some additional CMC-specific tags). UD pos tags have been automatically converted, based on the STTS pos tags. The data does not contain (manually corrected) lemma information. The original data comes from 3 different sources: a twitter dataset with 21,181 tokens, and two datasets from the Empirist shared task 2015: web data (12,718 tokens) and computer-mediated communication (10,505 tokens).</p> |
Kind of Data: |
archived tab-separated format (CoNLL-U) |
Methodology and Processing |
|
Sources Statement |
|
Data Access |
|
Other Study Description Materials |
|
Related Publications |
|
Citation |
|
Title: |
<p>Rehbein, Ines; Ruppenhofer, Josef; Zimmermann, Victor (2018): <em>A harmonised testsuite for POS tagging of German social media data</em>. In: Proceedings of the 27th International Conference on Computational Linguistics. September 19-21, 2018 Vienna, Austria.</p> |
Bibliographic Citation: |
<p>Rehbein, Ines; Ruppenhofer, Josef; Zimmermann, Victor (2018): <em>A harmonised testsuite for POS tagging of German social media data</em>. In: Proceedings of the 27th International Conference on Computational Linguistics. September 19-21, 2018 Vienna, Austria.</p> |
Label: |
social-media-POS-testsuite.conllu |
Notes: |
application/octet-stream |