A harmonised testsuite for social media POS tagging (DE)doi:10.11588/data/KXLMHNheiDATA2020-03-261Rehbein, Ines; Ruppenhofer, Josef; Zimmermann, Victor, 2020, "A harmonised testsuite for social media POS tagging (DE)", https://doi.org/10.11588/data/KXLMHN, heiDATA, V1A harmonised testsuite for social media POS tagging (DE)doi:10.11588/data/KXLMHNRehbein, InesRuppenhofer, JosefZimmermann, Victor2018Leibniz Institute for the German LanguageheiDATARehbein, InesArts and HumanitiesComputer and Information SciencePOS taggingGermanTweetsGerman web dataSocial media dataPOS tagging<p>A harmonised POS testsuite of web data, CMC and Twitter microtext, with word forms and STTS pos tags (+ some additional CMC-specific tags). UD pos tags have been automatically converted, based on the STTS pos tags. The data does not contain (manually corrected) lemma information. The original data comes from 3 different sources: a twitter dataset with 21,181 tokens, and two datasets from the Empirist shared task 2015: web data (12,718 tokens) and computer-mediated communication (10,505 tokens).</p>archived tab-separated format (CoNLL-U)<p>Rehbein, Ines; Ruppenhofer, Josef; Zimmermann, Victor (2018): <em>A harmonised testsuite for POS tagging of German social media data</em>. In: Proceedings of the 27th International Conference on Computational Linguistics. September 19-21, 2018 Vienna, Austria.</p><p>Rehbein, Ines; Ruppenhofer, Josef; Zimmermann, Victor (2018): <em>A harmonised testsuite for POS tagging of German social media data</em>. In: Proceedings of the 27th International Conference on Computational Linguistics. September 19-21, 2018 Vienna, Austria.</p>social-media-POS-testsuite.conlluapplication/octet-stream