10.11588/data/QKF4LT
Ruppenhofer, Josef(Leibniz Institute for the German Language)
Affixoid Dataset (DE)
heiDATA
2019
doi:10.11588/data/QKF4LT/4CPCOZdoi:10.11588/data/QKF4LT/ARVFSU
The dataset contains the manual annotations for the COLING 2018 submission "Distinguishing affixoid formations from compounds" by Josef Ruppenhofer, Michael Wiegand, Rebecca Wilm and Katja Markert. 1788 complex words containing one of 7 German suffixoid candidates (e.g. -hai, -gott) were annotated manually as to whether the complex forms represent regular compounds or affixoid formations. The main experiments in the paper use automatically extracted features of the complex forms in trying to correctly make this distinction. Additionally, the words were labeled for five properties related to any intensifying and evaluative meaning potentially associated with the whole word and its components. These manual feature annotations were used to establish the upper-bound performance of a classifier trained to distinguish affixoid formations from regular compounds.
Ruppenhofer, Josef(Leibniz Institute for the German Language)