The MSC Data Set (ICPSR doi:10.11588/data/JEESIQ)

View:

Part 1: Document Description
Part 2: Study Description
Part 5: Other Study-Related Materials
Entire Codebook

Document Description

Citation

Title:

The MSC Data Set

Identification Number:

doi:10.11588/data/JEESIQ

Distributor:

heiDATA

Date of Distribution:

2019-10-07

Version:

1

Bibliographic Citation:

Marasović, Ana; Zhou, Mengfei; Frank, Anette, 2019, "The MSC Data Set", https://doi.org/10.11588/data/JEESIQ, heiDATA, V1

Study Description

Citation

Title:

The MSC Data Set

Identification Number:

doi:10.11588/data/JEESIQ

Authoring Entity:

Marasović, Ana (Department of Computational Linguistics, Heidelberg University, Germany)

Zhou, Mengfei (Department of Computational Linguistics, Heidelberg University, Germany)

Frank, Anette (Department of Computational Linguistics, Heidelberg University, Germany)

Date of Production:

2015

Distributor:

heiDATA

Date of Distribution:

2019-10-07

Study Scope

Keywords:

Arts and Humanities, Computer and Information Science, modal sense classification, semantics, machine learning, annotation, modality

Topic Classification:

argument analysis, factuality recognition, sentiment detection

Abstract:

<p>From this page you can download resources we created for <strong>modal sense classification</strong> as reported in Zhou et al. (2015), Marasović et al. (2016) and Marasović and Frank (2015) (see "Related Publication" below):</p> <ul> <li>Heuristically sense-annotated training data acquired from EUROPARL and OpenSubtitles (<strong>EPOS_E</strong>, English). The dataset was used for: <ul> <li>the EMNLP 2015 Workshop submission "Semantically enriched models for modal sense classification" by Mengfei Zhou, Anette Frank,Annemarie Friedrich, and Alexis Palmer</li> <li>the LiLT submission "Modal Sense Classification At Large: Paraphrase-Driven Sense Projection, Semantically Enriched Classification Models and Cross-Genre Evaluations" by Ana Marasović, Mengfei Zou, Alexis Palmer, Anette Frank</li> <li>the RepL4NLP submission "Multilingual Modal Sense Classification using a Convolutional Neural Network" by Ana Marasović and Anette Frank.</li> </ul> </li> <li>Composition of training and testing used for the classification experiments. The dataset was used for: <ul> <li>the EMNLP 2015 Workshop submission "Semantically enriched models for modal sense classification" by submission Mengfei Zhou, Anette Frank,Annemarie Friedrich, and Alexis Palmer</li> <li>the RepL4NLP submission "Multilingual Modal Sense Classification using a Convolutional Neural Network" by Ana Marasović and Anette Frank.</li> </ul> </li> <li>Manually annotated subsection of <strong>MASC</strong> (English). The dataset was used for the LiLT submission "Modal Sense Classification At Large: Paraphrase-Driven Sense Projection, Semantically Enriched Classification Models and Cross-Genre Evaluations" by Ana Marasović, Mengfei Zou, Alexis Palmer, Anette Frank.</li> <li>Heuristically modal sense annotated training data and manually annotated test data from EUROPARL and OpenSubtitles (<strong>EPOS_G</strong>, German). The dataset was used for the RepL4NLP submission "Multilingual Modal Sense Classification using a Convolutional Neural Network" by Ana Marasović and Anette Frank.</li> </ul> <p>&nbsp;</p>

Kind of Data:

textual data

Methodology and Processing

Other Study-Related Materials

Label:

MSC Data Set.zip

Notes:

application/zip