The MSC Data Set (doi:10.11588/data/JEESIQ)

View:

Part 1: Document Description
Part 2: Study Description
Part 5: Other Study-Related Materials
Entire Codebook

(external link) (external link) (external link)

Document Description

Citation

Title:

The MSC Data Set

Identification Number:

doi:10.11588/data/JEESIQ

Distributor:

heiDATA

Date of Distribution:

2019-10-07

Version:

1

Bibliographic Citation:

Marasović, Ana; Zhou, Mengfei; Frank, Anette, 2019, "The MSC Data Set", https://doi.org/10.11588/data/JEESIQ, heiDATA, V1

Study Description

Citation

Title:

The MSC Data Set

Identification Number:

doi:10.11588/data/JEESIQ

Authoring Entity:

Marasović, Ana (Department of Computational Linguistics, Heidelberg University, Germany)

Zhou, Mengfei (Department of Computational Linguistics, Heidelberg University, Germany)

Frank, Anette (Department of Computational Linguistics, Heidelberg University, Germany)

Date of Production:

2015

Distributor:

heiDATA

Access Authority:

Marasović, Ana

Holdings Information:

https://doi.org/10.11588/data/JEESIQ

Study Scope

Keywords:

Arts and Humanities, Computer and Information Science, modal sense classification, semantics, machine learning, annotation, modality

Topic Classification:

argument analysis, factuality recognition, sentiment detection

Abstract:

<p>From this page you can download resources we created for <strong>modal sense classification</strong> as reported in Zhou et al. (2015), Marasović et al. (2016) and Marasović and Frank (2015) (see "Related Publication" below):</p> <ul> <li>Heuristically sense-annotated training data acquired from EUROPARL and OpenSubtitles (<strong>EPOS_E</strong>, English). The dataset was used for: <ul> <li>the EMNLP 2015 Workshop submission "Semantically enriched models for modal sense classification" by Mengfei Zhou, Anette Frank,Annemarie Friedrich, and Alexis Palmer</li> <li>the LiLT submission "Modal Sense Classification At Large: Paraphrase-Driven Sense Projection, Semantically Enriched Classification Models and Cross-Genre Evaluations" by Ana Marasović, Mengfei Zou, Alexis Palmer, Anette Frank</li> <li>the RepL4NLP submission "Multilingual Modal Sense Classification using a Convolutional Neural Network" by Ana Marasović and Anette Frank.</li> </ul> </li> <li>Composition of training and testing used for the classification experiments. The dataset was used for: <ul> <li>the EMNLP 2015 Workshop submission "Semantically enriched models for modal sense classification" by submission Mengfei Zhou, Anette Frank,Annemarie Friedrich, and Alexis Palmer</li> <li>the RepL4NLP submission "Multilingual Modal Sense Classification using a Convolutional Neural Network" by Ana Marasović and Anette Frank.</li> </ul> </li> <li>Manually annotated subsection of <strong>MASC</strong> (English). The dataset was used for the LiLT submission "Modal Sense Classification At Large: Paraphrase-Driven Sense Projection, Semantically Enriched Classification Models and Cross-Genre Evaluations" by Ana Marasović, Mengfei Zou, Alexis Palmer, Anette Frank.</li> <li>Heuristically modal sense annotated training data and manually annotated test data from EUROPARL and OpenSubtitles (<strong>EPOS_G</strong>, German). The dataset was used for the RepL4NLP submission "Multilingual Modal Sense Classification using a Convolutional Neural Network" by Ana Marasović and Anette Frank.</li> </ul> <p>&nbsp;</p>

Kind of Data:

textual data

Methodology and Processing

Sources Statement

Data Access

Other Study Description Materials

Related Materials

<p><strong>Europarl Parallel Corpus</strong></p> <p>Link to the corpus: <a href="https://www.statmt.org/europarl/">https://www.statmt.org/europarl/</a></p> <p><strong>OpenSubtitles corpus</strong></p> <p>Link to the corpus: <a href="http://opus.nlpl.eu/ ">http://opus.nlpl.eu/</a></p> <p><strong>Manually Annotated Sub-Corpus (MASC)</strong></p> <p>Link to the corpus: <a href="https://www.anc.org/data/masc/ ">https://www.anc.org/data/masc/ </a></p>

Related Publications

Citation

Title:

<p>Zhou, M., Frank, A., Friedrich, A., and Palmer, A. (2015). Semantically enriched models for modal sense classification. In <em>Proceedings of the EMNLP 2015 Workshop on Linking Models of Lexical, Sentential and Discourse-level Semantics</em>, pages 44&ndash;53, 18 September 2015, Lisboa, Portugal.</p>

Identification Number:

https://www.aclweb.org/anthology/W15-2705

Bibliographic Citation:

<p>Zhou, M., Frank, A., Friedrich, A., and Palmer, A. (2015). Semantically enriched models for modal sense classification. In <em>Proceedings of the EMNLP 2015 Workshop on Linking Models of Lexical, Sentential and Discourse-level Semantics</em>, pages 44&ndash;53, 18 September 2015, Lisboa, Portugal.</p>

Citation

Title:

<p>Marasović, A., Zhou, M., Palmer, A., and Frank, A. (2016). Modal sense classification at large: Paraphrasedriven sense projection, semantically enriched classification models and cross-genre evaluations. In <em>Linguistic Issues in Language Technology, Special issue on Modality in Natural Language Understanding</em>, volume 14 (2), Stanford, CA. CSLI Publications.</p>

Identification Number:

http://csli-lilt.stanford.edu/ojs/index.php/LiLT/article/view/65/65

Bibliographic Citation:

<p>Marasović, A., Zhou, M., Palmer, A., and Frank, A. (2016). Modal sense classification at large: Paraphrasedriven sense projection, semantically enriched classification models and cross-genre evaluations. In <em>Linguistic Issues in Language Technology, Special issue on Modality in Natural Language Understanding</em>, volume 14 (2), Stanford, CA. CSLI Publications.</p>

Citation

Title:

<p>Marasović, A. and Frank, A. (2016). Multilingual modal sense classification using a convolutional neural network. In <em>Proceedings of the 1st Workshop on Representation Learning for NLP,</em> pages 111&ndash;120, August 11, 2016, Berlin, Germany. Association for Computational Linguistics.</p>

Identification Number:

https://www.aclweb.org/anthology/W16-1613

Bibliographic Citation:

<p>Marasović, A. and Frank, A. (2016). Multilingual modal sense classification using a convolutional neural network. In <em>Proceedings of the 1st Workshop on Representation Learning for NLP,</em> pages 111&ndash;120, August 11, 2016, Berlin, Germany. Association for Computational Linguistics.</p>

Other Study-Related Materials

Label:

MSC Data Set.zip

Notes:

application/zip