WikiWarsDE Corpus (doi:10.11588/data/10026)

View:

Part 1: Document Description
Part 2: Study Description
Part 5: Other Study-Related Materials
Entire Codebook

(external link)

Document Description

Citation

Title:

WikiWarsDE Corpus

Identification Number:

doi:10.11588/data/10026

Distributor:

heiDATA

Date of Distribution:

2014-08-13

Version:

1

Bibliographic Citation:

Strötgen, Jannik; Gertz, Michael, 2014, "WikiWarsDE Corpus", https://doi.org/10.11588/data/10026, heiDATA, V1

Study Description

Citation

Title:

WikiWarsDE Corpus

Identification Number:

doi:10.11588/data/10026

Authoring Entity:

Strötgen, Jannik (Institute of Computer Science)

Gertz, Michael (Institute of Computer Science)

Producer:

Strötgen, Jannik

Gertz, Michael

Date of Production:

2011

Distributor:

heiDATA

Distributor:

HeiDATA: Heidelberg Research Data Repository

Access Authority:

Strötgen, Jannik

Date of Deposit:

2014-08-06

Holdings Information:

https://doi.org/10.11588/data/10026

Study Scope

Keywords:

Computer and Information Science, temporal expressions, temporal tagging, annotated corpus

Abstract:

The WikiWarsDE corpus is a German corpus containing Wikipedia articles with annotations of temporal expressions. Its creation was motivated by the English WikiWars corpus (Mazur & Dale 2010). WikiWarsDE was developed to support research on temporal information extraction and normalization. The 22 documents contain 95,604 tokens and 2,240 temporal expressions annotated following TIDES TIMEX2 annotation guidelines.

Methodology and Processing

Sources Statement

Data Sources:

<a href="http://de.wikipedia.org">German Wikipedia</a>

Data Access

Citation Requirement:

Please cite Strötgen & Gertz (2011), if you use the corpus in your work.

Other Study Description Materials

Related Materials

<a href="http://timexportal.wikidot.com/wikiwars">WikiWars corpus</a>

Related Studies

P. Mazur & R. Dale (2010). WikiWars: A New Corpus for Research on Temporal Expressions. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 913-922.

Related Publications

Citation

Title:

Strötgen, J. & Gertz M. (2011). WikiWarsDE: A German Corpus of Narratives Annotated with Temporal Expressions. Proceedings of the Conference of the German Society for Computational Linguistics and Language Technology (GSCL), pages 129-134.

Bibliographic Citation:

Strötgen, J. & Gertz M. (2011). WikiWarsDE: A German Corpus of Narratives Annotated with Temporal Expressions. Proceedings of the Conference of the German Society for Computational Linguistics and Language Technology (GSCL), pages 129-134.

Other Study-Related Materials

Label:

readme.txt

Text:

README

Notes:

text/plain; charset=UTF-8

Other Study-Related Materials

Label:

WikiWarsDE_20110412.zip

Text:

Notes:

application/zip