Twitter Titling Corpus (ICPSR doi:10.11588/data/IOHXDF)

View:

Part 1: Document Description
Part 2: Study Description
Part 3: Data Files Description
Part 4: Variable Description
Entire Codebook

Document Description

Citation

Title:

Twitter Titling Corpus

Identification Number:

doi:10.11588/data/IOHXDF

Distributor:

heiDATA

Date of Distribution:

2019-08-23

Version:

1

Bibliographic Citation:

van den Berg, Esther, 2019, "Twitter Titling Corpus", https://doi.org/10.11588/data/IOHXDF, heiDATA, V1, UNF:6:+F3lLKziwMvjy+xyktkilw== [fileUNF]

Study Description

Citation

Title:

Twitter Titling Corpus

Identification Number:

doi:10.11588/data/IOHXDF

Authoring Entity:

van den Berg, Esther (Department of Computational Linguistics, Heidelberg University (2016-2019); Leibniz Institute for the German Language (since 2019))

Date of Production:

2019

Distributor:

heiDATA

Date of Distribution:

2019-08-23

Study Scope

Keywords:

Arts and Humanities, Computer and Information Science, sentiment, entity framing, twitter corpus, annotated tweets, political discourse, computational social science, social media

Topic Classification:

sentiment, entity framing

Abstract:

<p>The Twitter Titling Corpus contains 4002 stance-annotated tweets collected between 20 June 2017 and 30 August 2017 mentioning 6 presidents. Each tweet is annotated for the naming form used to refer to the president, for the purpose of a study on the relation between naming variation and stance (cited below).</p> <p>This data is to be used for research purposes only.</p> <p><strong>Columns</strong></p> <ul> <li><strong>tweet_id</strong>: id of the tweet</li> <li><strong>president</strong>: person entity mentioned in the tweet who was president at the time of collection</li> <li><strong>country</strong>: country the president was president of at the time of collection</li> <li><strong>stance</strong>: positive, neutral or negative sentiment towards the president</li> <li><strong>naming form</strong>: form used to refer to president out of <ul> <li><em>first-name</em> (FN)</li> <li><em>last-name</em> (LN)</li> <li><em>first-name last-name</em> (FNLN)</li> <li><em>title last-name</em> (TLN)</li> <li><em>title first-name last-name</em> (TFNLN)</li> </ul> </li> </ul>

Kind of Data:

textual data, CSV text file format

Methodology and Processing

File Description--f2992

File: twitter_titling_corpus.tab

  • Number of cases: 4002

  • No. of variables per record: 5

  • Type of File: text/tab-separated-values

Notes:

UNF:6:+F3lLKziwMvjy+xyktkilw==

Variable Description

List of Variables:

Variables

tweet_id

f2992 Location:

Summary Statistics: Min. 8.7638402216279245E17; Mean 8.9118533562741018E17; StDev 6.106012454735068E15; Valid 4002.0; Max. 9.0897260490764698E17;

Variable Format: numeric

Notes: UNF:6:an8VxuVKpcfx/rHjAomG1A==

president

f2992 Location:

Variable Format: character

Notes: UNF:6:lZVz3D4rEUxxoJUntTXp5w==

country

f2992 Location:

Variable Format: character

Notes: UNF:6:DLsyd05I+3WEHmm5/LCbeg==

stance

f2992 Location:

Summary Statistics: Mean -0.3325837081459241; Valid 4002.0; StDev 0.8092706848478298; Max. 1.0; Min. -1.0

Variable Format: numeric

Notes: UNF:6:ME0yoCN1YdPdIqdjB64yJA==

naming_form

f2992 Location:

Variable Format: character

Notes: UNF:6:pU8g9YS+6B+4skb1CBpRDA==