Figure Eight Disaster Tweets — corpus_disaster • text2map.corpora

A dataset of 10,876 tweets that used disaster-related terms: e.g., "aftershock," "twister," "rescued," and so on. A team of coders tag whether a tweet was "relevant" (actually about a disaster) or "not relevant" (about a movie or a joke, for example). This dataset was originally collected by the company Figure Eight, which was acquired by Appen in March of 2019. It was also used in an NLP competition hosted on Kaggle.

Usage

data("corpus_disaster")

Format

A data frame with 10860 rows and 3 variables.

Source

https://www.kaggle.com/competitions/nlp-getting-started

Variables

doc_id. Unique ID for each tweet
text. Text of the tweet
relevant. Tag of "Relevant" to a disaster or "Not Relevant"