A dataset of 10,876 tweets that used disaster-related terms: e.g., "aftershock," "twister," "rescued," and so on. A team of coders tag whether a tweet was "relevant" (actually about a disaster) or "not relevant" (about a movie or a joke, for example). This dataset was originally collected by the company Figure Eight, which was acquired by Appen in March of 2019. It was also used in an NLP competition hosted on Kaggle.
Usage
data("corpus_disaster")