A dataset containing over 44 thousand news articles. 21,416 articles are labeled "real" all from Reuters.com, 22,828 news articles are labeled "fake" and collected from a variety of unreliable news websites. Veracity was determined using using the political fact-checking organization Politifact and Wikipedia. Articles with 10 or fewer words were removed.

data("corpus_isot_fake_news")

Format

A data frame with 44,244 rows and 5 variables.

Source

https://www.uvic.ca/ecs/ece/isot/datasets/fake-news/index.php

Variables

  • doc_id. Unique identifier for each comment

  • title. Title of the news article

  • text. Text of the article

  • date. Publication date of the article

  • rating. Veracity of the article (real or fake)

References

Ahmed H, Traore I, Saad S. (2018). "Detecting opinion spams and fake news using text classification", Journal of Security and Privacy, 1(1)