A dataset of 2,000 news articles randomly sampled from the ISOT Fake News Dataset, 1,000 sampled from "fake" and 1,000 sample from "real" news articles. Articles are labeled "real" are all from Reuters.com. Articles labeled "fake" are collected from a variety of unreliable news websites. Veracity was determined using using the political fact-checking organization Politifact and Wikipedia. Articles with 10 or fewer words were removed.

data(corpus_isot_fake_news2k)

Format

A data frame with 2,000 rows and 5 variables.

Source

https://www.uvic.ca/ecs/ece/isot/datasets/fake-news/index.php

Variables

  • doc_id. Unique identifier for each comment

  • title. Title of the news article

  • text. Text of the article

  • date. Publication date of the article

  • rating. Veracity of the article (real or fake)

References

Ahmed H, Traore I, Saad S. (2018). "Detecting opinion spams and fake news using text classification", Journal of Security and Privacy, 1(1)