
Function reference
-
corpus_senti_bench4k - Subset of 6 Corpora for the SentiStrength Benchmark
-
corpus_annual_review - Abstracts from the Annual Review of Sociology, 2020
-
corpus_atn_immigr - Balanced Sample of Immigration related articles from All the News Corpus
-
corpus_beyonce - Lyrics of Beyonce's Songs
-
corpus_cmu_blogs100 - Sample of 100 Blogposts from the CMU 2008 Political Blog Corpus
-
corpus_envsociology - Environmental Sociology Article Abstracts, 1990-2014
-
corpus_europarl_subset - Sample from European Parliament Proceedings Parallel Corpus
-
corpus_finefoods10k - Subset of Amazon Fine Food Reviews Corpus, 2011-2012
-
corpus_isot_fake_news2k - Sample of 2,000 ISOT Fake News Dataset
-
corpus_ittpr - Immigration Think Tank Press Release (ITTPR) Corpus, 1998-2020
-
corpus_presidential - U.S. Presidential Speeches, 1952-1996
-
corpus_reddit_aita10k - Subset of Community Ethical Judgements on Real-Life Anecdotes Corpus
-
corpus_taylor_swift - Lyrics of Taylor Swift's Songs
-
corpus_tng_season5 - Lines from Star Trek: The Next Generation, Season 5
-
corpus_usnss - National Security Strategy of the United States, 1987-2017
-
corpus_senti_bench - 6 Corpora for the SentiStrength Benchmark
-
corpus_disaster - Figure Eight Disaster Tweets
-
corpus_enron - Internal Emails from Enron Email Corpus
-
corpus_nytimes_covid - New York Times Articles about COVID-19, 2020
-
corpus_web_dubois - Lines from three books by W.E.B DuBois
-
corpus_isot_fake_news - ISOT Fake News Dataset
-
corpus_dsj_vox - DJS VOX Articles Corpus, 2014-2017
-
corpus_pitchfork - Pitckfork Reviews, 1999-2019
-
corpus_atn - All The News (ATN) Corpus 1.0, 2015-2017
-
corpus_atn2 - All The News (ATN) Corpus 2.0, 2016-2020
-
corpus_finefoods - Amazon Fine Food Reviews Corpus, 2011-2012
-
corpus_reddit_aita - Community Ethical Judgements on Real-Life Anecdotes Corpus
-
corpus_black_mirror - Lines from Black Mirror
-
corpus_scifi_pulp - 20th Century Science Fiction Pulp Magazines
-
corpus_moral_stories - Moral Stories
-
download_corpus() - Download specified corpus
-
tweetids_covid - Tweet IDs for 1,922 tweets using #Covid19 collected in 2021
-
tweetids_covid_geo - Tweet IDs for 1,999 geo-tagged tweets #Covid19 collected in 2021
-
tweetids_gme - Tweet IDs of 15,594 tweets using the $GME (GameStop Ticker)
-
tweetids_stayhome - Tweet IDs for 23,737 tweets using #StayHome collected in 2021