A dataset containing eight English stoplist. Is used
with the get_stoplist()
function.
stoplists
A data frame with 1775 rows and 2 variables.
The stoplists include:
"tiny2020": Tiny (2020) list of 33 words (Default)
"snowball2001": Snowball (2001) list of 127 words
"snowball2014": Updated Snowball (2014) list of 175 words
"van1979": van Rijsbergen's (1979) list of 250 words
"fox1990": Christopher Fox's (1990) list of 421 words
"smart1993": Original SMART (1993) list of 570 words
"onix2000": ONIX (2000) list of 196 words
"nltk2001": Python's NLTK (2009) list of 179 words
Tiny 2020, is a very small stop list of the most frequent English conjunctions, articles, prepositions, and demonstratives (N=17). Also includes the 8 forms of the copular verb "to be" and the 8 most frequent personal (singular and plural) pronouns (minus gendered and possessive pronouns).
No contractions are included.
Variables:
words. words to be stopped
source. source of the list