This is an R Package with several datasets for English-language dictionaries useful for text analysis. See also text2map.


This is primarily a dataset package and therefore we will not be sending it to CRAN. You can install the latest version from GitLab:




These English dictionaries contain hand-ranked and inferred word “norms” as well as frequency and rank information from various corpora.

  • sensorimotor Lancaster Sensorimotor Norms, N = 40,000 (Lynott, et al. 2020)
  • concreteness Lancaster Concreteness, N = 40,000 (Brysbaert et al. 2014)
  • nrc_vad NRC Valence, Arousal, and Dominance (Mohammad et al. 2018)
  • wkb_vad WKB Valence, Arousal, and Dominance (Warriner et al. 2013)
  • bootstrap_mrc Bootstrapped MRC Psycholinguistic Features (Paetzold and Specia 2016)
  • english_freqs Word Frequency/Rank Lists from Four Corpora
  • elp_lexical English Lexicon Project (ELP) Dictionaries (Balota et al. 2007)
  • subtlexus_freqs SUBTLEXus Word Frequency/Ranks (Brysbaert and New 2009)