Improvements

  • Added functionality
    • dtm_builder includes an option to return a dense base R matrix
    • dtm_stopper includes an option to remove based on a terms rank (e.g., top 10), stopping based on count and proportion are now two separate options

Improvements

Improvements

  • Added functionality to dtm_stopper() to stop words by document or term frequencies
    • Nomenclature was changed, stop_freq was changed to stop_termfreq
  • Added functionality to dtm_resampler() to resample proportion and fixed N lengths
  • Added and clarified documentation
  • Added a NEWS.md file to track changes to the package.