Downloads a designated corpus available for the
text2map.corpora package, hosted on GitLab. While some are
included when the package is installed, some
corpora need to be downloaded just once per machine.
Downloading may take a while. If no location is specified,
the file will be saved in the package's data folder allowing the
corpus to be loaded with load_corpus(). If a location other than the
package's data folder is specified, the corpus can be loaded with
load_corpus(corpus, location).
Arguments
- corpus
Character string indicating corpus name to be downloaded
- location
Default is
NULLand will save in the R package data folder. If desired, specify saving the corpus elsewhere (Note: if saved elsewhere, useload_corpus(corpus, location)instead ofdata())- force
Default
FALSE. If corpus already exists locally, download will be stopped unlessTRUE.- quiet
Logical (default
FALSE) to mute messages
Details
The function tries to download in the following format priority:
.qs2- Fastest loading (recommended).fst- Fast loading, data.frame only.rda- Standard R format with best compression
The package also includes curated lists of Tweet IDs which (in theory) can be "rehydrated" to rebuild a Tweet corpus:
tweetids_covid_geotweetids_covidtweetids_gmetweetids_stayhome
