79k diachronic English-language SNGS word embeddings, 12 decades, British News corpus
vecs_sgns200_british_news.Rd
79 thousand SGNS embeddings from Pedrazzini and McGillivray, trained on a corpus of 19th century British newspapers divided into decades. This is a list of 12 elements, in which every element is an embedding matrix associated with a given decade, 1800-1910. Each matrix is 79 thousand vectors (rows) and 200 dimensions (columns). Note that each embedding has the same vocabulary, but when words do not appear in a given decade they appear as rows with only zero values.
References
Pedrazzini, Nilo & Barbara McGillivray. 2022. Diachronic word embeddings from 19th-century British newspapers [Data set]. Zenodo. doi:10.5281/zenodo.7181682
Examples
if (FALSE) {
## download the model (once per machine)
download_pretrained("vecs_sgns200_british_news")
## load the model each session
data("vecs_sgns200_british_news")
## check dims
length(vecs_sgns200_british_news) == 12L
dim(vecs_sgns200_british_news[[1]]) == c(78879, 200)
}