Package index • text2map.pretrained

Functions

Functions for downloading and loading pretrained models

Functions for managing downloaded models

Structural topic models (bundled with package)

stm_envsoc: Pre-estimated STM Model for Environmental Sociology Abstracts
stm_fiction_cohort: Pre-estimated STM Model for Fiction-Author Cohort Study

Static word embedding models (must download first)

vecs_fasttext300_commoncrawl: 2 million English-language fastText word embeddings
vecs_fasttext300_wiki_news: 1 million English-language fastText word embeddings
vecs_fasttext300_wiki_news_subword: 1 million English-language fastText word embeddings, w/subword information
vecs_glove200_twitter: 1.2m English-language GloVe word embeddings trained on Twitter (200 dimensions)
vecs_glove300_metal_lyrics: 52k English-language GloVe word embeddings trained on metal lyrics
vecs_glove300_wiki_gigaword: 400k English-language GloVe word embeddings (300 dimensions)
vecs_glove50_twitter: 1.2m English-language GloVe word embeddings trained on Twitter (50 dimensions)
vecs_cbow300_googlenews: 3 million English-language CBOW word embeddings trained on Google News corpus
vecs_sgns300_bnc_pos: 160k English-language SGNS word embeddings trained on the British National Corpus
vecs_sgns300_googlengrams_kte_en: 1 million English-language SGNS word embeddings trained on Google N-Grams
vecs_svd20_metal_bpe: 3,299 SVD subword embeddings from BPE-tokenized metal lyrics
vecs_svd20_metal_position: 74 lo-fi SVD positional embeddings from metal lyrics
vecs_svd20_metal_type: 54k lo-fi SVD word-type embeddings from metal lyrics

Temporal/historical word embedding models (must download first)

vecs_sgns300_coha_histwords: 50k diachronic English-language SGNS word embeddings over 20 decades
vecs_sgns300_googlengrams_histwords: 100k diachronic English-language SGNS word embeddings, 20 decades, Google Books corpus
vecs_sgns300_googlengrams_histwords_de: 100k diachronic German-language SGNS word embeddings, 20 decades, Google Books corpus
vecs_sgns300_googlengrams_histwords_fr: 100k diachronic French-language SGNS word embeddings, 20 decades, Google Books corpus
vecs_sgns300_googlengrams_histwords_zh: 30k diachronic Chinese-language SGNS word embeddings, 5 decades, Google Books corpus
vecs_sgns300_googlengrams_fic_histwords: 100k diachronic English-language SGNS word embeddings, 20 decades, Google Books Fiction corpus
vecs_svd300_googlengrams_histwords: 75k diachronic English-language SVD word embeddings, 20 decades, Google Books corpus
vecs_sgns200_british_news: 79k diachronic English-language SNGS word embeddings, 12 decades, British News corpus