Embeddings functions

Functions for word embeddings

CMDist()

Calculate Concept Mover's Distance

CoCA()

Performs Concept Class Analysis (CoCA)

get_centroid()

Word embedding semantic centroid extractor

get_direction()

Word embedding semantic direction extractor

get_regions()

Word embedding semantic region extractor

get_anchors()

Gets anchor terms from precompiled anchor lists

find_projection()

Find the 'projection matrix' to a semantic vector

find_rejection()

Find the 'rejection matrix' from a semantic vector

find_transformation()

Find a specified matrix transformation

General Functions

General functions for text analysis

vocab_builder()

A fast unigram vocabulary builder

seq_builder()

Represent Documents as Token-Integer Sequences

get_stoplist()

Gets stoplist from precompiled lists

tiny_gender_tagger()

A very tiny "gender" tagger

DTM Functions

Functions for document-term matrices

dtm_builder()

A fast unigram DTM builder

dtm_stopper()

Removes terms from a DTM based on rules

dtm_stats()

Gets DTM summary statistics

dtm_resampler()

Resamples an input DTM to generate new DTMs

dtm_melter()

Melt a DTM into a triplet data frame

Textnet Functions

Functions for textual networks

doc_centrality()

Find a specified document centrality metric

doc_similarity()

Find a similarities between documents

Datasets

Included datesets

anchor_lists

A dataset of anchor lists

stoplists

A dataset of stoplists

ft_wv_sample

Sample of fastText embeddings

jfk_speech

Full Text of JFK's Rice Speech

meta_shakespeare

Metadata for Shakespeare's First Folio

Methods

Methods for specific classes

plot(<CoCA>)

Plot CoCA

print(<CoCA>)

Prints CoCA class information