Find a specified document centrality metric — doc

Given a document-term matrix or a document-similarity matrix, this function returns specified text network-based centrality measures. Currently, this includes degree, eigenvector, betweenness, and spanning.

doc_centrality(mat, method, alpha = 1L, two_mode = TRUE)

Arguments

mat: Document-term matrix with terms as columns or a document-similarity matrix with documents as rows and columns.
method: Character vector indicating centrality method, including "degree", "eigen", "span", and "between".
alpha: Number (default = 1) indicating the tuning parameter for weighted metrics.
two_mode: Logical (default = TRUE), indicating whether the input matrix is two mode (i.e. a document-term matrix) or one-mode (i.e. document-similarity matrix)

Value

A dataframe with two columns

Details

If a document-term matrix is provided, the function obtains the one-mode document-level projection to get the document-similarity matrix using tcrossprod(). If a one-mode document-similarity matrix is provided, then this step is skipped. This way document similiarities may be obtained using other methods, such as Word-Mover's Distance (see doc_similarity). The diagonal is ignored in all calculations.

Document centrality methods include:

degree: Opsahl's weighted degree centrality with tuning parameter "alpha"
between: vertex betweenness centrality using Brandes' method
eigen: eigenvector centrality using Freeman's method
span: Modified Burt's constraint following Stoltz and Taylor's method, uses a tuning parameter "alpha" and the output is scaled.

References

Brandes, Ulrik (2000) 'A faster algorithm for betweenness centrality' Journal of Mathematical Sociology. 25(2):163-177 doi:10.1080/0022250X.2001.9990249 .
Opsahl, Tore, et al. (2010) 'Node centrality in weighted networks: Generalizing degree and shortest paths.' Social Networks. 32(3)245:251 doi:10.1016/j.socnet.2010.03.006
Stoltz, Dustin; Taylor, Marshall (2019) 'Textual Spanning: Finding Discursive Holes in Text Networks' Socius. doi:10.1177/2378023119827674

Author

Dustin Stoltz

Examples


# load example text
data(jfk_speech)

# minimal preprocessing
jfk_speech$sentence <- tolower(jfk_speech$sentence)
jfk_speech$sentence <- gsub("[[:punct:]]+", " ", jfk_speech$sentence)

# create DTM
dtm <- dtm_builder(jfk_speech, sentence, sentence_id)

ddeg <- doc_centrality(dtm, method = "degree")
deig <- doc_centrality(dtm, method = "eigen")
dbet <- doc_centrality(dtm, method = "between")
dspa <- doc_centrality(dtm, method = "span")

# with a document-similarity matrix (dsm)

dsm <- doc_similarity(dtm, method = "cosine")
ddeg <- doc_centrality(dsm, method = "degree", two_mode = FALSE)