Given a document-term matrix or a document-similarity matrix, this function returns specified text network-based centrality measures. Currently, this includes degree, eigenvector, betweenness, and spanning.
doc_centrality(mat, method, alpha = 1L, two_mode = TRUE)
Document-term matrix with terms as columns or a document-similarity matrix with documents as rows and columns.
Character vector indicating centrality method, including "degree", "eigen", "span", and "between".
Number (default = 1) indicating the tuning parameter for weighted metrics.
Logical (default = TRUE), indicating whether the input matrix is two mode (i.e. a document-term matrix) or one-mode (i.e. document-similarity matrix)
A dataframe with two columns
If a document-term matrix is provided, the function obtains the one-mode
document-level projection to get the document-similarity matrix using
tcrossprod()
. If a one-mode document-similarity matrix is provided, then
this step is skipped. This way document similiarities may be obtained
using other methods, such as Word-Mover's Distance (see doc_similarity
).
The diagonal is ignored in all calculations.
Document centrality methods include:
degree: Opsahl's weighted degree centrality with tuning parameter "alpha"
between: vertex betweenness centrality using Brandes' method
eigen: eigenvector centrality using Freeman's method
span: Modified Burt's constraint following Stoltz and Taylor's method, uses a tuning parameter "alpha" and the output is scaled.
Brandes, Ulrik
(2000) 'A faster algorithm for betweenness centrality'
Journal of Mathematical Sociology. 25(2):163-177
doi:10.1080/0022250X.2001.9990249
.
Opsahl, Tore, et al.
(2010) 'Node centrality in weighted networks: Generalizing degree
and shortest paths.' Social Networks. 32(3)245:251
doi:10.1016/j.socnet.2010.03.006
Stoltz, Dustin; Taylor, Marshall
(2019) 'Textual Spanning: Finding Discursive Holes in Text Networks'
Socius. doi:10.1177/2378023119827674
# load example text
data(jfk_speech)
# minimal preprocessing
jfk_speech$sentence <- tolower(jfk_speech$sentence)
jfk_speech$sentence <- gsub("[[:punct:]]+", " ", jfk_speech$sentence)
# create DTM
dtm <- dtm_builder(jfk_speech, sentence, sentence_id)
ddeg <- doc_centrality(dtm, method = "degree")
deig <- doc_centrality(dtm, method = "eigen")
dbet <- doc_centrality(dtm, method = "between")
dspa <- doc_centrality(dtm, method = "span")
# with a document-similarity matrix (dsm)
dsm <- doc_similarity(dtm, method = "cosine")
ddeg <- doc_centrality(dsm, method = "degree", two_mode = FALSE)