Evaluate anchor sets in defining semantic directions

This function evaluates how well an anchor set defines a semantic direction. Anchors must be a two-column data.frame or a list of length == 2. Currently, the function only implements the "PairDir" metric developed by Boutyline and Johnston (2023).

test_anchors(anchors, wv, method = c("pairdir"), all = FALSE, summarize = TRUE)

Arguments

anchors: A data frame or list of juxtaposed 'anchor' terms
wv: Matrix of word embedding vectors (a.k.a embedding model) with rows as terms.
method: Which metric used to evaluate (currently only pairdir)
all: Logical (default FALSE). Whether to evaluate all possible pairwise combinations of two sets of anchors. If FALSE only the input pairs are used in evaluation and anchor sets must be of equal lengths.
summarize: Logical (default TRUE). Returns a dataframe with AVERAGE scores for input pairs along with each pairs' contribution. If summarize = FALSE, returns a list with each offset matrix, each contribution, and the average score.

Value

dataframe or list

Details

According to Boutyline and Johnston (2023):

"We find that PairDir -- a measure of parallelism between the offset vectors (and thus of the internal reliability of the estimated relation) -- consistently outperforms other reliability metrics in explaining axis accuracy."

Boutyline and Johnston only consider analyst specified pairs. However, if all = TRUE, all pairwise combinations of terms between each set are evaluated. This can allow for unequal sets of anchors, however this increases computational complexity considerably.

References

Boutyline, Andrei, and Ethan Johnston. 2023. “Forging Better Axes: Evaluating and Improving the Measurement of Semantic Dimensions in Word Embeddings.” doi:10.31235/osf.io/576h3

Examples



# load example word embeddings
data(ft_wv_sample)

df_anchors <- data.frame(
  a = c("rest", "rested", "stay", "stand"),
  z = c("coming", "embarked", "fast", "move")
)

test_anchors(df_anchors, ft_wv_sample)
#>       anchor_pair   pair_dir
#> 1         AVERAGE 0.13890810
#> 2     rest-coming 0.18960552
#> 3 rested-embarked 0.18302837
#> 4       stay-fast 0.10699562
#> 5      stand-move 0.07600288

test_anchors(df_anchors, ft_wv_sample, all = TRUE)
#>        anchor_pair  pair_dir
#> 1          AVERAGE 0.2748587
#> 2      rest-coming 0.3153744
#> 3    rested-coming 0.2752213
#> 4      stay-coming 0.2356302
#> 5     stand-coming 0.2242636
#> 6    rest-embarked 0.3004799
#> 7  rested-embarked 0.3048728
#> 8    stay-embarked 0.2208549
#> 9   stand-embarked 0.2094862
#> 10       rest-fast 0.3272416
#> 11     rested-fast 0.3054702
#> 12       stay-fast 0.3019808
#> 13      stand-fast 0.2737485
#> 14       rest-move 0.3153754
#> 15     rested-move 0.2671968
#> 16       stay-move 0.2791464
#> 17      stand-move 0.2413955