R/utils-embedding-vectors.R
test_anchors.Rd
This function evaluates how well an anchor set defines a semantic direction. Anchors must be a two-column data.frame or a list of length == 2. Currently, the function only implements the "PairDir" metric developed by Boutyline and Johnston (2023).
test_anchors(anchors, wv, method = c("pairdir"), all = FALSE, summarize = TRUE)
A data frame or list of juxtaposed 'anchor' terms
Matrix of word embedding vectors (a.k.a embedding model) with rows as terms.
Which metric used to evaluate (currently only pairdir)
Logical (default FALSE
). Whether to evaluate all possible
pairwise combinations of two sets of anchors. If FALSE
only
the input pairs are used in evaluation and anchor sets must be
of equal lengths.
Logical (default TRUE
). Returns a dataframe with AVERAGE
scores for input pairs along with each pairs' contribution.
If summarize = FALSE
, returns a list with each
offset matrix, each contribution, and the average score.
dataframe or list
According to Boutyline and Johnston (2023):
"We find that PairDir -- a measure of parallelism between the offset vectors (and thus of the internal reliability of the estimated relation) -- consistently outperforms other reliability metrics in explaining axis accuracy."
Boutyline and Johnston only consider analyst specified pairs. However,
if all = TRUE
, all pairwise combinations of terms between each set
are evaluated. This can allow for unequal sets of anchors, however this
increases computational complexity considerably.
Boutyline, Andrei, and Ethan Johnston. 2023. “Forging Better Axes: Evaluating and Improving the Measurement of Semantic Dimensions in Word Embeddings.” doi:10.31235/osf.io/576h3
# load example word embeddings
data(ft_wv_sample)
df_anchors <- data.frame(
a = c("rest", "rested", "stay", "stand"),
z = c("coming", "embarked", "fast", "move")
)
test_anchors(df_anchors, ft_wv_sample)
#> anchor_pair pair_dir
#> 1 AVERAGE 0.13890810
#> 2 rest-coming 0.18960552
#> 3 rested-embarked 0.18302837
#> 4 stay-fast 0.10699562
#> 5 stand-move 0.07600288
test_anchors(df_anchors, ft_wv_sample, all = TRUE)
#> anchor_pair pair_dir
#> 1 AVERAGE 0.2748587
#> 2 rest-coming 0.3153744
#> 3 rested-coming 0.2752213
#> 4 stay-coming 0.2356302
#> 5 stand-coming 0.2242636
#> 6 rest-embarked 0.3004799
#> 7 rested-embarked 0.3048728
#> 8 stay-embarked 0.2208549
#> 9 stand-embarked 0.2094862
#> 10 rest-fast 0.3272416
#> 11 rested-fast 0.3054702
#> 12 stay-fast 0.3019808
#> 13 stand-fast 0.2737485
#> 14 rest-move 0.3153754
#> 15 rested-move 0.2671968
#> 16 stay-move 0.2791464
#> 17 stand-move 0.2413955