perm_tester()
carries out Monte Carlo permutation tests for model
p-values from two-tailed, left-tailed, and/or right-tailed hypothesis
testing.
perm_tester(
data,
model,
perm_var = NULL,
strat_var = NULL,
statistic,
perm_n = 1000,
alternative = "all",
alpha = 0.05,
seed = NULL
)
The dataframe from which the model is estimated.
The model which will be estimated and re-estimated.
The variable in the model that will be permuted.
Default is NULL
which takes the first Y
n term
in the formula of the model
Categorical variable for within-stratum permutations.
Defaults to NULL
.
The name of the model statistic you want to "grab" after re-running the model with each permutation to compare to the original model statistic.
The total number of permutations. Defaults to 1000.
The alternative hypothesis. One of "two.sided"
(default),
"left"
, "right"
, and "all"
. Defaults to "all"
,
which reports the p-value statistics for all three
alternative hypotheses.
Alpha level for the hypothesis test. Defaults to 0.05.
Optional seed for reproducibility of the p-value statistics. Defaults to null.
Returns a data frame with the observed statistic (stat
), the
p-values (P_left
, for left-tailed, P_right
for right-tailed,
and/or
P_two
for two-tailed), and the standard errors and confidence
intervals for those p-values, respectively.
perm_tester()
can be used to derive p-values under the randomization
model of inference. There are various reasons one might want to do this---
with text data, and observational data more generally, this might be
because the corpus/sample is not a random sample from a target population.
In such cases, population model p-values might not make much sense since
the asymptotically-derived standard errors from which they are constructed
themselves do not make sense. We might therefore want to make inferences
on the basis of whether or not randomness, as a data-generating mechanism,
might reasonably account for a statistic at least as extreme as the one
we observed. perm_tester()
works from this idea.
perm_tester()
works like this. First, the model (supplied the model
parameter) is run on the observed data. Second, we take some statistic of
interest, which we indicate with the statistic
parameter, and set it to
the side. Third, a variable, perm_var
, is permuted---meaning the observed
values for the rows of data
on perm_var
are randomly reshuffled. Fourth,
we estimate the model again, this time with the permuted perm_var
. Fifth,
we get grab that same statistic
. We repeat steps two through
five a total of perm_n
times, each time tallying the number of times the
statistic
from the permutation-derived model is greater than or equal to
(for a right-tailed test), less-than or equal to (for a left-tailed test),
and/or has an absolute value greater than or equal to (for a two-tailed test)
the statistic
from the "real" model.
If we divide those tallies by the total number of permutations, then we
get randomization-based p-values. This is what perm_tester()
does. The
null hypothesis is that randomness could likely generate the statistic
that we observe. The alternative hypothesis is that randomness alone likely
can't account for the observed statistic.
We then reject the null hypothesis if the p-value is below a threshold indicated
with alpha
, which, as in population-based inference, is the probability
below which we are willing to reject the null hypothesis when it is actually
true. So if the p-value is below, say, alpha
= 0.05 and we're performing,
a right-tailed test, then fewer than 5% of the statistics derived from the
permutation-based models are greater than or equal to our observed
statistic. We would then reject the null, as it is unlikely (based on our alpha
threshold), that randomness as a data-generating mechanism can account
for a test statistic at least as large the one we observed.
In most cases, analysts probably cannot expect to perform "exact" permutation
tests where every possible permutation is accounted for---i.e., where
perm_n
equals the total number of possible permutations. Instead, we
can take random samples of the "population" of permutations. perm_tester()
does this, and reports the standard errors and (1 - alpha
) confidence
intervals for the p-values.
perm_tester()
can also perform stratified permutation tests, where the observed
perm_var
variables within groups. This can be done by setting the strat_var
variable to be he grouping variable.
Taylor, Marshall A. (2020)
'Visualization Strategies for Regression Estimates with Randomization
Inference' Stata Journal 20(2):309-335.
doi:10.1177/1536867X20930999
.
#' Darlington, Richard B. and Andrew F. Hayes (2016)
Regression analysis and linear models: Concepts, applications, and implementation.
Guilford Publications.
Ernst, Michael D. (2004)
'permutation methods: a basis for exact inference' Statistical Scicence
19(4):676-685.
doi:10.1214/088342304000000396
.
Manly, Bryan F. J. (2007)
Randomization, Bootstrap and Monte Carlo Methods in Biology.
Chapman and Hall/CRC.
doi:10.1201/9781315273075
.
# \donttest{
data <- text2map::meta_shakespeare
model <- lm(body_count ~ boas_problem_plays + year + genre, data = data)
# without stratified permutations, two-sided test
out1 <- perm_tester(
data = data,
model = model,
statistic = "coefficients",
perm_n = 40,
alternative = "two.sided",
alpha = .01,
seed = 8675309
)
# with stratified permutations, two-sided test
out2 <- perm_tester(
data = data,
model = model,
strat_var = "boas_problem_plays",
statistic = "coefficients",
perm_n = 40,
alternative = "two.sided",
alpha = .01,
seed = 8675309
)
# }