perm_tester() carries out Monte Carlo permutation tests for model p-values from two-tailed, left-tailed, and/or right-tailed hypothesis testing.

perm_tester(
  data,
  model,
  perm_var = NULL,
  strat_var = NULL,
  statistic,
  perm_n = 1000,
  alternative = "all",
  alpha = 0.05,
  seed = NULL
)

Arguments

data

The dataframe from which the model is estimated.

model

The model which will be estimated and re-estimated.

perm_var

The variable in the model that will be permuted. Default is NULL which takes the first Yn term in the formula of the model

strat_var

Categorical variable for within-stratum permutations. Defaults to NULL.

statistic

The name of the model statistic you want to "grab" after re-running the model with each permutation to compare to the original model statistic.

perm_n

The total number of permutations. Defaults to 1000.

alternative

The alternative hypothesis. One of "two.sided" (default), "left", "right", and "all". Defaults to "all", which reports the p-value statistics for all three alternative hypotheses.

alpha

Alpha level for the hypothesis test. Defaults to 0.05.

seed

Optional seed for reproducibility of the p-value statistics. Defaults to null.

Value

Returns a data frame with the observed statistic (stat), the p-values (P_left, for left-tailed, P_right for right-tailed, and/or P_two for two-tailed), and the standard errors and confidence intervals for those p-values, respectively.

Details

perm_tester() can be used to derive p-values under the randomization model of inference. There are various reasons one might want to do this--- with text data, and observational data more generally, this might be because the corpus/sample is not a random sample from a target population. In such cases, population model p-values might not make much sense since the asymptotically-derived standard errors from which they are constructed themselves do not make sense. We might therefore want to make inferences on the basis of whether or not randomness, as a data-generating mechanism, might reasonably account for a statistic at least as extreme as the one we observed. perm_tester() works from this idea.

perm_tester() works like this. First, the model (supplied the model parameter) is run on the observed data. Second, we take some statistic of interest, which we indicate with the statistic parameter, and set it to the side. Third, a variable, perm_var, is permuted---meaning the observed values for the rows of data on perm_var are randomly reshuffled. Fourth, we estimate the model again, this time with the permuted perm_var. Fifth, we get grab that same statistic. We repeat steps two through five a total of perm_n times, each time tallying the number of times the statistic from the permutation-derived model is greater than or equal to (for a right-tailed test), less-than or equal to (for a left-tailed test), and/or has an absolute value greater than or equal to (for a two-tailed test) the statistic from the "real" model.

If we divide those tallies by the total number of permutations, then we get randomization-based p-values. This is what perm_tester() does. The null hypothesis is that randomness could likely generate the statistic that we observe. The alternative hypothesis is that randomness alone likely can't account for the observed statistic.

We then reject the null hypothesis if the p-value is below a threshold indicated with alpha, which, as in population-based inference, is the probability below which we are willing to reject the null hypothesis when it is actually true. So if the p-value is below, say, alpha = 0.05 and we're performing, a right-tailed test, then fewer than 5% of the statistics derived from the permutation-based models are greater than or equal to our observed statistic. We would then reject the null, as it is unlikely (based on our alpha threshold), that randomness as a data-generating mechanism can account for a test statistic at least as large the one we observed.

In most cases, analysts probably cannot expect to perform "exact" permutation tests where every possible permutation is accounted for---i.e., where perm_n equals the total number of possible permutations. Instead, we can take random samples of the "population" of permutations. perm_tester() does this, and reports the standard errors and (1 - alpha) confidence intervals for the p-values.

perm_tester() can also perform stratified permutation tests, where the observed perm_var variables within groups. This can be done by setting the strat_var variable to be he grouping variable.

References

Taylor, Marshall A. (2020) 'Visualization Strategies for Regression Estimates with Randomization Inference' Stata Journal 20(2):309-335. doi:10.1177/1536867X20930999 .

#' Darlington, Richard B. and Andrew F. Hayes (2016) Regression analysis and linear models: Concepts, applications, and implementation. Guilford Publications.

Ernst, Michael D. (2004) 'permutation methods: a basis for exact inference' Statistical Scicence 19(4):676-685. doi:10.1214/088342304000000396 .

Manly, Bryan F. J. (2007) Randomization, Bootstrap and Monte Carlo Methods in Biology. Chapman and Hall/CRC. doi:10.1201/9781315273075 .

Author

Marshall Taylor and Dustin Stoltz

Examples

# \donttest{
data <- text2map::meta_shakespeare

model <- lm(body_count ~ boas_problem_plays + year + genre, data = data)

# without stratified permutations, two-sided test
out1 <- perm_tester(
  data = data,
  model = model,
  statistic = "coefficients",
  perm_n = 40,
  alternative = "two.sided",
  alpha = .01,
  seed = 8675309
)

# with stratified permutations, two-sided test
out2 <- perm_tester(
  data = data,
  model = model,
  strat_var = "boas_problem_plays",
  statistic = "coefficients",
  perm_n = 40,
  alternative = "two.sided",
  alpha = .01,
  seed = 8675309
)
# }