Skip to contents

Calculate one or multiple bootstrap p-values, given a bootstrap sample of test statistics.

Usage

bootstrap_pvals(
  boot_stat,
  stat,
  alternative = "two-sided",
  B_vals = length(boot_stat),
  reps = 1L,
  enlist = FALSE,
  seed = NULL
)

Arguments

boot_stat

vector of bootstrap replications of a test statistic.

stat

numeric value of the test statistic based on the original sample.

alternative

a character string specifying the alternative hypothesis, must be one of "two-sided" (the default), "greater" or "less".

B_vals

vector of sub-sample sizes for which to calculate p-values. Setting B_vals = length(boot_stat) (the default) will return a single p-value calculated on the full set of bootstrap replications. For B_vals < length(boot_stat), p-values will be calculated after sub-sampling (without replacement) the bootstrap replications.

reps

integer value for the number of sub-sample p-values to generate when B_vals < length(boot_stat), with a default of reps = 1.

enlist

logical indicating whether to wrap the returned values in an unnamed list, with a default of FALSE. Setting enlist = TRUE makes it easier to store the output as a single entry in a tibble.

seed

Single numeric value to which the random number generator seed will be set. Default is NULL, which does not set a seed.

Value

The format of the output depends on several contingencies. If only a single value of B_vals is specified and reps = 1, then the function returns a vector with a single p-value. If only a single value of

B_vals is specified but B_vals < length(boot_stat) and

reps > 1, then the function returns a vector p-values, with an entry for each sub-sample replication. If B_vals is a vector of multiple values, then the function returns a list with one entry per entry of

B_vals, where each entry is a vector of length reps with entries for each sub-sample replication.

If enlist = TRUE, then results will be wrapped in an unnamed list, which makes it easier to sore the output in a tibble.

Details

p-values are calculated by comparing stat to the distribution of boot_stat, which is taken to represent the null distribution of the test statistic. If alternative = "two-sided" (the default), then the p-value is the proportion of the bootstrap sample where the absolute value of the bootstrapped statistic exceeds the absolute value of the original statistic. If alternative = "greater", then the p-value is the proportion of the bootstrap sample where the value of the bootstrapped statistic is larger than the original statistic. If alternative = "less", then the p-value is the proportion of the bootstrap sample where the value of the bootstrapped statistic is less than the original statistic.

References

Davison, A.C. and Hinkley, D.V. (1997). _Bootstrap Methods and Their Application_, Chapter 4. Cambridge University Press.

Examples

# generate data from two distinct populations
dat <- data.frame(
  group = rep(c("A","B"), c(40, 50)),
  y = c(
    rgamma(40, shape = 7, scale = 2),
    rgamma(50, shape = 3, scale = 4)
  )
)
stat <- t.test(y ~ group, data = dat)$statistic

# create bootstrap replications under the null of no difference
boot_dat <- dat
booties <- replicate(399, {
  boot_dat$group <- sample(dat$group)
  t.test(y ~ group, data = boot_dat)$statistic
})

# calculate bootstrap p-values from full set of bootstrap replicates
bootstrap_pvals(boot_stat = booties, stat = stat)
#>   bootstraps      pval
#> 1        399 0.6666667

# calculate multiple bootstrap p-values using sub-sampling of replicates
bootstrap_pvals(
  boot_stat = booties, stat = stat,
  B_vals = 199,
  reps = 4L
)
#>   bootstraps                                       pval
#> 1        199 0.7035176, 0.7236181, 0.6482412, 0.6281407

# calculate multiple bootstrap p-values using sub-sampling of replicates,
# for each of several sub-sample sizes.
bootstrap_pvals(
  boot_stat = booties, stat = stat,
  B_vals = c(49,99,199),
  reps = 4L
)
#>   bootstraps                                       pval
#> 1         49 0.5714286, 0.6326531, 0.6122449, 0.6734694
#> 2         99 0.6060606, 0.6666667, 0.6565657, 0.6868687
#> 3        199 0.6532663, 0.6884422, 0.6783920, 0.7135678