Skip to contents

Given a set of bootstrap confidence intervals calculated across sub-samples with different numbers of replications, extrapolates confidence interval coverage and width of bootstrap confidence intervals to a specified (larger) number of bootstraps. The function also calculates the associated Monte Carlo standard errors. The confidence interval percentage is based on how you calculated the lower and upper bounds.

Usage

extrapolate_rejection(
  data,
  pvalue_subsamples,
  B_target = Inf,
  alpha = 0.05,
  nested = FALSE,
  format = "wide"
)

Arguments

data

data frame or tibble containing the simulation results.

pvalue_subsamples

list or name of column from data containing list of confidence intervals calculated based on sub-samples with different numbers of replications.

B_target

number of bootstrap replications to which the criteria should be extrapolated, with a default of B = Inf.

alpha

scalar or vector indicating the nominal alpha level(s). Default value is set to the conventional .05.

nested

logical value controlling the format of the output. If FALSE (the default), then the results will be returned as a data frame with rows for each distinct number of bootstraps. If TRUE, then the results will be returned as a data frame with a single row, with each performance criterion containing a nested data frame.

format

character string controlling the format of the output when CI_subsamples has results for more than one type of confidence interval. If "wide" (the default), then each performance criterion will have a separate column for each CI type. If "long", then each performance criterion will be a single variable, with separate rows for each CI type.

Value

A tibble containing the number of simulation iterations, performance criteria estimate(s) and the associated MCSE.

References

Boos DD, Zhang J (2000). “Monte Carlo evaluation of resampling-based hypothesis tests.” Journal of the American Statistical Association, 95(450), 486--492. doi:10.1080/01621459.2000.10474226 .

Examples


# function to generate data from two distinct populations
dgp <- function(N_A, N_B, shape_A, scale_A, shape_B, scale_B) {
  data.frame(
    group = rep(c("A","B"), c(N_A, N_B)),
      y = c(
        rgamma(N_A, shape = shape_A, scale = scale_A),
        rgamma(N_B, shape = shape_B, scale = scale_B)
      )
  )
}

# function to do a bootstrap t-test
estimator <- function(
    dat,
    B_vals = c(49,59,89,99), # number of booties to evaluate
    pval_reps = 4L
) {
  stat <- t.test(y ~ group, data = dat)$statistic

  # create bootstrap replications under the null of no difference
  boot_dat <- dat
  booties <- replicate(max(B_vals), {
    boot_dat$group <- sample(dat$group)
    t.test(y ~ group, data = boot_dat)$statistic
  })

  # calculate multiple bootstrap p-values using sub-sampling of replicates
  res <- data.frame(stat = stat)

  res$pvalue_subsamples <- bootstrap_pvals(
    boot_stat = booties,
    stat = stat,
    B_vals = B_vals,
    reps = pval_reps,
    enlist = TRUE
  )

  res
}

# create simulation driver
simulate_boot_pvals <- bundle_sim(
  f_generate = dgp,
  f_analyze = estimator
)

# replicate the bootstrap process
x <- simulate_boot_pvals(
  reps = 120L,
  N_A = 40, N_B = 50,
  shape_A = 7, scale_A = 2,
  shape_B = 4, scale_B = 3,
  B_vals = c(49, 99, 149, 199),
  pval_reps = 3L
)

extrapolate_rejection(
  data = x,
  pvalue_subsamples = pvalue_subsamples,
  B_target = 1999,
  alpha = c(.01, .05, .10)
)
#>   K_boot_rejection bootstraps boot_rej_rate_alpha_01 boot_rej_rate_alpha_05
#> 1              120         49              0.2000000              0.3777778
#> 2              120         99              0.1611111              0.3361111
#> 3              120        149              0.2027778              0.3388889
#> 4              120        199              0.1916667              0.3250000
#> 5              120       1999              0.1851262              0.3119213
#>   boot_rej_rate_alpha_10 boot_rej_rate_mcse_alpha_01
#> 1              0.5000000                  0.03333333
#> 2              0.5000000                  0.03080599
#> 3              0.5055556                  0.03621869
#> 4              0.5333333                  0.03608237
#> 5              0.5242474                  0.03641514
#>   boot_rej_rate_mcse_alpha_05 boot_rej_rate_mcse_alpha_10
#> 1                  0.04117277                  0.04230372
#> 2                  0.04070952                  0.04303315
#> 3                  0.04155837                  0.04427782
#> 4                  0.04293585                  0.04573296
#> 5                  0.04404141                  0.04686037

extrapolate_rejection(
  data = x,
  pvalue_subsamples = pvalue_subsamples,
  B_target = Inf,
  alpha = c(.01, .05, .10),
  nested = TRUE
)
#>   K_boot_rejection            bootstraps
#> 1              120 49, 99, 149, 199, Inf
#>                                                                                                                                                         boot_rej_rate
#> 1 0.2000000, 0.1611111, 0.2027778, 0.1916667, 0.1849391, 0.3777778, 0.3361111, 0.3388889, 0.3250000, 0.3103042, 0.5000000, 0.5000000, 0.5055556, 0.5333333, 0.5249696
#>                                                                                                                                                                   boot_rej_rate_mcse
#> 1 0.03333333, 0.03080599, 0.03621869, 0.03608237, 0.03665103, 0.04117277, 0.04070952, 0.04155837, 0.04293585, 0.04432085, 0.04230372, 0.04303315, 0.04427782, 0.04573296, 0.04713944