Generalize Treatment Effects to External Populations (Transportability)
general_BART.RdEstimates Population Average Treatment Effects (PATE) in an external
population using a fitted prince_bart model and external data, possibly from complex sample surveys.
Assumes that covariates X capture all sources of effect heterogeneity,
allowing the conditional complier effect CATE_C(x) to generalize to the
target population.
Usage
general_BART(
princebart_fit,
newdata,
subpop = NULL,
psu = NULL,
weights = NULL,
fast_propensity = TRUE,
n_cores = 1L,
seed = NULL,
verbose = FALSE
)Arguments
- princebart_fit
A fitted
prince_bartobject withkeep_trees = TRUE.- newdata
A data.frame containing the external population (e.g., survey data). Covariates present in the source data but missing from
newdatawill be automatically detected and multiply imputed using auxiliary BART models.- subpop
Logical vector of length
nrow(newdata)indicating which units belong to the target subpopulation. Default is all units.- psu
Vector of primary sampling unit identifiers for
newdata. Required for complex survey inference.- weights
Vector of survey weights for
newdata. Default is equal weights.- fast_propensity
Logical; if TRUE (default), compute instrument propensity e = P(Z|X) once using only covariates common to both source and external data, before imputation. This is much faster. If FALSE, compute e for each MI-completed dataset (slower but more statistically more faithful when the propensity depends strongly on imputed variables).
- n_cores
Number of cores for parallel computation. Default is 1.
- seed
Random seed for reproducibility.
- verbose
Logical; print progress messages. Default is FALSE
Value
An object of class general_pate containing:
pate: Point estimate (posterior mean) of the PATEci: 95\sd: Posterior standard deviationdraws: Vector of posterior draws of the PATEy0: Array of predicted Y(0) values (units x iterations x chains)y1: Array of predicted Y(1) values (units x iterations x chains)subpop: Subpopulation indicator usedpsu: PSU identifiers usedweights: Survey weights used
Details
This function implements a multi-step procedure:
Multiple imputation of missing covariates in external data using auxiliary BART models fit on source data.
Feature expansion: compute instrument propensity e = P(Z|X) in external data using BART fit on source data.
Predict potential outcomes Y(0) and Y(1) using saved princeBART trees.
Estimate PATE using Bayesian bootstrap for complex survey data.
The resulting PATE is a population-level estimand defined over the specified target population.
The key identifying assumption is that \(\mathrm{CATE}_C(x)\) is transportable, meaning that conditional on X, treatment effects for compliers in the source study equal conditional effects in the target population.
For sensitivity analyses (overlap trimming, confounding bounds), use
general_BART_overlap and general_BART_transportability
on the returned object.
Examples
if (FALSE) { # \dontrun{
# Fit princeBART on source study
fit <- prince_BART(Y ~ X1 + X2 + X3 | Z | W, data = source_data,
keep_trees = TRUE, n_samples = 1000)
# Generalize to external survey population
pate <- general_BART(
princebart_fit = fit,
newdata = survey_data,
subpop = survey_data$eligible == 1,
psu = survey_data$cluster_id,
weights = survey_data$survey_weight,
n_cores = 4
)
# Run sensitivity analyses on the result
overlap <- general_BART_overlap(pate, threshold = 0.05)
sens <- general_BART_transportability(pate, gamma = 2)
print(pate)
} # }