Constructs a data frame of grid points suitable for f_hat_from_models()
from any reference patient-level dataset. This is convenient when the
patient-level covariate space is multidimensional and there is no
obvious one-dimensional grid.
Arguments
- reference_data
A data frame (or matrix) of reference patient-level covariates. May be a held-out target-population sample, the pooled source covariates, or any plausible reference distribution.
- n_grid
Optional integer giving the desired grid size. If
NULLor>= nrow(reference_data), the full reference data is returned.- seed
Optional integer seed for reproducibility (used only when sub-sampling).
Details
If n_grid is NULL or at least nrow(reference_data), the reference
data is returned unchanged. Otherwise n_grid rows are sampled uniformly
at random (without replacement). The reference data should be on the
same scale and have the same columns as the data each centre's model
was fitted on.
The empirical distribution of the returned grid implicitly defines the
\(\mu\) measure used downstream. Pass uniform grid_weights (the
default) to weight each grid point equally; pass non-uniform
grid_weights to weight by an external reference distribution.
Examples
set.seed(1)
ref <- data.frame(age = rnorm(500, 60, 10),
bp = rnorm(500, 130, 15),
sex = sample(c("F", "M"), 500, replace = TRUE))
grid <- build_grid(ref, n_grid = 50, seed = 1)
nrow(grid)
#> [1] 50
head(grid)
#> age bp sex
#> 324 41.30211 131.21499 M
#> 167 57.44973 109.91799 M
#> 129 53.18340 125.20321 M
#> 418 57.48835 140.33866 F
#> 471 51.86756 136.83815 M
#> 299 59.49434 91.05833 F