| project {projpred} | R Documentation |
Project the posterior of the reference model onto the parameter space of a single submodel consisting of a specific combination of predictor terms or (after variable selection) onto the parameter space of a single or multiple submodels of specific sizes.
project( object, nterms = NULL, solution_terms = NULL, refit_prj = TRUE, ndraws = 400, nclusters = NULL, seed = sample.int(.Machine$integer.max, 1), regul = 1e-04, ... )
object |
An object which can be used as input to |
nterms |
Only relevant if |
solution_terms |
If not |
refit_prj |
A single logical value indicating whether to fit the
submodels (again) ( |
ndraws |
Only relevant if |
nclusters |
Only relevant if |
seed |
Pseudorandom number generation (PRNG) seed by which the same
results can be obtained again if needed. If |
regul |
A number giving the amount of ridge regularization when projecting onto (i.e., fitting) submodels which are GLMs. Usually there is no need for regularization, but sometimes we need to add some regularization to avoid numerical problems. |
... |
Arguments passed to |
Arguments ndraws and nclusters are automatically truncated at
the number of posterior draws in the reference model (which is 1 for
datafits). Using less draws or clusters in ndraws or nclusters than
posterior draws in the reference model may result in slightly inaccurate
projection performance. Increasing these arguments affects the computation
time linearly.
If the projection is performed onto a single submodel (i.e.,
length(nterms) == 1 || !is.null(solution_terms)), an object of class
projection which is a list containing the following elements:
disProjected draws for the dispersion parameter.
klThe KL divergence from the submodel to the reference model.
weightsWeights for the projected draws.
solution_termsA character vector of the submodel's predictor terms, ordered in the way in which the terms were added to the submodel.
submodlA list containing the submodel fits (one fit per
projected draw).
p_typeA single logical value indicating whether the
reference model's posterior draws have been clustered for the projection
(TRUE) or not (FALSE).
refmodelThe reference model object.
If the projection is performed onto more than one submodel, the output from
above is returned for each submodel, giving a list with one element for
each submodel.
if (requireNamespace("rstanarm", quietly = TRUE)) {
# Data:
dat_gauss <- data.frame(y = df_gaussian$y, df_gaussian$x)
# The "stanreg" fit which will be used as the reference model (with small
# values for `chains` and `iter`, but only for technical reasons in this
# example; this is not recommended in general):
fit <- rstanarm::stan_glm(
y ~ X1 + X2 + X3 + X4 + X5, family = gaussian(), data = dat_gauss,
QR = TRUE, chains = 2, iter = 500, refresh = 0, seed = 9876
)
# Variable selection (here without cross-validation and with small values
# for `nterms_max`, `nclusters`, and `nclusters_pred`, but only for the
# sake of speed in this example; this is not recommended in general):
vs <- varsel(fit, nterms_max = 3, nclusters = 5, nclusters_pred = 10,
seed = 5555)
# Projection onto the best submodel with 2 predictor terms (with a small
# value for `nclusters`, but only for the sake of speed in this example;
# this is not recommended in general):
prj_from_vs <- project(vs, nterms = 2, nclusters = 10, seed = 9182)
# Projection onto an arbitrary combination of predictor terms (with a small
# value for `nclusters`, but only for the sake of speed in this example;
# this is not recommended in general):
prj <- project(fit, solution_terms = c("X1", "X3", "X5"), nclusters = 10,
seed = 9182)
}