| quantileReg {randomForestSRC} | R Documentation |
Determines the conditional quantiles and conditional density for a regression forest. Applies to both univariate and multivariate forests. Can be used for both training and testing purposes.
## S3 method for class 'rfsrc' quantileReg(obj, oob = TRUE, prob = (1:10) / 10, newdata = NULL)
obj |
A previously grown forest. |
oob |
Return OOB (out-of-bag) quantiles? If false, in-bag values are returned. |
prob |
Target quantile probabilities. |
newdata |
Test data (optional) over which conditional quantiles are evaluated over. |
Given a regression forest, or a multivariate forest with at least one regression outcome, returns the conditional quantiles for the target outcomes. Also returns the conditional density, which can be used to calculate conditional moments, such as the mean and standard deviation.
Quantiles for each of the requested probabilities, conditional density (and conditional cdf) for each unique y-value, are returned for each data point in the training set, or if a test set is provided, the values are returned for each data point in the test data. If more than one target outcome is available, the returned object will be a list of length equal to the number of target outcomes.
Hemant Ishwaran and Udaya B. Kogalur
Meinshausen N. (2006) Quantile Regression Forests, Journal of Machine Learning Research, 7:983–999.
## ------------------------------------------------------------
## regression example
## ------------------------------------------------------------
## grow call, followed by quantileReg call
o <- rfsrc(mpg ~ ., mtcars)
qo <- quantileReg(o, prob = c(0.25, 0.5, 0.75))
## calculate the conditional mean, compare to OOB predicted value
c.mean <- qo$density %*% qo$yunq
print(data.frame(c.mean = c.mean, pred.oob = o$predicted.oob))
## calculate conditional standard deviation
c.std <- sqrt(qo$density %*% qo$yunq^2 - c.mean ^ 2)
quant <- qo$quantile
colnames(quant) <- paste("q", 100 * qo$prob, sep = "")
print(data.frame(quant, c.std))
## ------------------------------------------------------------
## train/test regression example
## ------------------------------------------------------------
## grow call, followed by quantileReg call
o <- rfsrc(mpg ~ ., mtcars[1:20,])
qo <- quantileReg(o, newdata = mtcars[-(1:20),], prob = c(0.25, 0.5, 0.75))
## calculate test set conditional mean and standard deviation
c.mean <- qo$density %*% qo$yunq
c.std <- sqrt(qo$density %*% qo$yunq^2 - c.mean ^ 2)
quant <- qo$quant
colnames(quant) <- paste("q", 100 * qo$prob, sep = "")
print(data.frame(quant, c.mean, c.std))
## ------------------------------------------------------------
## multivariate mixed outcomes example
## ------------------------------------------------------------
dta <- mtcars
dta$cyl <- factor(dta$cyl)
dta$carb <- factor(dta$carb, ordered = TRUE)
o <- rfsrc(cbind(carb, mpg, cyl, disp) ~., data = dta)
qo <- quantileReg(o)
print(head(qo$mpg$quant))
print(head(qo$disp$quant))
## ------------------------------------------------------------
## quantile regression plot for Boston Housing data
## ------------------------------------------------------------
if (library("mlbench", logical.return = TRUE)) {
## apply quantile regression to Boston Housing data
data(BostonHousing)
o <- rfsrc(medv ~ ., BostonHousing)
qo <- quantileReg(o, prob = c(0.25, 0.5, 0.75))
## quantile data for plotting
quant.dat <- qo$quant
y <- o$yvar
## quantile regression plot
plot(range(y), range(quant.dat), xlab = "y",
ylab = ".25-.75 Quantiles", type = "n")
jitter.y <- jitter(y, 10)
points(jitter.y, quant.dat[, 2], pch = 15, col = 4, cex = 0.75)
segments(jitter.y, quant.dat[, 2], jitter.y, quant.dat[, 1], col = "grey")
segments(jitter.y, quant.dat[, 2], jitter.y, quant.dat[, 3], col = "grey")
points(jitter.y, quant.dat[, 1], pch = "-", cex = 1)
points(jitter.y, quant.dat[, 3], pch = "-", cex = 1)
abline(0, 1, lty = 2, col = 2)
}
## ------------------------------------------------------------
## example of quantile regression for ordinal data
## ------------------------------------------------------------
## use the wine data for illustration
data(wine, package = "randomForestSRC")
## regression call -- request forest weights
o <- rfsrc(quality ~ ., wine, ntree = 100, forest.wt = "oob")
## run quantile regression/extract "probabilities" = density values
qo <- quantileReg(o, prob = (1:100)/100)
yunq <- qo$yunq
yvar <- factor(cut(o$yvar, c(-1, yunq), labels = yunq))
qo.dens <- qo$density
colnames(qo.dens) <- yunq
qo.class <- randomForestSRC:::bayes.rule(qo.dens)
qo.confusion <- table(yvar, qo.class)
qo.err <- 1-diag(qo.confusion)/rowSums(qo.confusion)
qo.confusion <- cbind(qo.confusion, qo.err)
print(qo.confusion)
cat("Normalized Brier:", 100 * randomForestSRC:::brier(yvar, qo.dens), "\n")