quantileReg {randomForestSRC}R Documentation

Quantile Regression Forests

Description

Determines the conditional quantiles and conditional density for a regression forest. Applies to both univariate and multivariate forests. Can be used for both training and testing purposes.

Usage

## S3 method for class 'rfsrc'
quantileReg(obj, oob = TRUE, prob = (1:10) / 10, newdata = NULL)

Arguments

obj

A previously grown forest.

oob

Return OOB (out-of-bag) quantiles? If false, in-bag values are returned.

prob

Target quantile probabilities.

newdata

Test data (optional) over which conditional quantiles are evaluated over.

Details

Given a regression forest, or a multivariate forest with at least one regression outcome, returns the conditional quantiles for the target outcomes. Also returns the conditional density, which can be used to calculate conditional moments, such as the mean and standard deviation.

Value

Quantiles for each of the requested probabilities, conditional density (and conditional cdf) for each unique y-value, are returned for each data point in the training set, or if a test set is provided, the values are returned for each data point in the test data. If more than one target outcome is available, the returned object will be a list of length equal to the number of target outcomes.

Author(s)

Hemant Ishwaran and Udaya B. Kogalur

References

Meinshausen N. (2006) Quantile Regression Forests, Journal of Machine Learning Research, 7:983–999.

See Also

rfsrc

Examples


## ------------------------------------------------------------
## regression example
## ------------------------------------------------------------

## grow call, followed by quantileReg call
o <- rfsrc(mpg ~ ., mtcars)
qo <- quantileReg(o, prob = c(0.25, 0.5, 0.75))

## calculate the conditional mean, compare to OOB predicted value
c.mean <- qo$density %*% qo$yunq
print(data.frame(c.mean = c.mean, pred.oob = o$predicted.oob))

## calculate conditional standard deviation
c.std <- sqrt(qo$density %*% qo$yunq^2 - c.mean ^ 2)
quant <- qo$quantile
colnames(quant) <- paste("q", 100 * qo$prob, sep = "")
print(data.frame(quant, c.std))


## ------------------------------------------------------------
## train/test regression example
## ------------------------------------------------------------

## grow call, followed by quantileReg call
o <- rfsrc(mpg ~ ., mtcars[1:20,])
qo <- quantileReg(o, newdata = mtcars[-(1:20),], prob = c(0.25, 0.5, 0.75))

## calculate test set conditional mean and standard deviation
c.mean <- qo$density %*% qo$yunq
c.std <- sqrt(qo$density %*% qo$yunq^2 - c.mean ^ 2)
quant <- qo$quant
colnames(quant) <- paste("q", 100 * qo$prob, sep = "")
print(data.frame(quant, c.mean, c.std))


## ------------------------------------------------------------
## multivariate mixed outcomes example
## ------------------------------------------------------------

dta <- mtcars
dta$cyl <- factor(dta$cyl)
dta$carb <- factor(dta$carb, ordered = TRUE)
o <- rfsrc(cbind(carb, mpg, cyl, disp) ~., data = dta)
qo <- quantileReg(o)

print(head(qo$mpg$quant))
print(head(qo$disp$quant))


## ------------------------------------------------------------
## quantile regression plot for Boston Housing data
## ------------------------------------------------------------

if (library("mlbench", logical.return = TRUE)) {

  ## apply quantile regression to Boston Housing data
  data(BostonHousing)
  o <- rfsrc(medv ~ ., BostonHousing)
  qo <- quantileReg(o, prob = c(0.25, 0.5, 0.75))

  ## quantile data for plotting 
  quant.dat <- qo$quant
  y <- o$yvar

  ## quantile regression plot
  plot(range(y), range(quant.dat), xlab = "y",
       ylab = ".25-.75 Quantiles", type = "n")
  jitter.y <- jitter(y, 10)
  points(jitter.y, quant.dat[, 2], pch = 15, col = 4, cex = 0.75)
  segments(jitter.y, quant.dat[, 2], jitter.y, quant.dat[, 1], col = "grey")
  segments(jitter.y, quant.dat[, 2], jitter.y, quant.dat[, 3], col = "grey")
  points(jitter.y, quant.dat[, 1], pch = "-", cex = 1)
  points(jitter.y, quant.dat[, 3], pch = "-", cex = 1)
  abline(0, 1, lty = 2, col = 2)

}

## ------------------------------------------------------------
## example of quantile regression for ordinal data
## ------------------------------------------------------------

 ## use the wine data for illustration
 data(wine, package = "randomForestSRC")

 ## regression call -- request forest weights
 o <- rfsrc(quality ~ ., wine, ntree = 100, forest.wt = "oob")

 ## run quantile regression/extract "probabilities" = density values
 qo <- quantileReg(o, prob = (1:100)/100)
 yunq <- qo$yunq
 yvar <- factor(cut(o$yvar, c(-1, yunq), labels = yunq)) 
 qo.dens <- qo$density
 colnames(qo.dens) <- yunq 
 qo.class <- randomForestSRC:::bayes.rule(qo.dens)
 qo.confusion <- table(yvar, qo.class)
 qo.err <- 1-diag(qo.confusion)/rowSums(qo.confusion)
 qo.confusion <- cbind(qo.confusion, qo.err)
 print(qo.confusion)
 cat("Normalized Brier:", 100 * randomForestSRC:::brier(yvar, qo.dens), "\n")


[Package randomForestSRC version 2.7.0 Index]