e.cp3o {ecp}R Documentation

CHANGE POINTS ESTIMATION BY PROBABILISTICALLY PRUNED OBJECTIVE (VIA E-STATISTIC)

Description

An algorithm for multiple change point analysis that uses dynamic programming and probabilistic pruning. The E-statistic is used as the goodness-of-fit measure.

Usage

e.cp3o(Z, K=1, delta=29, alpha=1, eps=0.01, verbose=FALSE)

Arguments

Z

A T x d matrix containing the length T time series with d-dimensional observations.

K

The maximum number of change points.

delta

The window size used to calculate the calculate the complete portion off our approximate test statistic. This also corresponds to one less than the minimum segment size.

alpha

The moment index used for determining the distance between and within segments.

eps

The epsilon probability used for the probabilistic pruning procedure.

verbose

A flag indicating if status updates should be printed.

Details

Segmentations are found through the use of dynamic programming and probabilistic pruning. The computational complexity of this method is O(KT^2), where K is the maximum number of change points, and T is the number of observations.

Value

The returned value is a list with the following components.

number

The estimated number of change points.

estimates

The location of the change points estimated by the procedure.

gofM

A vector of goodness of fit values for differing number of change points. The first entry corresponds to when there is only a single change point, the second for when there are two, and so on.

cpLoc

A list of all the optimal change point locations for differing numbers of change points. The first component corresponds to when there is only one change point, the second for when there are two change points, and so on.

time

The total amount to time take to estimate the change point locations.

Author(s)

Nicholas A. James

References

Rizzo M.L., Szekely G.L (2005). Hierarchical clustering via joint between-within distances: Extending ward's minimum variance method. Journal of Classification.

Rizzo M.L., Szekely G.L. (2010). Disco analysis: A nonparametric extension of analysis of variance. The Annals of Applied Statistics.

Examples

set.seed(400)
x1 = matrix(c(rnorm(100),rnorm(100,3),rnorm(100,0,2)))
y1 = e.cp3o(Z=x1, K=7, delta=29, alpha=1, eps=0.01, verbose=FALSE)
#View estimated change point locations
y1$estimates
x2 = rbind(MASS::mvrnorm(100,c(0,0),diag(2)),MASS::mvrnorm(100,c(2,2),diag(2)))
y2 = e.cp3o(Z=x2, K=4, delta=29, alpha=1, eps=0.01, verbose=FALSE)
#View estimated change point locations
y2$estimates
#View all possible segmentations for differing numbers of change points
y2$cpLoc

[Package ecp version 3.0.0 Index]