| sjc.kgap {sjPlot} | R Documentation |
An implementation of the gap statistic algorithm from Tibshirani, Walther, and Hastie's
"Estimating the number of clusters in a data set via the gap statistic".
This function calls the clusGap-function of the
cluster-package to calculate the data for the plot.
sjc.kgap(x, max = 10, B = 100, SE.factor = 1, method = "Tibs2001SEmax", plotResults = TRUE)
x |
matrix, where rows are observations and columns are individual dimensions, to compute and plot the gap statistic (according to a uniform reference distribution). |
max |
maximum number of clusters to consider, must be at least two. Default is 10. |
B |
integer, number of Monte Carlo ("bootstrap") samples. Default is 100. |
SE.factor |
[When |
method |
character string indicating how the "optimal" number of clusters,
k^, is computed from the gap statistics (and their standard deviations),
or more generally how the location k^ of the maximum of f[k] should be
determined. Default is
|
plotResults |
logical, if |
An object containing the used data frame for plotting, the ggplot object and the number of found cluster.
Tibshirani R, Walther G, Hastie T (2001) Estimating the number of clusters in a data set via gap statistic. J. R. Statist. Soc. B, 63, Part 2, pp. 411-423
Maechler, M., Rousseeuw, P., Struyf, A., Hubert, M., Hornik, K.(2013). cluster: Cluster Analysis Basics and Extensions. R package version 1.14.4. (web)
## Not run: # plot gap statistic and determine best number of clusters # in mtcars dataset sjc.kgap(mtcars) # and in iris dataset sjc.kgap(iris[,1:4]) ## End(Not run)