| kpeaks-package {kpeaks} | R Documentation |
The input argument k that represents the number of clusters is needed to start all of the partitioning clustering algorithms. In unsupervised learning applications, an optimal value of this argument is widely determined by using the internal validity indexes. Since these indexes suggest a k value which is computed on the clustering results after several runs of a clustering algorithm, they are computationally expensive. On the contrary, 'kpeaks' enables to estimate k before running any clustering algorithm. It is based on a simple novel technique using the descriptive statistics of peak counts of the features in a dataset.
The package 'kpeaks' contains five functions and one synthetically created dataset for testing purposes. In order to suggest an estimate of k, the function findk internally calls the functions genpolygon and findpolypeaks, respectively. The frequency polygons can be visually inspected by using the function plotpolygon. Using rmshoulders is recommended to flatten or remove the the shoulder peaks around the main peaks of a frequency polygon, if any.
Zeynel Cebeci, Cagatay Cebeci
findk,
findpolypeaks,
genpolygon,
plotpolygon,
rmshoulders