| DataVisualizations-package {DataVisualizations} | R Documentation |
Gives access to data visualisation methods that are relevant from the data scientist's point of view. The flagship idea of 'DataVisualizations' is the mirrored density plot (MD-plot) for either classified or non-classified multivariate data published in Thrun, M.C. et al.: "Analyzing the Fine Structure of Distributions" (2020), PLoS ONE, <DOI:10.1371/journal.pone.0238835>. The MD-plot outperforms the box-and-whisker diagram (box plot), violin plot and bean plot and geom_violin plot of ggplot2. Furthermore, a collection of various visualization methods for univariate data is provided. In the case of exploratory data analysis, 'DataVisualizations' makes it possible to inspect the distribution of each feature of a dataset visually through a combination of four methods. One of these methods is the Pareto density estimation (PDE) of the probability density function (pdf). Additionally, visualizations of the distribution of distances using PDE, the scatter-density plot using PDE for two variables as well as the Shepard density plot and the Bland-Altman plot are presented here. Pertaining to classified high-dimensional data, a number of visualizations are described, such as f.ex. the heat map and silhouette plot. A political map of the world or Germany can be visualized with the additional information defined by a classification of countries or regions. By extending the political map further, an uncomplicated function for a Choropleth map can be used which is useful for measurements across a geographic area. For categorical features, the Pie charts, slope charts and fan plots, improved by the ABC analysis, become usable. More detailed explanations are found in the book by Thrun, M.C.: "Projection-Based Clustering through Self-Organization and Swarm Intelligence" (2018) <DOI:10.1007/978-3-658-20540-9>.
For a brief introduction to DataVisualizations please see the vignette A Quick Tour in Data Visualizations.
Please see http://www.deepbionics.org/. Depending on the context please cite either [Thrun, 2018] regarding visualizations in the context of clustering or [Thrun/Ultsch, 2018] for other visualizations.
For the Mirrored Density Plot (MD plot) please cite [Thrun et al., 2020] and see the extensive vignette in https://md-plot.readthedocs.io/en/latest/index.html. The MD plot is also available in Python https://pypi.org/project/md-plot/
Index of help topics:
ABCbarplot Barplot with Sorted Data Colored by ABCanalysis
AccountingInformation_PrimeStandard_Q3_2019
Accounting Information in the Prime Standard in
Q3 in 2019 (AI_PS_Q3_2019)
BimodalityAmplitude Bimodality Amplitude
ChoroplethPostalCodesAndAGS_Germany
Postal Codes and AGS of Germany for a
Choropleth Map
Choroplethmap Plots the Choropleth Map
ClassBoxplot Creates Boxplot plot for all classes
ClassMDplot Class MDplot for Data w.r.t. all classes
ClassPDEplot PDE Plot for all classes
ClassPDEplotMaxLikeli Create PDE plot for all classes with maximum
likelihood
Classplot Classplot
CombineCols Combine vectors of various lengths
Crosstable Crosstable plot
DataVisualizations-package
Visualizations of High-Dimensional Data
DefaultColorSequence Default color sequence for plots
DensityScatter Scatter Density Plot
DualaxisClassplot Dualaxis Classplot
DualaxisLinechart DualaxisLinechart
Fanplot The fan plot
FundamentalData_Q1_2018
Fundamental Data of the 1st Quarter in 2018
GoogleMapsCoordinates Google Maps with marked coordinates
Heatmap Heatmap for Clustering
HeatmapColors Default color sequence for plots
ITS Income Tax Share
InspectBoxplots Inspect Boxplots
InspectCorrelation Inspect the Correlation
InspectDistances Inspection of Distance-Distribution
InspectScatterplots Pairwise scatterplots and optimal histograms
InspectStandardization
QQplot of Data versus Normalized Data
InspectVariable Visualization of Distribution of one variable
JitterUniqueValues Jitters Unique Values
Lsun3D Lsun3D inspired by FCPS
MAplot Minus versus Add plot
MDplot Mirrored Density plot (MD-plot)
MDplot4multiplevectors
Mirrored Density plot (MD-plot)for Multiple
Vectors
MTY Muncipal Income Tax Yield
OptimalNoBins Optimal Number Of Bins
PDEplot PDE plot
PDEscatter Scatter Density Plot
ParetoDensityEstimation
Pareto Density EstimationV2
ParetoRadius ParetoRadius for distributions
Piechart The pie chart
Pixelmatrix Plot of a Pixel Matrix
Plot3D 3D plot of points
PlotMissingvalues Plot of the Amount Of Missing Values
PlotProductratio Product-Ratio Plot
PmatrixColormap P-Matrix colors
QQplot QQplot with a Linear Fit
ShepardDensityScatter Shepard PDE scatter
Sheparddiagram Draws a Shepard Diagram
SignedLog Signed Log
Silhouetteplot Silhouette plot of classified data.
Slopechart Slope Chart
SmoothedDensitiesXY Smoothed Densities X with Y
StatPDEdensity Pareto Density Estimation
Worldmap plots a world map by country codes
categoricalVariable A categorical Feature.
inPSphere2D 2D data points in Pareto Sphere
stat_pde_density Calculate Pareto density estimation for ggplot2
plots
world_country_polygons
world_country_polygons
zplot Plotting for 3 dimensional data
Michael Thrun, Felix Pape, Onno Hansen-Goos, Alfred Ultsch
Maintainer: Michael Thrun <m.thrun@gmx.net>
[Thrun, 2018] Thrun, M. C.: Projection Based Clustering through Self-Organization and Swarm Intelligence, doctoral dissertation 2017, Springer, Heidelberg, ISBN: 978-3-658-20539-3, doi: 10.1007/978-3-658-20540-9, 2018.
[Thrun/Ultsch, 2018] Thrun, M. C., & Ultsch, A. : Effects of the payout system of income taxes to municipalities in Germany, in Papiez, M. & Smiech,, S. (eds.), Proc. 12th Professor Aleksander Zelias International Conference on Modelling and Forecasting of Socio-Economic Phenomena, pp. 533-542, Cracow: Foundation of the Cracow University of Economics, Cracow, Poland, 2018.
[Thrun et al., 2020] Thrun, M. C., Gehlert, T. & Ultsch, A.: Analyzing the Fine Structure of Distributions, PLoS ONE, Vol. 15(10), pp. 1-66, DOI 10.1371/journal.pone.0238835, 2020.
data("Lsun3D")
Data=Lsun3D$Data
Pixelmatrix(Data)
InspectDistances(as.matrix(dist(Data)))
data("ITS")
data("MTY")
Inds=which(ITS<900&MTY<8000)
plot(ITS[Inds],MTY[Inds],main='Bimodality is not visible in normal scatter plot')
PDEscatter(ITS[Inds],MTY[Inds],xlab = 'ITS in EUR',
ylab ='MTY in EUR' ,main='Pareto Density Estimation indicates Bimodality' )
MAlist=MAplot(ITS,MTY)
data("Lsun3D")
Cls=Lsun3D$Cls
Data=Lsun3D$Data
#clear cluster structure
plot(Data[,1:2],col=Cls)
#However, the silhouette plot does not indicate a very good clustering in cluster 1 and 2
Silhouetteplot(Data,Cls = Cls)
Heatmap(as.matrix(dist(Data)),Cls = Cls)