| bertinplot {seriation} | R Documentation |
Plot a data matrix of cases and variables. Each value is represented by a symbol. Large values are highlighted. Note that Bertin arranges the cases horizontally and the variables as rows. The matrix can be rearranged using seriation techniques to make structure in the data visible (see Falguerolles et al 1997).
# grid-based plot bertinplot(x, order = NULL, panel.function = panel.bars, highlight = TRUE, row_labels = TRUE, col_labels = TRUE, flip_axes = TRUE, ...) # ggplot2-based plot ggbertinplot(x, order = NULL, geom = "bar", highlight = TRUE, row_labels = TRUE, col_labels = TRUE, flip_axes = TRUE, prop = FALSE, ...)
x |
a data matrix. Note that following Bertin,
columns are variables and rows are cases. This behavior can be
reversed using |
order |
an object of class |
panel.function |
a function to produce the symbols. Currently
available functions are |
geom |
visualization type. Available geometries are: "tile", "rectangle", "circle", "line", "bar", "none". |
highlight |
a logical scalar indicating whether to use highlighting.
If |
row_labels, col_labels |
a logical indicating if row and column labels in |
flip_axes |
logical indicating whether to swap cases and variables
in the plot. The default ( |
prop |
logical; change the aspect ratio so cells in the image have a equal width and height. |
... |
|
The plot is organized as a matrix of symbols. The symbols are drawn
by a panel function, where all symbols of a row are drawn
by one call of the function (using vectorization). The interface for the
panel function is panel.myfunction(value, spacing, hl).
value is the vector of values for a row scaled between 0 and 1,
spacing contains the relative space between symbols and
hl is a logical vector indicating which symbol should be highlighted.
Cut lines can be added to an existing bertin plot
using bertin_cut_line(x = NULL, y = NULL).
x/y is can be a number indicating where to
draw the cut line between two columns/rows. If both x
and y is specified then one can select a row/column and the
other can select a range to draw a line which does only span a part
of the row/column. It is important to
call bertinplot() with the option pop = FALSE.
ggbertinplot calls ggpimage and all additional parameters are passed on.
Michael Hahsler
de Falguerolles, A., Friedrich, F., Sawitzki, G. (1997): A Tribute to J. Bertin's Graphical Data Analysis. In: Proceedings of the SoftStat '97 (Advances in Statistical Software 6), 11–20.
ser_permutation,
seriate,
Package grid.
data("Irish")
scale_by_rank <- function(x) apply(x, 2, rank)
x <- scale_by_rank(Irish[,-6])
# Use the the sum of absolute rank differences
order <- c(
seriate(dist(x, "minkowski", p = 1)),
seriate(dist(t(x), "minkowski", p = 1))
)
# Plot
bertinplot(x, order)
# Some alternative displays
bertinplot(x, order, panel = panel.tiles, shading_col = bluered(100), highlight = FALSE)
bertinplot(x, order, panel = panel.circles, spacing = -.2)
bertinplot(x, order, panel = panel.rectangles)
bertinplot(x, order, panel = panel.lines)
# Plot with cut lines (we manually set the order here)
order <- ser_permutation(c(21, 16, 19, 18, 14, 12, 20, 15,
17, 26, 13, 41, 7, 11, 5, 23, 28, 34, 31, 1, 38, 40,
3, 39, 4, 27, 24, 8, 37, 36, 25, 30, 33, 35, 2,
22, 32, 29, 10, 6, 9),
c(4, 2, 1, 6, 8, 7, 5, 3))
bertinplot(x, order, pop=FALSE)
bertin_cut_line(, 4) ## horizontal line between rows 4 and 5
bertin_cut_line(, 7) ## separate "Right to Life" from the rest
bertin_cut_line(14, c(0, 4)) ## separate a block of large values (vertically)
# ggplot2-based plots
if (require("ggplot2")) {
library(ggplot2)
# Default plot uses bars and highlighting values larger than the mean
ggbertinplot(x, order)
# highlight values in the 4th quartile
ggbertinplot(x, order, highlight = quantile(x, probs = .75))
# Use different geoms. "none" lets the user specify their own geom.
# Variables set are row, col and x (for the value).
ggbertinplot(x, order, geom = "tile", prop = TRUE)
ggbertinplot(x, order, geom = "rectangle")
ggbertinplot(x, order, geom = "rectangle", prop = TRUE)
ggbertinplot(x, order, geom = "circle")
ggbertinplot(x, order, geom = "line")
# Tiles with diverging color scale
ggbertinplot(x, order, geom = "tile", prop = TRUE) +
scale_fill_gradient2(midpoint = mean(x))
# Custom geom (geom = "none"). Defined variables are row, col, and x for the value
ggbertinplot(x, order, geom = "none", prop = FALSE) +
geom_point(aes(x = col, y = row, size = x, color = x > 30), pch = 15) +
scale_size(range = c(1, 10))
# Use a ggplot2 theme with theme_set()
old_theme <- theme_set(theme_minimal() +
theme(panel.grid = element_blank())
)
ggbertinplot(x, order, geom = "bar")
theme_set(old_theme)
}