PPC-distributions {bayesplot}R Documentation

PPC distributions

Description

Compare the empirical distribution of the data y to the distributions of simulated/replicated data yrep from the posterior predictive distribution. See the Plot Descriptions section, below, for details.

Usage

ppc_hist(y, yrep, ..., binwidth = NULL, freq = TRUE)

ppc_boxplot(y, yrep, ..., notch = TRUE, size = 0.5, alpha = 1)

ppc_freqpoly(y, yrep, ..., binwidth = NULL, freq = TRUE, size = 0.25,
  alpha = 1)

ppc_freqpoly_grouped(y, yrep, group, ..., binwidth = NULL, freq = TRUE,
  size = 0.25, alpha = 1)

ppc_dens(y, yrep, ..., trim = FALSE, size = 0.5, alpha = 1)

ppc_dens_overlay(y, yrep, ..., trim = FALSE, size = 0.25, alpha = 0.7)

ppc_ecdf_overlay(y, yrep, ..., pad = TRUE, size = 0.25, alpha = 0.7)

ppc_violin_grouped(y, yrep, group, ..., probs = c(0.1, 0.5, 0.9), size = 1,
  alpha = 1, y_draw = c("violin", "points", "both"), y_size = 1,
  y_alpha = 1, y_jitter = 0.1)

Arguments

y

A vector of observations. See Details.

yrep

An S by N matrix of draws from the posterior predictive distribution, where S is the size of the posterior sample (or subset of the posterior sample used to generate yrep) and N is the number of observations (the length of y). The columns of yrep should be in the same order as the data points in y for the plots to make sense. See Details for additional instructions.

...

Currently unused.

binwidth

An optional value used as the binwidth argument to geom_histogram to override the default binwidth.

freq

For histograms, freq=TRUE (the default) puts count on the y-axis. Setting freq=FALSE puts density on the y-axis. (For many plots the y-axis text is off by default. To view the count or density labels on the y-axis see the yaxis_text convenience function.)

notch

A logical scalar passed to geom_boxplot. Unlike for geom_boxplot, the default is notch=TRUE.

size, alpha

Passed to the appropriate geom to control the appearance of the yrep distributions.

group

A grouping variable (a vector or factor) the same length as y. Each value in group is interpreted as the group level pertaining to the corresponding value of y.

trim

A logical scalar passed to geom_density.

pad

A logical scalar passed to stat_ecdf.

probs

A numeric vector passed to geom_violin's draw_quantiles argument to specify at which quantiles to draw horizontal lines. Set to NULL to remove the lines.

y_draw

For ppc_violin_grouped, a string specifying how to draw y: "violin" (default), "points" (jittered points), or "both".

y_jitter, y_size, y_alpha

For ppc_violin_grouped, if y_draw is "points" or "both" then y_size, y_alpha, and y_jitter are passed to to the size, alpha, and width arguments of geom_jitter to control the appearance of y points. The default of y_jitter=NULL will let ggplot2 determine the amount of jitter.

Details

For Binomial data, the plots will typically be most useful if y and yrep contain the "success" proportions (not discrete "success" or "failure" counts).

Value

A ggplot object that can be further customized using the ggplot2 package.

Plot Descriptions

ppc_hist, ppc_freqpoly, ppc_dens, ppc_boxplot

A separate histogram, shaded frequency polygon, smoothed kernel density estimate, or box and whiskers plot is displayed for y and each dataset (row) in yrep. For these plots yrep should therefore contain only a small number of rows. See the Examples section.

ppc_freqpoly_grouped

A separate frequency polygon is plotted for each level of a grouping variable for y and each dataset (row) in yrep. For this plot yrep should therefore contain only a small number of rows. See the Examples section.

ppc_dens_overlay, ppc_ecdf_overlay

Kernel density or empirical CDF estimates of each dataset (row) in yrep are overlaid, with the distribution of y itself on top (and in a darker shade).

ppc_violin_grouped

The density estimate of yrep within each level of a grouping variable is plotted as a violin with horizontal lines at notable quantiles. y is overlaid on the plot either as a violin, points, or both, depending on the y_draw argument.

References

Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., and Rubin, D. B. (2013). Bayesian Data Analysis. Chapman & Hall/CRC Press, London, third edition. (Ch. 6)

See Also

Other PPCs: PPC-discrete, PPC-errors, PPC-intervals, PPC-loo, PPC-overview, PPC-scatterplots, PPC-test-statistics

Examples

color_scheme_set("brightblue")
y <- example_y_data()
yrep <- example_yrep_draws()
dim(yrep)
ppc_dens_overlay(y, yrep[1:25, ])

ppc_ecdf_overlay(y, yrep[sample(nrow(yrep), 25), ])


# for ppc_hist,dens,freqpoly,boxplot definitely use a subset yrep rows so
# only a few (instead of nrow(yrep)) histograms are plotted
ppc_hist(y, yrep[1:8, ])


color_scheme_set("red")
ppc_boxplot(y, yrep[1:8, ])

# wizard hat plot
color_scheme_set("blue")
ppc_dens(y, yrep[200:202, ])


ppc_freqpoly(y, yrep[1:3,], alpha = 0.1, size = 1, binwidth = 5)

# if groups are different sizes then the 'freq' argument can be useful
group <- example_group_data()
ppc_freqpoly_grouped(y, yrep[1:3,], group) + yaxis_text()

ppc_freqpoly_grouped(y, yrep[1:3,], group, freq = FALSE) + yaxis_text()


# don't need to only use small number of rows for ppc_violin_grouped
# (as it pools yrep draws within groups)
color_scheme_set("gray")
ppc_violin_grouped(y, yrep, group, size = 1.5)

ppc_violin_grouped(y, yrep, group, alpha = 0)

# change how y is drawn
ppc_violin_grouped(y, yrep, group, alpha = 0, y_draw = "points", y_size = 1.5)
ppc_violin_grouped(y, yrep, group, alpha = 0, y_draw = "both",
                   y_size = 1.5, y_alpha = 0.5, y_jitter = 0.33)



[Package bayesplot version 1.4.0 Index]