| TOne {DescTools} | R Documentation |
Create a table summarizing continuous, categorical and dichotomous variables, optionally stratified by one or more variables, while performing adequate statistical tests.
TOne(x, grp = NA, add.length = TRUE, colnames = NULL, vnames = NULL,
total = TRUE, align = "\\l",
FUN = NULL, TEST = NULL, intref = "high",
fmt = list(abs = Fmt("abs"), num = Fmt("num"), per = Fmt("per"),
pval = as.fmt(fmt = "*", na.form = " ")) )
x |
a data.frame containing all the variables to be included in the table. |
grp |
the grouping variable. |
add.length |
logical. If set to |
colnames |
a vector of columnnames for the result table. |
vnames |
a vector of variablenames to be placed in the first column instead of the real names. |
total |
logical (default |
align |
the character on whose position the strings will be aligned. Left alignment can be requested by setting |
FUN |
the function to be used as location and dispersion measure for numeric (including integer) variables (typically |
TEST |
a list of functions to be used to test the variables. Must be named as |
intref |
one out of |
fmt |
format codes for absolute, numeric and percentage values, and for the p-values of the tests. |
In research the characteristics of study populations are often characterised through some kind of a "Table 1", containing descriptives of the used variables, as mean/standard deviation for continuous variables, and proportions for categorical variables. In many cases, a comparison is made between two or more groups within the framework of the scientific question. Creating such a table can be very time consuming and there's a need for a flexible function that helps us to solve the task. TOne() should be as easy to use as possible and yet so flexible that the essential design elements can be freely defined.
This is done by dividing the world into 3 groups: numeric, factor and dichotomous variables (having exactly two values or levels). Depending on the group, the descriptives and the according sensible tests are chosen. By default mean/sd is chosen for numeric values. Their difference is tested with the Kruskal-Wallis test. For categorical variables the absolute and relative frequencies are calculated and tested with a chi-square test. The tests can be changed with the argument TEST. These must be organised as list containing elements named "num", "cat" and "dich". Each of them must be a function with arguments (x, g), returning something similar to a p-value. The legend text of the test, which is appended to the table, can be set with the variable lbl.
Great importance was attached to the free definition of the number formats. By default, the optionally definable format templates of DescTools are used. Deviations from this can be freely passed as arguments to the function. Formats can be defined for integers, floating point numbers, percentages and for the p-values of statistical tests. All options of the function Format() are available and can be provided as a list. See examples which show several different implementations.
The function returns a character matrix as result, which can easily be subset or combined with other matrices. An interface for ToWrd() is available such that the matrix can be transferred to MS-Word. Both font and alignment are freely selectable in the Word table.
a character matrix
Andri Signorell <andri@signorell.net>
WrdTable(), ToWrd.TOne()
options(scipen = 8)
opt <- DescToolsOptions()
# define some special formats for count data, percentages and numeric results
# (those will be supported by TOne)
Fmt(abs = as.fmt(digits = 0, big.mark = "'")) # counts
Fmt(per = as.fmt(digits = 1, fmt = "%")) # percentages
Fmt(num = as.fmt(digits = 1, big.mark = "'")) # numeric
TOne(x = d.pizza[, c("temperature", "delivery_min", "driver", "wine_ordered")],
grp = d.pizza$quality)
# define median/IQR as describing functions for the numeric variables
TOne(iris[, -5], iris[, 5],
FUN = function(x) {
gettextf("%s / %s",
Format(median(x, na.rm = TRUE), digits = 1),
Format(IQR(x, na.rm = TRUE), digits = 3))
}
)
# replace kruskal.test by ANOVA and report the p.value
# Change tests for all the types
TOne(x = iris[, -5], grp = iris[, 5],
FUN = function(x) gettextf("%s / %s",
Format(mean(x, na.rm = TRUE), digits = 1),
Format(sd(x, na.rm = TRUE), digits = 3)),
TEST = list(
num = list(fun = function(x, g){summary(aov(x ~ g))[[1]][1, "Pr(>F)"]},
lbl = "ANOVA"),
cat = list(fun = function(x, g){chisq.test(table(x, g))$p.val},
lbl = "Chi-Square test"),
dich = list(fun = function(x, g){fisher.test(table(x, g))$p.val},
lbl = "Fisher exact test")),
fmt = list(abs = Fmt("abs"), num = Fmt("num"), per = Fmt("per"),
pval = as.fmt(fmt = "*", na.form = " "))
)
# dichotomous integer or logical values can be reported by the high or low value
x <- sample(x = c(0, 1), size = 100, prob = c(0.3, 0.7), replace = TRUE)
y <- sample(x = c(0, 1), size = 100, prob = c(0.3, 0.7), replace = TRUE) == 1
z <- factor(sample(x = c(0, 1), size = 100, prob = c(0.3, 0.7), replace = TRUE))
g <- sample(x = letters[1:4], size = 100, replace = TRUE)
d.set <- data.frame(x = x, y = y, z = z, g = g)
TOne(d.set[1:3], d.set$g, intref = "low")
TOne(d.set[1:3], d.set$g, intref = "high")
# intref would not control factors, use relevel to change reported value
TOne(data.frame(z = relevel(z, "1")), g)
TOne(data.frame(z = z), g)
options(opt)
## Not run:
# Send the whole stuff to Word
wrd <- GetNewWrd()
ToWrd(
TOne(x = d.pizza[, c("temperature", "delivery_min", "driver", "wine_ordered")],
grp = d.pizza$quality,
fmt = list(num=Fmt("num", digits=1))
),
font = list(name="Arial narrow", size=8),
align = c("l","r") # this will be recycled: left-right-left-right ...
)
## End(Not run)