| concat.split {splitstackshape} | R Documentation |
The concat.split function takes a column with multiple values, splits
the values into a list or into separate columns, and returns a new
data.frame or data.table.
concat.split(data, split.col, sep = ",", structure = "compact", mode = NULL, type = NULL, drop = FALSE, fixed = FALSE, fill = NA, ...)
data |
The source |
split.col |
The variable that needs to be split; can be specified either by the column number or the variable name. |
sep |
The character separating each value (defaults to |
structure |
Can be either |
mode |
Can be either |
type |
Can be either |
drop |
Logical (whether to remove the original variable from the output
or not). Defaults to |
fixed |
Is the input for the |
fill |
The "fill" value for missing values when |
... |
Additional arguments to |
structure
"compact" creates as many columns as
the maximum length of the resulting split. This is the most useful
general-case application of this function.
When the input is numeric,
"expanded" creates as many columns as the maximum value of the input
data. This is most useful when converting to mode = "binary".
"list" creates a single new column that is structurally a
list within a data.frame or data.table.
fixed
When structure = "expanded" or
structure = "list", it is possible to supply a a regular expression
containing the characters to split on. For example, to split on ",",
";", or "|", you can set sep = ",|;|\|" or sep =
"[,;|]", and fixed = FALSE to split on any of those characters.
This is more of a "legacy" or "convenience" wrapper function
encompassing the features available in the separated functions of
cSplit, concat.split.compact,
concat.split.list, and concat.split.expanded.
Ananda Mahto
cSplit, concat.split.compact,
concat.split.expanded, concat.split.list,
concat.split.multiple
## Load some data temp <- head(concat.test) # Split up the second column, selecting by column number concat.split(temp, 2) # ... or by name, and drop the offensive first column concat.split(temp, "Likes", drop = TRUE) # The "Hates" column uses a different separator concat.split(temp, "Hates", sep = ";", drop = TRUE) ## Not run: # You'll get a warning here, when trying to retain the original values concat.split(temp, 2, mode = "value", drop = TRUE) ## End(Not run) # Try again. Notice the differing number of resulting columns concat.split(temp, 2, structure = "expanded", mode = "value", type = "numeric", drop = TRUE) # Let's try splitting some strings... Same syntax concat.split(temp, 3, drop = TRUE) # Strings can also be split to binary representations concat.split(temp, 3, structure = "expanded", type = "character", fill = 0, drop = TRUE) # Split up the "Likes column" into a list variable; retain original column head(concat.split(concat.test, 2, structure = "list", drop = FALSE)) # View the structure of the output to verify # that the new column is a list; note the # difference between "Likes" and "Likes_list". str(concat.split(temp, 2, structure = "list", drop = FALSE))