cSplit {splitstackshape}R Documentation

Split Concatenated Values into Separate Values

Description

The cSplit function is designed to quickly and conveniently split concatenated data into separate values.

Usage

cSplit(indt, splitCols, sep = ",", direction = "wide", fixed = TRUE,
  drop = TRUE, stripWhite = TRUE, makeEqual = NULL, type.convert = TRUE)

Arguments

indt

The input data.frame or data.table.

splitCols

The column or columns that need to be split.

sep

The values that serve as a delimiter within each column. This can be a single value if all columns have the same delimiter, or a vector of values in the same order as the delimiters in each of the splitCols.

direction

The desired direction of the results, either "wide" or "long".

fixed

Logical. Should the split character be treated as a fixed pattern (TRUE) or a regular expression (FALSE)? Defaults to TRUE.

drop

Logical. Should the original concatenated column be dropped? Defaults to TRUE.

stripWhite

Logical. If there is whitespace around the delimiter in the concatenated columns, should it be stripped prior to splitting? Defaults to TRUE.

makeEqual

Logical. Should all groups be made to be the same length? Defaults to FALSE.

type.convert

Logical. Should type.convert be used to convert the result of each column? This would add a little to the execution time.

Value

A data.table with the values split into new columns or rows.

Note

The cSplit function replaces most of the earlier concat.split* functions. The earlier functions remain for compatability purposes, but now they are essentially wrappers for the cSplit function.

If you know that all values in the column would have the same number of values per row after being split, you should use the cSplit_f function instead, which uses fread instead of strsplit and is generally faster.

Author(s)

Ananda Mahto

See Also

concat.split, cSplit_f

Examples

## Sample data
temp <- head(concat.test)

## Split the "Likes" column
cSplit(temp, "Likes")

## Split the "Likes" and "Hates" columns --
##   they have different delimiters...
cSplit(temp, c("Likes", "Hates"), c(",", ";"))

## Split "Siblings" into a long form...
cSplit(temp, "Siblings", ",", direction = "long")

## Split "Siblings" into a long form, removing extra whitespace
cSplit(temp, "Siblings", ",", direction = "long", stripWhite = TRUE)

## Split a vector
y <- c("a_b_c", "a_b", "c_a_b")
cSplit(as.data.table(y), "y", "_")

[Package splitstackshape version 1.4.2 Index]