| cSplit {splitstackshape} | R Documentation |
The cSplit function is designed to quickly and conveniently split
concatenated data into separate values.
cSplit(indt, splitCols, sep = ",", direction = "wide", fixed = TRUE, drop = TRUE, stripWhite = TRUE, makeEqual = NULL, type.convert = TRUE)
indt |
The input |
splitCols |
The column or columns that need to be split. |
sep |
The values that serve as a delimiter within each column.
This can be a single value if all columns have the same delimiter, or a
vector of values in the same order as the delimiters in each of the
|
direction |
The desired direction of the results, either |
fixed |
Logical. Should the split character be treated as a fixed
pattern ( |
drop |
Logical. Should the original concatenated column be dropped?
Defaults to |
stripWhite |
Logical. If there is whitespace around the delimiter in
the concatenated columns, should it be stripped prior to splitting? Defaults
to |
makeEqual |
Logical. Should all groups be made to be the same length?
Defaults to |
type.convert |
Logical. Should |
A data.table with the values
split into new columns or rows.
The cSplit function replaces most of the earlier
concat.split* functions. The earlier functions remain for
compatability purposes, but now they are essentially wrappers for the
cSplit function.
If you know that all values in the column would have the same number of values per row after being split, you should use the cSplit_f function instead, which uses fread instead of strsplit and is generally faster.
Ananda Mahto
## Sample data
temp <- head(concat.test)
## Split the "Likes" column
cSplit(temp, "Likes")
## Split the "Likes" and "Hates" columns --
## they have different delimiters...
cSplit(temp, c("Likes", "Hates"), c(",", ";"))
## Split "Siblings" into a long form...
cSplit(temp, "Siblings", ",", direction = "long")
## Split "Siblings" into a long form, removing extra whitespace
cSplit(temp, "Siblings", ",", direction = "long", stripWhite = TRUE)
## Split a vector
y <- c("a_b_c", "a_b", "c_a_b")
cSplit(as.data.table(y), "y", "_")