| hard_arrange {disk.frame} | R Documentation |
A hard_arrange is a sort by that also reorganizes the chunks to ensure that every unique grouping of 'by“ is in the same chunk. Or in other words, every row that share the same 'by' value will end up in the same chunk.
hard_arrange(df, ..., add = FALSE, .drop = FALSE)
## S3 method for class 'data.frame'
hard_arrange(df, ...)
## S3 method for class 'disk.frame'
hard_arrange(
df,
...,
outdir = tempfile("tmp_disk_frame_hard_arrange"),
nchunks = disk.frame::nchunks(df),
overwrite = TRUE
)
df |
a disk.frame |
... |
grouping variables |
add |
same as dplyr::arrange |
.drop |
same as dplyr::arrange |
outdir |
the output directory |
nchunks |
The number of chunks in the output. Defaults = nchunks.disk.frame(df) |
overwrite |
overwrite the out put directory |
iris.df = as.disk.frame(iris, nchunks = 2) # arrange iris.df by specifies and ensure rows with the same specifies are in the same chunk iris_hard.df = hard_arrange(iris.df, Species) get_chunk(iris_hard.df, 1) get_chunk(iris_hard.df, 2) # clean up cars.df delete(iris.df) delete(iris_hard.df)