| chunk.apply {iotools} | R Documentation |
chunk.apply processes input in chunks and applies FUN
to each chunk, collecting the results.
chunk.apply(input, FUN, ..., CH.MERGE = rbind, CH.MAX.SIZE = 33554432, parallel=1) chunk.tapply(input, FUN, ..., sep = "\t", CH.MERGE = rbind, CH.MAX.SIZE = 33554432)
input |
Either a chunk reader or a file name or connection that will be used to create a chunk reader |
FUN |
Function to apply to each chunk |
... |
Additional parameters passed to |
sep |
for tapply, gives separator for the key over which to apply. Each line is split at the first separator, and the value is treated as the key over which to apply the function over. |
CH.MERGE |
Function to call to merge results from all
chunks. Common values are |
CH.MAX.SIZE |
maximal size of each chunk in bytes |
parallel |
the number of parallel processes to use in the calculation (*nix only). |
The result of calling CH.MERGE on all chunk results.
The input to FUN is the raw chunk, so typically it is
advisabe to use mstrsplit or similar function as the
first setep in FUN.
Simon Urbanek
## Not run:
## compute quantiles of the first variable for each chunk
## of at most 10kB size
chunk.apply("input.file.txt",
function(o) {
m = mstrsplit(o)
quantile(as.numeric(m[,1]), c(0.25, 0.5, 0.75))
}, CH.MAX.SIZE=1e5)
## End(Not run)