This function allows you to write a dataset. By writing to more efficient
binary storage formats, and by specifying relevant partitioning, you can
make it much faster to read and query.
dataset |
Dataset, RecordBatch, Table, arrow_dplyr_query, or
data.frame. If an arrow_dplyr_query or grouped_df,
schema and partitioning will be taken from the result of any select()
and group_by() operations done on the dataset. filter() queries will be
applied to restrict written rows.
Note that select()-ed columns may not be renamed.
|
path |
string path, URI, or SubTreeFileSystem referencing a directory
to write to (directory will be created if it does not exist)
|
format |
file format to write the dataset to. Currently supported
formats are "feather" (aka "ipc") and "parquet". Default is to write to the
same format as dataset.
|
partitioning |
Partitioning or a character vector of columns to
use as partition keys (to be written as path segments). Default is to
use the current group_by() columns.
|
basename_template |
string template for the names of files to be written.
Must contain "{i}", which will be replaced with an autoincremented
integer to generate basenames of datafiles. For example, "part-{i}.feather"
will yield "part-0.feather", ....
|
hive_style |
logical: write partition segments as Hive-style
(key1=value1/key2=value2/file.ext) or as just bare values. Default is TRUE.
|
... |
additional format-specific arguments. For available Parquet
options, see write_parquet(). The available Feather options are
-
use_legacy_format logical: write data formatted so that Arrow libraries
versions 0.14 and lower can read it. Default is FALSE. You can also
enable this by setting the environment variable ARROW_PRE_0_15_IPC_FORMAT=1.
-
metadata_version: A string like "V5" or the equivalent integer indicating
the Arrow IPC MetadataVersion. Default (NULL) will use the latest version,
unless the environment variable ARROW_PRE_1_0_METADATA_VERSION=1, in
which case it will be V4.
-
codec: A Codec which will be used to compress body buffers of written
files. Default (NULL) will not compress body buffers.
|