| ft_bucketizer {sparklyr} | R Documentation |
Similar to R's cut function, this transforms a numeric column
into a discretized column, with breaks specified through the splits
parameter.
ft_bucketizer(x, input_col = NULL, output_col = NULL, splits = NULL,
input_cols = NULL, output_cols = NULL, splits_array = NULL,
handle_invalid = "error", uid = random_string("bucketizer_"), ...)
x |
A |
input_col |
The name of the input column. |
output_col |
The name of the output column. |
splits |
A numeric vector of cutpoints, indicating the bucket boundaries. |
input_cols |
Names of input columns. |
output_cols |
Names of output columns. |
splits_array |
Parameter for specifying multiple splits parameters. Each element in this array can be used to map continuous features into buckets. |
handle_invalid |
(Spark 2.1.0+) Param for how to handle invalid entries. Options are 'skip' (filter out rows with invalid values), 'error' (throw an error), or 'keep' (keep invalid values in a special additional bucket). Default: "error" |
uid |
A character string used to uniquely identify the feature transformer. |
... |
Optional arguments; currently unused. |
The object returned depends on the class of x.
spark_connection: When x is a spark_connection, the function returns a ml_transformer,
a ml_estimator, or one of their subclasses. The object contains a pointer to
a Spark Transformer or Estimator object and can be used to compose
Pipeline objects.
ml_pipeline: When x is a ml_pipeline, the function returns a ml_pipeline with
the transformer or estimator appended to the pipeline.
tbl_spark: When x is a tbl_spark, a transformer is constructed then
immediately applied to the input tbl_spark, returning a tbl_spark
See http://spark.apache.org/docs/latest/ml-features.html for more information on the set of transformations available for DataFrame columns in Spark.
Other feature transformers: ft_binarizer,
ft_chisq_selector,
ft_count_vectorizer, ft_dct,
ft_elementwise_product,
ft_feature_hasher,
ft_hashing_tf, ft_idf,
ft_imputer,
ft_index_to_string,
ft_interaction, ft_lsh,
ft_max_abs_scaler,
ft_min_max_scaler, ft_ngram,
ft_normalizer,
ft_one_hot_encoder, ft_pca,
ft_polynomial_expansion,
ft_quantile_discretizer,
ft_r_formula,
ft_regex_tokenizer,
ft_sql_transformer,
ft_standard_scaler,
ft_stop_words_remover,
ft_string_indexer,
ft_tokenizer,
ft_vector_assembler,
ft_vector_indexer,
ft_vector_slicer, ft_word2vec