ml-tuning {sparklyr}R Documentation

Spark ML – Tuning

Description

Perform hyper-parameter tuning using either K-fold cross validation or train-validation split.

Usage

ml_cross_validator(x, estimator, estimator_param_maps, evaluator,
  num_folds = 3L, seed = NULL, uid = random_string("cross_validator_"),
  ...)

ml_train_validation_split(x, estimator, estimator_param_maps, evaluator,
  train_ratio = 0.75, seed = NULL,
  uid = random_string("train_validation_split_"), ...)

Arguments

x

A spark_connection, ml_pipeline, or a tbl_spark.

estimator

A ml_estimator object.

estimator_param_maps

A named list of stages and hyper-parameter sets to tune. See details.

evaluator

A ml_evaluator object, see ml_evaluator.

num_folds

Number of folds for cross validation. Must be >= 2. Default: 3

seed

A random seed. Set this value if you need your results to be reproducible across repeated calls.

uid

A character string used to uniquely identify the ML estimator.

...

Optional arguments; currently unused.

train_ratio

Ratio between train and validation data. Must be between 0 and 1. Default: 0.75

Details

ml_cross_validator() performs k-fold cross validation while ml_train_validation_split() performs tuning on one pair of train and validation datasets.

Value

The object returned depends on the class of x.


[Package sparklyr version 0.7.0 Index]