Select Best Models and Prep Final Outputs

final_models(
  run_info,
  average_models = TRUE,
  max_model_average = 3,
  weekly_to_daily = TRUE,
  parallel_processing = NULL,
  inner_parallel = FALSE,
  num_cores = NULL
)

Arguments

run_info

run info using the set_run_info() function.

average_models

If TRUE, create simple averages of individual models and save the most accurate one.

max_model_average

Max number of models to average together. Will create model averages for 2 models up until input value or max number of models ran.

weekly_to_daily

If TRUE, convert a week forecast down to day by evenly splitting across each day of week. Helps when aggregating up to higher temporal levels like month or quarter.

parallel_processing

Default of NULL runs no parallel processing and forecasts each individual time series one after another. 'local_machine' leverages all cores on current machine Finn is running on. 'spark' runs time series in parallel on a spark cluster in Azure Databricks or Azure Synapse.

inner_parallel

Run components of forecast process inside a specific time series in parallel. Can only be used if parallel_processing is set to NULL or 'spark'.

num_cores

Number of cores to run when parallel processing is set up. Used when running parallel computations on local machine or within Azure. Default of NULL uses total amount of cores on machine minus one. Can't be greater than number of cores on machine minus 1.

Value

Final model outputs are written to disk.

Examples

# \donttest{
data_tbl <- timetk::m4_monthly %>%
  dplyr::rename(Date = date) %>%
  dplyr::mutate(id = as.character(id)) %>%
  dplyr::filter(
    Date >= "2013-01-01",
    Date <= "2015-06-01"
  )

run_info <- set_run_info()
#> Finn Submission Info
#>  Experiment Name: finn_fcst
#>  Run Name: finn_fcst-20240805T151546Z
#> 

prep_data(run_info,
  input_data = data_tbl,
  combo_variables = c("id"),
  target_variable = "value",
  date_type = "month",
  forecast_horizon = 3
)
#>  Prepping Data
#>  Prepping Data [3.2s]
#> 

prep_models(run_info,
  models_to_run = c("arima", "ets"),
  back_test_scenarios = 3
)
#>  Creating Model Workflows
#>  Creating Model Workflows [150ms]
#> 
#>  Creating Model Hyperparameters
#>  Creating Model Hyperparameters [136ms]
#> 
#>  Creating Train Test Splits
#>  Turning ensemble models off since no multivariate models were chosen to run.
#>  Creating Train Test Splits

#>  Creating Train Test Splits [279ms]
#> 

train_models(run_info,
  run_global_models = FALSE
)
#>  Training Individual Models
#>  Training Individual Models [21.7s]
#> 

final_models(run_info)
#>  Selecting Best Models
#>  Selecting Best Models [903ms]
#> 
# }