Copyright (c) Microsoft Corporation.
Licensed under the MIT License.

We fit some simple models to the orange juice data for illustrative purposes. Here, each model is actually a group of models, one for each combination of store and brand. This is the standard approach taken in statistical forecasting, and is supported out-of-the-box by the tidyverts framework.

Note that the model training process is embarrassingly parallel on 3 levels:

This lets us speed up the training significantly. While the fable::model function can fit multiple models in parallel, we will run it sequentially here and instead parallelise by dataset. This avoids contention for cores, and also results in the simplest code. As a guard against returning invalid results, we also specify the argument .safely=FALSE; this forces model to throw an error if a model algorithm fails.

srcdir <- here::here("R_utils")
for(src in dir(srcdir, full.names=TRUE)) source(src)

load_objects("grocery_sales", "data.Rdata")

cl <- make_cluster(libs=c("tidyr", "dplyr", "fable", "tsibble", "feasts"))

oj_modelset_basic <- parallel::parLapply(cl, oj_train, function(df)
{
    model(df,
        mean=MEAN(logmove),
        naive=NAIVE(logmove),
        drift=RW(logmove ~ drift()),
        arima=ARIMA(logmove ~ pdq() + PDQ(0, 0, 0)),
        .safely=FALSE
    )
})
oj_fcast_basic <- parallel::clusterMap(cl, get_forecasts, oj_modelset_basic, oj_test)

save_objects(oj_modelset_basic, oj_fcast_basic,
             example="grocery_sales", file="model_basic.Rdata")

oj_fcast_basic %>%
    bind_rows() %>%
    mutate_at(-(1:3), exp) %>%
    eval_forecasts()
ABCDEFGHIJ0123456789
mean
<dbl>
naive
<dbl>
drift
<dbl>
arima
<dbl>
69.77178109.8926112.054868.70597

The ARIMA model does the best of the simple models, but not any better than a simple mean.

Having fit some basic models, we can also try an exponential smoothing model, fit using the ETS function. Unlike the others, ETS does not currently support time series with missing values; we therefore have to use one of the other models to impute missing values first via the interpolate function.

oj_modelset_ets <- parallel::clusterMap(cl, function(df, basicmod)
{
    df %>%
        interpolate(object=select(basicmod, -c(mean, naive, drift))) %>%
        model(
            ets=ETS(logmove ~ error("A") + trend("A") + season("N")),
            .safely=FALSE
        )
}, oj_train, oj_modelset_basic)

oj_fcast_ets <- parallel::clusterMap(cl, get_forecasts, oj_modelset_ets, oj_test)

destroy_cluster(cl)

save_objects(oj_modelset_ets, oj_fcast_ets,
             example="grocery_sales", file="model_ets.Rdata")

oj_fcast_ets %>%
    bind_rows() %>%
    mutate_at(-(1:3), exp) %>%
    eval_forecasts()
ABCDEFGHIJ0123456789
ets
<dbl>
74.2348

The ETS model does worse than the ARIMA model, something that should not be a surprise given the lack of strong seasonality and trend in this dataset. We conclude that any simple univariate approach is unlikely to do well.

LS0tCnRpdGxlOiBCYXNpYyBtb2RlbHMKb3V0cHV0OiBodG1sX25vdGVib29rCi0tLQoKX0NvcHlyaWdodCAoYykgTWljcm9zb2Z0IENvcnBvcmF0aW9uLl88YnIvPgpfTGljZW5zZWQgdW5kZXIgdGhlIE1JVCBMaWNlbnNlLl8KCmBgYHtyLCBlY2hvPUZBTFNFLCByZXN1bHRzPSJoaWRlIiwgbWVzc2FnZT1GQUxTRX0KbGlicmFyeSh0aWR5cikKbGlicmFyeShkcGx5cikKbGlicmFyeSh0c2liYmxlKQpsaWJyYXJ5KGZlYXN0cykKbGlicmFyeShmYWJsZSkKbGlicmFyeSh1cmNhKQpgYGAKCldlIGZpdCBzb21lIHNpbXBsZSBtb2RlbHMgdG8gdGhlIG9yYW5nZSBqdWljZSBkYXRhIGZvciBpbGx1c3RyYXRpdmUgcHVycG9zZXMuIEhlcmUsIGVhY2ggbW9kZWwgaXMgYWN0dWFsbHkgYSBfZ3JvdXBfIG9mIG1vZGVscywgb25lIGZvciBlYWNoIGNvbWJpbmF0aW9uIG9mIHN0b3JlIGFuZCBicmFuZC4gVGhpcyBpcyB0aGUgc3RhbmRhcmQgYXBwcm9hY2ggdGFrZW4gaW4gc3RhdGlzdGljYWwgZm9yZWNhc3RpbmcsIGFuZCBpcyBzdXBwb3J0ZWQgb3V0LW9mLXRoZS1ib3ggYnkgdGhlIHRpZHl2ZXJ0cyBmcmFtZXdvcmsuCgotIGBtZWFuYDogVGhpcyBpcyBqdXN0IGEgc2ltcGxlIG1lYW4uCi0gYG5haXZlYDogQSByYW5kb20gd2FsayBtb2RlbCB3aXRob3V0IGFueSBvdGhlciBjb21wb25lbnRzLiBUaGlzIGFtb3VudHMgdG8gc2V0dGluZyBhbGwgZm9yZWNhc3QgdmFsdWVzIHRvIHRoZSBsYXN0IG9ic2VydmVkIHZhbHVlLgotIGBkcmlmdGA6IFRoaXMgYWRqdXN0cyB0aGUgYG5haXZlYCBtb2RlbCB0byBpbmNvcnBvcmF0ZSBhIHN0cmFpZ2h0LWxpbmUgdHJlbmQuCi0gYGFyaW1hYDogQW4gQVJJTUEgbW9kZWwgd2l0aCB0aGUgcGFyYW1ldGVyIHZhbHVlcyBlc3RpbWF0ZWQgZnJvbSB0aGUgZGF0YS4KCk5vdGUgdGhhdCB0aGUgbW9kZWwgdHJhaW5pbmcgcHJvY2VzcyBpcyBlbWJhcnJhc3NpbmdseSBwYXJhbGxlbCBvbiAzIGxldmVsczoKCi0gV2UgaGF2ZSBtdWx0aXBsZSBpbmRlcGVuZGVudCB0cmFpbmluZyBkYXRhc2V0czsKLSBGb3Igd2hpY2ggd2UgZml0IG11bHRpcGxlIGluZGVwZW5kZW50IG1vZGVsczsKLSBXaXRoaW4gd2hpY2ggd2UgaGF2ZSBpbmRlcGVuZGVudCBzdWItbW9kZWxzIGZvciBlYWNoIHN0b3JlIGFuZCBicmFuZC4KClRoaXMgbGV0cyB1cyBzcGVlZCB1cCB0aGUgdHJhaW5pbmcgc2lnbmlmaWNhbnRseS4gV2hpbGUgdGhlIGBmYWJsZTo6bW9kZWxgIGZ1bmN0aW9uIGNhbiBmaXQgbXVsdGlwbGUgbW9kZWxzIGluIHBhcmFsbGVsLCB3ZSB3aWxsIHJ1biBpdCBzZXF1ZW50aWFsbHkgaGVyZSBhbmQgaW5zdGVhZCBwYXJhbGxlbGlzZSBieSBkYXRhc2V0LiBUaGlzIGF2b2lkcyBjb250ZW50aW9uIGZvciBjb3JlcywgYW5kIGFsc28gcmVzdWx0cyBpbiB0aGUgc2ltcGxlc3QgY29kZS4gQXMgYSBndWFyZCBhZ2FpbnN0IHJldHVybmluZyBpbnZhbGlkIHJlc3VsdHMsIHdlIGFsc28gc3BlY2lmeSB0aGUgYXJndW1lbnQgYC5zYWZlbHk9RkFMU0VgOyB0aGlzIGZvcmNlcyBgbW9kZWxgIHRvIHRocm93IGFuIGVycm9yIGlmIGEgbW9kZWwgYWxnb3JpdGhtIGZhaWxzLgoKYGBge3J9CnNyY2RpciA8LSBoZXJlOjpoZXJlKCJSX3V0aWxzIikKZm9yKHNyYyBpbiBkaXIoc3JjZGlyLCBmdWxsLm5hbWVzPVRSVUUpKSBzb3VyY2Uoc3JjKQoKbG9hZF9vYmplY3RzKCJncm9jZXJ5X3NhbGVzIiwgImRhdGEuUmRhdGEiKQoKY2wgPC0gbWFrZV9jbHVzdGVyKGxpYnM9YygidGlkeXIiLCAiZHBseXIiLCAiZmFibGUiLCAidHNpYmJsZSIsICJmZWFzdHMiKSkKCm9qX21vZGVsc2V0X2Jhc2ljIDwtIHBhcmFsbGVsOjpwYXJMYXBwbHkoY2wsIG9qX3RyYWluLCBmdW5jdGlvbihkZikKewogICAgbW9kZWwoZGYsCiAgICAgICAgbWVhbj1NRUFOKGxvZ21vdmUpLAogICAgICAgIG5haXZlPU5BSVZFKGxvZ21vdmUpLAogICAgICAgIGRyaWZ0PVJXKGxvZ21vdmUgfiBkcmlmdCgpKSwKICAgICAgICBhcmltYT1BUklNQShsb2dtb3ZlIH4gcGRxKCkgKyBQRFEoMCwgMCwgMCkpLAogICAgICAgIC5zYWZlbHk9RkFMU0UKICAgICkKfSkKb2pfZmNhc3RfYmFzaWMgPC0gcGFyYWxsZWw6OmNsdXN0ZXJNYXAoY2wsIGdldF9mb3JlY2FzdHMsIG9qX21vZGVsc2V0X2Jhc2ljLCBval90ZXN0KQoKc2F2ZV9vYmplY3RzKG9qX21vZGVsc2V0X2Jhc2ljLCBval9mY2FzdF9iYXNpYywKICAgICAgICAgICAgIGV4YW1wbGU9Imdyb2Nlcnlfc2FsZXMiLCBmaWxlPSJtb2RlbF9iYXNpYy5SZGF0YSIpCgpval9mY2FzdF9iYXNpYyAlPiUKICAgIGJpbmRfcm93cygpICU+JQogICAgbXV0YXRlX2F0KC0oMTozKSwgZXhwKSAlPiUKICAgIGV2YWxfZm9yZWNhc3RzKCkKYGBgCgpUaGUgQVJJTUEgbW9kZWwgZG9lcyB0aGUgYmVzdCBvZiB0aGUgc2ltcGxlIG1vZGVscywgYnV0IG5vdCBhbnkgYmV0dGVyIHRoYW4gYSBzaW1wbGUgbWVhbi4KCkhhdmluZyBmaXQgc29tZSBiYXNpYyBtb2RlbHMsIHdlIGNhbiBhbHNvIHRyeSBhbiBleHBvbmVudGlhbCBzbW9vdGhpbmcgbW9kZWwsIGZpdCB1c2luZyB0aGUgYEVUU2AgZnVuY3Rpb24uIFVubGlrZSB0aGUgb3RoZXJzLCBgRVRTYCBkb2VzIG5vdCBjdXJyZW50bHkgc3VwcG9ydCB0aW1lIHNlcmllcyB3aXRoIG1pc3NpbmcgdmFsdWVzOyB3ZSB0aGVyZWZvcmUgaGF2ZSB0byB1c2Ugb25lIG9mIHRoZSBvdGhlciBtb2RlbHMgdG8gaW1wdXRlIG1pc3NpbmcgdmFsdWVzIGZpcnN0IHZpYSB0aGUgYGludGVycG9sYXRlYCBmdW5jdGlvbi4KCmBgYHtyfQpval9tb2RlbHNldF9ldHMgPC0gcGFyYWxsZWw6OmNsdXN0ZXJNYXAoY2wsIGZ1bmN0aW9uKGRmLCBiYXNpY21vZCkKewogICAgZGYgJT4lCiAgICAgICAgaW50ZXJwb2xhdGUob2JqZWN0PXNlbGVjdChiYXNpY21vZCwgLWMobWVhbiwgbmFpdmUsIGRyaWZ0KSkpICU+JQogICAgICAgIG1vZGVsKAogICAgICAgICAgICBldHM9RVRTKGxvZ21vdmUgfiBlcnJvcigiQSIpICsgdHJlbmQoIkEiKSArIHNlYXNvbigiTiIpKSwKICAgICAgICAgICAgLnNhZmVseT1GQUxTRQogICAgICAgICkKfSwgb2pfdHJhaW4sIG9qX21vZGVsc2V0X2Jhc2ljKQoKb2pfZmNhc3RfZXRzIDwtIHBhcmFsbGVsOjpjbHVzdGVyTWFwKGNsLCBnZXRfZm9yZWNhc3RzLCBval9tb2RlbHNldF9ldHMsIG9qX3Rlc3QpCgpkZXN0cm95X2NsdXN0ZXIoY2wpCgpzYXZlX29iamVjdHMob2pfbW9kZWxzZXRfZXRzLCBval9mY2FzdF9ldHMsCiAgICAgICAgICAgICBleGFtcGxlPSJncm9jZXJ5X3NhbGVzIiwgZmlsZT0ibW9kZWxfZXRzLlJkYXRhIikKCm9qX2ZjYXN0X2V0cyAlPiUKICAgIGJpbmRfcm93cygpICU+JQogICAgbXV0YXRlX2F0KC0oMTozKSwgZXhwKSAlPiUKICAgIGV2YWxfZm9yZWNhc3RzKCkKYGBgCgpUaGUgRVRTIG1vZGVsIGRvZXMgX3dvcnNlXyB0aGFuIHRoZSBBUklNQSBtb2RlbCwgc29tZXRoaW5nIHRoYXQgc2hvdWxkIG5vdCBiZSBhIHN1cnByaXNlIGdpdmVuIHRoZSBsYWNrIG9mIHN0cm9uZyBzZWFzb25hbGl0eSBhbmQgdHJlbmQgaW4gdGhpcyBkYXRhc2V0LiBXZSBjb25jbHVkZSB0aGF0IGFueSBzaW1wbGUgdW5pdmFyaWF0ZSBhcHByb2FjaCBpcyB1bmxpa2VseSB0byBkbyB3ZWxsLgo=