Train models
Having explored the retail data, let’s fit some models to it. We’ll use the first 25.5 years of data for model training, and the remaining 10 years for training.
drift
is a simple random walk model incorporating a drift term.
sdrift
is the seasonal counterpart to drift
, ie the random walk is by season.
ar
is an ARIMA model with all seasonal and nonseasonal terms chosen from the data.
ets_auto
is an ETS model with the form of the components chosen from the data (either additive or multiplicative).
ets_fixed
is an ETS model where the components are all additive, based on examining the plots in the previous notebook.
In addition, a nice feature of the model
function is that it can fit models in parallel by leveraging the future and future.apply packages. Here, we use the multisession
plan to create a background cluster of R processes for this purpose.
library(dplyr)
library(tsibbledata)
library(tsibble)
library(feasts)
library(fable)
library(future)
plan(multisession)
aus_retail_tr <- aus_retail %>%
filter(Month <= yearmonth("2008 Dec"))
aus_retail_vl <- aus_retail %>%
filter(Month > yearmonth("2008 Dec"))
mods <- model(aus_retail_tr,
drift=NAIVE(log(Turnover) ~ drift()),
sdrift=SNAIVE(log(Turnover) ~ drift()),
ar=ARIMA(log(Turnover)),
ets_auto=ETS(log(Turnover)),
ets_fixed=ETS(log(Turnover) ~ error("A") + trend("A") + season("A"))
)
Warning in sqrt(diag(best$var.coef)): NaNs produced
Warning in sqrt(diag(best$var.coef)): NaNs produced
nrow(mods)
[1] 150
Note that there are 150 separate models for each of the above, corresponding to all observed combinations of state/territory and industry (not every industry is represented in each state). The ability to parallelise model training is thus very useful.
Let’s examine the resulting output, for one time series. The plotted output from autoplot
includes the point forecasts along with the 80% and 95% prediction intervals, for each model. To compare these results to the actual turnover in the period, we pass the validation dataset to autoplot
in the data
argument. The actual turnover is given by the black line.
library(ggplot2)
fcasts <- forecast(mods, new_data=aus_retail_vl)
fcasts %>%
filter(Industry == "Food retailing", State == "New South Wales") %>%
autoplot(data=aus_retail_vl) +
theme(legend.position="bottom")
The main feature of this plot is that the drift
model is almost comically bad. Not only does it fails to capture the seasonal pattern in the data, but it also severely overestimates the growth in turnover in the validation period.
We can redo the plot, but omitting this one model and using only the 80% prediction intervals:
fcasts %>%
filter(Industry == "Food retailing", State == "New South Wales", .model != "drift") %>%
autoplot(data=aus_retail_vl, level=80) +
theme(legend.position="bottom")
This plot shows that, in fact, all of the models are systematically overestimating the growth in turnover (although the actual growth is still mostly within the prediction intervals). To see whether this is limited to this particular time series, we can also aggregate up the forecasts to the state level and plot them. There is a wart to be aware of: some time series actually end before the validation period, so we need to exclude them from the aggregation to avoid distorting the results.
state_vl <- aus_retail_vl %>%
group_by(State) %>%
summarise(Turnover=sum(Turnover))
fcasts_state <- fcasts %>%
filter(Month > yearmonth("2008 Dec"), .model != "drift") %>%
group_by(State, .model) %>%
summarise(Turnover=sum(.mean)) %>%
bind_rows(state_vl) %>%
mutate(.model=ifelse(is.na(.model), ".response", .model))
fcasts_state_plot <- function(state)
{
fcasts_state %>%
filter(State == state) %>%
ungroup() %>%
update_tsibble(key=.model) %>%
autoplot(Turnover) +
theme(legend.position="bottom") +
scale_y_log10() +
annotation_logticks() +
ggtitle(state)
}
fcasts_state_plot("New South Wales")
fcasts_state_plot("Victoria")
fcasts_state_plot("Queensland")
fcasts_state_plot("South Australia")
fcasts_state_plot("Western Australia")
fcasts_state_plot("Tasmania")
fcasts_state_plot("Northern Territory")
fcasts_state_plot("Australian Capital Territory")
This shows that all of the forecasts are systematically overestimating the trend. What’s causing this? The reason is probably because of how we split the data into training and validation periods. The training data terminates at the end of 2008, which corresponds to the global financial crisis; conversely, the validation data starts at a point in which the economy is low and beginning to recover from the crisis.
Update models to 2013
To test this hypothesis, let’s refit the models, but this time with the training period extended to the end of 2013. The drift
model is omitted as it is clearly inappropriate for the data.
aus_retail_2013_tr <- aus_retail %>%
filter(Month <= yearmonth("2013 Dec"))
aus_retail_2013_vl <- aus_retail %>%
filter(Month > yearmonth("2013 Dec"))
mods_2013 <- model(aus_retail_2013_tr,
sdrift=SNAIVE(log(Turnover) ~ drift()),
ar=ARIMA(log(Turnover)),
ets_auto=ETS(log(Turnover)),
ets_fixed=ETS(log(Turnover) ~ error("A") + trend("A") + season("A"))
)
Warning in sqrt(diag(best$var.coef)): NaNs produced
Warning in sqrt(diag(best$var.coef)): NaNs produced
Warning in sqrt(diag(best$var.coef)): NaNs produced
Warning in sqrt(diag(best$var.coef)): NaNs produced
Warning in sqrt(diag(best$var.coef)): NaNs produced
fcasts_2013 <- forecast(mods_2013, new_data=aus_retail_2013_vl)
fcasts_state_2013 <- fcasts_2013 %>%
group_by(State, .model) %>%
summarise(Turnover=sum(.mean)) %>%
bind_rows(state_vl) %>%
mutate(.model=ifelse(is.na(.model), ".response", .model))
fcasts_state_2013_plot <- function(state)
{
fcasts_state_2013 %>%
filter(State == state) %>%
ungroup() %>%
update_tsibble(key=.model) %>%
autoplot(Turnover) +
theme(legend.position="bottom") +
scale_y_log10() +
annotation_logticks() +
ggtitle(state)
}
fcasts_state_2013_plot("New South Wales")
fcasts_state_2013_plot("Victoria")
fcasts_state_2013_plot("Queensland")
fcasts_state_2013_plot("South Australia")
fcasts_state_2013_plot("Western Australia")
fcasts_state_2013_plot("Tasmania")
fcasts_state_2013_plot("Northern Territory")
fcasts_state_2013_plot("Australian Capital Territory")
The plots show much better agreement between forecasts and actuals, especially for the larger state (NSW and Victoria). Nevertheless, there is still substantial error for the smaller states. This is probably because these states were hit harder by the global recession and took longer to recover.
Measuring accuracy
A variety of point estimate accuracy measures are provided in the fabletools package. In general, you should not put too much emphasis on such measures as they play down the uncertainty inherent in any statistical inference task, let alone forecasting; remember to look at the prediction intervals as well to guide you on whether a model is adequate. Also, it’s better to treat these as relative measures, to help us decide which of a number of competing models to use, rather than looking at the absolute accuracy.
Nevertheless, let’s examine some accuracy scores for the different model types. For this dataset, the MASE (mean absolute scaled error) and MAPE (mean absolute percentage error) measures are appropriate. MAPE is simple and easy to explain to a nontechnical audience, while MASE has better statistical properties.
Unaggregated accuracy
The accuracy scores by state, and overall, are given below. These are calculated by obtaining the individual accuracy scores for each time series, and then averaging them.
library(tidyr)
aus_retail_agg <- aggregate_key(aus_retail, State*Industry, Turnover=sum(Turnover))
acc <- accuracy(fcasts_2013, aus_retail_agg, measures=list(MASE=MASE, MAPE=MAPE))
acc %>%
mutate(State=as.character(State)) %>%
group_by(State, .model) %>%
summarise(across(MASE:MAPE, mean)) %>%
pivot_wider(id_cols=State, names_from=.model, values_from=MASE:MAPE)
`summarise()` regrouping output by 'State' (override with `.groups` argument)
acc %>%
group_by(.model) %>%
summarise(across(MASE:MAPE, mean)) %>%
pivot_wider(names_from=.model, values_from=MASE:MAPE)
`summarise()` ungrouping output (override with `.groups` argument)
Aggregated accuracy
A possibly undesirable feature of the measures above is that they treat all combinations of state and industry equally. In some scenarios this is reasonable; here, we might suppose that smaller states/industries in terms of turnover should be given less weight than larger ones. This is the implicit assumption when analysing the data by aggregating it to the state or industry level, for example.
Here are the weighted/aggregated accuracy scores. The code is somewhat more involved, as for MASE we also need to obtain the suitable aggregated training series.
fcasts_2013_wide <- fcasts_2013 %>%
as_tibble() %>%
pivot_wider(id_cols=c(State, Industry, .model, Month), names_from=.model, values_from=.mean) %>%
inner_join(aus_retail_2013_vl, by=c("State", "Industry", "Month"))
fcasts_2013_wide %>%
group_by(State) %>%
group_modify(function(.x, .y)
{
traindata <- aus_retail_2013_tr %>%
filter(State == .y$State) %>%
summarise(Turnover=sum(Turnover))
summarise(.x, across(sdrift:ets_fixed, list(
MASE=function(x) MASE(x - .x$Turnover, traindata$Turnover, .period=12, d=FALSE, D=TRUE),
MAPE=function(x) MAPE(x - .x$Turnover, .x$Turnover)
)))
}) %>%
select(State, contains("MASE"), contains("MAPE"))
fcasts_2013_wide %>%
group_modify(function(.x, .y)
{
traindata <- summarise(aus_retail_2013_tr, Turnover=sum(Turnover))
summarise(.x, across(sdrift:ets_fixed, list(
MASE=function(x) MASE(x - .x$Turnover, traindata$Turnover, .period=12, d=FALSE, D=TRUE),
MAPE=function(x) MAPE(x - .x$Turnover, .x$Turnover)
)))
}) %>%
select(contains("MASE"), contains("MAPE"))
This broadly confirms the patterns seen in the plots above. The sdrift
model performs worst, which is unsurprising given that it is simplistic by design. The ETS models perform best, probably because this particular dataset exhibits very clear trends and seasonal patterns. The forecast accuracy is best for the bigger states (NSW and Victoria) and worst for the Northern Territory and Western Australia.
LS0tCnRpdGxlOiAiVXNpbmcgVGlkeXZlcnRzIHdpdGggdGhlIEF1c3RyYWxpYW4gcmV0YWlsIGRhdGE6IG1vZGVsbGluZyIKb3V0cHV0OiBodG1sX25vdGVib29rCi0tLQoKIyMgVHJhaW4gbW9kZWxzCgpIYXZpbmcgZXhwbG9yZWQgdGhlIHJldGFpbCBkYXRhLCBsZXQncyBmaXQgc29tZSBtb2RlbHMgdG8gaXQuIFdlJ2xsIHVzZSB0aGUgZmlyc3QgMjUuNSB5ZWFycyBvZiBkYXRhIGZvciBtb2RlbCB0cmFpbmluZywgYW5kIHRoZSByZW1haW5pbmcgMTAgeWVhcnMgZm9yIHRyYWluaW5nLgoKLSBgZHJpZnRgIGlzIGEgc2ltcGxlIHJhbmRvbSB3YWxrIG1vZGVsIGluY29ycG9yYXRpbmcgYSBkcmlmdCB0ZXJtLgotIGBzZHJpZnRgIGlzIHRoZSBzZWFzb25hbCBjb3VudGVycGFydCB0byBgZHJpZnRgLCBpZSB0aGUgcmFuZG9tIHdhbGsgaXMgYnkgc2Vhc29uLgotIGBhcmAgaXMgYW4gQVJJTUEgbW9kZWwgd2l0aCBhbGwgc2Vhc29uYWwgYW5kIG5vbnNlYXNvbmFsIHRlcm1zIGNob3NlbiBmcm9tIHRoZSBkYXRhLgotIGBldHNfYXV0b2AgaXMgYW4gRVRTIG1vZGVsIHdpdGggdGhlIGZvcm0gb2YgdGhlIGNvbXBvbmVudHMgY2hvc2VuIGZyb20gdGhlIGRhdGEgKGVpdGhlciBhZGRpdGl2ZSBvciBtdWx0aXBsaWNhdGl2ZSkuCi0gYGV0c19maXhlZGAgaXMgYW4gRVRTIG1vZGVsIHdoZXJlIHRoZSBjb21wb25lbnRzIGFyZSBhbGwgYWRkaXRpdmUsIGJhc2VkIG9uIGV4YW1pbmluZyB0aGUgcGxvdHMgaW4gdGhlIHByZXZpb3VzIG5vdGVib29rLgoKSW4gYWRkaXRpb24sIGEgbmljZSBmZWF0dXJlIG9mIHRoZSBgbW9kZWxgIGZ1bmN0aW9uIGlzIHRoYXQgaXQgY2FuIGZpdCBtb2RlbHMgaW4gcGFyYWxsZWwgYnkgbGV2ZXJhZ2luZyB0aGUgZnV0dXJlIGFuZCBmdXR1cmUuYXBwbHkgcGFja2FnZXMuIEhlcmUsIHdlIHVzZSB0aGUgYG11bHRpc2Vzc2lvbmAgcGxhbiB0byBjcmVhdGUgYSBiYWNrZ3JvdW5kIGNsdXN0ZXIgb2YgUiBwcm9jZXNzZXMgZm9yIHRoaXMgcHVycG9zZS4KCmBgYHtyfQpsaWJyYXJ5KGRwbHlyKQpsaWJyYXJ5KHRzaWJibGVkYXRhKQpsaWJyYXJ5KHRzaWJibGUpCmxpYnJhcnkoZmVhc3RzKQpsaWJyYXJ5KGZhYmxlKQpsaWJyYXJ5KGZ1dHVyZSkKCnBsYW4obXVsdGlzZXNzaW9uKQoKYXVzX3JldGFpbF90ciA8LSBhdXNfcmV0YWlsICU+JQogICAgZmlsdGVyKE1vbnRoIDw9IHllYXJtb250aCgiMjAwOCBEZWMiKSkKYXVzX3JldGFpbF92bCA8LSBhdXNfcmV0YWlsICU+JQogICAgZmlsdGVyKE1vbnRoID4geWVhcm1vbnRoKCIyMDA4IERlYyIpKQoKbW9kcyA8LSBtb2RlbChhdXNfcmV0YWlsX3RyLAogICAgZHJpZnQ9TkFJVkUobG9nKFR1cm5vdmVyKSB+IGRyaWZ0KCkpLAogICAgc2RyaWZ0PVNOQUlWRShsb2coVHVybm92ZXIpIH4gZHJpZnQoKSksCiAgICBhcj1BUklNQShsb2coVHVybm92ZXIpKSwKICAgIGV0c19hdXRvPUVUUyhsb2coVHVybm92ZXIpKSwKICAgIGV0c19maXhlZD1FVFMobG9nKFR1cm5vdmVyKSB+IGVycm9yKCJBIikgKyB0cmVuZCgiQSIpICsgc2Vhc29uKCJBIikpCikKCm5yb3cobW9kcykKYGBgCgpOb3RlIHRoYXQgdGhlcmUgYXJlIDE1MCBzZXBhcmF0ZSBtb2RlbHMgZm9yIGVhY2ggb2YgdGhlIGFib3ZlLCBjb3JyZXNwb25kaW5nIHRvIGFsbCBvYnNlcnZlZCBjb21iaW5hdGlvbnMgb2Ygc3RhdGUvdGVycml0b3J5IGFuZCBpbmR1c3RyeSAobm90IGV2ZXJ5IGluZHVzdHJ5IGlzIHJlcHJlc2VudGVkIGluIGVhY2ggc3RhdGUpLiBUaGUgYWJpbGl0eSB0byBwYXJhbGxlbGlzZSBtb2RlbCB0cmFpbmluZyBpcyB0aHVzIHZlcnkgdXNlZnVsLgoKTGV0J3MgZXhhbWluZSB0aGUgcmVzdWx0aW5nIG91dHB1dCwgZm9yIG9uZSB0aW1lIHNlcmllcy4gVGhlIHBsb3R0ZWQgb3V0cHV0IGZyb20gYGF1dG9wbG90YCBpbmNsdWRlcyB0aGUgcG9pbnQgZm9yZWNhc3RzIGFsb25nIHdpdGggdGhlIDgwJSBhbmQgOTUlIHByZWRpY3Rpb24gaW50ZXJ2YWxzLCBmb3IgZWFjaCBtb2RlbC4gVG8gY29tcGFyZSB0aGVzZSByZXN1bHRzIHRvIHRoZSBhY3R1YWwgdHVybm92ZXIgaW4gdGhlIHBlcmlvZCwgd2UgcGFzcyB0aGUgdmFsaWRhdGlvbiBkYXRhc2V0IHRvIGBhdXRvcGxvdGAgaW4gdGhlIGBkYXRhYCBhcmd1bWVudC4gVGhlIGFjdHVhbCB0dXJub3ZlciBpcyBnaXZlbiBieSB0aGUgYmxhY2sgbGluZS4KCmBgYHtyfQpsaWJyYXJ5KGdncGxvdDIpCgpmY2FzdHMgPC0gZm9yZWNhc3QobW9kcywgbmV3X2RhdGE9YXVzX3JldGFpbF92bCkKCmZjYXN0cyAlPiUKICAgIGZpbHRlcihJbmR1c3RyeSA9PSAiRm9vZCByZXRhaWxpbmciLCBTdGF0ZSA9PSAiTmV3IFNvdXRoIFdhbGVzIikgJT4lCiAgICBhdXRvcGxvdChkYXRhPWF1c19yZXRhaWxfdmwpICsKICAgICAgICB0aGVtZShsZWdlbmQucG9zaXRpb249ImJvdHRvbSIpCmBgYAoKVGhlIG1haW4gZmVhdHVyZSBvZiB0aGlzIHBsb3QgaXMgdGhhdCB0aGUgYGRyaWZ0YCBtb2RlbCBpcyBhbG1vc3QgY29taWNhbGx5IGJhZC4gTm90IG9ubHkgZG9lcyBpdCBmYWlscyB0byBjYXB0dXJlIHRoZSBzZWFzb25hbCBwYXR0ZXJuIGluIHRoZSBkYXRhLCBidXQgaXQgYWxzbyBzZXZlcmVseSBvdmVyZXN0aW1hdGVzIHRoZSBncm93dGggaW4gdHVybm92ZXIgaW4gdGhlIHZhbGlkYXRpb24gcGVyaW9kLgoKV2UgY2FuIHJlZG8gdGhlIHBsb3QsIGJ1dCBvbWl0dGluZyB0aGlzIG9uZSBtb2RlbCBhbmQgdXNpbmcgb25seSB0aGUgODAlIHByZWRpY3Rpb24gaW50ZXJ2YWxzOgoKYGBge3J9CmZjYXN0cyAlPiUKICAgIGZpbHRlcihJbmR1c3RyeSA9PSAiRm9vZCByZXRhaWxpbmciLCBTdGF0ZSA9PSAiTmV3IFNvdXRoIFdhbGVzIiwgLm1vZGVsICE9ICJkcmlmdCIpICU+JQogICAgYXV0b3Bsb3QoZGF0YT1hdXNfcmV0YWlsX3ZsLCBsZXZlbD04MCkgKwogICAgICAgIHRoZW1lKGxlZ2VuZC5wb3NpdGlvbj0iYm90dG9tIikKYGBgCgpUaGlzIHBsb3Qgc2hvd3MgdGhhdCwgaW4gZmFjdCwgX2FsbF8gb2YgdGhlIG1vZGVscyBhcmUgc3lzdGVtYXRpY2FsbHkgb3ZlcmVzdGltYXRpbmcgdGhlIGdyb3d0aCBpbiB0dXJub3ZlciAoYWx0aG91Z2ggdGhlIGFjdHVhbCBncm93dGggaXMgc3RpbGwgbW9zdGx5IHdpdGhpbiB0aGUgcHJlZGljdGlvbiBpbnRlcnZhbHMpLiBUbyBzZWUgd2hldGhlciB0aGlzIGlzIGxpbWl0ZWQgdG8gdGhpcyBwYXJ0aWN1bGFyIHRpbWUgc2VyaWVzLCB3ZSBjYW4gYWxzbyBhZ2dyZWdhdGUgdXAgdGhlIGZvcmVjYXN0cyB0byB0aGUgc3RhdGUgbGV2ZWwgYW5kIHBsb3QgdGhlbS4gVGhlcmUgaXMgYSB3YXJ0IHRvIGJlIGF3YXJlIG9mOiBzb21lIHRpbWUgc2VyaWVzIGFjdHVhbGx5IGVuZCBiZWZvcmUgdGhlIHZhbGlkYXRpb24gcGVyaW9kLCBzbyB3ZSBuZWVkIHRvIGV4Y2x1ZGUgdGhlbSBmcm9tIHRoZSBhZ2dyZWdhdGlvbiB0byBhdm9pZCBkaXN0b3J0aW5nIHRoZSByZXN1bHRzLgoKYGBge3J9CnN0YXRlX3ZsIDwtIGF1c19yZXRhaWxfdmwgJT4lCiAgICBncm91cF9ieShTdGF0ZSkgJT4lCiAgICBzdW1tYXJpc2UoVHVybm92ZXI9c3VtKFR1cm5vdmVyKSkKCmZjYXN0c19zdGF0ZSA8LSBmY2FzdHMgJT4lCiAgICBmaWx0ZXIoTW9udGggPiB5ZWFybW9udGgoIjIwMDggRGVjIiksIC5tb2RlbCAhPSAiZHJpZnQiKSAlPiUKICAgIGdyb3VwX2J5KFN0YXRlLCAubW9kZWwpICU+JQogICAgc3VtbWFyaXNlKFR1cm5vdmVyPXN1bSgubWVhbikpICU+JQogICAgYmluZF9yb3dzKHN0YXRlX3ZsKSAlPiUKICAgIG11dGF0ZSgubW9kZWw9aWZlbHNlKGlzLm5hKC5tb2RlbCksICIucmVzcG9uc2UiLCAubW9kZWwpKQoKZmNhc3RzX3N0YXRlX3Bsb3QgPC0gZnVuY3Rpb24oc3RhdGUpCnsKICAgIGZjYXN0c19zdGF0ZSAlPiUKICAgICAgICBmaWx0ZXIoU3RhdGUgPT0gc3RhdGUpICU+JQogICAgICAgIHVuZ3JvdXAoKSAlPiUKICAgICAgICB1cGRhdGVfdHNpYmJsZShrZXk9Lm1vZGVsKSAlPiUKICAgICAgICBhdXRvcGxvdChUdXJub3ZlcikgKwogICAgICAgICAgICB0aGVtZShsZWdlbmQucG9zaXRpb249ImJvdHRvbSIpICsKICAgICAgICAgICAgc2NhbGVfeV9sb2cxMCgpICsKICAgICAgICAgICAgYW5ub3RhdGlvbl9sb2d0aWNrcygpICsKICAgICAgICAgICAgZ2d0aXRsZShzdGF0ZSkKfQoKZmNhc3RzX3N0YXRlX3Bsb3QoIk5ldyBTb3V0aCBXYWxlcyIpCmZjYXN0c19zdGF0ZV9wbG90KCJWaWN0b3JpYSIpCmZjYXN0c19zdGF0ZV9wbG90KCJRdWVlbnNsYW5kIikKZmNhc3RzX3N0YXRlX3Bsb3QoIlNvdXRoIEF1c3RyYWxpYSIpCmZjYXN0c19zdGF0ZV9wbG90KCJXZXN0ZXJuIEF1c3RyYWxpYSIpCmZjYXN0c19zdGF0ZV9wbG90KCJUYXNtYW5pYSIpCmZjYXN0c19zdGF0ZV9wbG90KCJOb3J0aGVybiBUZXJyaXRvcnkiKQpmY2FzdHNfc3RhdGVfcGxvdCgiQXVzdHJhbGlhbiBDYXBpdGFsIFRlcnJpdG9yeSIpCmBgYAoKVGhpcyBzaG93cyB0aGF0IGFsbCBvZiB0aGUgZm9yZWNhc3RzIGFyZSBzeXN0ZW1hdGljYWxseSBvdmVyZXN0aW1hdGluZyB0aGUgdHJlbmQuIFdoYXQncyBjYXVzaW5nIHRoaXM/IFRoZSByZWFzb24gaXMgcHJvYmFibHkgYmVjYXVzZSBvZiBob3cgd2Ugc3BsaXQgdGhlIGRhdGEgaW50byB0cmFpbmluZyBhbmQgdmFsaWRhdGlvbiBwZXJpb2RzLiBUaGUgdHJhaW5pbmcgZGF0YSB0ZXJtaW5hdGVzIGF0IHRoZSBlbmQgb2YgMjAwOCwgd2hpY2ggY29ycmVzcG9uZHMgdG8gdGhlIGdsb2JhbCBmaW5hbmNpYWwgY3Jpc2lzOyBjb252ZXJzZWx5LCB0aGUgdmFsaWRhdGlvbiBkYXRhIHN0YXJ0cyBhdCBhIHBvaW50IGluIHdoaWNoIHRoZSBlY29ub215IGlzIGxvdyBhbmQgYmVnaW5uaW5nIHRvIHJlY292ZXIgZnJvbSB0aGUgY3Jpc2lzLgoKIyMgVXBkYXRlIG1vZGVscyB0byAyMDEzCgpUbyB0ZXN0IHRoaXMgaHlwb3RoZXNpcywgbGV0J3MgcmVmaXQgdGhlIG1vZGVscywgYnV0IHRoaXMgdGltZSB3aXRoIHRoZSB0cmFpbmluZyBwZXJpb2QgZXh0ZW5kZWQgdG8gdGhlIGVuZCBvZiAyMDEzLiBUaGUgYGRyaWZ0YCBtb2RlbCBpcyBvbWl0dGVkIGFzIGl0IGlzIGNsZWFybHkgaW5hcHByb3ByaWF0ZSBmb3IgdGhlIGRhdGEuCgpgYGB7cn0KYXVzX3JldGFpbF8yMDEzX3RyIDwtIGF1c19yZXRhaWwgJT4lCiAgICBmaWx0ZXIoTW9udGggPD0geWVhcm1vbnRoKCIyMDEzIERlYyIpKQphdXNfcmV0YWlsXzIwMTNfdmwgPC0gYXVzX3JldGFpbCAlPiUKICAgIGZpbHRlcihNb250aCA+IHllYXJtb250aCgiMjAxMyBEZWMiKSkKCm1vZHNfMjAxMyA8LSBtb2RlbChhdXNfcmV0YWlsXzIwMTNfdHIsCiAgICBzZHJpZnQ9U05BSVZFKGxvZyhUdXJub3ZlcikgfiBkcmlmdCgpKSwKICAgIGFyPUFSSU1BKGxvZyhUdXJub3ZlcikpLAogICAgZXRzX2F1dG89RVRTKGxvZyhUdXJub3ZlcikpLAogICAgZXRzX2ZpeGVkPUVUUyhsb2coVHVybm92ZXIpIH4gZXJyb3IoIkEiKSArIHRyZW5kKCJBIikgKyBzZWFzb24oIkEiKSkKKQoKZmNhc3RzXzIwMTMgPC0gZm9yZWNhc3QobW9kc18yMDEzLCBuZXdfZGF0YT1hdXNfcmV0YWlsXzIwMTNfdmwpCgpmY2FzdHNfc3RhdGVfMjAxMyA8LSBmY2FzdHNfMjAxMyAlPiUKICAgIGdyb3VwX2J5KFN0YXRlLCAubW9kZWwpICU+JQogICAgc3VtbWFyaXNlKFR1cm5vdmVyPXN1bSgubWVhbikpICU+JQogICAgYmluZF9yb3dzKHN0YXRlX3ZsKSAlPiUKICAgIG11dGF0ZSgubW9kZWw9aWZlbHNlKGlzLm5hKC5tb2RlbCksICIucmVzcG9uc2UiLCAubW9kZWwpKQoKZmNhc3RzX3N0YXRlXzIwMTNfcGxvdCA8LSBmdW5jdGlvbihzdGF0ZSkKewogICAgZmNhc3RzX3N0YXRlXzIwMTMgJT4lCiAgICAgICAgZmlsdGVyKFN0YXRlID09IHN0YXRlKSAlPiUKICAgICAgICB1bmdyb3VwKCkgJT4lCiAgICAgICAgdXBkYXRlX3RzaWJibGUoa2V5PS5tb2RlbCkgJT4lCiAgICAgICAgYXV0b3Bsb3QoVHVybm92ZXIpICsKICAgICAgICAgICAgdGhlbWUobGVnZW5kLnBvc2l0aW9uPSJib3R0b20iKSArCiAgICAgICAgICAgIHNjYWxlX3lfbG9nMTAoKSArCiAgICAgICAgICAgIGFubm90YXRpb25fbG9ndGlja3MoKSArCiAgICAgICAgICAgIGdndGl0bGUoc3RhdGUpCn0KCmZjYXN0c19zdGF0ZV8yMDEzX3Bsb3QoIk5ldyBTb3V0aCBXYWxlcyIpCmZjYXN0c19zdGF0ZV8yMDEzX3Bsb3QoIlZpY3RvcmlhIikKZmNhc3RzX3N0YXRlXzIwMTNfcGxvdCgiUXVlZW5zbGFuZCIpCmZjYXN0c19zdGF0ZV8yMDEzX3Bsb3QoIlNvdXRoIEF1c3RyYWxpYSIpCmZjYXN0c19zdGF0ZV8yMDEzX3Bsb3QoIldlc3Rlcm4gQXVzdHJhbGlhIikKZmNhc3RzX3N0YXRlXzIwMTNfcGxvdCgiVGFzbWFuaWEiKQpmY2FzdHNfc3RhdGVfMjAxM19wbG90KCJOb3J0aGVybiBUZXJyaXRvcnkiKQpmY2FzdHNfc3RhdGVfMjAxM19wbG90KCJBdXN0cmFsaWFuIENhcGl0YWwgVGVycml0b3J5IikKYGBgCgpUaGUgcGxvdHMgc2hvdyBtdWNoIGJldHRlciBhZ3JlZW1lbnQgYmV0d2VlbiBmb3JlY2FzdHMgYW5kIGFjdHVhbHMsIGVzcGVjaWFsbHkgZm9yIHRoZSBsYXJnZXIgc3RhdGUgKE5TVyBhbmQgVmljdG9yaWEpLiBOZXZlcnRoZWxlc3MsIHRoZXJlIGlzIHN0aWxsIHN1YnN0YW50aWFsIGVycm9yIGZvciB0aGUgc21hbGxlciBzdGF0ZXMuIFRoaXMgaXMgcHJvYmFibHkgYmVjYXVzZSB0aGVzZSBzdGF0ZXMgd2VyZSBoaXQgaGFyZGVyIGJ5IHRoZSBnbG9iYWwgcmVjZXNzaW9uIGFuZCB0b29rIGxvbmdlciB0byByZWNvdmVyLgoKCiMjIE1lYXN1cmluZyBhY2N1cmFjeQoKQSB2YXJpZXR5IG9mIHBvaW50IGVzdGltYXRlIGFjY3VyYWN5IG1lYXN1cmVzIGFyZSBwcm92aWRlZCBpbiB0aGUgZmFibGV0b29scyBwYWNrYWdlLiBJbiBnZW5lcmFsLCB5b3Ugc2hvdWxkIG5vdCBwdXQgdG9vIG11Y2ggZW1waGFzaXMgb24gc3VjaCBtZWFzdXJlcyBhcyB0aGV5IHBsYXkgZG93biB0aGUgdW5jZXJ0YWludHkgaW5oZXJlbnQgaW4gYW55IHN0YXRpc3RpY2FsIGluZmVyZW5jZSB0YXNrLCBsZXQgYWxvbmUgZm9yZWNhc3Rpbmc7IHJlbWVtYmVyIHRvIGxvb2sgYXQgdGhlIHByZWRpY3Rpb24gaW50ZXJ2YWxzIGFzIHdlbGwgdG8gZ3VpZGUgeW91IG9uIHdoZXRoZXIgYSBtb2RlbCBpcyBhZGVxdWF0ZS4gQWxzbywgaXQncyBiZXR0ZXIgdG8gdHJlYXQgdGhlc2UgYXMgX3JlbGF0aXZlXyBtZWFzdXJlcywgdG8gaGVscCB1cyBkZWNpZGUgd2hpY2ggb2YgYSBudW1iZXIgb2YgY29tcGV0aW5nIG1vZGVscyB0byB1c2UsIHJhdGhlciB0aGFuIGxvb2tpbmcgYXQgdGhlIGFic29sdXRlIGFjY3VyYWN5LgoKTmV2ZXJ0aGVsZXNzLCBsZXQncyBleGFtaW5lIHNvbWUgYWNjdXJhY3kgc2NvcmVzIGZvciB0aGUgZGlmZmVyZW50IG1vZGVsIHR5cGVzLiBGb3IgdGhpcyBkYXRhc2V0LCB0aGUgTUFTRSAobWVhbiBhYnNvbHV0ZSBzY2FsZWQgZXJyb3IpIGFuZCBNQVBFIChtZWFuIGFic29sdXRlIHBlcmNlbnRhZ2UgZXJyb3IpIG1lYXN1cmVzIGFyZSBhcHByb3ByaWF0ZS4gTUFQRSBpcyBzaW1wbGUgYW5kIGVhc3kgdG8gZXhwbGFpbiB0byBhIG5vbnRlY2huaWNhbCBhdWRpZW5jZSwgd2hpbGUgTUFTRSBoYXMgYmV0dGVyIHN0YXRpc3RpY2FsIHByb3BlcnRpZXMuCgojIyMgVW5hZ2dyZWdhdGVkIGFjY3VyYWN5CgpUaGUgYWNjdXJhY3kgc2NvcmVzIGJ5IHN0YXRlLCBhbmQgb3ZlcmFsbCwgYXJlIGdpdmVuIGJlbG93LiBUaGVzZSBhcmUgY2FsY3VsYXRlZCBieSBvYnRhaW5pbmcgdGhlIGluZGl2aWR1YWwgYWNjdXJhY3kgc2NvcmVzIGZvciBlYWNoIHRpbWUgc2VyaWVzLCBhbmQgdGhlbiBhdmVyYWdpbmcgdGhlbS4KCmBgYHtyfQpsaWJyYXJ5KHRpZHlyKQoKYXVzX3JldGFpbF9hZ2cgPC0gYWdncmVnYXRlX2tleShhdXNfcmV0YWlsLCBTdGF0ZSpJbmR1c3RyeSwgVHVybm92ZXI9c3VtKFR1cm5vdmVyKSkKYWNjIDwtIGFjY3VyYWN5KGZjYXN0c18yMDEzLCBhdXNfcmV0YWlsX2FnZywgbWVhc3VyZXM9bGlzdChNQVNFPU1BU0UsIE1BUEU9TUFQRSkpCgphY2MgJT4lCiAgICBtdXRhdGUoU3RhdGU9YXMuY2hhcmFjdGVyKFN0YXRlKSkgJT4lCiAgICBncm91cF9ieShTdGF0ZSwgLm1vZGVsKSAlPiUKICAgIHN1bW1hcmlzZShhY3Jvc3MoTUFTRTpNQVBFLCBtZWFuKSkgJT4lCiAgICBwaXZvdF93aWRlcihpZF9jb2xzPVN0YXRlLCBuYW1lc19mcm9tPS5tb2RlbCwgdmFsdWVzX2Zyb209TUFTRTpNQVBFKQoKYWNjICU+JQogICAgZ3JvdXBfYnkoLm1vZGVsKSAlPiUKICAgIHN1bW1hcmlzZShhY3Jvc3MoTUFTRTpNQVBFLCBtZWFuKSkgJT4lCiAgICBwaXZvdF93aWRlcihuYW1lc19mcm9tPS5tb2RlbCwgdmFsdWVzX2Zyb209TUFTRTpNQVBFKQpgYGAKCiMjIyBBZ2dyZWdhdGVkIGFjY3VyYWN5CgpBIHBvc3NpYmx5IHVuZGVzaXJhYmxlIGZlYXR1cmUgb2YgdGhlIG1lYXN1cmVzIGFib3ZlIGlzIHRoYXQgdGhleSB0cmVhdCBhbGwgY29tYmluYXRpb25zIG9mIHN0YXRlIGFuZCBpbmR1c3RyeSBlcXVhbGx5LiBJbiBzb21lIHNjZW5hcmlvcyB0aGlzIGlzIHJlYXNvbmFibGU7IGhlcmUsIHdlIG1pZ2h0IHN1cHBvc2UgdGhhdCBzbWFsbGVyIHN0YXRlcy9pbmR1c3RyaWVzIGluIHRlcm1zIG9mIHR1cm5vdmVyIHNob3VsZCBiZSBnaXZlbiBsZXNzIHdlaWdodCB0aGFuIGxhcmdlciBvbmVzLiBUaGlzIGlzIHRoZSBpbXBsaWNpdCBhc3N1bXB0aW9uIHdoZW4gYW5hbHlzaW5nIHRoZSBkYXRhIGJ5IGFnZ3JlZ2F0aW5nIGl0IHRvIHRoZSBzdGF0ZSBvciBpbmR1c3RyeSBsZXZlbCwgZm9yIGV4YW1wbGUuCgpIZXJlIGFyZSB0aGUgd2VpZ2h0ZWQvYWdncmVnYXRlZCBhY2N1cmFjeSBzY29yZXMuIFRoZSBjb2RlIGlzIHNvbWV3aGF0IG1vcmUgaW52b2x2ZWQsIGFzIGZvciBNQVNFIHdlIGFsc28gbmVlZCB0byBvYnRhaW4gdGhlIHN1aXRhYmxlIGFnZ3JlZ2F0ZWQgdHJhaW5pbmcgc2VyaWVzLgoKYGBge3J9CmZjYXN0c18yMDEzX3dpZGUgPC0gZmNhc3RzXzIwMTMgJT4lCiAgICBhc190aWJibGUoKSAlPiUKICAgIHBpdm90X3dpZGVyKGlkX2NvbHM9YyhTdGF0ZSwgSW5kdXN0cnksIC5tb2RlbCwgTW9udGgpLCBuYW1lc19mcm9tPS5tb2RlbCwgdmFsdWVzX2Zyb209Lm1lYW4pICU+JQogICAgaW5uZXJfam9pbihhdXNfcmV0YWlsXzIwMTNfdmwsIGJ5PWMoIlN0YXRlIiwgIkluZHVzdHJ5IiwgIk1vbnRoIikpCgpmY2FzdHNfMjAxM193aWRlICU+JQogICAgZ3JvdXBfYnkoU3RhdGUpICU+JQogICAgZ3JvdXBfbW9kaWZ5KGZ1bmN0aW9uKC54LCAueSkKICAgIHsKICAgICAgICB0cmFpbmRhdGEgPC0gYXVzX3JldGFpbF8yMDEzX3RyICU+JQogICAgICAgICAgICBmaWx0ZXIoU3RhdGUgPT0gLnkkU3RhdGUpICU+JQogICAgICAgICAgICBzdW1tYXJpc2UoVHVybm92ZXI9c3VtKFR1cm5vdmVyKSkKICAgICAgICBzdW1tYXJpc2UoLngsIGFjcm9zcyhzZHJpZnQ6ZXRzX2ZpeGVkLCBsaXN0KAogICAgICAgICAgICBNQVNFPWZ1bmN0aW9uKHgpIE1BU0UoeCAtIC54JFR1cm5vdmVyLCB0cmFpbmRhdGEkVHVybm92ZXIsIC5wZXJpb2Q9MTIsIGQ9RkFMU0UsIEQ9VFJVRSksCiAgICAgICAgICAgIE1BUEU9ZnVuY3Rpb24oeCkgTUFQRSh4IC0gLngkVHVybm92ZXIsIC54JFR1cm5vdmVyKQogICAgICAgICkpKQogICAgfSkgJT4lCiAgICBzZWxlY3QoU3RhdGUsIGNvbnRhaW5zKCJNQVNFIiksIGNvbnRhaW5zKCJNQVBFIikpCgpmY2FzdHNfMjAxM193aWRlICU+JQogICAgZ3JvdXBfbW9kaWZ5KGZ1bmN0aW9uKC54LCAueSkKICAgIHsKICAgICAgICB0cmFpbmRhdGEgPC0gc3VtbWFyaXNlKGF1c19yZXRhaWxfMjAxM190ciwgVHVybm92ZXI9c3VtKFR1cm5vdmVyKSkKICAgICAgICBzdW1tYXJpc2UoLngsIGFjcm9zcyhzZHJpZnQ6ZXRzX2ZpeGVkLCBsaXN0KAogICAgICAgICAgICBNQVNFPWZ1bmN0aW9uKHgpIE1BU0UoeCAtIC54JFR1cm5vdmVyLCB0cmFpbmRhdGEkVHVybm92ZXIsIC5wZXJpb2Q9MTIsIGQ9RkFMU0UsIEQ9VFJVRSksCiAgICAgICAgICAgIE1BUEU9ZnVuY3Rpb24oeCkgTUFQRSh4IC0gLngkVHVybm92ZXIsIC54JFR1cm5vdmVyKQogICAgICAgICkpKQogICAgfSkgJT4lCiAgICBzZWxlY3QoY29udGFpbnMoIk1BU0UiKSwgY29udGFpbnMoIk1BUEUiKSkKYGBgCgpUaGlzIGJyb2FkbHkgY29uZmlybXMgdGhlIHBhdHRlcm5zIHNlZW4gaW4gdGhlIHBsb3RzIGFib3ZlLiBUaGUgYHNkcmlmdGAgbW9kZWwgcGVyZm9ybXMgd29yc3QsIHdoaWNoIGlzIHVuc3VycHJpc2luZyBnaXZlbiB0aGF0IGl0IGlzIHNpbXBsaXN0aWMgYnkgZGVzaWduLiBUaGUgRVRTIG1vZGVscyBwZXJmb3JtIGJlc3QsIHByb2JhYmx5IGJlY2F1c2UgdGhpcyBwYXJ0aWN1bGFyIGRhdGFzZXQgZXhoaWJpdHMgdmVyeSBjbGVhciB0cmVuZHMgYW5kIHNlYXNvbmFsIHBhdHRlcm5zLiBUaGUgZm9yZWNhc3QgYWNjdXJhY3kgaXMgYmVzdCBmb3IgdGhlIGJpZ2dlciBzdGF0ZXMgKE5TVyBhbmQgVmljdG9yaWEpIGFuZCB3b3JzdCBmb3IgdGhlIE5vcnRoZXJuIFRlcnJpdG9yeSBhbmQgV2VzdGVybiBBdXN0cmFsaWEuCgojIyBDb21tZW50cwoKIyMjIFJpc2tzIG9mIGZvcmVjYXN0aW5nCgpUaGVyZSBpcyBhIHBhcnRpY3VsYXJseSB0aW1lbHkgYW5kIGltcG9ydGFudCBvYnNlcnZhdGlvbiB0byBtYWtlIHJlZ2FyZGluZyB0aGlzIGRhdGFzZXQuIEZyb20gYWJvdmUsIHdlIHNhdyB0aGF0IHVwZGF0aW5nIHRoZSBtb2RlbHMgdG8gdXNlIHRoZSBkYXRhIHVwIHRvIDIwMTMgZ2F2ZSBiZXR0ZXIgZm9yZWNhc3QgYWNjdXJhY3ksIGVzcGVjaWFsbHkgZm9yIE5TVyBhbmQgVmljdG9yaWEuIEFzc3VtaW5nIHRoYXQgd2Ugd2VyZSBvbmx5IGludGVyZXN0ZWQgaW4gdGhlc2UgdHdvIHN0YXRlcywgd2hhdCB3b3VsZCBoYXBwZW4gaWYgd2Ugd2VyZSB0byB1c2UgdGhlIG1vZGVscyB0byBvYnRhaW4gZm9yZWNhc3RzIGZvciAyMDIwIGFuZCBiZXlvbmQ/IERlc3BpdGUgdGhlIGdvb2QgcmVzdWx0cyBvbiBwYXN0IGRhdGEsIHRoZXkgd291bGQgYWxtb3N0IGNlcnRhaW5seSBiZSB2ZXJ5IHdpZGUgb2YgdGhlIG1hcmsuIFRoaXMgaXMgYmVjYXVzZSBldmVuIHRoZSBiZXN0IG1vZGVsIGNvdWxkIG5vdCBwb3NzaWJseSBhbnRpY2lwYXRlIHRoZSBtYXNzaXZlIGdsb2JhbCByZWNlc3Npb24gY2F1c2VkIGJ5IHRoZSBDT1ZJRC0xOSBwYW5kZW1pYy4gKE9mIGNvdXJzZSwgdGhlIHNhbWUgd291bGQgYXBwbHkgZm9yIGFueSBzdWJqZWN0aXZlIGZvcmVjYXN0IGJhc2VkIG9uIGV4cGVydCBrbm93bGVkZ2UsIHNvIHRoaXMgaXMgbm90IGFuIGVuZG9yc2VtZW50IG9mIGp1ZGdlbWVudGFsIGZvcmVjYXN0aW5nLikKClRoZXNlIHJlc3VsdHMgZGVtb25zdHJhdGUgdGhlIHJpc2tzIGluaGVyZW50IGluIGZvcmVjYXN0aW5nLCBlc3BlY2lhbGx5IHdoZW4gdGhlcmUgaXMgYSBzdHJvbmcgdHJlbmQuIEV2ZW4gYSB0cmVuZCB0aGF0IGFwcGVhcnMgdG8gYmUgc3RhYmxlIG92ZXIgdGltZSBjYW4gY2hhbmdlIGZvciByZWFzb25zIG5vdCBjYXB0dXJlZCBpbiB0aGUgZGF0YSwgcmVzdWx0aW5nIGluIHN5c3RlbWF0aWMgZm9yZWNhc3QgZXJyb3JzLiBBbnkgYXNzZXNzbWVudCBvZiBtb2RlbCBwZXJmb3JtYW5jZSBzaG91bGQgYmUgaW50ZXJwcmV0ZWQgaW4gY29udGV4dCwgd2l0aCB0aGUgcG9zc2liaWxpdHkgb2YgZXh0ZXJuYWwgc2hpZnRzIHRha2VuIGludG8gYWNjb3VudC4gUHJlZGljdGlvbiBzdGFuZGFyZCBpbnRlcnZhbHMgYXJlIGJldHRlciB0aGFuIHJlbHlpbmcgb24gcG9pbnQgZm9yZWNhc3RzLCBidXQgdGhlc2Ugc3RpbGwgYXNzdW1lIHRoYXQgdGhlIHRyYWluaW5nIGRhdGEgY29udGFpbnMgc2hvY2tzIHNpbWlsYXIgdG8gdGhvc2UgdGhhdCB3aWxsIG9jY3VyIGluIHRoZSBmdXR1cmU6IHRoYXQgaXMsIHRoZXkgY2FuIGRlYWwgd2l0aCAia25vd24gdW5rbm93bnMiLCBidXQgbm90ICJ1bmtub3duIHVua25vd25zIi4KCiMjIyBNYW55IG1vZGVscyB2cyBvbmUgbW9kZWwKCkluIHRoaXMgY2FzZSBzdHVkeSwgdGhlIGBhdXNfcmV0YWlsYCBkYXRhc2V0IGNvbnRhaW5lZCAxNTAgc2VwYXJhdGUgdGltZSBzZXJpZXMsIG9uZSBmb3IgZXZlcnkgY29tYmluYXRpb24gb2Ygc3RhdGUgYW5kIGluZHVzdHJ5LiBFYWNoIG9mIG91ciBtb2RlbHMsIGxpa2UgYGFyYCBhbmQgYGV0c2AsIGlzIGFjdHVhbGx5IGEgX2ZhbWlseV8gb2YgbW9kZWxzLCBvbmUgcGVyIHRpbWUgc2VyaWVzLiBUaGlzIHNvLWNhbGxlZCBfbWFueS1tb2RlbHNfIGFwcHJvYWNoIHRvIGZvcmVjYXN0aW5nIGNhbiBiZSBjb250cmFzdGVkIHRvIHRoZSBfb25lIG1vZGVsXyBhcHByb2FjaCwgd2hlcmUgd2UgZml0IGEgc2luZ2xlIG1vZGVsIHRvIHRoZSBlbnRpcmUgZGF0YXNldCwgYW5kIHVzZSB2YXJpYWJsZXMgbGlrZSBzdGF0ZSBhbmQgaW5kdXN0cnkgYXMgcHJlZGljdG9yIHZhcmlhYmxlcy4KCkluIGdlbmVyYWwsIHRoZSBtYW55LW1vZGVscyBhcHByb2FjaCBvZnRlbiBwcm9kdWNlcyBiZXR0ZXIgcmVzdWx0cyB0aGFuIHVzaW5nIG9uZSBtb2RlbC4gVGhlIHJlYXNvbiBmb3IgdGhpcyBjYW4gYmUgc2VlbiBpbiB0aGUgdGltZSBwbG90cyBvZiB0aGUgaW5kaXZpZHVhbCBzdGF0ZXMgYW5kIGluZHVzdHJpZXM6IHRoZSB0cmVuZHMsIGFuZCBzZWFzb25hbCBwYXR0ZXJucywgdmFyeSBjb25zaWRlcmFibHkgZnJvbSBvbmUgdGltZSBzZXJpZXMgdG8gdGhlIG5leHQuIEl0IGlzIG9mdGVuIGRpZmZpY3VsdCB0byBjYXB0dXJlIHRoaXMgc3lzdGVtYXRpYyB2YXJpYXRpb24gaW4gYSBzaW5nbGUgbW9kZWwuCgpUaGUgdGlkeXZlcnRzIGZyYW1ld29yayBjdXJyZW50bHkgb25seSBzdXBwb3J0cyB0aGUgbWFueS1tb2RlbHMgYXBwcm9hY2gsIGJ1dCB3b3JrIGlzIGluIHByb2dyZXNzIHRvIHN1cHBvcnQgb25lLW1vZGVsIGFzIHdlbGwuCgojIyMgUmVncmVzc2lvbi1iYXNlZCBtb2RlbGxpbmcKClNvbWV0aGluZyB0aGF0IGhhcyBub3QgYmVlbiBhdHRlbXB0ZWQgaGVyZSBpcyBhIHJlZ3Jlc3Npb24tYmFzZWQgYXBwcm9hY2guIEluIHRoZSBkaXNjdXNzaW9uIG9mIHRoZSBmb3JlY2FzdCBwZXJmb3JtYW5jZSBieSBzdGF0ZSwgaXQgd2FzIG1lbnRpb25lZCB0aGF0IHNvbWUgc3RhdGVzIGhhZCBiZWVuIGhhcmRlciBoaXQgdGhhbiBvdGhlcnMgYnkgdGhlIGdsb2JhbCBmaW5hbmNpYWwgY3Jpc2lzIHdoaWNoIGFmZmVjdGVkIHRoZSBhY2N1cmFjeSBvZiB0aGUgZm9yZWNhc3RzLiBJdCBjb3VsZCBiZSBpbWFnaW5lZCB0aGF0IGlmIHdlIGhhZCBzdWl0YWJsZSBkYXRhIG9uIGVjb25vbWljIGluZGljYXRvcnMtLS1ob3VzZWhvbGQgaW5jb21lLCB1bmVtcGxveW1lbnQsIGluZmxhdGlvbiwgZXRjLS0td2UgY291bGQgdXNlIHRoZXNlIGFzIHByZWRpY3RvciB2YXJpYWJsZXMgaW4gYSByZWdyZXNzaW9uIG1vZGVsIGZvciByZXRhaWwgdHVybm92ZXIuIFRoaXMgd291bGQgYWxsb3cgdXMgdG8gY2FwdHVyZSB0aGUgZGlmZmVyZW50IHRyZW5kcyBieSBzdGF0ZSwgYW5kIHRodXMgZ2VuZXJhdGUgbW9yZSBhY2N1cmF0ZSBmb3JlY2FzdHMuCgpUaGUgbWFpbiBkcmF3YmFjayBvZiByZWdyZXNzaW9uLWJhc2VkIGZvcmVjYXN0aW5nIGlzIHRoYXQgdmFsdWVzIGZvciB0aGUgcHJlZGljdG9yIHZhcmlhYmxlcyB0aGVtc2VsdmVzIG11c3QgYmUgYXZhaWxhYmxlIGluIHRoZSBwZXJpb2RzIGZvciB3aGljaCBhIGZvcmVjYXN0IGlzIHJlcXVpcmVkLiBPYnRhaW5pbmcgdGhlc2UgdmFsdWVzIGlzIG9mdGVuIGEgZGlmZmljdWx0IGZvcmVjYXN0aW5nIHByb2JsZW0gaW4gaXRzIG93biByaWdodCwgYW5kIGZvcmVjYXN0aW5nIGVjb25vbWljIGluZGljYXRvcnMgaW4gcGFydGljdWxhciBpcyBub3RvcmlvdXNseSBoYXJkLiBIZW5jZSB3ZSB3aWxsIG5vdCBoYXZlIGdhaW5lZCBhbnl0aGluZy4KCk5ldmVydGhlbGVzcywgaW4gc29tZSBjaXJjdW1zdGFuY2VzIGEgcmVncmVzc2lvbi1iYXNlZCBhcHByb2FjaCBjYW4gYmUgY29tcGV0aXRpdmUgd2l0aCB1bml2YXJpYXRlIGZvcmVjYXN0aW5nLiBBbiBleGFtcGxlIGlzIHJldGFpbCBkZW1hbmQgYXQgdGhlIGluZGl2aWR1YWwtcHJvZHVjdCBsZXZlbCwgd2hlcmUgd2UgaGF2ZSBhIGxhcmdlIHNldCBvZiBwcm9kdWN0IGZlYXR1cmVzIHdpdGggd2hpY2ggdG8gbW9kZWwgZGVtYW5kLCBhbmQgdGhlc2UgZmVhdHVyZXMgY2FuIHJlYXNvbmFibHkgYmUgYXNzdW1lZCB0byBiZSBrbm93biAob3IgY29udHJvbGxhYmxlKSBpbiB0aGUgZnV0dXJlLiBXaXRoIGVub3VnaCBkYXRhLCBtYWNoaW5lIGxlYXJuaW5nIGFsZ29yaXRobXMgc3VjaCBhcyBncmFkaWVudCBib29zdGluZyBvciBkZWVwIGxlYXJuaW5nIG5ldHdvcmtzIGNhbiBiZSB1c2VkIHRvIGZpdCBvbmUgbGFyZ2UgbW9kZWwgdG8gdGhlIGVudGlyZSBkYXRhc2V0LCBhbmQgdGhlIHByZWRpY3Rpb25zIHVzZWQgZm9yIGZvcmVjYXN0aW5nLiBTdWNoIG1vZGVscyBhcmUgYmV5b25kIHRoZSBzY29wZSBvZiB0aGlzIGNhc2Ugc3R1ZHkuCgoK
Comments
Risks of forecasting
There is a particularly timely and important observation to make regarding this dataset. From above, we saw that updating the models to use the data up to 2013 gave better forecast accuracy, especially for NSW and Victoria. Assuming that we were only interested in these two states, what would happen if we were to use the models to obtain forecasts for 2020 and beyond? Despite the good results on past data, they would almost certainly be very wide of the mark. This is because even the best model could not possibly anticipate the massive global recession caused by the COVID-19 pandemic. (Of course, the same would apply for any subjective forecast based on expert knowledge, so this is not an endorsement of judgemental forecasting.)
These results demonstrate the risks inherent in forecasting, especially when there is a strong trend. Even a trend that appears to be stable over time can change for reasons not captured in the data, resulting in systematic forecast errors. Any assessment of model performance should be interpreted in context, with the possibility of external shifts taken into account. Prediction standard intervals are better than relying on point forecasts, but these still assume that the training data contains shocks similar to those that will occur in the future: that is, they can deal with “known unknowns”, but not “unknown unknowns”.
Many models vs one model
In this case study, the
aus_retail
dataset contained 150 separate time series, one for every combination of state and industry. Each of our models, likear
andets
, is actually a family of models, one per time series. This so-called many-models approach to forecasting can be contrasted to the one model approach, where we fit a single model to the entire dataset, and use variables like state and industry as predictor variables.In general, the many-models approach often produces better results than using one model. The reason for this can be seen in the time plots of the individual states and industries: the trends, and seasonal patterns, vary considerably from one time series to the next. It is often difficult to capture this systematic variation in a single model.
The tidyverts framework currently only supports the many-models approach, but work is in progress to support one-model as well.
Regression-based modelling
Something that has not been attempted here is a regression-based approach. In the discussion of the forecast performance by state, it was mentioned that some states had been harder hit than others by the global financial crisis which affected the accuracy of the forecasts. It could be imagined that if we had suitable data on economic indicators—household income, unemployment, inflation, etc—we could use these as predictor variables in a regression model for retail turnover. This would allow us to capture the different trends by state, and thus generate more accurate forecasts.
The main drawback of regression-based forecasting is that values for the predictor variables themselves must be available in the periods for which a forecast is required. Obtaining these values is often a difficult forecasting problem in its own right, and forecasting economic indicators in particular is notoriously hard. Hence we will not have gained anything.
Nevertheless, in some circumstances a regression-based approach can be competitive with univariate forecasting. An example is retail demand at the individual-product level, where we have a large set of product features with which to model demand, and these features can reasonably be assumed to be known (or controllable) in the future. With enough data, machine learning algorithms such as gradient boosting or deep learning networks can be used to fit one large model to the entire dataset, and the predictions used for forecasting. Such models are beyond the scope of this case study.