The AI agent is a tool-calling orchestration layer that sits on top of the core finnts pipeline. It uses an LLM to:
You keep control through a few inputs (data, horizon, optional regressors, performance goal, iteration budget); the agent does the rest.
Core Agent Functions:
iterate_forecast()
: Run the agent to iterate toward a
best forecast run.update_forecast()
: Update forecasts with new data,
using models trained in previous agent runs, optionally re-invoking the
agent if accuracy degrades.ask_agent()
: Ask natural language questions about your
forecast results and get data-driven answers.agent_version
and run_id
) and log files.Use these helpers to retrieve outputs:
get_best_agent_run(agent_info)
get_agent_forecast(agent_info)
ask_agent(agent_info, question)
finnts
Set up environment variables for Azure OpenAI (example):
# Sys.setenv(
# AZURE_OPENAI_ENDPOINT = "<your-endpoint>",
# AZURE_OPENAI_API_KEY = "<your-key>",
# AZURE_OPENAI_API_VERSION = "<api-version>"
# )
Below is a complete flow using the built-in M4 monthly sample.
library(finnts)
library(dplyr)
project <- set_project_info(
project_name = "ai_agent_demo",
path = tempdir(), # or a persistent folder
combo_variables = c("id"),
target_variable = "value",
date_type = "month", # day|week|month|quarter|year
fiscal_year_start = 1 # fiscal month (1 = Jan)
)
Tip:
path
controls where logs/forecasts/EDA artifacts are saved.
Supports local filesystem, Azure Blob (viaAzureStor::blob_container
), or Microsoft 365 drives (ms365r
) viastorage_object
.
Date
(class
Date
).target_variable
.
hist_data <- timetk::m4_monthly %>%
dplyr::filter(date >= as.Date("2013-01-01")) %>%
dplyr::rename(Date = date) %>%
dplyr::mutate(id = as.character(id))
driver_llm
: executes the workflow via tool
calling.reason_llm
(optional): does the chain-of-thought style
reasoning / EDA digestion to choose Finn run inputs step-by-step.
driver_llm <- ellmer::chat_azure_openai(model = "gpt-4o-mini")
reason_llm <- ellmer::chat_azure_openai(model = "o4-mini") # can be the same or a more advanced reasoning model
agent <- set_agent_info(
project_info = project,
driver_llm = driver_llm,
reason_llm = reason_llm, # optional but recommended
input_data = hist_data,
forecast_horizon = 6, # number of future periods
external_regressors = NULL, # e.g., c("Price","Promo")
allow_hierarchical_forecast = FALSE, # set TRUE to let agent use hierarchies
overwrite = TRUE # start a fresh run_id if inputs changed
)
This writes the versioned inputs into path/input_data/
(hashed by combo/run) and logs the new
agent_version
/run_id
.
iterate_forecast(
agent_info = agent,
weighted_mape_goal = 0.05, # your accuracy target of 5%
max_iter = 3, # stop after N iterations if not hitting goal
)
What happens under the hood:
max_iter
).
best_runs <- get_best_agent_run(agent_info = agent, full_run_info = TRUE)
head(best_runs)
fcst <- get_agent_forecast(agent_info = agent)
head(fcst)
best_runs
summarizes, for each time series combo, the
best run inputs when calling the Finn forecast process.fcst
returns the consolidated forecast table (if
hierarchical reconciliation was used, this is the reconciled
output).After running iterate_forecast()
or
update_forecast()
, you can use ask_agent()
to
ask natural language questions about your results. The agent analyzes
your forecast data, model configurations, and EDA outputs to provide
data-driven answers.
ask_agent()
creates an LLM-driven workflow that: 1.
Plans the analysis steps needed to answer your question
2. Executes R code to analyze the relevant data 3.
Generates a natural language answer based on the
results
# Ask about forecast accuracy
answer <- ask_agent(
agent_info = agent,
question = "What is the average weighted MAPE across all time series?"
)
# Ask about models used
answer <- ask_agent(
agent_info = agent,
question = "Which models were selected as best for each time series?"
)
# Ask about data quality
answer <- ask_agent(
agent_info = agent,
question = "Were there any missing values or outliers in the data?"
)
# Ask about specific forecasts
answer <- ask_agent(
agent_info = agent,
question = "What are the forecasted values for M750 for the next 3 months?"
)
# Ask about time series characteristics
answer <- ask_agent(
agent_info = agent,
question = "Which time series show strong seasonality patterns?"
)
# Ask comparative questions
answer <- ask_agent(
agent_info = agent,
question = "Which time series have the highest forecast uncertainty?"
)
ask_agent()
has access to three main data sources:
get_agent_forecast()
): Future predictions, back-test
results, model selections, confidence intervalsget_best_agent_run()
): Feature engineering settings,
transformations applied, model hyperparametersget_eda_data()
): Time
series characteristics, seasonality, stationarity tests, data quality
metricsThe agent automatically determines which data sources to use based on your question.
When you have new input data, keep the same project
and create a new agent run with updated
input_data
. Then call update_forecast()
:
# suppose you've appended more months to hist_data:
hist_data2 <- hist_data %>% dplyr::filter(Date <= as.Date("2016-06-01"))
agent2 <- set_agent_info(
project_info = project,
driver_llm = driver_llm,
reason_llm = reason_llm,
input_data = hist_data2,
forecast_horizon = 6,
overwrite = TRUE # required to create a new agent version when running update_forecast()
)
update_forecast(
agent_info = agent2,
weighted_mape_goal = 0.05,
allow_iterate_forecast = TRUE, # if degradation detected, allow the agent to re-iterate
max_iter = 2 # cap re-iteration cost
)
updated_fcst <- get_agent_forecast(agent2)
# Ask questions about the updated forecast
answer <- ask_agent(
agent_info = agent2,
question = "Summarize the forecast accuracy."
)
What update_forecast()
does:
allow_iterate_forecast = TRUE
, it will
invoke the iterate loop (bounded by
max_iter
) to recover accuracy.Set allow_hierarchical_forecast = TRUE
in
set_agent_info()
to let the agent detect:
bottoms_up
(default),When the agent selects a hierarchy, it will: - train at the selected
aggregate(s), - reconcile down to the bottom level, -
produce a reconciled get_agent_forecast()
output.
For background and manual control, see the “Hierarchical Forecasting” vignette.
If you pass
external_regressors = c("Price","Promo", ...)
:
input_data
for the selected
columns.parallel_processing = "local_machine"
runs each time
series in parallel across local cores.parallel_processing = "spark"
executes combos on an
Azure Databricks/Synapse Spark cluster (see “Parallel
Processing” vignette).inner_parallel = TRUE
parallelizes work
inside a combo (useful when outer parallelism is
NULL
or "spark"
).num_cores = NULL
defaults to all cores minus
one.You normally won’t need this, but for audits:
path/input_data/…
path/eda/…
path/logs/…
(includes the hashed
*-agent_run.csv
and *-agent_best_run.*
for
each version)path/forecasts/…
(condensed, reconciled or
per-model outputs)Use the helpers first; dig into files only if you must.