automl.automl

size

def size(learner_classes: dict, config: dict) -> float

Size function.

Returns:

The mem size in bytes for a config.

AutoML Objects

class AutoML(BaseEstimator)

The AutoML class.

Example:

automl = AutoML()
automl_settings = {
    "time_budget": 60,
    "metric": 'accuracy',
    "task": 'classification',
    "log_file_name": 'mylog.log',
}
automl.fit(X_train = X_train, y_train = y_train, **automl_settings)

init

def __init__(**settings)

Constructor.

Many settings in fit() can be passed to the constructor too. If an argument in fit() is provided, it will override the setting passed to the constructor. If an argument in fit() is not provided but provided in the constructor, the value passed to the constructor will be used.

Arguments:

metric - A string of the metric name or a function, e.g., 'accuracy', 'roc_auc', 'roc_auc_ovr', 'roc_auc_ovo', 'roc_auc_weighted', 'roc_auc_ovo_weighted', 'roc_auc_ovr_weighted', 'f1', 'micro_f1', 'macro_f1', 'log_loss', 'mae', 'mse', 'r2', 'mape'. Default is 'auto'. For a full list of supported built-in metrics, please refer to https://microsoft.github.io/FLAML/docs/Use-Cases/Task-Oriented-AutoML#optimization-metric If passing a customized metric function, the function needs to have the following input arguments:

def custom_metric(
    X_test, y_test, estimator, labels,
    X_train, y_train, weight_test=None, weight_train=None,
    config=None, groups_test=None, groups_train=None,
):
    return metric_to_minimize, metrics_to_log

which returns a float number as the minimization objective, and a dictionary as the metrics to log. E.g.,

def custom_metric(
    X_val, y_val, estimator, labels,
    X_train, y_train, weight_val=None, weight_train=None,
    *args,
):
    from sklearn.metrics import log_loss
    import time

    start = time.time()
    y_pred = estimator.predict_proba(X_val)
    pred_time = (time.time() - start) / len(X_val)
    val_loss = log_loss(y_val, y_pred, labels=labels, sample_weight=weight_val)
    y_pred = estimator.predict_proba(X_train)
    train_loss = log_loss(y_train, y_pred, labels=labels, sample_weight=weight_train)
    alpha = 0.5
    return val_loss * (1 + alpha) - alpha * train_loss, {
        "val_loss": val_loss,
        "train_loss": train_loss,
        "pred_time": pred_time,
    }

Note: When passing a custom metric function, pass the function itself (e.g., metric=custom_metric), not the result of calling it (e.g., metric=custom_metric(...)). FLAML will call your function internally during the training process.

task - A string of the task type, e.g., 'classification', 'regression', 'ts_forecast', 'rank', 'seq-classification', 'seq-regression', 'summarization', or an instance of the Task class.
n_jobs - An integer of the number of threads for training | default=-1. Use all available resources when n_jobs == -1.
log_file_name - A string of the log file name | default="". To disable logging, set it to be an empty string "".
estimator_list - A list of strings for estimator names, or 'auto'. e.g., ['lgbm', 'xgboost', 'xgb_limitdepth', 'catboost', 'rf', 'extra_tree'].
time_budget - A float number of the time budget in seconds. Use -1 if no time limit.
max_iter - An integer of the maximal number of iterations.
sample - A boolean of whether to sample the training data during search.
ensemble - boolean or dict | default=False. Whether to perform ensemble after search. Can be a dict with keys 'passthrough' and 'final_estimator' to specify the passthrough and final_estimator in the stacker. The dict can also contain 'n_jobs' as the key to specify the number of jobs for the stacker.
Note - The hyperparameters of a custom 'final_estimator' are NOT automatically tuned. If you provide an estimator instance (e.g., CatBoostClassifier()), it will use the parameters you specified or their defaults. If 'final_estimator' is not provided, the best model found during the search will be used as the final estimator.
eval_method - A string of resampling strategy, one of ['auto', 'cv', 'holdout'].
split_ratio - A float of the valiation data percentage for holdout.
n_splits - An integer of the number of folds for cross - validation.
log_type - Specifies which logs to save. One of ['better', 'all']. Default is 'better'.
- 'better': Logs configs and models (if model_history is True) only when the loss improves, to log_file_name and MLflow, respectively.
- 'all': Logs all configs and models (if model_history is True), regardless of performance.
Note - Configs are always logged to MLflow if MLflow logging is enabled.
model_history - A boolean of whether to keep the best model per estimator. Make sure memory is large enough if setting to True. Default False.
log_training_metric - A boolean of whether to log the training metric for each model.
mem_thres - A float of the memory size constraint in bytes.
pred_time_limit - A float of the prediction latency constraint in seconds. It refers to the average prediction time per row in validation data.
train_time_limit - None or a float of the training time constraint in seconds for each trial. Only valid for sequential search.
verbose - int, default=3 | Controls the verbosity, higher means more messages.
verbose=0 - logger level = CRITICAL
verbose=1 - logger level = ERROR
verbose=2 - logger level = WARNING
verbose=3 - logger level = INFO
verbose=4 - logger level = DEBUG
verbose>5 - logger level = NOTSET
retrain_full - bool or str, default=True | whether to retrain the selected model on the full training data when using holdout. True - retrain only after search finishes; False - no retraining; 'budget' - do best effort to retrain without violating the time budget.
split_type - str or splitter object, default="auto" | the data split type.
- A valid splitter object is an instance of a derived class of scikit-learn KFold and have split and get_n_splits methods with the same signatures. Set eval_method to "cv" to use the splitter object.
- Valid str options depend on different tasks. For classification tasks, valid choices are ["auto", 'stratified', 'uniform', 'time', 'group']. "auto" -> stratified. For regression tasks, valid choices are ["auto", 'uniform', 'time', 'group']. "auto" -> uniform. For time series forecast tasks, must be "auto" or 'time'. For ranking task, must be "auto" or 'group'.
hpo_method - str, default="auto" | The hyperparameter optimization method. By default, CFO is used for sequential search and BlendSearch is used for parallel search. No need to set when using flaml's default search space or using a simple customized search space. When set to 'bs', BlendSearch is used. BlendSearch can be tried when the search space is complex, for example, containing multiple disjoint, discontinuous subspaces. When set to 'random', random search is used.
starting_points - A dictionary or a str to specify the starting hyperparameter config for the estimators | default="static". If str:
- if "data", use data-dependent defaults;
- if "data:path" use data-dependent defaults which are stored at path;
- if "static", use data-independent defaults. If dict, keys are the name of the estimators, and values are the starting hyperparameter configurations for the corresponding estimators. The value can be a single hyperparameter configuration dict or a list of hyperparameter configuration dicts. In the following code example, we get starting_points from the automl object and use them in the new_automl object. e.g.,

from flaml import AutoML
automl = AutoML()
X_train, y_train = load_iris(return_X_y=True)
automl.fit(X_train, y_train)
starting_points = automl.best_config_per_estimator

new_automl = AutoML()
new_automl.fit(X_train, y_train, starting_points=starting_points)

seed - int or None, default=None | The random seed for hpo.
n_concurrent_trials - [In preview] int, default=1 | The number of concurrent trials. When n_concurrent_trials > 1, flaml performes parallel tuning and installation of ray or spark is required: pip install flaml[ray] or pip install flaml[spark]. Please check here for more details about installing Spark.
keep_search_state - boolean, default=False | Whether to keep data needed for model search after fit(). By default the state is deleted for space saving.
preserve_checkpoint - boolean, default=True | Whether to preserve the saved checkpoint on disk when deleting automl. By default the checkpoint is preserved.
early_stop - boolean, default=False | Whether to stop early if the search is considered to converge.
force_cancel - boolean, default=False | Whether to forcely cancel Spark jobs if the search time exceeded the time budget.
mlflow_exp_name - str, default=None | The name of the mlflow experiment. This should be specified if enable mlflow autologging on Spark. Otherwise it will log all the results into the experiment of the same name as the basename of main entry file.
append_log - boolean, default=False | Whetehr to directly append the log records to the input log file if it exists.
auto_augment - boolean, default=True | Whether to automatically augment rare classes.
min_sample_size - int, default=MIN_SAMPLE_TRAIN | the minimal sample size when sample=True.
use_ray - boolean or dict. If boolean: default=False | Whether to use ray to run the training in separate processes. This can be used to prevent OOM for large datasets, but will incur more overhead in time. If dict: the dict contains the keywords arguments to be passed to ray.tune.run.
use_spark - boolean, default=False | Whether to use spark to run the training in parallel spark jobs. This can be used to accelerate training on large models and large datasets, but will incur more overhead in time and thus slow down training in some cases. GPU training is not supported yet when use_spark is True. For Spark clusters, by default, we will launch one trial per executor. However, sometimes we want to launch more trials than the number of executors (e.g., local mode). In this case, we can set the environment variable FLAML_MAX_CONCURRENT to override the detected num_executors. The final number of concurrent trials will be the minimum of n_concurrent_trials and num_executors.
free_mem_ratio - float between 0 and 1, default=0. The free memory ratio to keep during training.
metric_constraints - list, default=[] | The list of metric constraints. Each element in this list is a 3-tuple, which shall be expressed in the following format: the first element of the 3-tuple is the name of the metric, the second element is the inequality sign chosen from ">=" and "<=", and the third element is the constraint value. E.g., ('val_loss', '<=', 0.1). Note that all the metric names in metric_constraints need to be reported via the metrics_to_log dictionary returned by a customized metric function. The customized metric function shall be provided via the metric key word argument of the fit() function or the automl constructor. Find an example in the 4th constraint type in this doc. If pred_time_limit is provided as one of keyword arguments to fit() function or the automl constructor, flaml will automatically (and under the hood) add it as an additional element in the metric_constraints. Essentially 'pred_time_limit' specifies a constraint about the prediction latency constraint in seconds.
custom_hp - dict, default=None | The custom search space specified by user. It is a nested dict with keys being the estimator names, and values being dicts per estimator search space. In the per estimator search space dict, the keys are the hyperparameter names, and values are dicts of info ("domain", "init_value", and "low_cost_init_value") about the search space associated with the hyperparameter (i.e., per hyperparameter search space dict). When custom_hp is provided, the built-in search space which is also a nested dict of per estimator search space dict, will be updated with custom_hp. Note that during this nested dict update, the per hyperparameter search space dicts will be replaced (instead of updated) by the ones provided in custom_hp. Note that the value for "domain" can either be a constant or a sample.Domain object. e.g.,

custom_hp = {
     "transformer_ms": {
         "model_path": {
             "domain": "albert-base-v2",
         },
         "learning_rate": {
             "domain": tune.choice([1e-4, 1e-5]),
         }
     }
 }

skip_transform - boolean, default=False | Whether to pre-process data prior to modeling.
allow_label_overlap - boolean, default=True | For classification tasks with holdout evaluation, whether to allow label overlap between train and validation sets. When True (default), uses a fast strategy that adds the first instance of missing labels to the set that is missing them, which may create some overlap. When False, uses a precise but slower strategy that intelligently re-splits instances to avoid overlap when possible. Only affects classification tasks with holdout evaluation method.
fit_kwargs_by_estimator - dict, default=None | The user specified keywords arguments, grouped by estimator name. e.g.,

fit_kwargs_by_estimator = {
    "transformer": {
        "output_dir": "test/data/output/",
        "fp16": False,
    }
}

mlflow_logging - boolean, default=True | Whether to log the training results to mlflow. Not valid if mlflow is not installed.

getstate

def __getstate__()

Customize pickling to avoid serializing runtime-only objects.

MLflow's sklearn flavor serializes estimators via (cloud)pickle. During AutoML fitting we may attach an internal mlflow integration instance which holds concurrent.futures.Future objects and executors containing thread locks, which are not picklable.

config_history

@property
def config_history() -> dict

A dictionary of iter->(estimator, config, time), storing the best estimator, config, and the time when the best model is updated each time.

model

@property
def model()

An object with predict() and predict_proba() method (for classification), storing the best trained model.

best_model_for_estimator

def best_model_for_estimator(estimator_name: str)

Return the best model found for a particular estimator.

Arguments:

estimator_name - a str of the estimator's name.

Returns:

An object storing the best model for estimator_name. If model_history was set to False during fit(), then the returned model is untrained unless estimator_name is the best estimator. If model_history was set to True, then the returned model is trained.

best_estimator

@property
def best_estimator()

A string indicating the best estimator found.

best_iteration

@property
def best_iteration()

An integer of the iteration number where the best config is found.

best_config

@property
def best_config()

A dictionary of the best configuration.

The returned config dictionary can be used to:

Pass as starting_points to a new AutoML run.
Initialize the corresponding FLAML estimator directly.
Initialize the original model (e.g., LightGBM, XGBoost) after converting FLAML-specific parameters.

Notes:

The config contains FLAML's search space parameters, which may differ from the original model's parameters. For example, FLAML uses log_max_bin for LightGBM instead of max_bin. Use the FLAML estimator's config2params() method to convert to the original model's parameters.

Example:

from flaml import AutoML
from flaml.automl.model import LGBMEstimator
from lightgbm import LGBMClassifier
from sklearn.datasets import load_iris

X, y = load_iris(return_X_y=True)

# Train with AutoML
automl = AutoML()
automl.fit(X, y, task="classification", time_budget=10)

# Get the best config
best_config = automl.best_config
print("Best config:", best_config)
# Example output: {'n_estimators': 4, 'num_leaves': 4, 'min_child_samples': 20,
#                  'learning_rate': 0.1, 'log_max_bin': 8, ...}

# Option 1: Use FLAML estimator directly (handles parameter conversion internally)
flaml_estimator = LGBMEstimator(task="classification", **best_config)
flaml_estimator.fit(X, y)

# Option 2: Convert to original model parameters using config2params()
# This converts FLAML-specific params (e.g., log_max_bin -> max_bin)
original_params = flaml_estimator.params  # or use flaml_estimator.config2params(best_config)
print("Original model params:", original_params)
# Example output: {'n_estimators': 4, 'num_leaves': 4, 'min_child_samples': 20,
#                  'learning_rate': 0.1, 'max_bin': 255, ...}  # log_max_bin converted to max_bin

# Now use with original LightGBM
lgbm_model = LGBMClassifier(**original_params)
lgbm_model.fit(X, y)

best_config_per_estimator

@property
def best_config_per_estimator()

A dictionary of all estimators' best configuration.

Returns a dictionary where keys are estimator names (e.g., 'lgbm', 'xgboost') and values are the best hyperparameter configurations found for each estimator. The config may include FLAML_sample_size which indicates the sample size used during training.

This is useful for:

Passing as starting_points to a new AutoML run for warm-starting.
Comparing the best configurations across different estimators.
Initializing the original models after converting FLAML-specific parameters.

Notes:

The configs contain FLAML's search space parameters, which may differ from the original models' parameters. Use each estimator's config2params() method to convert to the original model's parameters.

Example:

from flaml import AutoML
from flaml.automl.model import LGBMEstimator, XGBoostEstimator
from lightgbm import LGBMClassifier
from xgboost import XGBClassifier
from sklearn.datasets import load_iris

X, y = load_iris(return_X_y=True)

# Train with AutoML
automl = AutoML()
automl.fit(X, y, task="classification", time_budget=30,
           estimator_list=['lgbm', 'xgboost'])

# Get best configs for all estimators
configs = automl.best_config_per_estimator
print(configs)
# Example output: {'lgbm': {'n_estimators': 4, 'num_leaves': 4, 'log_max_bin': 8, ...},
#                  'xgboost': {'n_estimators': 4, 'max_leaves': 4, ...}}

# Use as starting points for a new AutoML run (warm start)
new_automl = AutoML()
new_automl.fit(X, y, task="classification", time_budget=30,
               starting_points=configs)

# Or convert to original model parameters for direct use
if configs.get('lgbm'):
    lgbm_config = configs['lgbm'].copy()
    lgbm_config.pop('FLAML_sample_size', None)  # Remove FLAML internal param
    flaml_lgbm = LGBMEstimator(task="classification", **lgbm_config)
    original_lgbm_params = flaml_lgbm.params  # Converted params (log_max_bin -> max_bin), or use flaml_lgbm.config2params(lgbm_config)
    lgbm_model = LGBMClassifier(**original_lgbm_params)
    lgbm_model.fit(X, y)

if configs.get('xgboost'):
    xgb_config = configs['xgboost'].copy()
    xgb_config.pop('FLAML_sample_size', None)  # Remove FLAML internal param
    flaml_xgb = XGBoostEstimator(task="classification", **xgb_config)
    original_xgb_params = flaml_xgb.params  # Converted params
    xgb_model = XGBClassifier(**original_xgb_params)
    xgb_model.fit(X, y)

best_loss_per_estimator

@property
def best_loss_per_estimator()

A dictionary of all estimators' best loss.

best_loss

@property
def best_loss()

A float of the best loss found.

best_result

@property
def best_result()

Result dictionary for model trained with the best config.

metrics_for_best_config

@property
def metrics_for_best_config()

Returns a float of the best loss, and a dictionary of the auxiliary metrics to log associated with the best config. These two objects correspond to the returned objects by the customized metric function for the config with the best loss.

best_config_train_time

@property
def best_config_train_time()

A float of the seconds taken by training the best config.

supported_metrics

@property
def supported_metrics()

Returns a tuple of supported metrics for the task.

Returns:

metrics Tuple - sklearn metrics from sklearn package; huggingface metrics from datasets package; spark metrics from pyspark package

feature_transformer

@property
def feature_transformer()

Returns AutoML Transformer

label_transformer

@property
def label_transformer()

Returns AutoML label transformer

classes_

@property
def classes_()

A numpy array of shape (n_classes,) for class labels.

time_to_find_best_model

@property
def time_to_find_best_model() -> float

Time taken to find best model in seconds.

predict

def predict(X: np.ndarray | DataFrame | list[str] | list[list[str]]
            | psDataFrame, **pred_kwargs)

Predict label from features.

Arguments:

X - A numpy array or pandas dataframe or pyspark.pandas dataframe of featurized instances, shape n * m, or for time series forcast tasks: a pandas dataframe with the first column containing timestamp values (datetime type) or an integer n for the predict steps (only valid when the estimator is arima or sarimax). Other columns in the dataframe are assumed to be exogenous variables (categorical or numeric).
**pred_kwargs - Other key word arguments to pass to predict() function of the searched learners, such as per_device_eval_batch_size.

multivariate_X_test = DataFrame({
    'timeStamp': pd.date_range(start='1/1/2022', end='1/07/2022'),
    'categorical_col': ['yes', 'yes', 'no', 'no', 'yes', 'no', 'yes'],
    'continuous_col': [105, 107, 120, 118, 110, 112, 115]
})
model.predict(multivariate_X_test)

Returns:

A array-like of shape n * 1: each element is a predicted label for an instance.

predict_proba

def predict_proba(X, **pred_kwargs)

Predict the probability of each class from features, only works for classification problems.

Arguments:

X - A numpy array of featurized instances, shape n * m.
**pred_kwargs - Other key word arguments to pass to predict_proba() function of the searched learners, such as per_device_eval_batch_size.

Returns:

A numpy array of shape n * c. c is the # classes. Each element at (i, j) is the probability for instance i to be in class j.

preprocess

def preprocess(X: np.ndarray | DataFrame | list[str] | list[list[str]]
               | psDataFrame)

Preprocess data using task-level preprocessing.

This method applies task-level preprocessing transformations to the input data, including handling of data types, sparse matrices, and feature transformations that were learned during the fit phase. This should be called before any estimator-level preprocessing.

Arguments:

X - A numpy array or pandas dataframe or pyspark.pandas dataframe of featurized instances, shape n * m, or for time series forecast tasks: a pandas dataframe with the first column containing timestamp values (datetime type) or an integer n for the predict steps (only valid when the estimator is arima or sarimax). Other columns in the dataframe are assumed to be exogenous variables (categorical or numeric).

Returns:

Preprocessed data in the same format as input (numpy array, DataFrame, etc.).

Raises:

AttributeError - If the model has not been fitted yet.

Example:

```python
automl = AutoML()
automl.fit(X_train, y_train, task="classification")

# Apply task-level preprocessing to new data
X_test_preprocessed = automl.preprocess(X_test)

# Required when calling a single ensemble component directly, since
# `automl.model.estimators_[i]` was fitted on already-preprocessed
# data and cannot consume raw input (see issue `1136`):
automl_ensemble = AutoML()
automl_ensemble.fit(X_train, y_train, task="classification", ensemble=True)
X_test_preprocessed = automl_ensemble.preprocess(X_test)
component_pred = automl_ensemble.model.estimators_[0].predict(X_test_preprocessed)
```

add_learner

def add_learner(learner_name, learner_class)

Add a customized learner.

Arguments:

learner_name - A string of the learner's name.
learner_class - A subclass of flaml.automl.model.BaseEstimator.

get_estimator_from_log

def get_estimator_from_log(log_file_name: str, record_id: int,
                           task: str | Task)

Get the estimator from log file.

Arguments:

log_file_name - A string of the log file name.
record_id - An integer of the record ID in the file, 0 corresponds to the first trial.
task - A string of the task type, 'binary', 'multiclass', 'regression', 'ts_forecast', 'rank', or an instance of the Task class.

Returns:

An estimator object for the given configuration.

retrain_from_log

def retrain_from_log(log_file_name,
                     X_train=None,
                     y_train=None,
                     dataframe=None,
                     label=None,
                     time_budget=np.inf,
                     task: str | Task | None = None,
                     eval_method=None,
                     split_ratio=None,
                     n_splits=None,
                     split_type=None,
                     groups=None,
                     n_jobs=-1,
                     train_best=True,
                     train_full=False,
                     record_id=-1,
                     auto_augment=None,
                     custom_hp=None,
                     skip_transform=None,
                     preserve_checkpoint=True,
                     fit_kwargs_by_estimator=None,
                     **fit_kwargs)

Retrain from log file.

This function is intended to retrain the logged configurations. NOTE: In some rare case, the last config is early stopped to meet time_budget and it's the best config. But the logged config's ITER_HP (e.g., n_estimators) is not reduced.

Arguments:

log_file_name - A string of the log file name.
X_train - A numpy array or dataframe of training data in shape n*m. For time series forecast tasks, the first column of X_train must be the timestamp column (datetime type). Other columns in the dataframe are assumed to be exogenous variables (categorical or numeric).
y_train - A numpy array or series of labels in shape n*1.
dataframe - A dataframe of training data including label column. For time series forecast tasks, dataframe must be specified and should have at least two columns: timestamp and label, where the first column is the timestamp column (datetime type). Other columns in the dataframe are assumed to be exogenous variables (categorical or numeric).
label - A str of the label column name, e.g., 'label';
Note - If X_train and y_train are provided, dataframe and label are ignored; If not, dataframe and label must be provided.
time_budget - A float number of the time budget in seconds.
task - A string of the task type, e.g., 'classification', 'regression', 'ts_forecast', 'rank', 'seq-classification', 'seq-regression', 'summarization', or an instance of Task class.
eval_method - A string of resampling strategy, one of ['auto', 'cv', 'holdout'].
split_ratio - A float of the validation data percentage for holdout.
n_splits - An integer of the number of folds for cross-validation.
split_type - str or splitter object, default="auto" | the data split type.
- A valid splitter object is an instance of a derived class of scikit-learn KFold and have split and get_n_splits methods with the same signatures. Set eval_method to "cv" to use the splitter object.
- Valid str options depend on different tasks. For classification tasks, valid choices are ["auto", 'stratified', 'uniform', 'time', 'group']. "auto" -> stratified. For regression tasks, valid choices are ["auto", 'uniform', 'time', 'group']. "auto" -> uniform. For time series forecast tasks, must be "auto" or 'time'. For ranking task, must be "auto" or 'group'.
groups - None or array-like | Group labels (with matching length to y_train) or groups counts (with sum equal to length of y_train) for training data.
n_jobs - An integer of the number of threads for training | default=-1. Use all available resources when n_jobs == -1.
train_best - A boolean of whether to train the best config in the time budget; if false, train the last config in the budget.
train_full - A boolean of whether to train on the full data. If true, eval_method and sample_size in the log file will be ignored.
record_id - the ID of the training log record from which the model will be retrained. By default record_id = -1 which means this will be ignored. record_id = 0 corresponds to the first trial, and when record_id >= 0, time_budget will be ignored.
auto_augment - boolean, default=True | Whether to automatically augment rare classes.
custom_hp - dict, default=None | The custom search space specified by user Each key is the estimator name, each value is a dict of the custom search space for that estimator. Notice the domain of the custom search space can either be a value or a sample.Domain object.

custom_hp = {
    "transformer_ms": {
        "model_path": {
            "domain": "albert-base-v2",
        },
        "learning_rate": {
            "domain": tune.choice([1e-4, 1e-5]),
        }
    }
}

fit_kwargs_by_estimator - dict, default=None | The user specified keywords arguments, grouped by estimator name. e.g.,

fit_kwargs_by_estimator = {
    "transformer": {
        "output_dir": "test/data/output/",
        "fp16": False,
    }
}

**fit_kwargs - Other key word arguments to pass to fit() function of the searched learners, such as sample_weight. Below are a few examples of estimator-specific parameters:
period - int | forecast horizon for all time series forecast tasks. This is the number of time steps ahead to forecast (e.g., period=12 means forecasting 12 steps into the future). This represents the forecast horizon used during model training. Note: during prediction, the output length equals the length of X_test. FLAML automatically handles feature engineering for you - sklearn-based models (lgbm, rf, xgboost, etc.) will have lagged features created automatically, while time series native models (prophet, arima, sarimax) use their built-in forecasting capabilities. You do NOT need to manually create lagged features of the target variable.
gpu_per_trial - float, default = 0 | A float of the number of gpus per trial, only used by TransformersEstimator, XGBoostSklearnEstimator, and TemporalFusionTransformerEstimator.
group_ids - list of strings of column names identifying a time series, only used by TemporalFusionTransformerEstimator, required for 'ts_forecast_panel' task. group_ids is a parameter for TimeSeriesDataSet object from PyTorchForecasting. For other parameters to describe your dataset, refer to TimeSeriesDataSet PyTorchForecasting. To specify your variables, use static_categoricals, static_reals, time_varying_known_categoricals, time_varying_known_reals, time_varying_unknown_categoricals, time_varying_unknown_reals, variable_groups. To provide more information on your data, use max_encoder_length, min_encoder_length, lags.
log_dir - str, default = "lightning_logs" | Folder into which to log results for tensorboard, only used by TemporalFusionTransformerEstimator.
max_epochs - int, default = 20 | Maximum number of epochs to run training, only used by TemporalFusionTransformerEstimator.
batch_size - int, default = 64 | Batch size for training model, only used by TemporalFusionTransformerEstimator and TCNEstimator.

search_space

@property
def search_space() -> dict

Search space.

Must be called after fit(...) (use max_iter=0 and retrain_final=False to prevent actual fitting).

Returns:

A dict of the search space.

low_cost_partial_config

@property
def low_cost_partial_config() -> dict

Low cost partial config.

Returns:

A dict. (a) if there is only one estimator in estimator_list, each key is a hyperparameter name. (b) otherwise, it is a nested dict with 'ml' as the key, and a list of the low_cost_partial_configs as the value, corresponding to each learner's low_cost_partial_config; the estimator index as an integer corresponding to the cheapest learner is appended to the list at the end.

cat_hp_cost

@property
def cat_hp_cost() -> dict

Categorical hyperparameter cost

Returns:

A dict. (a) if there is only one estimator in estimator_list, each key is a hyperparameter name. (b) otherwise, it is a nested dict with 'ml' as the key, and a list of the cat_hp_cost's as the value, corresponding to each learner's cat_hp_cost; the cost relative to lgbm for each learner (as a list itself) is appended to the list at the end.

points_to_evaluate

@property
def points_to_evaluate() -> dict

Initial points to evaluate.

Returns:

A list of dicts. Each dict is the initial point for each learner.

resource_attr

@property
def resource_attr() -> str | None

Attribute of the resource dimension.

Returns:

A string for the sample size attribute (the resource attribute in AutoML) or None.

min_resource

@property
def min_resource() -> float | None

Attribute for pruning.

Returns:

A float for the minimal sample size or None.

max_resource

@property
def max_resource() -> float | None

Attribute for pruning.

Returns:

A float for the maximal sample size or None.

pickle

def pickle(output_file_name)

Serialize the AutoML instance to a pickle file.

Notes:

When the trained estimator(s) are Spark-based, they may hold references to SparkContext/SparkSession via Spark ML objects. Such objects are not safely picklable and can cause pickling/broadcast errors.

This method externalizes Spark ML models into an adjacent artifact directory and stores only lightweight metadata in the pickle.

load_pickle

@classmethod
def load_pickle(cls, input_file_name: str, load_spark_models: bool = True)

Load an AutoML instance saved by :meth:pickle.

Arguments:

input_file_name - Path to the pickle file created by :meth:pickle.
load_spark_models - Whether to load externalized Spark ML models back into the estimator objects. If False, Spark estimators will remain without their underlying Spark model and cannot be used for predict.

Returns:

The deserialized AutoML instance.

trainable

@property
def trainable() -> Callable[[dict], float | None]

Training function.

Returns:

A function that evaluates each config and returns the loss.

metric_constraints

@property
def metric_constraints() -> list

Metric constraints.

Returns:

A list of the metric constraints.

fit

def fit(X_train=None,
        y_train=None,
        dataframe=None,
        label=None,
        metric=None,
        task: str | Task | None = None,
        n_jobs=None,
        log_file_name=None,
        estimator_list=None,
        time_budget=None,
        max_iter=None,
        sample=None,
        ensemble=None,
        eval_method=None,
        log_type=None,
        model_history=None,
        split_ratio=None,
        n_splits=None,
        log_training_metric=None,
        mem_thres=None,
        pred_time_limit=None,
        train_time_limit=None,
        X_val=None,
        y_val=None,
        sample_weight_val=None,
        groups_val=None,
        groups=None,
        verbose=None,
        retrain_full=None,
        split_type=None,
        learner_selector=None,
        hpo_method=None,
        starting_points=None,
        seed=None,
        n_concurrent_trials=None,
        keep_search_state=None,
        preserve_checkpoint=True,
        early_stop=None,
        force_cancel=None,
        append_log=None,
        auto_augment=None,
        min_sample_size=None,
        use_ray=None,
        use_spark=None,
        free_mem_ratio=0,
        metric_constraints=None,
        custom_hp=None,
        time_col=None,
        cv_score_agg_func=None,
        skip_transform=None,
        allow_label_overlap=True,
        mlflow_logging=None,
        fit_kwargs_by_estimator=None,
        mlflow_exp_name=None,
        **fit_kwargs)

Find a model for a given task.

Arguments:

X_train - A numpy array or a pandas dataframe of training data in shape (n, m). For time series forecsat tasks, the first column of X_train must be the timestamp column (datetime type). Other columns in the dataframe are assumed to be exogenous variables (categorical or numeric). When using ray, X_train can be a ray.ObjectRef.
y_train - A numpy array or a pandas series of labels in shape (n, ).
dataframe - A dataframe of training data including label column. For time series forecast tasks, dataframe must be specified and must have at least two columns, timestamp and label, where the first column is the timestamp column (datetime type). Other columns in the dataframe are assumed to be exogenous variables (categorical or numeric). When using ray, dataframe can be a ray.ObjectRef.
label - A str of the label column name for, e.g., 'label';
Note - If X_train and y_train are provided, dataframe and label are ignored; If not, dataframe and label must be provided.
metric - A string of the metric name or a function, e.g., 'accuracy', 'roc_auc', 'roc_auc_ovr', 'roc_auc_ovo', 'roc_auc_weighted', 'roc_auc_ovo_weighted', 'roc_auc_ovr_weighted', 'f1', 'micro_f1', 'macro_f1', 'log_loss', 'mae', 'mse', 'r2', 'mape'. Default is 'auto'. For a full list of supported built-in metrics, please refer to https://microsoft.github.io/FLAML/docs/Use-Cases/Task-Oriented-AutoML#optimization-metric If passing a customized metric function, the function needs to have the following input arguments:

def custom_metric(
    X_test, y_test, estimator, labels,
    X_train, y_train, weight_test=None, weight_train=None,
    config=None, groups_test=None, groups_train=None,
):
    return metric_to_minimize, metrics_to_log

which returns a float number as the minimization objective, and a dictionary as the metrics to log. E.g.,

def custom_metric(
    X_val, y_val, estimator, labels,
    X_train, y_train, weight_val=None, weight_train=None,
    *args,
):
    from sklearn.metrics import log_loss
    import time

    start = time.time()
    y_pred = estimator.predict_proba(X_val)
    pred_time = (time.time() - start) / len(X_val)
    val_loss = log_loss(y_val, y_pred, labels=labels, sample_weight=weight_val)
    y_pred = estimator.predict_proba(X_train)
    train_loss = log_loss(y_train, y_pred, labels=labels, sample_weight=weight_train)
    alpha = 0.5
    return val_loss * (1 + alpha) - alpha * train_loss, {
        "val_loss": val_loss,
        "train_loss": train_loss,
        "pred_time": pred_time,
    }

task - A string of the task type, e.g., 'classification', 'regression', 'ts_forecast_regression', 'ts_forecast_classification', 'rank', 'seq-classification', 'seq-regression', 'summarization', or an instance of Task class
n_jobs - An integer of the number of threads for training | default=-1. Use all available resources when n_jobs == -1.
log_file_name - A string of the log file name | default="". To disable logging, set it to be an empty string "".
estimator_list - A list of strings for estimator names, or 'auto'. e.g., ['lgbm', 'xgboost', 'xgb_limitdepth', 'catboost', 'rf', 'extra_tree'].
time_budget - A float number of the time budget in seconds. Use -1 if no time limit.
max_iter - An integer of the maximal number of iterations.
NOTE - when both time_budget and max_iter are unspecified, only one model will be trained per estimator.
sample - A boolean of whether to sample the training data during search.
ensemble - boolean or dict | default=False. Whether to perform ensemble after search. Can be a dict with keys 'passthrough' and 'final_estimator' to specify the passthrough and final_estimator in the stacker. The dict can also contain 'n_jobs' as the key to specify the number of jobs for the stacker.
Note - The hyperparameters of a custom 'final_estimator' are NOT automatically tuned. If you provide an estimator instance (e.g., CatBoostClassifier()), it will use the parameters you specified or their defaults. If 'final_estimator' is not provided, the best model found during the search will be used as the final estimator.
eval_method - A string of resampling strategy, one of ['auto', 'cv', 'holdout'].
split_ratio - A float of the valiation data percentage for holdout.
n_splits - An integer of the number of folds for cross - validation.
log_type - A string of the log type, one of ['better', 'all']. 'better' only logs configs with better loss than previos iters 'all' logs all the tried configs.
model_history - A boolean of whether to keep the trained best model per estimator. Make sure memory is large enough if setting to True. Default value is False. If False, best_model_for_estimator would return a untrained model for non-best learner.
log_training_metric - A boolean of whether to log the training metric for each model.
mem_thres - A float of the memory size constraint in bytes.
pred_time_limit - A float of the prediction latency constraint in seconds. It refers to the average prediction time per row in validation data.
train_time_limit - None or a float of the training time constraint in seconds for each trial. Only valid for sequential search.
X_val - None or a numpy array or a pandas dataframe of validation data.
y_val - None or a numpy array or a pandas series of validation labels.
sample_weight_val - None or a numpy array of the sample weight of validation data of the same shape as y_val.
groups_val - None or array-like | group labels (with matching length to y_val) or group counts (with sum equal to length of y_val) for validation data. Need to be consistent with groups.
groups - None or array-like | Group labels (with matching length to y_train) or groups counts (with sum equal to length of y_train) for training data.
verbose - int, default=3 | Controls the verbosity, higher means more messages.
verbose=0 - logger level = CRITICAL
verbose=1 - logger level = ERROR
verbose=2 - logger level = WARNING
verbose=3 - logger level = INFO
verbose=4 - logger level = DEBUG
verbose>5 - logger level = NOTSET
retrain_full - bool or str, default=True | whether to retrain the selected model on the full training data when using holdout. True - retrain only after search finishes; False - no retraining; 'budget' - do best effort to retrain without violating the time budget.
split_type - str or splitter object, default="auto" | the data split type.
- A valid splitter object is an instance of a derived class of scikit-learn KFold and have split and get_n_splits methods with the same signatures. Set eval_method to "cv" to use the splitter object.
- Valid str options depend on different tasks. For classification tasks, valid choices are ["auto", 'stratified', 'uniform', 'time', 'group']. "auto" -> stratified. For regression tasks, valid choices are ["auto", 'uniform', 'time', 'group']. "auto" -> uniform. For time series forecast tasks, must be "auto" or 'time'. For ranking task, must be "auto" or 'group'.
hpo_method - str, default="auto" | The hyperparameter optimization method. By default, CFO is used for sequential search and BlendSearch is used for parallel search. No need to set when using flaml's default search space or using a simple customized search space. When set to 'bs', BlendSearch is used. BlendSearch can be tried when the search space is complex, for example, containing multiple disjoint, discontinuous subspaces. When set to 'random', random search is used.
starting_points - A dictionary or a str to specify the starting hyperparameter config for the estimators | default="data". If str:
- if "data", use data-dependent defaults;
- if "data:path" use data-dependent defaults which are stored at path;
- if "static", use data-independent defaults. If dict, keys are the name of the estimators, and values are the starting hyperparameter configurations for the corresponding estimators. The value can be a single hyperparameter configuration dict or a list of hyperparameter configuration dicts. In the following code example, we get starting_points from the automl object and use them in the new_automl object. e.g.,

from flaml import AutoML
automl = AutoML()
X_train, y_train = load_iris(return_X_y=True)
automl.fit(X_train, y_train)
starting_points = automl.best_config_per_estimator

new_automl = AutoML()
new_automl.fit(X_train, y_train, starting_points=starting_points)

seed - int or None, default=None | The random seed for hpo.
n_concurrent_trials - [In preview] int, default=1 | The number of concurrent trials. When n_concurrent_trials > 1, flaml performes parallel tuning and installation of ray or spark is required: pip install flaml[ray] or pip install flaml[spark]. Please check here for more details about installing Spark.
keep_search_state - boolean, default=False | Whether to keep data needed for model search after fit(). By default the state is deleted for space saving.
preserve_checkpoint - boolean, default=True | Whether to preserve the saved checkpoint on disk when deleting automl. By default the checkpoint is preserved.
early_stop - boolean, default=False | Whether to stop early if the search is considered to converge.
force_cancel - boolean, default=False | Whether to forcely cancel the PySpark job if overtime.
mlflow_exp_name - str, default=None | The name of the mlflow experiment. This should be specified if enable mlflow autologging on Spark. Otherwise it will log all the results into the experiment of the same name as the basename of main entry file.
append_log - boolean, default=False | Whetehr to directly append the log records to the input log file if it exists.
auto_augment - boolean, default=True | Whether to automatically augment rare classes.
min_sample_size - int, default=MIN_SAMPLE_TRAIN | the minimal sample size when sample=True.
use_ray - boolean or dict. If boolean: default=False | Whether to use ray to run the training in separate processes. This can be used to prevent OOM for large datasets, but will incur more overhead in time. If dict: the dict contains the keywords arguments to be passed to ray.tune.run.
use_spark - boolean, default=False | Whether to use spark to run the training in parallel spark jobs. This can be used to accelerate training on large models and large datasets, but will incur more overhead in time and thus slow down training in some cases.
free_mem_ratio - float between 0 and 1, default=0. The free memory ratio to keep during training.
metric_constraints - list, default=[] | The list of metric constraints. Each element in this list is a 3-tuple, which shall be expressed in the following format: the first element of the 3-tuple is the name of the metric, the second element is the inequality sign chosen from ">=" and "<=", and the third element is the constraint value. E.g., ('precision', '>=', 0.9). Note that all the metric names in metric_constraints need to be reported via the metrics_to_log dictionary returned by a customized metric function. The customized metric function shall be provided via the metric key word argument of the fit() function or the automl constructor. Find examples in this test. If pred_time_limit is provided as one of keyword arguments to fit() function or the automl constructor, flaml will automatically (and under the hood) add it as an additional element in the metric_constraints. Essentially 'pred_time_limit' specifies a constraint about the prediction latency constraint in seconds.
custom_hp - dict, default=None | The custom search space specified by user Each key is the estimator name, each value is a dict of the custom search space for that estimator. Notice the domain of the custom search space can either be a value of a sample.Domain object.

custom_hp = {
    "transformer_ms": {
        "model_path": {
            "domain": "albert-base-v2",
        },
        "learning_rate": {
            "domain": tune.choice([1e-4, 1e-5]),
        }
    }
}

time_col - for a time series task, name of the column containing the timestamps. If not provided, defaults to the first column of X_train/X_val
cv_score_agg_func - customized cross-validation scores aggregate function. Default to average metrics across folds. If specificed, this function needs to have the following input arguments:
- val_loss_folds: list of floats, the loss scores of each fold;
- log_metrics_folds: list of dicts/floats, the metrics of each fold to log.
This function should return the final aggregate result of all folds. A float number of the minimization objective, and a dictionary as the metrics to log or None. E.g.,

def cv_score_agg_func(val_loss_folds, log_metrics_folds):
    metric_to_minimize = sum(val_loss_folds)/len(val_loss_folds)
    metrics_to_log = None
    for single_fold in log_metrics_folds:
        if metrics_to_log is None:
            metrics_to_log = single_fold
        elif isinstance(metrics_to_log, dict):
            metrics_to_log = {k: metrics_to_log[k] + v for k, v in single_fold.items()}
        else:
            metrics_to_log += single_fold
    if metrics_to_log:
        n = len(val_loss_folds)
        metrics_to_log = (
            {k: v / n for k, v in metrics_to_log.items()}
            if isinstance(metrics_to_log, dict)
            else metrics_to_log / n
        )
    return metric_to_minimize, metrics_to_log

skip_transform - boolean, default=False | Whether to pre-process data prior to modeling.
allow_label_overlap - boolean, default=True | For classification tasks with holdout evaluation, whether to allow label overlap between train and validation sets. When True (default), uses a fast strategy that adds the first instance of missing labels to the set that is missing them, which may create some overlap. When False, uses a precise but slower strategy that intelligently re-splits instances to avoid overlap when possible. Only affects classification tasks with holdout evaluation method.
mlflow_logging - boolean, default=None | Whether to log the training results to mlflow. Default value is None, which means the logging decision is made based on AutoML.init's mlflow_logging argument. Not valid if mlflow is not installed.
fit_kwargs_by_estimator - dict, default=None | The user specified keywords arguments, grouped by estimator name. For TransformersEstimator, available fit_kwargs can be found from TrainingArgumentsForAuto. e.g.,

fit_kwargs_by_estimator = {
    "transformer": {
        "output_dir": "test/data/output/",
        "fp16": False,
    },
    "tft": {
        "max_encoder_length": 1,
        "min_encoder_length": 1,
        "static_categoricals": [],
        "static_reals": [],
        "time_varying_known_categoricals": [],
        "time_varying_known_reals": [],
        "time_varying_unknown_categoricals": [],
        "time_varying_unknown_reals": [],
        "variable_groups": {},
        "lags": {},
    }
}

**fit_kwargs - Other key word arguments to pass to fit() function of the searched learners, such as sample_weight. Below are a few examples of estimator-specific parameters:
period - int | forecast horizon for all time series forecast tasks. This is the number of time steps ahead to forecast (e.g., period=12 means forecasting 12 steps into the future). This represents the forecast horizon used during model training. Note: during prediction, the output length equals the length of X_test. FLAML automatically handles feature engineering for you - sklearn-based models (lgbm, rf, xgboost, etc.) will have lagged features created automatically, while time series native models (prophet, arima, sarimax) use their built-in forecasting capabilities. You do NOT need to manually create lagged features of the target variable.
gpu_per_trial - float, default = 0 | A float of the number of gpus per trial, only used by TransformersEstimator, XGBoostSklearnEstimator, and TemporalFusionTransformerEstimator.
group_ids - list of strings of column names identifying a time series, only used by TemporalFusionTransformerEstimator, required for 'ts_forecast_panel' task. group_ids is a parameter for TimeSeriesDataSet object from PyTorchForecasting. For other parameters to describe your dataset, refer to TimeSeriesDataSet PyTorchForecasting. To specify your variables, use static_categoricals, static_reals, time_varying_known_categoricals, time_varying_known_reals, time_varying_unknown_categoricals, time_varying_unknown_reals, variable_groups. To provide more information on your data, use max_encoder_length, min_encoder_length, lags.
log_dir - str, default = "lightning_logs" | Folder into which to log results for tensorboard, only used by TemporalFusionTransformerEstimator.
max_epochs - int, default = 20 | Maximum number of epochs to run training, only used by TemporalFusionTransformerEstimator.
batch_size - int, default = 64 | Batch size for training model, only used by TemporalFusionTransformerEstimator and TCNEstimator.

size​

AutoML Objects​

__init__​

__getstate__​

config_history​

model​

best_model_for_estimator​

best_estimator​

best_iteration​

best_config​

best_config_per_estimator​

best_loss_per_estimator​

best_loss​

best_result​

metrics_for_best_config​

best_config_train_time​

supported_metrics​

feature_transformer​

label_transformer​

classes_​

time_to_find_best_model​

predict​

predict_proba​

preprocess​

add_learner​

get_estimator_from_log​

retrain_from_log​

search_space​

low_cost_partial_config​

cat_hp_cost​

points_to_evaluate​

resource_attr​

min_resource​

max_resource​

pickle​

load_pickle​

trainable​

metric_constraints​

fit​

size

AutoML Objects

init

getstate

config_history

model

best_model_for_estimator

best_estimator

best_iteration

best_config

best_config_per_estimator

best_loss_per_estimator

best_loss

best_result

metrics_for_best_config

best_config_train_time

supported_metrics

feature_transformer

label_transformer

classes_

time_to_find_best_model

predict

predict_proba

preprocess

add_learner

get_estimator_from_log

retrain_from_log

search_space

low_cost_partial_config

cat_hp_cost

points_to_evaluate

resource_attr

min_resource

max_resource

pickle

load_pickle

trainable

metric_constraints

fit