Huggingface Integration#
Introduction#
This document outlines the integrations between Olive and Huggingface. Discover how to use Huggingface resources within Olive.
Input Model#
Use the HfModel
type if you want to optimize a Huggingface model, or evaluate a Huggingface model. The default task
is text-generation-with-past
.
Huggingface Hub model#
Olive can automatically retrieve models from Huggingface hub:
"input_model":{
"type": "HfModel",
"model_path": "meta-llama/Llama-2-7b-hf"
}
Local model#
If you have the Huggingface model prepared in local:
"input_model":{
"type": "HfModel",
"model_path": "path/to/local/model"
}
Note: You must also have the tokenizer and other necessary files in the same local directory.
Azure ML model#
Olive supports loading model from your Azure Machine Learning workspace. Find detailed configurations here.
Example: Llama-2-7b from Azure ML model catalog:
"input_model":{
"type": "HfModel",
"model_path": {
"type": "azureml_registry_model",
"name": "Llama-2-7b",
"registry_name": "azureml-meta",
"version": "13"
}
}
Model config loading#
Olive can automatically retrieve model configurations from Huggingface hub:
Olive retrieves model configuration from transformers for future usage.
Olive simplifies the process by automatically fetching configurations such as IO config and dummy input required for the
OnnxConversion
pass from OnnxConfig ifoptimum
is installed and themodel_type
andtask
are supported. This means there’s no need for you to manually specify the IO config when using theOnnxConversion
pass.
You can also provide your own IO config which will override the automatically fetched IO config and dummy inputs:
"input_model": {
"type": "HfModel",
"model_path": "meta-llama/Llama-2-7b-hf",
"io_config": {
"input_names": [ "input_ids", "attention_mask", "position_ids" ],
"output_names": [ "logits" ],
"input_shapes": [ [ 2, 8 ], [ 2, 8 ], [ 2, 8 ] ],
"input_types": [ "int64", "int64", "int64" ],
"dynamic_axes": {
"input_ids": { "0": "batch_size", "1": "sequence_length" },
"attention_mask": { "0": "batch_size", "1": "total_sequence_length" },
"position_ids": { "0": "batch_size", "1": "sequence_length" }
},
"dynamic_shapes": {
"input_ids": { "0": ["batch_size", 0, 8], "1": ["sequence_length", 0, 2048] },
"attention_mask": { "0": ["batch_size", 0, 8], "1": ["total_sequence_length", 0, 3072] },
"position_ids": { "0": ["batch_size", 0, 8], "1": ["sequence_length", 0, 2048] }
}
}
}
Huggingface datasets#
Olive supports automatically downloading and applying Huggingface datasets to Passes and Evaluators.
Datasets can be added to data_configs
section in the configuration file with "type": "HuggingfaceContainer"
. Read How to Configure Data for more information.
You can reference the dataset by its name in the Pass config
Example: datasets in data_configs
:
"data_configs": [{
"name": "oasst1_train",
"type": "HuggingfaceContainer",
"load_dataset_config": {
"data_name": "timdettmers/openassistant-guanaco",
"split": "train"
},
"pre_process_data_config": {
"text_cols": ["text"],
"strategy": "line-by-line",
"max_seq_len": 512,
"pad_to_max_len": false
}
}]
Pass config:
"session_params_tuning": {
"type": "OrtSessionParamsTuning",
"data_config": "oasst1_train"
}
Huggingface metrics#
Huggingface metrics in Olive are supported by Huggingface evaluate. You can refer to Huggingface metrics page for a complete list of available metrics.
Example metric config
{
"name": "accuracy",
"type": "accuracy",
"backend": "huggingface_metrics",
"data_config": "oasst1_train",
"sub_types": [
{"name": "accuracy", "priority": -1},
{"name": "f1"}
]
}
Please refer to How to configure metrics for more information on how to set up metrics.
Huggingface login#
For certain gated models or datasets, you need to log in to your Huggingface account to access them. If the Huggingface resources you are using require a token, please add hf_token: true
to the Olive system configuration. Olive will then automatically manage the Huggingface login process, allowing you to access these gated resources.
Local system, docker system and Python environment system#
For local system, docker system and Python environment system, please run command huggingface-cli login
in your terminal to login your Huggingface account. Find more details about login here.
AzureML system#
Follow these steps to enable Huggingface login for AzureML system:
Get your Huggingface token string from Settings -> Access Tokens.
Create or use an existing Azure Key Vault. Assume the key vault is named
my_keyvault_name
. Add a new secret namedhf-token
, and set the value as the token from the first step. It is important to note that Olive reserveshf-token
secret name specifically for Huggingface login. Do not use this name in this keyvault for other purpose.Make sure you have
azureml_client
section in your configuration file, and add a new attributekeyvault_name
to it. For example:"azureml_client": { "subscription_id": "<subscription_id>", "resource_group": "<resource_group>", "workspace_name": "<workspace_name>", "keyvault_name" : "my_keyvault_name" }
Configure the Managed Service Identity (MSI) for the host compute or target compute. Detailed instruction can be found here. Then grant the host compute or target compute access to the key vault resource following this guide
Add
hf_token: True
to AzureML system configuration:"aml_system": { "type": "AzureML", "config": { "hf_token": true } }
With the above steps, Olive can automatically retrieve your Huggingface token from the hf-token
secret in the my_keyvault_name
key vault, and log in your Huggingface account in the AML job.