OliveModels

The following models are available in Olive.

Model Configuration

class olive.model.ModelConfig(*, type: str, config: dict)[source]

Input model config which will be used to create the model handler.

For example, the config looks like for llama2: .. code-block:: json

{
“input_model”: {

“type”: “CompositePyTorchModel”, “config”: {

“model_path”: “llama_v2”, “model_components”: [

{

“name”: “decoder_model”, “type”: “PyTorchModel”, “config”: {

“model_script”: “user_script.py”, “io_config”: {

“input_names”: [“tokens”, “position_ids”, “attn_mask”, …], “output_names”: [“logits”, “attn_mask_out”, …], “dynamic_axes”: {

“tokens”: { “0”: “batch_size”, “1”: “seq_len” }, “position_ids”: { “0”: “batch_size”, “1”: “seq_len” }, “attn_mask”: { “0”: “batch_size”, “1”: “max_seq_len” }, …

}

}, “model_loader”: “load_decoder_model”, “dummy_inputs_func”: “decoder_inputs”

}

}, {

“name”: “decoder_with_past_model”, “type”: “PyTorchModel”, “config”: {

“model_script”: “user_script.py”, “io_config”: {

“input_names”: [“tokens_increment”, “position_ids_increment”, “attn_mask”, …], “output_names”: [“logits”, “attn_mask_out”, …], “dynamic_axes”: {

“tokens_increment”: { “0”: “batch_size”, “1”: “seq_len_increment” }, “position_ids_increment”: { “0”: “batch_size”, “1”: “seq_len_increment” }, “attn_mask”: { “0”: “batch_size”, “1”: “max_seq_len” }, …

}

}, “model_loader”: “load_decoder_with_past_model”, “dummy_inputs_func”: “decoder_with_past_inputs”

}

}

]

}

}

}

ONNX Model Handler

class olive.model.ONNXModelHandler(model_path: str | Path | ResourcePath | ResourcePathConfig | None = None, onnx_file_name: str | None = None, inference_settings: dict | None = None, use_ort_extensions: bool = False, model_attributes: Dict[str, Any] | None = None, external_initializers_file_name: str | None = None, constant_inputs_file_name: str | None = None)[source]

ONNX model handler.

Besides the model loading functionalities, the model handler also provider the onnx graph functionality by mixin

the mixin class OnnxEpValidateMixin is used to validate the execution providers. the mixin class OnnxGraphMixin is used to support onnx graph operations.

CompositeModel Model Handler

class olive.model.CompositeModelHandler(model_components: List[OliveModelHandler | Dict[str, Any]], model_component_names: List[str], model_attributes: Dict[str, Any] | None = None)[source]

CompositeModel represents multiple component models.

The only responsibility of CompositeModelHandler is to provider a get_model_components which will iterate all the child models.

Whisper is an example composite model that has encoder and decoder components. CompositeModelHandler is a collection of Models. All the child model in the container should have same model type.

DistributedOnnxModel Model Handler

class olive.model.DistributedOnnxModelHandler(model_path: str | Path | ResourcePath | ResourcePathConfig | None, model_name_pattern: str, num_ranks: int, inference_settings: dict | None = None, use_ort_extensions: bool = False, model_attributes: Dict[str, Any] | None = None)[source]

OpenVINO Model Handler

class olive.model.OpenVINOModelHandler(model_path: str | Path | ResourcePath | ResourcePathConfig | None, model_attributes: Dict[str, Any] | None = None)[source]

OpenVINO model handler.

The main responsibility of OpenVINOModelHandler is to provide the model loading for OpenVINO model.

PyTorch Model Handler

class olive.model.PyTorchModelHandler(model_path: str | Path | ResourcePath | ResourcePathConfig | None = None, model_file_format: ModelFileFormat = ModelFileFormat.PYTORCH_ENTIRE_MODEL, model_loader: str | Callable | None = None, model_script: Path | str | None = None, script_dir: Path | str | None = None, io_config: Dict[str, Any] | IoConfig | str | Callable | None = None, dummy_inputs_func: str | Callable | None = None, hf_config: Dict[str, Any] | HfConfig | None = None, adapter_path: str | Path | ResourcePath | ResourcePathConfig | None = None, model_attributes: Dict[str, Any] | None = None)[source]

PyTorch model handler.

Besides the model loading for PyTorch model, the model handler also provides the following functionalities:
  • Get the model io configuration either from user provider io_config or from hf_config. The priority is user provided io_config is higher than hf_config.

  • Get the dummy inputs for PyTorch model used to evaluate the latency.

  • All kinds of Hf model functionalities by HfConfigMixin.

DistributedPyTorchModelHandler Model

class olive.model.DistributedPyTorchModelHandler(model_path: str | Path | ResourcePath | ResourcePathConfig | None, model_name_pattern: str, num_ranks: int, model_file_format: ModelFileFormat = ModelFileFormat.PYTORCH_ENTIRE_MODEL, model_loader: str | Callable | None = None, model_script: Path | str | None = None, script_dir: Path | str | None = None, io_config: Dict[str, Any] | IoConfig | str | Callable | None = None, dummy_inputs_func: str | Callable | None = None, hf_config: Dict[str, Any] | HfConfig | None = None, adapter_path: str | Path | ResourcePath | ResourcePathConfig | None = None, model_attributes: Dict[str, Any] | None = None)[source]

SNPEHandler Model

class olive.model.SNPEModelHandler(input_names: List[str], input_shapes: List[List[int]], output_names: List[str], output_shapes: List[List[int]], model_path: str | Path | ResourcePath | ResourcePathConfig | None = None, model_attributes: Dict[str, Any] | None = None)[source]

CompositePyTorchModel Model Handler

class olive.model.CompositePyTorchModelHandler(model_components: List[Dict[str, Any]], **kwargs)[source]

The CompositePyTorchModel handler.

Its main responsibility is to create a list of child PyTorch model and used to initialzie a composite model.