OliveModels¶
The following models are available in Olive.
Model Configuration¶
- class olive.model.ModelConfig(*, type: str, config: dict)[source]¶
Input model config which will be used to create the model handler.
For example, the config looks like for llama2: .. code-block:: json
- {
- “input_model”: {
“type”: “CompositePyTorchModel”, “config”: {
“model_path”: “llama_v2”, “model_components”: [
- {
“name”: “decoder_model”, “type”: “PyTorchModel”, “config”: {
“model_script”: “user_script.py”, “io_config”: {
“input_names”: [“tokens”, “position_ids”, “attn_mask”, …], “output_names”: [“logits”, “attn_mask_out”, …], “dynamic_axes”: {
“tokens”: { “0”: “batch_size”, “1”: “seq_len” }, “position_ids”: { “0”: “batch_size”, “1”: “seq_len” }, “attn_mask”: { “0”: “batch_size”, “1”: “max_seq_len” }, …
}
}, “model_loader”: “load_decoder_model”, “dummy_inputs_func”: “decoder_inputs”
}
}, {
“name”: “decoder_with_past_model”, “type”: “PyTorchModel”, “config”: {
“model_script”: “user_script.py”, “io_config”: {
“input_names”: [“tokens_increment”, “position_ids_increment”, “attn_mask”, …], “output_names”: [“logits”, “attn_mask_out”, …], “dynamic_axes”: {
“tokens_increment”: { “0”: “batch_size”, “1”: “seq_len_increment” }, “position_ids_increment”: { “0”: “batch_size”, “1”: “seq_len_increment” }, “attn_mask”: { “0”: “batch_size”, “1”: “max_seq_len” }, …
}
}, “model_loader”: “load_decoder_with_past_model”, “dummy_inputs_func”: “decoder_with_past_inputs”
}
}
]
}
}
}
ONNX Model Handler¶
- class olive.model.ONNXModelHandler(model_path: str | Path | ResourcePath | ResourcePathConfig | None = None, onnx_file_name: str | None = None, inference_settings: dict | None = None, use_ort_extensions: bool = False, model_attributes: Dict[str, Any] | None = None, external_initializers_file_name: str | None = None, constant_inputs_file_name: str | None = None)[source]¶
ONNX model handler.
Besides the model loading functionalities, the model handler also provider the onnx graph functionality by mixin
the mixin class OnnxEpValidateMixin is used to validate the execution providers. the mixin class OnnxGraphMixin is used to support onnx graph operations.
CompositeModel Model Handler¶
- class olive.model.CompositeModelHandler(model_components: List[OliveModelHandler | Dict[str, Any]], model_component_names: List[str], model_attributes: Dict[str, Any] | None = None)[source]¶
CompositeModel represents multiple component models.
The only responsibility of CompositeModelHandler is to provider a get_model_components which will iterate all the child models.
Whisper is an example composite model that has encoder and decoder components. CompositeModelHandler is a collection of Models. All the child model in the container should have same model type.
DistributedOnnxModel Model Handler¶
OpenVINO Model Handler¶
PyTorch Model Handler¶
- class olive.model.PyTorchModelHandler(model_path: str | Path | ResourcePath | ResourcePathConfig | None = None, model_file_format: ModelFileFormat = ModelFileFormat.PYTORCH_ENTIRE_MODEL, model_loader: str | Callable | None = None, model_script: Path | str | None = None, script_dir: Path | str | None = None, io_config: Dict[str, Any] | IoConfig | str | Callable | None = None, dummy_inputs_func: str | Callable | None = None, hf_config: Dict[str, Any] | HfConfig | None = None, adapter_path: str | Path | ResourcePath | ResourcePathConfig | None = None, model_attributes: Dict[str, Any] | None = None)[source]¶
PyTorch model handler.
- Besides the model loading for PyTorch model, the model handler also provides the following functionalities:
Get the model io configuration either from user provider io_config or from hf_config. The priority is user provided io_config is higher than hf_config.
Get the dummy inputs for PyTorch model used to evaluate the latency.
All kinds of Hf model functionalities by HfConfigMixin.
DistributedPyTorchModelHandler Model¶
- class olive.model.DistributedPyTorchModelHandler(model_path: str | Path | ResourcePath | ResourcePathConfig | None, model_name_pattern: str, num_ranks: int, model_file_format: ModelFileFormat = ModelFileFormat.PYTORCH_ENTIRE_MODEL, model_loader: str | Callable | None = None, model_script: Path | str | None = None, script_dir: Path | str | None = None, io_config: Dict[str, Any] | IoConfig | str | Callable | None = None, dummy_inputs_func: str | Callable | None = None, hf_config: Dict[str, Any] | HfConfig | None = None, adapter_path: str | Path | ResourcePath | ResourcePathConfig | None = None, model_attributes: Dict[str, Any] | None = None)[source]¶