OliveModels
The following models are available in Olive.
Model Configuration
Hf Model Handler
- class olive.model.HfModelHandler(model_path: str | Path | dict | ResourcePathConfig | ResourcePath | None, task: str = 'text-generation-with-past', load_kwargs: Dict[str, Any] | HfLoadKwargs | None = None, io_config: Dict[str, Any] | IoConfig | str | None = None, adapter_path: str | Path | dict | ResourcePathConfig | ResourcePath | None = None, model_attributes: Dict[str, Any] | None = None, generative: bool = False)[source]
Distributed Hf Model Handler
- class olive.model.DistributedHfModelHandler(model_path: str | Path | dict | ResourcePathConfig | ResourcePath | None, model_name_pattern: str, num_ranks: int, task: str, load_kwargs: Dict[str, Any] | HfLoadKwargs | None = None, io_config: Dict[str, Any] | IoConfig | None = None, model_attributes: Dict[str, Any] | None = None, generative: bool = False)[source]
PyTorch Model Handler
- class olive.model.PyTorchModelHandler(model_path: str | Path | dict | ResourcePathConfig | ResourcePath | None = None, model_file_format: ModelFileFormat = ModelFileFormat.PYTORCH_ENTIRE_MODEL, model_loader: str | Callable | None = None, model_script: Path | str | None = None, script_dir: Path | str | None = None, io_config: Dict[str, Any] | IoConfig | str | Callable | None = None, dummy_inputs_func: str | Callable | None = None, model_attributes: Dict[str, Any] | None = None, generative: bool = False)[source]
PyTorch model handler.
- Besides the model loading for PyTorch model, the model handler also provides the following functionalities:
Get the model io configuration from user provider io_config.
Get the dummy inputs for PyTorch model used to evaluate the latency.
ONNX Model Handler
- class olive.model.ONNXModelHandler(model_path: str | Path | dict | ResourcePathConfig | ResourcePath | None = None, onnx_file_name: str | None = None, inference_settings: dict | None = None, use_ort_extensions: bool = False, model_attributes: Dict[str, Any] | None = None, external_initializers_file_name: str | None = None, constant_inputs_file_name: str | None = None, generative: bool = False)[source]
ONNX model handler.
Besides the model loading functionalities, the model handler also provider the onnx graph functionality by mixin
the mixin class OnnxEpValidateMixin is used to validate the execution providers. the mixin class OnnxGraphMixin is used to support onnx graph operations.
Distributed Onnx Model Handler
- class olive.model.DistributedOnnxModelHandler(model_path: str | Path | dict | ResourcePathConfig | ResourcePath | None, model_name_pattern: str, num_ranks: int, inference_settings: dict | None = None, use_ort_extensions: bool = False, model_attributes: Dict[str, Any] | None = None, generative: bool = False)[source]
OpenVINO Model Handler
SNPE Model Handler
Composite Model Handler
- class olive.model.CompositeModelHandler(model_components: List[OliveModelHandler | Dict[str, Any]], model_component_names: List[str], model_attributes: Dict[str, Any] | None = None)[source]
CompositeModel represents multiple component models.
The only responsibility of CompositeModelHandler is to provider a get_model_components which will iterate all the child models.
Whisper is an example composite model that has encoder and decoder components. CompositeModelHandler is a collection of Models. All the child model in the container should have same model type.