OliveModels

The following models are available in Olive.

Model Configuration

class olive.model.ModelConfig(*, type: str, config: dict)[source]

Input model config which will be used to create the model handler.

Hf Model Handler

class olive.model.HfModelHandler(model_path: str | Path | dict | ResourcePathConfig | ResourcePath | None, task: str = 'text-generation-with-past', load_kwargs: Dict[str, Any] | HfLoadKwargs | None = None, io_config: Dict[str, Any] | IoConfig | str | None = None, adapter_path: str | Path | dict | ResourcePathConfig | ResourcePath | None = None, model_attributes: Dict[str, Any] | None = None, generative: bool = False)[source]

Distributed Hf Model Handler

class olive.model.DistributedHfModelHandler(model_path: str | Path | dict | ResourcePathConfig | ResourcePath | None, model_name_pattern: str, num_ranks: int, task: str, load_kwargs: Dict[str, Any] | HfLoadKwargs | None = None, io_config: Dict[str, Any] | IoConfig | None = None, model_attributes: Dict[str, Any] | None = None, generative: bool = False)[source]

PyTorch Model Handler

class olive.model.PyTorchModelHandler(model_path: str | Path | dict | ResourcePathConfig | ResourcePath | None = None, model_file_format: ModelFileFormat = ModelFileFormat.PYTORCH_ENTIRE_MODEL, model_loader: str | Callable | None = None, model_script: Path | str | None = None, script_dir: Path | str | None = None, io_config: Dict[str, Any] | IoConfig | str | Callable | None = None, dummy_inputs_func: str | Callable | None = None, model_attributes: Dict[str, Any] | None = None, generative: bool = False)[source]

PyTorch model handler.

Besides the model loading for PyTorch model, the model handler also provides the following functionalities:
  • Get the model io configuration from user provider io_config.

  • Get the dummy inputs for PyTorch model used to evaluate the latency.

ONNX Model Handler

class olive.model.ONNXModelHandler(model_path: str | Path | dict | ResourcePathConfig | ResourcePath | None = None, onnx_file_name: str | None = None, inference_settings: dict | None = None, use_ort_extensions: bool = False, model_attributes: Dict[str, Any] | None = None, external_initializers_file_name: str | None = None, constant_inputs_file_name: str | None = None, generative: bool = False)[source]

ONNX model handler.

Besides the model loading functionalities, the model handler also provider the onnx graph functionality by mixin

the mixin class OnnxEpValidateMixin is used to validate the execution providers. the mixin class OnnxGraphMixin is used to support onnx graph operations.

Distributed Onnx Model Handler

class olive.model.DistributedOnnxModelHandler(model_path: str | Path | dict | ResourcePathConfig | ResourcePath | None, model_name_pattern: str, num_ranks: int, inference_settings: dict | None = None, use_ort_extensions: bool = False, model_attributes: Dict[str, Any] | None = None, generative: bool = False)[source]

OpenVINO Model Handler

class olive.model.OpenVINOModelHandler(model_path: str | Path | dict | ResourcePathConfig | ResourcePath | None, model_attributes: Dict[str, Any] | None = None)[source]

OpenVINO model handler.

The main responsibility of OpenVINOModelHandler is to provide the model loading for OpenVINO model.

SNPE Model Handler

class olive.model.SNPEModelHandler(input_names: List[str], input_shapes: List[List[int]], output_names: List[str], output_shapes: List[List[int]], model_path: str | Path | dict | ResourcePathConfig | ResourcePath | None = None, model_attributes: Dict[str, Any] | None = None)[source]

Composite Model Handler

class olive.model.CompositeModelHandler(model_components: List[OliveModelHandler | Dict[str, Any]], model_component_names: List[str], model_attributes: Dict[str, Any] | None = None)[source]

CompositeModel represents multiple component models.

The only responsibility of CompositeModelHandler is to provider a get_model_components which will iterate all the child models.

Whisper is an example composite model that has encoder and decoder components. CompositeModelHandler is a collection of Models. All the child model in the container should have same model type.