How-to#

Find more details on specific Olive capabilities, such as quantization, running workflows on remote compute, model packaging, conversions, and more!

Set-up#

How to install Olive

Learn how to install olive-ai.

Install Olive

Working with the CLI#

The Olive CLI provides a set of primitives such as quantize, finetune, onnx-graph-capture, auto-opt that enable you to easily optimize models and experiment with different cutting-edge optimization strategies without the need to define workflows.

Tip

For users new to Olive, we recommend that you start with the CLI.

Auto Optimizer

Learn how to use the olive auto-opt command to take a PyTorch/Hugging Face model and turn it into an optimized ONNX model.

olive auto-opt

Finetune

Learn how to use the olive finetune command to create (Q)LoRA adapters.

olive finetune

Quantize

Learn how to use the olive quantize command to quantize your model with different precisions and techniques such as AWQ.

olive quantize

Execute Olive Workflows

Learn how to use the olive run command to execute an Olive workflow.

olive run

Configure Workflows (Advanced)#

For more complex scenarios, you can create fully customize workflows where you can run any of the 40+ supported optimization techniques in a sequence.

How to configure passes

Learn how to configure passes.

Configure pass

How to configure models

Learn how to configure models.

Configure models

How to configure data

Learn how to configure data such as pre and post processing instructions.

Configure data

How to configure metrics

Learn how to configure metrics such as accuracy, latency, throughput, and your own custom metrics.

Configure metrics

How to package models

Learn how to package models for deployment.

Model packaging

How to configure systems

Learn how to configure systems such as local compute and remote compute to be a host (machine that executes optimization) and/or a target (machine that model will inference on).

Configure systems

How to use Auto Optimizer

Learn how to use Auto Optimizer - a tool that automatically creates the best model for you - in a workflow.

Auto Optimizer

Model splitting

Learn how to split a model into multiple components.

Model splitting

ONNX Graph Surgeon

Learn how to use ONNX Graph Surgeon to manipulate ONNX graphs.

ONNX Graph Surgeon

Integrations#

How to integrate with Azure AI

Learn how to use integrations with Azure AI, such as model catalog, remote compute, and data/job artifacts.

Integrate with Azure AI

How to integrate with Hugging Face

Learn how to use integrations with Hugging Face, such as models, data, and metrics.

Integrate with Hugging Face