How-to#
Find more details on specific Olive capabilities, such as quantization, running workflows on remote compute, model packaging, conversions, and more!
Set-up#
Working with the CLI#
The Olive CLI provides a set of primitives such as quantize, finetune, onnx-graph-capture, auto-opt that enable you to easily optimize models and experiment with different cutting-edge optimization strategies without the need to define workflows.
Tip
For users new to Olive, we recommend that you start with the CLI.
Auto Optimizer
Learn how to use the olive auto-opt command to take a PyTorch/Hugging Face model and turn it into an optimized ONNX model.
Quantize
Learn how to use the olive quantize command to quantize your model with different precisions and techniques such as AWQ.
Execute Olive Workflows
Learn how to use the olive run command to execute an Olive workflow.
Configure Workflows (Advanced)#
For more complex scenarios, you can create fully customize workflows where you can run any of the 40+ supported optimization techniques in a sequence.
How to configure data
Learn how to configure data such as pre and post processing instructions.
How to configure metrics
Learn how to configure metrics such as accuracy, latency, throughput, and your own custom metrics.
How to configure systems
Learn how to configure systems such as local compute and remote compute to be a host (machine that executes optimization) and/or a target (machine that model will inference on).
How to use Auto Optimizer
Learn how to use Auto Optimizer - a tool that automatically creates the best model for you - in a workflow.
ONNX Graph Surgeon
Learn how to use ONNX Graph Surgeon to manipulate ONNX graphs.
Integrations#
How to integrate with Azure AI
Learn how to use integrations with Azure AI, such as model catalog, remote compute, and data/job artifacts.
How to integrate with Hugging Face
Learn how to use integrations with Hugging Face, such as models, data, and metrics.