Olive
0.2.0

OVERVIEW

  • Olive
  • Design
  • Quick Tour
  • Olive Options

GET STARTED

  • Installation
  • Quickstart Examples

TUTORIALS

  • Configuring OliveSystem
  • Configuring Metric
  • Configuring Pass
  • Configuring HW-dependent optimizations
    • ONNX related – General
    • PyTorch related – General
    • OpenVINO related – Intel HW
    • SNPE related – Qualcomm HW
      • Prerequisites
        • Download and unzip SNPE SDK
        • Install SDK system dependencies
        • Configure Olive SNPE
      • Model Conversion
        • Example Configuration
      • Post Training Quantization (PTQ)
        • Example Configuration
  • Advanced User Tour
  • How to add new Pass
  • How to write user_script
  • Packaging Olive artifacts

EXAMPLES

  • Inception model optimization on Qualcomm NPU
  • Cifar10 optimization with OpenVINO for Intel HW
  • BERT optimization with QAT Customized Training Loop on CPU
  • ResNet optimization with QAT Default Training Loop on CPU
  • ResNet optimization with QAT PyTorch Lightning Module on CPU
  • SqueezeNet latency optimization with DirectML
  • Stable Diffusion optimization with DirectML
  • BERT optimization with Intel® Neural Compressor Post Training quantization on CPU
  • Whisper optimization using ORT toolchain

API REFERENCE

  • OliveModels
  • OliveSystems
  • OliveEvaluator
  • Metric
  • SearchAlgorithms
  • Engine
  • Passes
Olive
  • Configuring HW-dependent optimizations
  • SNPE related – Qualcomm HW
  • View page source

SNPE related – Qualcomm HW¶

The Snapdragon Neural Processing Engine (SNPE) is a Qualcomm Snapdragon software accelerated runtime for the execution of deep neural networks.

Olive provides tools to convert models from different frameworks such as ONNX and TensorFlow to SNPE Deep Learning Container (DLC) file and quantize them to 8 bit fixed point for running on the Hexagon DSP. Olive uses the development tools available in the Snapdragon Neural Processing Engine SDK also known as Qualcomm Neural Processing SDK for AI.

Prerequisites¶

Download and unzip SNPE SDK¶

Download the SNPE SDK zip following instructions from Qualcomm.

Unzip the file and set the unzipped directory path as environment variable SNPE_ROOT.

Note

The SNPE SDK development environment is limited to Ubuntu, specifically version 18.04. It might not work as expected on Ubuntu 20.04. We recommend using a Ubuntu 18.04 docker container if you don’t have a machine running the same OS.

Install SDK system dependencies¶

source $SNPE_ROOT/bin/dependencies.sh

Configure Olive SNPE¶

python -m olive.snpe.configure

Model Conversion¶

SNPEConversion converts ONNX or TensorFlow models to SNPE DLC. The DLC file can be loaded into the SNPE runtime for inference using one of the Snapdragon accelerated compute cores.

Please refer to SNPEConversion for more details about the pass and its config parameters.

Example Configuration¶

{
    "type": "SNPEConversion",
    "config": {
        "input_names": ["input"],
        "input_shapes": [[1, 299, 299, 3]],
        "output_names": ["InceptionV3/Predictions/Reshape_1"],
        "output_shapes": [[1, 1001]],
    }
}

Post Training Quantization (PTQ)¶

SNPEQuantization quantizes the DLC file. Quantized DLC files use fixed point representations of network parameters, generally 8 bit weights and 8 or 32bit biases. Please refer to the corresponding documentation for more details.

Please refer to SNPEQuantization for more details about the pass and its config parameters.

Example Configuration¶

{
    "type": "SNPEQuantization",
    "config":  {
        "data_dir": "data_dir",
        "user_script": "user_script.py",
        "dataloader_func": "create_quant_dataloader",
        "enable_htp": true
    }
}

Check out this file for an example implementation of "user_script.py" and "create_quant_dataloader".

Previous Next

© Copyright 2023, olivedevteam@microsoft.com.

Built with Sphinx using a theme provided by Read the Docs.