Command Line Tools

Olive provides command line tools that can be invoked using the olive command. The command line tools are used to perform various tasks such as running an Olive workflow, managing AzureML compute, and more.

If olive is not in your PATH, you can run the command line tools by replacing olive with python -m olive.

usage: olive

Sub-commands

run

Run an olive workflow

olive run [-h] [--package-config PACKAGE_CONFIG] --run-config RUN_CONFIG
          [--setup] [--data-root DATA_ROOT] [--tempdir TEMPDIR]

Named Arguments

--package-config

For advanced users. Path to optional package (json) config file with location of individual pass module implementation and corresponding dependencies. Configuration might also include user owned/proprietary/private pass implementations.

--run-config, --config

Path to json config file

--setup

Whether run environment setup

Default: False

--data-root, --data_root

The data root path for optimization

--tempdir

Root directory for tempfile directories and files

configure-qualcomm-sdk

Configure Qualcomm SDK for Olive

olive configure-qualcomm-sdk [-h] --py_version {3.6,3.8} --sdk {snpe,qnn}

Named Arguments

--py_version

Possible choices: 3.6, 3.8

Python version: Use 3.6 for tensorflow 1.15 and 3.8 otherwise

--sdk

Possible choices: snpe, qnn

Qualcomm SDK: snpe or qnn

manage-aml-compute

Create new compute in your AzureML workspace

olive manage-aml-compute [-h] (--create | --delete)
                         [--subscription_id SUBSCRIPTION_ID]
                         [--resource_group RESOURCE_GROUP]
                         [--workspace_name WORKSPACE_NAME]
                         [--aml_config_path AML_CONFIG_PATH] --compute_name
                         COMPUTE_NAME [--vm_size VM_SIZE]
                         [--location LOCATION] [--min_nodes MIN_NODES]
                         [--max_nodes MAX_NODES]
                         [--idle_time_before_scale_down IDLE_TIME_BEFORE_SCALE_DOWN]

Named Arguments

--create, -c

Create new compute

Default: False

--delete, -d

Delete existing compute

Default: False

--subscription_id

Azure subscription ID

--resource_group

Name of the Azure resource group

--workspace_name

Name of the AzureML workspace

--aml_config_path

Path to AzureML config file. If provided, subscription_id, resource_group and workspace_name are ignored

--compute_name

Name of the new compute

--vm_size

VM size of the new compute. This is required if you are creating a compute instance

--location

Location of the new compute. This is required if you are creating a compute instance

--min_nodes

Minimum number of nodes

Default: 0

--max_nodes

Maximum number of nodes

Default: 2

--idle_time_before_scale_down

Idle seconds before scaledown

Default: 120

export-adapters

Export lora adapter weights to a .npz file that will be consumed by ONNX models generated by Olive ExtractedAdapters pass.

olive export-adapters [-h] [--adapter_path ADAPTER_PATH]
                      [--output_path OUTPUT_PATH] [--dtype {float32,float16}]
                      [--pack_weights] [--quantize_int4]
                      [--int4_block_size {16,32,64,128,256}]
                      [--int4_quantization_mode {symmetric,asymmetric}]

Named Arguments

--adapter_path

Path to the adapters weights saved after peft fine-tuning. Can be a local folder or huggingface id.

--output_path

Path to save the exported weights. Will be saved as a .npz file.

--dtype

Possible choices: float32, float16

Data type to save float weights as. If quantize_int4 is True, this is the data type of the quantization scales. Default is float32.

Default: “float32”

--pack_weights

Whether to pack the weights. If True, the weights for each module type will be packed into a single array.

Default: False

--quantize_int4

Quantize the weights to int4 using blockwise quantization.

Default: False

int4 quantization options

--int4_block_size

Possible choices: 16, 32, 64, 128, 256

Block size for int4 quantization. Default is 32.

Default: 32

--int4_quantization_mode

Possible choices: symmetric, asymmetric

Quantization mode for int4 quantization. Default is symmetric.

Default: “symmetric”