Command Line Tools¶
Olive provides command line tools that can be invoked using the olive
command. The command line tools are used to perform various tasks such as running an Olive workflow, managing AzureML compute, and more.
If olive
is not in your PATH, you can run the command line tools by replacing olive
with python -m olive
.
usage: olive
Sub-commands¶
run¶
Run an olive workflow
olive run [-h] [--package-config PACKAGE_CONFIG] --run-config RUN_CONFIG
[--setup] [--data-root DATA_ROOT] [--tempdir TEMPDIR]
Named Arguments¶
- --package-config
For advanced users. Path to optional package (json) config file with location of individual pass module implementation and corresponding dependencies. Configuration might also include user owned/proprietary/private pass implementations.
- --run-config, --config
Path to json config file
- --setup
Whether run environment setup
Default: False
- --data-root, --data_root
The data root path for optimization
- --tempdir
Root directory for tempfile directories and files
configure-qualcomm-sdk¶
Configure Qualcomm SDK for Olive
olive configure-qualcomm-sdk [-h] --py_version {3.6,3.8} --sdk {snpe,qnn}
Named Arguments¶
- --py_version
Possible choices: 3.6, 3.8
Python version: Use 3.6 for tensorflow 1.15 and 3.8 otherwise
- --sdk
Possible choices: snpe, qnn
Qualcomm SDK: snpe or qnn
manage-aml-compute¶
Create new compute in your AzureML workspace
olive manage-aml-compute [-h] (--create | --delete)
[--subscription_id SUBSCRIPTION_ID]
[--resource_group RESOURCE_GROUP]
[--workspace_name WORKSPACE_NAME]
[--aml_config_path AML_CONFIG_PATH] --compute_name
COMPUTE_NAME [--vm_size VM_SIZE]
[--location LOCATION] [--min_nodes MIN_NODES]
[--max_nodes MAX_NODES]
[--idle_time_before_scale_down IDLE_TIME_BEFORE_SCALE_DOWN]
Named Arguments¶
- --create, -c
Create new compute
Default: False
- --delete, -d
Delete existing compute
Default: False
- --subscription_id
Azure subscription ID
- --resource_group
Name of the Azure resource group
- --workspace_name
Name of the AzureML workspace
- --aml_config_path
Path to AzureML config file. If provided, subscription_id, resource_group and workspace_name are ignored
- --compute_name
Name of the new compute
- --vm_size
VM size of the new compute. This is required if you are creating a compute instance
- --location
Location of the new compute. This is required if you are creating a compute instance
- --min_nodes
Minimum number of nodes
Default: 0
- --max_nodes
Maximum number of nodes
Default: 2
- --idle_time_before_scale_down
Idle seconds before scaledown
Default: 120
export-adapters¶
Export lora adapter weights to a .npz file that will be consumed by ONNX models generated by Olive ExtractedAdapters pass.
olive export-adapters [-h] [--adapter_path ADAPTER_PATH]
[--output_path OUTPUT_PATH] [--dtype {float32,float16}]
[--pack_weights] [--quantize_int4]
[--int4_block_size {16,32,64,128,256}]
[--int4_quantization_mode {symmetric,asymmetric}]
Named Arguments¶
- --adapter_path
Path to the adapters weights saved after peft fine-tuning. Can be a local folder or huggingface id.
- --output_path
Path to save the exported weights. Will be saved as a .npz file.
- --dtype
Possible choices: float32, float16
Data type to save float weights as. If quantize_int4 is True, this is the data type of the quantization scales. Default is float32.
Default: “float32”
- --pack_weights
Whether to pack the weights. If True, the weights for each module type will be packed into a single array.
Default: False
- --quantize_int4
Quantize the weights to int4 using blockwise quantization.
Default: False
int4 quantization options¶
- --int4_block_size
Possible choices: 16, 32, 64, 128, 256
Block size for int4 quantization. Default is 32.
Default: 32
- --int4_quantization_mode
Possible choices: symmetric, asymmetric
Quantization mode for int4 quantization. Default is symmetric.
Default: “symmetric”