OrtSessionΒΆ

OrtSession operator is extended from official ONNX Runtime augmented by functionalities as dyanmic batching to boost throughput.

APIsΒΆ

class pyis.python.ops.OrtSession(self: ops.OrtSession, model_path: str, input_names: List[str], output_names: List[str], inter_op_thread_num: int = 1, intra_op_thread_num: int = 0, dynamic_batching: bool = False, batch_size: int = 1, ort_dll_file: str = '') NoneΒΆ

Augmented OrtSession ONNX Runtime Session

Create an ORT(ONNX Runtime) Session.

Parameters
  • model_file (str) – path to the onnx model,

  • input_names (List[str]) – input names to the onnx model,

  • output_names (List[str]) – output names to the onnx model,

  • inter_op_thread_num (int) – inter-op thread num, default to 1,

  • intra_op_thread_num (int) – intra-op thread num, default to 0, use all cores

  • dynamic_batching (bool) – use dynamic batching or not,

  • batch_size (int) – dynamic batch size,

  • ort_dll_file (string) – optional, custom ORT dll file path

run(self: ops.OrtSession, arg0: List[numpy.ndarray]) List[numpy.ndarray]ΒΆ

Run Ort Session with input tensors as list of numpy array.

Parameters

model_path (List[numpy.ndarray]) – input tensors as list of numpy array.

Returns

output tensors as list of numpy array

ExampleΒΆ

# Copyright (c) Microsoft Corporation. All rights reserved.
# Licensed under the MIT license.

import numpy as np
from pyis.python import ops
from typing import List

model_path: str = 'example.onnx'
input_names: List[str] = ['query_token_ids']
output_names: List[str] = ['class_label']

inputs: List[np.ndarray] = [np.array([3, 6, 8], dtype=np.int64)]

# create ort session object
ort_session: ops.OrtSession = ops.OrtSession(
    model_path,
    input_names,
    output_names,
    inter_op_thread_num = 1,
    intra_op_thread_num = 0,
    dynamic_batching = False
)

# run ort inference and get outputs
outputs: List[np.ndarray] = ort_session.run(inputs)