QuickStart#

In this Notebook we run Archai’s Quickstart example on Azure Machine Learning.

Prerequisites#

  • Python 3.7 or later

  • An Azure subscription

  • An Azure Resource Group

  • An Azure Machine Learning Workspace

This notebook also assumes you have a python environment setup using pip install -e .[aml] in your Archai repository root

[2]:
from pathlib import Path

from IPython.display import display, Image
from IPython.core.display import HTML

from azure.ai.ml import Output, command

import archai.common.azureml_helper as aml_helper
import archai.common.notebook_helper as nb_helper

Get a handle to the workspace#

We load the workspace from a workspace configuration file.

[ ]:
ml_client = aml_helper.get_aml_client_from_file("../.azureml/config.json")
print(f'Using workspace: {ml_client.workspace_name} in resource group: {ml_client.resource_group_name}')

Create a compute cluster#

We provision a Linux compute cluster for this Notebook. See the full list on VM sizes and prices.

[3]:
cpu_compute_name = "nas-cpu-cluster-D14-v2"
compute_cluster = aml_helper.create_compute_cluster(ml_client, cpu_compute_name)
You already have a cluster named nas-cpu-cluster-D14-v2, we'll reuse it as is.

Create an environment based on a YAML file#

Azure Machine Learning maintains a set of CPU and GPU Ubuntu Linux-based base images with common system dependencies. For the set of base images and their corresponding Dockerfiles, see the AzureML Containers repo.

[4]:
archai_job_env = aml_helper.create_environment_from_file(ml_client,
                                                         image="mcr.microsoft.com/azureml/openmpi3.1.2-ubuntu18.04:latest",
                                                         conda_file="conda.yaml",
                                                         version="0.0.1")
Environment with name aml-archai is registered to workspace, the environment version is 0.0.1

Create job#

[5]:
job = command(experiment_name="archai_quickstart",
              display_name="Archai's QuickStart",
              compute=cpu_compute_name,
              environment=f"{archai_job_env.name}:{archai_job_env.version}",
              code="main.py",
              outputs=dict(
                  output_path=Output(type="uri_folder", mode="rw_mount")
              ),
              command="python main.py --output_dir ${{outputs.output_path}}"
              )

Run job#

[6]:
quickstart_job = ml_client.create_or_update(job)
Uploading main.py (< 1 MB): 100%|##########| 1.74k/1.74k [00:00<00:00, 6.91kB/s]


Open the job overview on Azure ML Studio in your web browser (this works when you are running this notebook in VS code).

[7]:
import webbrowser
webbrowser.open(quickstart_job.services["Studio"].endpoint)

job_name = quickstart_job.name
print(f'Started job: {job_name}')
Started job: busy_shampoo_cqjgwy28gc

Download job’s output#

[ ]:
output_name = "output_path"
download_path = "output"

aml_helper.download_job_output(ml_client, job_name=quickstart_job.name, output_name=output_name, download_path=download_path)

downloaded_folder = Path(download_path) / "named-outputs" / output_name

Show Pareto Frontiers#

[9]:
param_vs_latency_img = Image(filename=downloaded_folder / "pareto_non_embedding_params_vs_onnx_latency.png")
display(param_vs_latency_img)
../../../../../_images/advanced_guide_cloud_azure_notebooks_quickstart_quickstart_17_0.png
[10]:
param_vs_memory_img = Image(filename=downloaded_folder / "pareto_non_embedding_params_vs_onnx_memory.png")
display(param_vs_memory_img)
../../../../../_images/advanced_guide_cloud_azure_notebooks_quickstart_quickstart_18_0.png
[11]:
latency_vs_memory_img = Image(filename=downloaded_folder / "pareto_onnx_latency_vs_onnx_memory.png")
display(latency_vs_memory_img)
../../../../../_images/advanced_guide_cloud_azure_notebooks_quickstart_quickstart_19_0.png

Show search state of the last iteration#

[11]:
df = nb_helper.get_search_csv(downloaded_folder)
df = df[['archid', 'non_embedding_params', 'onnx_latency', 'onnx_memory', 'is_pareto']]
df[(df['onnx_latency'] < 0.1) & (df['is_pareto'] == True)]
[11]:
archid non_embedding_params onnx_latency onnx_memory is_pareto
1 gpt2_2ce523ffef6587e0a9790c173a89d6fd25f1b9b6 7769472.0 0.068997 54.867023 True
5 gpt2_e24cd64a53d6a9be4a5cf2116daee38e0763947c 9791616.0 0.087752 57.236445 True
7 gpt2_ad522de7b54d7ad73231ccab52e7f8b17c444dd8 3154688.0 0.027303 45.336156 True
8 gpt2_3b76e4046ead432cab16d17d0a3d500efc0249c5 1963840.0 0.025981 13.966668 True
9 gpt2_56b3029ce65d7c75585e97c67a3162b6b34fd453 9462656.0 0.077058 61.318761 True
12 gpt2_af14be0b4e8f8a29e744cd81c503419200e05c3a 2564480.0 0.024791 43.084689 True
13 gpt2_08a67612ee38cccd05c47a45e137a8e82921c3c0 5356160.0 0.055600 34.976172 True
15 gpt2_1638d1cbba003004298c9f1fec1885e1c3e724ac 1841408.0 0.015650 29.566623 True
17 gpt2_a9736f6b29e56c1026b452ede20adc7b9f488ada 10346304.0 0.077892 72.751252 True
19 gpt2_35dc3c85cfeb893d23399a0d0c087ae53f718323 1782528.0 0.018778 15.950404 True
23 gpt2_49d99f06ad70afece175610b82e4b9f69a367d07 1185408.0 0.015044 27.064179 True
29 gpt2_6dcd8caf6477c3554cefa5a9629f2778e31582e8 6409472.0 0.054197 52.364760 True
31 gpt2_adcecc4a73970b5ceaf61c5a20c6e2fc914bebbf 1683328.0 0.018393 12.884486 True
33 gpt2_ea6fcf50e01df27f5065286f64d3d4fd086ad1f6 6908352.0 0.062050 40.857306 True
34 gpt2_3df377a7c2f287c0396798d69587be4ce80eb065 9476096.0 0.082323 58.701959 True
35 gpt2_5825b8567da483715267609a45897741217c7358 9194496.0 0.069051 73.742070 True
38 gpt2_1b55c333a06a31b8c3f5dee0be477bff60961b27 7109376.0 0.062521 44.316129 True
42 gpt2_6f1494e13898235ea0abfee60f651e972ad7c15f 8138432.0 0.071064 56.273517 True
44 gpt2_05bb7d11f19a7864283f4a4c48d205a71bbc0cca 5295232.0 0.048272 61.571509 True
45 gpt2_88d96623f6fba1aef736f1eebfde662ebd2dc91c 10352192.0 0.086744 64.712472 True
52 gpt2_cc745fafec97887cfbd9583cd89c276805121d14 8575680.0 0.069086 65.999776 True
53 gpt2_54a4da9b9011aa8b7cabd16c81d7ec60e1fa4bec 6503936.0 0.052899 55.414674 True
62 gpt2_1c4ef81c171f066ea6e51256051fb83af5ce9ee3 12931200.0 0.098794 90.671401 True