AI & ML Academy - Azure ML Platform
Welcome to the AI & ML Academy (AIA) - ML Platform!
Overview
Azure Machine Learning (Azure ML, or AML) Services is a platform and not just a notebook to run ML models. Azure ML Service brings enterprise users the ability to train, test, and deploy their models across a host of environments to support their machine learning applications. The goal is to run our machine learning models in an evergreen production environment as an application and not solely as a one-time experiment.
The purpose of this module is to outline the different run-time environments. We’ve created a decision matrix and plotted the different Azure Services applicable to Artificial Intelligence or Machine Learning. The matrix will have two spectrums (scenarios): Compute Type and ML Lifecycle. The cells of this matrix will define the options we have across the Azure Platform to deploy and run Machine Learning models. The purpose of this module will be to define where each ML service resides and more importantly, the use case, tools, personas, frameworks, and languages that support them.
The first scenario of Azure ML is the Compute Type. Azure ML is a cross-platform environment that spans from Windows, Linux, and Azure. These run-time environments can be classified into two categories: Hybrid or Cloud. A Hybrid environment is a physical or virtual environment that runs on a workstation or set of servers running in a data center or cloud-hosting. The Cloud for this discussion will be Azure running as a PaaS or SaaS service but is not limited to only Azure.
The second scenario of Azure ML is the ML Lifecycle. We have different resource requirements depending on where the model resides in the lifecycle. These two stages of the ML lifecycles are Training/Test or Inference (we’ve simplified the lifecycles to reduce the complexity for this discussion).
Here is the decision matrix outlining the scenarios and the resulting run-time environments. There are four options in this matrix and the environment names will be Developer, Hosted, Unmanaged, and Managed. The main purpose of this matrix is to serve as a learning tool and table of content for this learning resource.
ML Lifecycle | Compute Type | |
Hybrid | Cloud | |
Training/Testing | Developer Environment | Hosted Environment |
Inference | Unmanaged Environment | Managed Environment |
Here are the definitions for each environment that are part of this matrix. (For simplicity, we’ve defined four and realize there are other combinations like Multi-cloud, but are focusing on the most prevalent environments)
- Developer Environment – This is an environment for a Pro-Developer Data Scientist to train and test their models. These models can be run on your local machine, physical or virtual servers, and in an IaaS environment in the cloud. The Data Scientist will build out their environment from scratch (OS, Language, Framework, and IDE). This provides versatility, but a high degree of maintenance to reproduce your model. The mindset of “it runs on my computer” won’t be acceptable so an audit trail will be required at release time.
- Unmanaged Environment – This environment is an extension of the Developer Environment but migrates it to a production environment that is fully-managed by the end-user. This gives the end-user all the versatility of the open-source ecosystem and provides the best-of-breed approach. A typical end-user would be an ML Engineer who has a model and set of artifacts ready to deploy to production.
- Hosted Environment – This environment will utilize a common runtime environment in terms of OS, Languages, Frameworks, and IDEs. The hosted environment is set up and supported by Azure. The most common environment will be set up to ensure versatility but reduce set up time for users. A typical user for this environment is a Data Scientist who may experiment in model development but needs to hand-off their code to experts to deploy into production. A great example of this is Azure ML Notebooks.
- Managed Environment – This environment is similar to a PaaS or SaaS Azure service where the developer isn’t concerned with the set-up of their environment or availability of the service. Their main concern is the scalability of the environment to service the traffic to score the data. Managed environments start to transition from a Data Scientist to a more Low-code option to streamline adoption by Citizen Developers. Managed environments are not typically leveraged by Pro-Developers since they like to build their own through the Software Development Lifecycle.
The recommended migration path across lifecycles is defined based on the color of the cells. A Hybrid Environment will typically start with the Developer Environment for training/testing and then deploy to an Unmanaged Environment. This Unmanaged Environment can be hosted in Azure. There is some crossover, but the versions of software you need to replicate in the Developer Environment will likely not match one found in the Managed Environment. The same holds true with the Cloud Environment and migration from Hosted to Managed.
Developer Environment
This is an environment for a Pro-Developer Data Scientist to train and test their models. These models can be run on your local machine, physical or virtual servers, and in an IaaS environment in the cloud. The Data Scientist will build out their environment from scratch (OS, Language, Framework, and IDE). This provides versatility, but a high degree of maintenance to reproduce your model. The mindset of “it runs on my computer” won’t be acceptable so an audit trail will be required at release time.
For example, a Pro-Developer Data Scientist – who wants to run their experiment on their local PC in the WSL Environment – can set up their environment with the necessary requirements file for their ML experiment. They can leverage VS Code to execute code on an ad-hoc basis for training and exploratory data analysis. There local machine doesn’t have enough horsepower, so they migrate their environment to an Azure Data Science VM. The additional GPUs will reduce training runtime. Likewise, they can leverage a LINUX VM directly and create the environment from scratch per their local machine configuration. The typical use case is physical, virtual, or IaaS compute. This compute environment is attached to Azure ML Service through the Python SDK to track, monitor, and register models experiments as required.
Here are the Azure Services that support AI & ML workloads
- Local
- Remote Azure VM (VM, ACI, AKS)
- ML.NET
Resource | Level | Training Assets URL |
Local | 100 | Fundamentals of machine learning in the cloud |
200 | Enhance your Azure ML experience with the VS Code extension | |
300 | Train an image classification model with Azure ML | |
Remote Azure VM | 100 | Supercharge your Azure ML development with VS Code |
200 | Azure ML in a Day | |
300 | Machine Learning Cheat Sheet | |
ML.NET | 100 | ML.NET Comparison Cheat Sheet |
200 | On .NET Live – Adding Machine Learning to your .NET Apps with ML .NET | |
300 | ML.NET tutorials |
Unmanaged Environment
This environment is an extension of the Developer Environment but migrates it to a production environment that is fully-managed by the end-user. This gives the end-user all the versatility of the open-source ecosystem and provides the best-of-breed approach. A typical end-user would be an ML Engineer who has a model and set of artifacts ready to deploy to production.
After model selection, they leverage the Azure CLI to deploy the model and environment to a LINUX VM in Azure. The Azure VM will run as the scoring engine (inference) for production applications and/or as an MVP application to monitor model performance.
Here are the Azure Services that support AI & ML workloads
- Local Web Service (Docker Image)
- Remote VM (Azure VM, Kubernetes, AKS)
- Windows ML
Resource | Level | Training Assets URL |
Local Web Service | 100 | How do you deploy a machine learning model as a web service in Azure? |
200 | Use a custom container to deploy a model to an online endpoint | |
300 | Register a model and deploy locally | |
Remote VM | 100 | Model deployment and inferencing with Azure ML |
200 | ONNX and Azure ML | |
300 | Deploying a web service to Azure Kubernetes Service (AKS) | |
Windows ML | 100 | Windows AI |
200 | Image Classification with ML.NET and Windows Machine Learning | |
300 | Windows Machine Learning Samples |
Hosted Environment
This environment will utilize a common runtime environment in terms of OS, Languages, Frameworks, and IDEs. The hosted environment is created and supported by Azure. The most common environment will be utilized to ensure versatility but reduce set up time for users. A typical user for this environment is a Data Scientist who may experiment in model development but needs to hand-off their code to experts to deploy into production. A great example of this is Azure ML Notebooks.
These environments will be created with a preconfigured set of machine learning frameworks, languages, packages, and tools. This is to ensure set up time is minimal for end-users. This environment is typically for a Data Scientist who wants to build machine learning models but isn’t an expert regarding configuration and infrastructure. The simplicity of the environment is a benefit, but the flexibility to build a multitude of models for accuracy may be restrained depending on the configuration of these environments. It is highly recommended you review the configuration, so you fully understand the options.
Here are the Azure Services that support AI & ML workloads
- DSVM
- Synapse Analytics Spark Pools
- Synapse Analytics Workspace
- Azure Machine Learning Notebooks
- Azure Machine Learning Attached Compute
- Azure Databricks
- HDInsight
Resource | Level | Training Assets URL | |
DSVM | 100 | Data Science VM overview (DSVM) | |
200 | Data Science Tools on DSVM | ||
300 | Samples on DSVM | ||
Synapse Analytics Workspace | 100 | Machine Learning Experiences in Azure Synapse | |
200 | Azure AI Services in Azure Synapse Analytics | ||
Synapse Analytics Spark Pools | 100 | Synapse Analytics Spark Pools overview | |
200 | Build a machine learning app with Apache Spark MLlib and Azure Synapse Analytics | ||
300 | SynapseML running on Synapse Analytics Spark Pools | ||
Azure ML Notebooks | 100 | Azure ML Studio Notebooks | |
200 | Run Jupyter notebooks in your workspace | ||
300 | Image Classification using Notebooks in Azure ML | ||
Azure ML Compute | 100 | Training machine learning models at scale with Azure ML | |
200 | What is an Azure ML compute instance? | ||
300 | Azure ML Training Compute Targets | ||
Databricks | 100 | Azure Databricks overview | |
200 | Deploy models for inference and prediction | ||
300 | Model training examples |
Managed Environment
This environment is similar to a PaaS or SaaS Azure service where the developer isn’t concerned with the setup of their environment or availability of the service. Their main concern is the scalability of the environment to service the traffic to score the data. Managed environments start to transition from a Data Scientist to a more Low-code option to streamline adoption by Citizen Developers. Managed environments are not typically leveraged by Pro-Developers since they like to build their own through the Software Development Lifecycle.
These environments are pre-built environments with varying degrees of model portability. The purpose is to reduce machine learning model development/deployment time. A Database Developer might want to run a batch inference script (T-SQL) to score the new data in the database platform and/or a Data Scientist who needs to wear multiple hats can leverage the Azure ML Service for an end-to-end training and scoring.
Here are the Azure Services that support AI & ML workloads
- Applied AI
- SQL Server Machine Learning Service
- Azure SQL Managed Instance
- Azure ML Inference Cluster
- Azure ML Attached Compute
- Azure ML Managed Endpoints
- Azure Batch
- SQL Edge
- IoT Edge
Resource | Level | Training Assets URL |
Applied AI | 100 | Azure Applied AI Services overview |
200 | Prebuilt AI models in Customer Applications | |
300 | Create a Document Intelligence Logic Apps workflow | |
SQL Server ML Svc | 100 | What is SQL Server Machine Learning Services with Python and R? |
Azure SQL Managed Instance | 100 | Machine Learning Services in Azure SQL Managed Instance |
200 | Key Differences between Managed Instance and SQL Server | |
300 | Linear Regression tutorial for Managed Instance | |
Azure ML Inference Compute | 100 | Model deployment and inferencing |
200 | Compute targets for inferencing | |
300 | Attach AKS to Azure ML workspace | |
Azure ML Managed Endpoint | 100 | Online endpoints and deployments for real-time inference |
200 | Deploy and score model with online endpoint | |
SQL Edge | 100 | What is Azure SQL Edge |
200 | ONNX in SQL Edge | |
300 | Deploy and make predictions with an ONNX model |