Skip to the content.

Image classification

This directory provides examples and best practices for building image classification systems. Our goal is to enable users to easily and quickly train high-accuracy classifiers on their own datasets. We provide example notebooks with pre-set default parameters that are shown to work well on a variety of data sets. We also include extensive documentation of common pitfalls and best practices. Additionally, we show how Azure, Microsoft’s cloud computing platform, can be used to speed up training on large data sets or deploy models as web services.

Image classification (single object) Image classification (multiple objects)

We recommend using PyTorch as a Deep Learning platform for its ease of use, simplicity when debugging, and popularity in the data science community. For Computer Vision functionality, we also rely heavily on fast.ai, a PyTorch data science library which comes with rich deep learning features and extensive documentation. We highly recommend watching the 2019 fast.ai lecture series video to understand the underlying technology. Fast.ai’s documentation is also a valuable resource.

Frequently asked questions

Answers to Frequently Asked Questions such as “How many images do I need to train a model?” or “How to annotate images?” can be found in the FAQ.md file.

Notebooks

We provide several notebooks to show how image classification algorithms are designed, evaluated and operationalized. Notebooks starting with 0 are intended to be run sequentially, as there are dependencies between them. These notebooks contain introductory “required” material. Notebooks starting with 1 can be considered optional and contain more complex and specialized topics.

While all notebooks can be executed in Windows, we have found that fast.ai is much faster on the Linux operating system. Additionally, using GPU dramatically improves training speeds. We suggest using an Azure Data Science Virtual Machine with V100 GPU (instructions, price table).

We have also found that some browsers do not render Jupyter widgets correctly. If you have issues, try using an alternative browser, such as Edge or Chrome.

Notebook name Description
00_webcam.ipynb Demonstrates inference on an image from your computer’s webcam using a pre-trained model.
01_training_introduction.ipynb Introduces some of the basic concepts around model training and evaluation.
02_multilabel_classification.ipynb Introduces multi-label classification and introduces key differences between training a multi-label and single-label classification models.
03_training_accuracy_vs_speed.ipynb Trains a model with high accuracy vs one with a fast inferencing speed. Use this to train on your own datasets!
10_image_annotation.ipynb A simple UI to annotate images.
11_exploring_hyperparameters.ipynb Finds optimal model parameters using grid search.
12_hard_negative_sampling.ipynb Demonstrated how to use hard negatives to improve your model performance.
20_azure_workspace_setup.ipynb Sets up your Azure resources and Azure Machine Learning workspace.
21_deployment_on_azure_container_instances.ipynb Deploys a trained model exposed on a REST API using Azure Container Instances (ACI).
22_deployment_on_azure_kubernetes_service.ipynb Deploys a trained model exposed on a REST API using the Azure Kubernetes Service (AKS).
23_aci_aks_web_service_testing.ipynb Tests the deployed models on either ACI or AKS.
24_exploring_hyperparameters_on_azureml.ipynb Performs highly parallel parameter sweeping using AzureML’s HyperDrive.

Azure-enhanced notebooks

Azure products and services are used in certain notebooks to enhance the efficiency of developing classification systems at scale.

To successfully run these notebooks, the users need an Azure subscription or can use Azure for free.

The Azure products featured in the notebooks include: