Supported Models
Windows ML CLI has validated a set of models for compatibility across all
Execution Providers (EPs)—see the full
Model Accuracy Report .
winml-cli supports a wide range of model architectures and tasks. This page
lists what's validated and how to discover model support.
Discovery Commands
# Browse the curated catalog (64 validated models)
uv run winml catalog
# Filter by task
uv run winml catalog -t image-classification
# Check if a specific model is supported
uv run winml inspect -m microsoft/resnet-50
# List all known tasks
uv run winml inspect --list-tasks
Supported Tasks
winml-cli recognizes 35 task types across vision, NLP, audio, and multimodal domains. Of these, 16 have dedicated inference classes; the remainder are supported via the generic task fallback.
Vision
Task
Example Models
image-classification
ResNet, ConvNeXt, ViT, Swin
image-segmentation
Segformer, Mask2Former
semantic-segmentation
Segformer
object-detection
DETR, YOLOS, Table-Transformer
depth-estimation
Depth Anything, ZoeDepth
image-feature-extraction
DINOv2, ViT
zero-shot-image-classification
CLIP, SigLIP
NLP
Task
Example Models
text-classification
BERT, RoBERTa, XLM-RoBERTa
token-classification
BERT, RoBERTa (NER)
question-answering
BERT, RoBERTa
fill-mask
BERT, RoBERTa
feature-extraction
BGE, BERT, all-MiniLM
text-generation
Qwen3 (composite)
text2text-generation
T5, BART, Marian
Audio
Task
Example Models
automatic-speech-recognition
Whisper
audio-classification
Wav2Vec2
Multimodal
Task
Example Models
zero-shot-image-classification
CLIP (text + vision)
image-to-text
VisionEncoderDecoder
visual-question-answering
BLIP
Validated Model Catalog
The following models have been validated end-to-end with EP compatibility
testing. Use winml catalog to browse the full list interactively.
Image Classification
Model
Architecture
apple/mobilevit-small
MobileViT
dima806/fairface_age_image_detection
ViT
facebook/convnext-tiny-224
ConvNeXt
google/vit-base-patch16-224
ViT
microsoft/resnet-18
ResNet
microsoft/resnet-50
ResNet
microsoft/swin-large-patch4-window7-224
Swin
rizvandwiki/gender-classification
ViT
Model
Architecture
facebook/dino-vitb16
ViT
facebook/dino-vits16
ViT
facebook/dinov2-small
DINOv2
google/vit-base-patch16-224-in21k
ViT
Model
Architecture
BAAI/bge-base-en-v1.5
BERT
BAAI/bge-m3
XLM-RoBERTa
BAAI/bge-small-en-v1.5
BERT
google-bert/bert-base-multilingual-cased
BERT
Intel/bert-base-uncased-mrpc
BERT
laion/CLIP-ViT-B-32-laion2B-s34B-b79K
CLIP
openai/clip-vit-base-patch16
CLIP
openai/clip-vit-base-patch32
CLIP
sentence-transformers/all-MiniLM-L6-v2
BERT
sentence-transformers/all-mpnet-base-v2
MPNet
sentence-transformers/multi-qa-mpnet-base-dot-v1
MPNet
sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
BERT
Sentence Similarity
Model
Architecture
BAAI/bge-base-en-v1.5
BERT
BAAI/bge-large-en-v1.5
BERT
BAAI/bge-m3
XLM-RoBERTa
BAAI/bge-small-en-v1.5
BERT
sentence-transformers/all-MiniLM-L6-v2
BERT
sentence-transformers/all-mpnet-base-v2
MPNet
sentence-transformers/multi-qa-mpnet-base-dot-v1
MPNet
sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
BERT
sentence-transformers/paraphrase-multilingual-mpnet-base-v2
XLM-RoBERTa
Fill-Mask
Model
Architecture
distilbert/distilbert-base-uncased
DistilBERT
FacebookAI/roberta-base
RoBERTa
FacebookAI/roberta-large
RoBERTa
FacebookAI/xlm-roberta-base
XLM-RoBERTa
google-bert/bert-base-multilingual-cased
BERT
google-bert/bert-base-multilingual-uncased
BERT
google-bert/bert-base-uncased
BERT
Text Classification
Model
Architecture
cardiffnlp/twitter-roberta-base-sentiment-latest
RoBERTa
distilbert/distilbert-base-uncased-finetuned-sst-2-english
DistilBERT
Intel/bert-base-uncased-mrpc
BERT
ProsusAI/finbert
BERT
Token Classification
Model
Architecture
Babelscape/wikineural-multilingual-ner
BERT
dbmdz/bert-large-cased-finetuned-conll03-english
BERT
dslim/bert-base-NER
BERT
Isotonic/distilbert_finetuned_ai4privacy_v2
DistilBERT
w11wo/indonesian-roberta-base-posp-tagger
RoBERTa
Question Answering
Model
Architecture
deepset/bert-large-uncased-whole-word-masking-squad2
BERT
deepset/roberta-base-squad2
RoBERTa
deepset/tinyroberta-squad2
RoBERTa
distilbert/distilbert-base-cased-distilled-squad
DistilBERT
distilbert/distilbert-base-uncased-distilled-squad
DistilBERT
google-bert/bert-large-uncased-whole-word-masking-finetuned-squad
BERT
Zero-Shot Classification
Model
Architecture
joeddav/xlm-roberta-large-xnli
XLM-RoBERTa
Zero-Shot Image Classification
Model
Architecture
openai/clip-vit-base-patch16
CLIP
Image Segmentation
Model
Architecture
mattmdjaga/segformer_b2_clothes
Segformer
nvidia/segformer-b1-finetuned-ade-512-512
Segformer
nvidia/segformer-b2-finetuned-ade-512-512
Segformer
nvidia/segformer-b5-finetuned-ade-640-640
Segformer
Image-to-Text
Model
Architecture
microsoft/trocr-base-handwritten
VisionEncoderDecoder
microsoft/trocr-base-printed
VisionEncoderDecoder
microsoft/trocr-large-handwritten
VisionEncoderDecoder
Execution Provider Compatibility
Each validated model is tested against available EPs:
EP
Alias
Devices
Notes
NvTensorRTRTXExecutionProvider
nvtensorrtrtx, nv_tensorrt_rtx
GPU
NVIDIA TensorRT-RTX; NVIDIA GPU with TensorRT runtime
CUDAExecutionProvider
cuda
GPU
NVIDIA CUDA; any CUDA-capable GPU
MIGraphXExecutionProvider
migraphx
GPU
AMD ROCm MIGraphX
QNNExecutionProvider
qnn
NPU, GPU
Qualcomm Snapdragon; bundled in ORT
OpenVINOExecutionProvider
openvino
NPU, GPU, CPU
Intel hardware
DmlExecutionProvider
dml
GPU
DirectML; any DirectX 12 GPU
CPUExecutionProvider
cpu
CPU
Always available
VitisAIExecutionProvider
vitisai
NPU
AMD/Xilinx
Adding Unsupported Models
If your model architecture isn't in the catalog, winml-cli may still support it
through auto-detection:
# Try inspecting first
uv run winml inspect -m your-org/your-model
# If "Status: Supported", proceed normally
uv run winml build -m your-org/your-model -d auto -o output/
For truly custom architectures, use --trust-remote-code to allow execution of
model code from the Hugging Face Hub.
See also