Speech Commands Model Gallery

This gallery is a collection of audio keyword spotter models trained using the technique described in the Tutorial on training an audio keyword spotter using the Speech Commands Dataset.

These models use different neural network architectures with different sizes to trade off accuracy and speed. The plot below shows how each of the models performs in terms of accuracy (how often the most confident prediction is right) versus speed (milliseconds per audio prediction on a Raspberry Pi 3 device). Click and drag to pan around. Zoom in and out with your mouse's scrollwheel. Click on a model to go to the gallery page where you can download that model.

Architecture Accuracy msec/frame Model name