Get started with speech
Microsoft’s speech services
The Speech Services by Cognitive Services are the unification of speech-to-text, text-to-speech, and speech-translation into a single Azure subscription. It’s easy to speech enable your applications, tools, and devices with the Speech SDK, Speech Devices SDK, or REST APIs.
Speech-to-text
Speech-to-text from Azure Speech Services, enables real-time transcription of audio streams into text that your applications, tools, or devices can consume, display, and take action on as command input. This service is powered by the same recognition technology that Microsoft uses for Cortana and Office products, and works seamlessly with the translation and text-to-speech.
Text-to-speech
Text-to-speech from Azure Speech Services is a service that enables your applications, tools, or devices to convert text into natural human-like synthesized speech. Choose from standard and neural voices, or create your own custom voice unique to your product or brand.
See also:
Get started with Custom Speech service
What is Custom Speech? Access Custom Speech portal
Get started with a device and speech
Get Speech Devices SDK and find suitable development kits here.
Testing a speech platform device
The signal paths and architecture used for testing a Microsoft Windows Speech Platform device are described below:
Capture streams represent audio signals acquired by integrated microphone(s), and pre-processed for use by a speech recognition engine or keyword spotter. Render streams represent audio signals destined for playback via device speakers or playback accessories, and enable echo cancellation algorithm functionality.