Azure Speech Managed Identity Setup
Configure Azure Speech for managed identity correctly by using the resource-specific endpoint, the right RBAC roles, and the matching Admin Settings fields.
Speech is the place where managed identity setup becomes very specific. The most important rule is simple: the regional endpoint that works with keys is not enough for managed identity. You need the resource-specific custom-subdomain endpoint so Azure can evaluate RBAC against the correct Speech resource.
Use the resource-specific endpoint
Managed identity needs the custom-subdomain Speech endpoint, not the shared regional gateway endpoint used by key-based authentication.
Grant RBAC on the Speech resource
Start with Cognitive Services Speech User and add Cognitive Services Speech Contributor only if the specific transcription flow still needs it.
Fill the matching admin fields
Endpoint, region, locale, authentication type, and resource ID all need to align with the Speech resource you are actually authorizing.
Test with a real audio workflow
Validate with upload, transcription, and optional text-to-speech scenarios rather than stopping at a configuration save.
The endpoint format is the root-cause difference
With key authentication, the regional endpoint can infer the target resource from the key. With managed identity, Azure needs the hostname itself to identify the specific Speech resource so it can evaluate RBAC. That is why the custom subdomain is mandatory here.
Authentication Methods: Regional vs. Resource-Specific Endpoints
Authentication Methods: Regional vs. Resource-Specific Endpoints
Regional Endpoint (Shared Gateway)
Endpoint format: https://<region>.api.cognitive.microsoft.com
- Example:
https://eastus2.api.cognitive.microsoft.com - This is a shared endpoint for all Speech resources in that Azure region
- Acts as a gateway that routes requests to individual Speech resources
Resource-Specific Endpoint (Custom Subdomain)
Endpoint format: https://<resource-name>.cognitiveservices.azure.com
- Example:
https://simplechat6-dev-speech.cognitiveservices.azure.com - This is a unique endpoint dedicated to your specific Speech resource
- Requires custom subdomain to be enabled on the resource
Why Regional Endpoint Works with Key but NOT Managed Identity
Key-Based Authentication ✅ Works with Regional Endpoint
When using subscription key authentication:
POST https://eastus2.api.cognitive.microsoft.com/speechtotext/transcriptions:transcribe
Headers:
Ocp-Apim-Subscription-Key: abc123def456...
Why it works:
- The subscription key directly identifies your specific Speech resource
- The regional gateway uses the key to look up which resource it belongs to
- The request is automatically routed to your resource
- Authorization succeeds because the key proves ownership
Managed Identity (AAD Token) ❌ Fails with Regional Endpoint
When using managed identity authentication:
POST https://eastus2.api.cognitive.microsoft.com/speechtotext/transcriptions:transcribe
Headers:
Authorization: Bearer eyJ0eXAiOiJKV1QiLCJhbGc...
Why it fails (returns 400 BadRequest):
- The Bearer token proves your App Service identity to Azure AD
- The token does NOT specify which Speech resource you want to access
- The regional gateway cannot determine:
- Which specific Speech resource you’re authorized for
- Whether your managed identity has RBAC roles on that resource
- Result: The gateway rejects the request with 400 BadRequest
Managed Identity ✅ Works with Resource-Specific Endpoint
When using managed identity with custom subdomain:
POST https://simplechat6-dev-speech.cognitiveservices.azure.com/speechtotext/transcriptions:transcribe
Headers:
Authorization: Bearer eyJ0eXAiOiJKV1QiLCJhbGc...
Why it works:
- The hostname itself identifies your specific Speech resource
- Azure validates your managed identity Bearer token against that resource’s RBAC
- If your App Service MI has
Cognitive Services Speech Userrole → authorized - The request proceeds to your dedicated Speech resource instance
For some transcription operations, you may also need Cognitive Services Speech Contributor. Start with Speech User, then add Speech Contributor if transcription still fails after endpoint and identity configuration are correct.
Required Setup for Managed Identity
Prerequisites
- Azure Speech Service resource created in your subscription
- System-assigned or user-assigned managed identity on your App Service
- RBAC role assignments on the Speech resource
Step 1: Turn On the Custom Domain on the Speech Resource
Why needed: By default, Speech resources use the regional endpoint and do NOT have custom subdomains. Managed identity requires the resource-specific endpoint.
Azure portal walkthrough
- Go to the Azure portal and open your Azure AI Speech resource.
- In the left pane under Resource Management, select Networking.
- Open the Firewalls and virtual networks tab.
- Select Generate Custom Domain Name.
- Enter a globally unique custom domain name. The final endpoint will look like
https://<custom-name>.cognitiveservices.azure.com. - Select Save.
- After the update finishes, open Keys and Endpoint and confirm the resource endpoint now starts with
https://<custom-name>.cognitiveservices.azure.com.
Important notes:
- Custom subdomain name must be globally unique across Azure
- Usually use the same name as your resource:
<resource-name> - One-way operation: Cannot be disabled once enabled
- Microsoft Learn recommends trying the change on a test resource first if the production Speech resource already has many Speech Studio models or projects
Azure CLI alternative
If you prefer CLI instead of the portal:
az account set --subscription <subscription-id>
az cognitiveservices account update \
--name <speech-resource-name> \
--resource-group <resource-group-name> \
--custom-domain <speech-resource-name>
Example:
az account set --subscription <subscription-id>
az cognitiveservices account update \
--name simplechat6-dev-speech \
--resource-group sc-simplechat6-dev-rg \
--custom-domain simplechat6-dev-speech
Verify the custom domain is enabled
Portal verification:
- Open the Speech resource.
- Go to Keys and Endpoint.
- Confirm the endpoint now starts with
https://<custom-name>.cognitiveservices.azure.cominstead ofhttps://<region>.api.cognitive.microsoft.com.
CLI verification:
az cognitiveservices account show \
--name <speech-resource-name> \
--resource-group <resource-group-name> \
--query "{customSubDomainName:properties.customSubDomainName, endpoint:properties.endpoint}"
Expected output:
{
"customSubDomainName": "simplechat6-dev-speech",
"endpoint": "https://simplechat6-dev-speech.cognitiveservices.azure.com/"
}
Step 2: Assign RBAC Roles to Managed Identity
Grant your App Service managed identity the necessary roles on the Speech resource:
# Get the Speech resource ID
SPEECH_RESOURCE_ID=$(az cognitiveservices account show \
--name <speech-resource-name> \
--resource-group <resource-group-name> \
--query id -o tsv)
# Get the App Service managed identity principal ID
MI_PRINCIPAL_ID=$(az webapp identity show \
--name <app-service-name> \
--resource-group <resource-group-name> \
--query principalId -o tsv)
# Assign Cognitive Services Speech User role (baseline data-plane access)
az role assignment create \
--assignee $MI_PRINCIPAL_ID \
--role "Cognitive Services Speech User" \
--scope $SPEECH_RESOURCE_ID
# Assign Cognitive Services Speech Contributor role (if transcription operations still require it)
az role assignment create \
--assignee $MI_PRINCIPAL_ID \
--role "Cognitive Services Speech Contributor" \
--scope $SPEECH_RESOURCE_ID
Verify role assignments:
az role assignment list \
--assignee $MI_PRINCIPAL_ID \
--scope $SPEECH_RESOURCE_ID \
-o table
Step 3: Configure Admin Settings
In the Admin Settings → Search & Extract → Multimedia Support section:
- Use the Setup Guide button on the AI Voice Conversations card if you want an in-app walkthrough while filling the Speech fields.
| Setting | Value | Example |
|---|---|---|
| Enable Audio File Support | ✅ Checked | |
| Enable Speech-to-Text Input | Optional | |
| Enable Text-to-Speech | Optional | |
| Speech Service Endpoint | Resource-specific endpoint (with custom subdomain) | https://simplechat6-dev-speech.cognitiveservices.azure.com |
| Speech Service Location | Azure region | eastus2 |
| Speech Service Locale | Language locale for transcription | en-US |
| Authentication Type | Managed Identity | |
| Speech Subscription ID | Optional helper for building the ARM resource ID in the Admin UI | 12345678-1234-1234-1234-123456789abc |
| Speech Resource Group | Optional helper for building the ARM resource ID in the Admin UI | rg-speech-prod |
| Speech Resource Name | Optional helper for building the ARM resource ID in the Admin UI | my-speech-resource |
| Speech Service Key | (Leave empty when using MI) | |
| Speech Resource ID | Required when using managed identity for text-to-speech | /subscriptions/.../providers/Microsoft.CognitiveServices/accounts/<speech-resource-name> |
Critical:
- Endpoint must be the resource-specific URL (custom subdomain)
- Do NOT use the regional endpoint for managed identity
- If you have not created the custom domain yet, use the Azure portal walkthrough in Step 1 before saving the Speech endpoint in Admin Settings
- Remove trailing slash from endpoint: ✅
https://..azure.com❌https://..azure.com/ - If text-to-speech is enabled with managed identity, set the full Speech Resource ID in Admin Settings
- If you do not know the full resource ID, the Admin Settings page can build it from Subscription ID, Resource Group, and Speech Resource Name
Step 4: Test Audio Upload
- Upload a short WAV or MP3 file
- Monitor application logs for transcription progress
- Expected log output:
File size: 1677804 bytes Produced 1 WAV chunks: ['/tmp/tmp_chunk_000.wav'] [Debug] Transcribing WAV chunk: /tmp/tmp_chunk_000.wav [Debug] Speech config obtained successfully [Debug] Received 5 phrases Creating 3 transcript pages
Troubleshooting
Error: NameResolutionError - Failed to resolve hostname
Symptom: Failed to resolve 'simplechat6-dev-speech.cognitiveservices.azure.com'
Cause: Custom subdomain not enabled on Speech resource
Solution: Enable custom subdomain using Step 1 above
Error: 400 BadRequest when using MI with regional endpoint
Symptom: 400 Client Error: BadRequest for url: https://eastus2.api.cognitive.microsoft.com/speechtotext/transcriptions:transcribe
Cause: Managed identity requires resource-specific endpoint, not regional
Solution: Update Admin Settings endpoint to use https://<resource-name>.cognitiveservices.azure.com
Error: 401 Authentication error with MI
Symptom: WebSocket upgrade failed: Authentication error (401)
Cause: Missing RBAC role assignments
Solution: Assign required roles using Step 2 above
Error: Text-to-speech fails with MI but transcription works
Symptom: Audio uploads or speech-to-text input succeed, but /api/chat/tts fails when authentication type is Managed Identity.
Cause: Text-to-speech managed identity also requires the Speech Resource ID in addition to the custom-domain endpoint and region.
Solution: Populate Speech Resource ID in Admin Settings and verify the App Service managed identity has the required RBAC role(s).
Key auth works but MI fails
Diagnosis checklist:
- Custom subdomain enabled on Speech resource?
- Admin Settings endpoint is resource-specific (not regional)?
- Managed identity has RBAC roles on Speech resource?
- Authentication Type set to “Managed Identity” in Admin Settings?
Summary
| Authentication Method | Endpoint Type | Example | Works? |
|---|---|---|---|
| Key | Regional | https://eastus2.api.cognitive.microsoft.com |
✅ Yes |
| Key | Resource-specific | https://simplechat6-dev-speech.cognitiveservices.azure.com |
✅ Yes |
| Managed Identity | Regional | https://eastus2.api.cognitive.microsoft.com |
❌ No (400 BadRequest) |
| Managed Identity | Resource-specific | https://simplechat6-dev-speech.cognitiveservices.azure.com |
✅ Yes (with custom subdomain) |
Key takeaway: Managed identity for Azure Cognitive Services data-plane operations requires:
- Custom subdomain enabled on the resource
- Resource-specific endpoint configured in your application
- RBAC roles assigned to the managed identity at the resource scope