How-To Guide

Azure Speech Managed Identity Setup

Configure Azure Speech for managed identity correctly by using the resource-specific endpoint, the right RBAC roles, and the matching Admin Settings fields.

Custom subdomain required Regional endpoint fails with MI Speech User role first

Managed identity guide Admin configuration

Speech is the place where managed identity setup becomes very specific. The most important rule is simple: the regional endpoint that works with keys is not enough for managed identity. You need the resource-specific custom-subdomain endpoint so Azure can evaluate RBAC against the correct Speech resource.

Use the resource-specific endpoint

Managed identity needs the custom-subdomain Speech endpoint, not the shared regional gateway endpoint used by key-based authentication.

Grant RBAC on the Speech resource

Start with Cognitive Services Speech User and add Cognitive Services Speech Contributor only if the specific transcription flow still needs it.

Fill the matching admin fields

Endpoint, region, locale, authentication type, and resource ID all need to align with the Speech resource you are actually authorizing.

Test with a real audio workflow

Validate with upload, transcription, and optional text-to-speech scenarios rather than stopping at a configuration save.

The endpoint format is the root-cause difference

With key authentication, the regional endpoint can infer the target resource from the key. With managed identity, Azure needs the hostname itself to identify the specific Speech resource so it can evaluate RBAC. That is why the custom subdomain is mandatory here.

Authentication Methods: Regional vs. Resource-Specific Endpoints

Regional Endpoint (Shared Gateway)

Endpoint format: https://<region>.api.cognitive.microsoft.com

Example: https://eastus2.api.cognitive.microsoft.com
This is a shared endpoint for all Speech resources in that Azure region
Acts as a gateway that routes requests to individual Speech resources

Resource-Specific Endpoint (Custom Subdomain)

Endpoint format: https://<resource-name>.cognitiveservices.azure.com

Example: https://simplechat6-dev-speech.cognitiveservices.azure.com
This is a unique endpoint dedicated to your specific Speech resource
Requires custom subdomain to be enabled on the resource

Why Regional Endpoint Works with Key but NOT Managed Identity

Key-Based Authentication ✅ Works with Regional Endpoint

When using subscription key authentication:

POST https://eastus2.api.cognitive.microsoft.com/speechtotext/transcriptions:transcribe
Headers:
  Ocp-Apim-Subscription-Key: abc123def456...

Why it works:

The subscription key directly identifies your specific Speech resource
The regional gateway uses the key to look up which resource it belongs to
The request is automatically routed to your resource
Authorization succeeds because the key proves ownership

Managed Identity (AAD Token) ❌ Fails with Regional Endpoint

When using managed identity authentication:

POST https://eastus2.api.cognitive.microsoft.com/speechtotext/transcriptions:transcribe
Headers:
  Authorization: Bearer eyJ0eXAiOiJKV1QiLCJhbGc...

Why it fails (returns 400 BadRequest):

The Bearer token proves your App Service identity to Azure AD
The token does NOT specify which Speech resource you want to access
The regional gateway cannot determine:
- Which specific Speech resource you’re authorized for
- Whether your managed identity has RBAC roles on that resource
Result: The gateway rejects the request with 400 BadRequest

Managed Identity ✅ Works with Resource-Specific Endpoint

When using managed identity with custom subdomain:

POST https://simplechat6-dev-speech.cognitiveservices.azure.com/speechtotext/transcriptions:transcribe
Headers:
  Authorization: Bearer eyJ0eXAiOiJKV1QiLCJhbGc...

Why it works:

The hostname itself identifies your specific Speech resource
Azure validates your managed identity Bearer token against that resource’s RBAC
If your App Service MI has Cognitive Services Speech User role → authorized
The request proceeds to your dedicated Speech resource instance

For some transcription operations, you may also need Cognitive Services Speech Contributor. Start with Speech User, then add Speech Contributor if transcription still fails after endpoint and identity configuration are correct.

Required Setup for Managed Identity

Prerequisites

Azure Speech Service resource created in your subscription
System-assigned or user-assigned managed identity on your App Service
RBAC role assignments on the Speech resource

Step 1: Turn On the Custom Domain on the Speech Resource

Why needed: By default, Speech resources use the regional endpoint and do NOT have custom subdomains. Managed identity requires the resource-specific endpoint.

Azure portal walkthrough

Go to the Azure portal and open your Azure AI Speech resource.
In the left pane under Resource Management, select Networking.
Open the Firewalls and virtual networks tab.
Select Generate Custom Domain Name.
Enter a globally unique custom domain name. The final endpoint will look like https://<custom-name>.cognitiveservices.azure.com.
Select Save.
After the update finishes, open Keys and Endpoint and confirm the resource endpoint now starts with https://<custom-name>.cognitiveservices.azure.com.

Important notes:

Custom subdomain name must be globally unique across Azure
Usually use the same name as your resource: <resource-name>
One-way operation: Cannot be disabled once enabled
Microsoft Learn recommends trying the change on a test resource first if the production Speech resource already has many Speech Studio models or projects

Azure CLI alternative

If you prefer CLI instead of the portal:

az account set --subscription <subscription-id>
az cognitiveservices account update \
  --name <speech-resource-name> \
  --resource-group <resource-group-name> \
  --custom-domain <speech-resource-name>

Example:

az account set --subscription <subscription-id>
az cognitiveservices account update \
  --name simplechat6-dev-speech \
  --resource-group sc-simplechat6-dev-rg \
  --custom-domain simplechat6-dev-speech

Verify the custom domain is enabled

Portal verification:

Open the Speech resource.
Go to Keys and Endpoint.
Confirm the endpoint now starts with https://<custom-name>.cognitiveservices.azure.com instead of https://<region>.api.cognitive.microsoft.com.

CLI verification:

az cognitiveservices account show \
  --name <speech-resource-name> \
  --resource-group <resource-group-name> \
  --query "{customSubDomainName:properties.customSubDomainName, endpoint:properties.endpoint}"

Expected output:

{
  "customSubDomainName": "simplechat6-dev-speech",
  "endpoint": "https://simplechat6-dev-speech.cognitiveservices.azure.com/"
}

Step 2: Assign RBAC Roles to Managed Identity

Grant your App Service managed identity the necessary roles on the Speech resource:

# Get the Speech resource ID
SPEECH_RESOURCE_ID=$(az cognitiveservices account show \
  --name <speech-resource-name> \
  --resource-group <resource-group-name> \
  --query id -o tsv)

# Get the App Service managed identity principal ID
MI_PRINCIPAL_ID=$(az webapp identity show \
  --name <app-service-name> \
  --resource-group <resource-group-name> \
  --query principalId -o tsv)

# Assign Cognitive Services Speech User role (baseline data-plane access)
az role assignment create \
  --assignee $MI_PRINCIPAL_ID \
  --role "Cognitive Services Speech User" \
  --scope $SPEECH_RESOURCE_ID

# Assign Cognitive Services Speech Contributor role (if transcription operations still require it)
az role assignment create \
  --assignee $MI_PRINCIPAL_ID \
  --role "Cognitive Services Speech Contributor" \
  --scope $SPEECH_RESOURCE_ID

Verify role assignments:

az role assignment list \
  --assignee $MI_PRINCIPAL_ID \
  --scope $SPEECH_RESOURCE_ID \
  -o table

Step 3: Configure Admin Settings

In the Admin Settings → Search & Extract → Multimedia Support section:

Use the Setup Guide button on the AI Voice Conversations card if you want an in-app walkthrough while filling the Speech fields.

Setting	Value	Example
Enable Audio File Support	✅ Checked
Enable Speech-to-Text Input	Optional
Enable Text-to-Speech	Optional
Speech Service Endpoint	Resource-specific endpoint (with custom subdomain)	`https://simplechat6-dev-speech.cognitiveservices.azure.com`
Speech Service Location	Azure region	`eastus2`
Speech Service Locale	Language locale for transcription	`en-US`
Authentication Type	Managed Identity
Speech Subscription ID	Optional helper for building the ARM resource ID in the Admin UI	`12345678-1234-1234-1234-123456789abc`
Speech Resource Group	Optional helper for building the ARM resource ID in the Admin UI	`rg-speech-prod`
Speech Resource Name	Optional helper for building the ARM resource ID in the Admin UI	`my-speech-resource`
Speech Service Key	(Leave empty when using MI)
Speech Resource ID	Required when using managed identity for text-to-speech	`/subscriptions/.../providers/Microsoft.CognitiveServices/accounts/<speech-resource-name>`

Critical:

Endpoint must be the resource-specific URL (custom subdomain)
Do NOT use the regional endpoint for managed identity
If you have not created the custom domain yet, use the Azure portal walkthrough in Step 1 before saving the Speech endpoint in Admin Settings
Remove trailing slash from endpoint: ✅ https://..azure.com ❌ https://..azure.com/
If text-to-speech is enabled with managed identity, set the full Speech Resource ID in Admin Settings
If you do not know the full resource ID, the Admin Settings page can build it from Subscription ID, Resource Group, and Speech Resource Name

Step 4: Test Audio Upload

Upload a short WAV or MP3 file
Monitor application logs for transcription progress

Expected log output:

File size: 1677804 bytes
Produced 1 WAV chunks: ['/tmp/tmp_chunk_000.wav']
[Debug] Transcribing WAV chunk: /tmp/tmp_chunk_000.wav
[Debug] Speech config obtained successfully
[Debug] Received 5 phrases
Creating 3 transcript pages

Troubleshooting

Error: NameResolutionError - Failed to resolve hostname

Symptom: Failed to resolve 'simplechat6-dev-speech.cognitiveservices.azure.com'

Cause: Custom subdomain not enabled on Speech resource

Solution: Enable custom subdomain using Step 1 above

Error: 400 BadRequest when using MI with regional endpoint

Symptom: 400 Client Error: BadRequest for url: https://eastus2.api.cognitive.microsoft.com/speechtotext/transcriptions:transcribe

Cause: Managed identity requires resource-specific endpoint, not regional

Solution: Update Admin Settings endpoint to use https://<resource-name>.cognitiveservices.azure.com

Error: 401 Authentication error with MI

Symptom: WebSocket upgrade failed: Authentication error (401)

Cause: Missing RBAC role assignments

Solution: Assign required roles using Step 2 above

Error: Text-to-speech fails with MI but transcription works

Symptom: Audio uploads or speech-to-text input succeed, but /api/chat/tts fails when authentication type is Managed Identity.

Cause: Text-to-speech managed identity also requires the Speech Resource ID in addition to the custom-domain endpoint and region.

Solution: Populate Speech Resource ID in Admin Settings and verify the App Service managed identity has the required RBAC role(s).

Key auth works but MI fails

Diagnosis checklist:

Custom subdomain enabled on Speech resource?
Admin Settings endpoint is resource-specific (not regional)?
Managed identity has RBAC roles on Speech resource?
Authentication Type set to “Managed Identity” in Admin Settings?

Summary

Authentication Method	Endpoint Type	Example	Works?
Key	Regional	`https://eastus2.api.cognitive.microsoft.com`	✅ Yes
Key	Resource-specific	`https://simplechat6-dev-speech.cognitiveservices.azure.com`	✅ Yes
Managed Identity	Regional	`https://eastus2.api.cognitive.microsoft.com`	❌ No (400 BadRequest)
Managed Identity	Resource-specific	`https://simplechat6-dev-speech.cognitiveservices.azure.com`	✅ Yes (with custom subdomain)

Key takeaway: Managed identity for Azure Cognitive Services data-plane operations requires:

Custom subdomain enabled on the resource
Resource-specific endpoint configured in your application
RBAC roles assigned to the managed identity at the resource scope

Azure Speech Managed Identity Setup

Use the resource-specific endpoint

Grant RBAC on the Speech resource

Fill the matching admin fields

Test with a real audio workflow

The endpoint format is the root-cause difference

Authentication Methods: Regional vs. Resource-Specific Endpoints

Authentication Methods: Regional vs. Resource-Specific Endpoints

Regional Endpoint (Shared Gateway)

Resource-Specific Endpoint (Custom Subdomain)

Why Regional Endpoint Works with Key but NOT Managed Identity

Key-Based Authentication ✅ Works with Regional Endpoint

Managed Identity (AAD Token) ❌ Fails with Regional Endpoint

Managed Identity ✅ Works with Resource-Specific Endpoint

Required Setup for Managed Identity

Prerequisites

Step 1: Turn On the Custom Domain on the Speech Resource

Azure portal walkthrough

Azure CLI alternative

Verify the custom domain is enabled

Step 2: Assign RBAC Roles to Managed Identity

Step 3: Configure Admin Settings

Step 4: Test Audio Upload

Troubleshooting

Error: NameResolutionError - Failed to resolve hostname

Error: 400 BadRequest when using MI with regional endpoint

Error: 401 Authentication error with MI

Error: Text-to-speech fails with MI but transcription works

Key auth works but MI fails

Summary

References