Task 01 - Incorporate speech-to-text transcription and call compliance (20 minutes)
Introduction
Azure OpenAI is an important part of the Azure AI Services landscape, but there are several other services that we can use to support Contoso Suites, including the Azure AI Services Speech service. Understanding and transcribing speech is a critical part of a modern call center analytics system. We can use the Speech service to enable automated transcription, and then submit those results to Azure OpenAI and our GPT-4o deployment to determine whether a call is in compliance.
You will build a transcription engine and a compliance checker with three simple rules:
- Does the call contain vulgarity?
- If the call needs an indicator that Contoso Suites is recording it, did that indicator happen? An example of this is when you hear the message “This call may be monitored or recorded for quality assurance and training purposes” on telephone calls.
- If users wish to check whether a call is actually relevant to Contoso Suites, was the call relevant? Contoso Suites deals with the hotel and travel industry.
Although we are performing a fairly simple set of compliance checks, these could be more sophisticated and serve to automate business processes such as rating call center agents based on a fixed set of criteria or indicating the calls that might require further manual review.
Description
In this task, you will extend the existing Streamlit dashboard to include speech-to-text transcription and call compliance activities.
Success Criteria
- You are able to generate a transcript of an audio file.
- You are able to perform a compliance check of an audio transcription.
- You are able to deploy the application to Azure App Services and have the application continue to function as expected.
Learning Resources
- Speech service overview
- Quickstart: Hear and speak with chat models in the Azure AI Studio playground
- Quickstart: Recognize and convert speech to text
Key Tasks
01: Review call center page
In the Streamlit dashboard, open the page 4_Call_Center.py
. Begin by running the dashboard and reviewing how the Call Center page operates. The Contoso Suites team has implemented a simple file upload feature and provided you with a sample audio file for testing. That sample file is in src/data/audio
and is entitled 01_Customer_Service_Call.wav
. Use that file to generate a test transcript and ensure you understand how the code on the page works.
If you are using a GitHub Codespaces instance, you can right-click on the
01_Customer_Service_Call.wav
file and select Download to save a local copy of this file.
02: Update transcription request code
The Contoso Suites development team tried to implement transcription functionality but were unable to get the code to work. They would like you to complete the create_transcription_request()
function. There are three TODO
items in the function.
Expand this section to view the solution
The code to complete the create_transcription_request()
function is as follows:
# Subscribe to the events fired by the conversation transcriber
transcriber.transcribed.connect(handle_final_result)
transcriber.session_started.connect(lambda evt: print(f'SESSION STARTED: {evt}'))
transcriber.session_stopped.connect(lambda evt: print(f'SESSION STOPPED {evt}'))
transcriber.canceled.connect(lambda evt: print(f'CANCELED {evt}'))
# stop continuous transcription on either session stopped or canceled events
transcriber.session_stopped.connect(stop_cb)
transcriber.canceled.connect(stop_cb)
transcriber.start_transcribing_async()
# Read the whole wave files at once and stream it to sdk
_, wav_data = wavfile.read(audio_file)
stream.write(wav_data.tobytes())
stream.close()
while not done:
time.sleep(.5)
transcriber.stop_transcribing_async()
This code satisfies all of the TODO
blocks and should go immediately after the stop_cb()
function and before return all_results
.
03. Determine call compliance
Fill in the is_call_in_compliance()
function. They would like to perform one check every time: determine whether there was vulgarity on the call. There are two optional checks that need to happen if the user selects the appropriate checkbox. If the user checks include_recording_message
, we want check if the caller was aware that the call is being recorded. If is_relevant_to_topic
is active, we want to ask if the call is relevant to the hotel and resort industry. After setting up a system prompt which handles these mandatory and conditional checks, call the make_open_ai_chat_request()
function and return its message contents.
Expand this section to view the solution
The complete is_call_in_compliance()
is as follows:
@st.cache_data
def is_call_in_compliance(call_contents, include_recording_message, is_relevant_to_topic):
"""Analyze a call for relevance and compliance."""
joined_call_contents = ' '.join(call_contents)
if include_recording_message:
include_recording_message_text = "2. Was the caller aware that the call was being recorded?"
else:
include_recording_message_text = ""
if is_relevant_to_topic:
is_relevant_to_topic_text = "3. Was the call relevant to the hotel and resort industry?"
else:
is_relevant_to_topic_text = ""
system = f"""
You are an automated analysis system for Contoso Suites.
Contoso Suites is a luxury hotel and resort chain with locations
in a variety of Caribbean nations and territories.
You are analyzing a call for relevance and compliance.
You will only answer the following questions based on the call contents:
1. Was there vulgarity on the call?
{include_recording_message_text}
{is_relevant_to_topic_text}
"""
response = make_azure_openai_chat_request(system, joined_call_contents)
return response.choices[0].message.content
04. Test and deploy
After filling in these code segments, re-run the application and ensure that you can generate a transcript of the sample call audio and can generate a compliance check. You should get a response that the call contains no vulgarity, that the user is aware of the call being recorded, and that the call does pertain to the hotel and resort industry.
Then, deploy the application and ensure that your it still behaves correctly as an App Service.