Task 03 - Implement vector search of audio transcriptions and summaries (30 minutes)
Introduction
After seeing you generate a call transcript and summary in the prior tasks of this exercise, the Contoso Suites development team is excited at the prospect of saving these transcripts and then searching through them. They would like to use Cosmos DB to store the transcript and then perform semantic search–that is, searching by concepts or meanings rather than specific words. The primary benefit to semantic search is that requesters can retrieve documents based on the meaning of terms rather than exact term match. This will return documents that the requester might have intended but did not include any of the precise words that appeared in the request. Further, we will search on the abstractive summarization of the transcript, shortening the transcript to a segment of approximately two sentences. Searching the summaries rather than full texts will reduce the number of false positive results when using a vector comparison technique such as cosine similarity for search.
Description
In this task, you will implement functionality to save the text of a call transcription, as well as its embedding (via Azure OpenAI), in a Cosmos DB container. You will then make use of Cosmos DB’s vector similarity capability to perform semantic search in the existing Streamlit application.
Success Criteria
- You are saving transcripts and abstractive summaries in Cosmos DB, as well as an embedding of the abstractive summary.
- You are able to perform a search on the Call Center Search page, returning call transcripts based on a text query.
Learning Resources
- Index and query vectors in Azure Cosmos DB for NoSQL in Python
- Tutorial: Explore Azure OpenAI Service embeddings and document search
- cosmos-vector-aoai
- azure-cosmos-db-vector-search-openai-python
Key Tasks
01: Create call transcripts container
Create a new Cosmos DB container in the ContosoSuites
database. Name this container CallTranscripts
. It should have a partition key of /call_id
and the same vector embedding policy that you used in Exercise 03: a container vector policy using the cosine
distance function, based on a float32
data type, and 1536
dimensions, using an index type of diskANN
. Designate the field for storing vector data as request_vector
.
Expand this section to view the solution
Container vector policies and vector indexing policies must be defined at the time of container creation. In order to create a container, perform the following steps:
- In the Azure portal, navigate to your Cosmos DB resource.
- Select Data Explorer in the left-hand menu.
- On the Data Explorer page, select New Container
- In the New Container dialog:
- Select Use existing under Database id and select the ContosoSuites database from the dropdown list.
- Enter
CallTranscripts
into the Container id box. - Enter
/call_id
into the Partition key box. - Expand the Container Vectory Policy section of the dialog, select Add vector embedding, and then enter the following values into the specified fields:
- Path: Enter “/request_vector”.
- Data type: Select float32.
- Distance function: Select cosine.
- Dimensions: Enter 1536. This is based on the number of dimensions generated by the
ada-text-embedding-002
model in Azure OpenAI. - Index type: Select diskANN. Given the number of dimensions being specified, 1536, the
flat
index type will not work, as it only supports a maximum of 505 dimensions for vectors. ThequantizedFlat
index could also be used here.diskANN
is a more efficient index type, but given the amount of data we are working with in this lab, you likely will not notice any difference in performance.
- Select OK to create the container.
02: Make embedding request
Open the 4_Call_Center.py
file in the ContosoSuitesDashboard
folder. Complete the function make_azure_openai_embedding_request()
, accepting a string of text and creating an embedding based on it. Be sure to use the embedding_deployment_name
secret, as the GPT-4o model deployment will not be appropriate for generating embeddings.
Expand this section to view the solution
The completed version of the make_azure_openai_embedding_request()
function is as follows:
def make_azure_openai_embedding_request(text):
"""Create and return a new embedding request. Key assumptions:
- Azure OpenAI endpoint, key, and deployment name stored in Streamlit secrets."""
token_provider = get_bearer_token_provider(
DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default"
)
aoai_endpoint = st.secrets["aoai"]["endpoint"]
aoai_embedding_deployment_name = st.secrets["aoai"]["embedding_deployment_name"]
client = openai.AzureOpenAI(
azure_ad_token_provider=token_provider,
api_version="2024-06-01",
azure_endpoint = aoai_endpoint
)
# Create and return a new embedding request
return client.embeddings.create(
model=aoai_embedding_deployment_name,
input=text
)
If the code shows some errors on the DefaultAzureCredential
or get_bearer_token_provider
you may need to add this line at the top of the file:
from azure.identity import DefaultAzureCredential, get_bearer_token_provider
03: Generate embeddings
Complete the function generate_embeddings_for_call_contents()
. This function should normalize the incoming text before calling make_azure_openai_embedding_request()
. Then, you should return the actual embeddings–not just the entire response object!.
Expand this section to view the solution
The completed version of the generate_embeddings_for_call_contents()
function is as follows:
def generate_embeddings_for_call_contents(call_contents):
"""Generate embeddings for call contents. Key assumptions:
- Call contents is a single string.
- Azure OpenAI endpoint, key, and deployment name stored in Streamlit secrets."""
# Normalize the text for tokenization
normalized_content = normalize_text(call_contents)
# Call make_azure_openai_embedding_request() with the normalized content
response = make_azure_openai_embedding_request(normalized_content)
return response.data[0].embedding
04: Save transcript
Complete the function save_transcript_to_cosmos_db()
. This function should create a new record in the CallTranscripts
container.
Expand this section to view the solution
The completed version of the save_transcript_to_cosmos_db()
function is as follows:
def save_transcript_to_cosmos_db(transcript_item):
"""Save embeddings to Cosmos DB vector store. Key assumptions:
- transcript_item is a JSON object containing call_id (int),
call_transcript (string), and request_vector (list).
- Cosmos DB endpoint, client_id, and database name stored in Streamlit secrets."""
cosmos_client_id = st.secrets["cosmos"]["client_id"]
cosmos_credentials = DefaultAzureCredential(managed_identity_client_id=cosmos_client_id)
cosmos_endpoint = st.secrets["cosmos"]["endpoint"]
cosmos_database_name = st.secrets["cosmos"]["database_name"]
cosmos_container_name = "CallTranscripts"
# Create a CosmosClient
client = CosmosClient(url=cosmos_endpoint, credential=cosmos_credentials)
# Load the Cosmos database and container
database = client.get_database_client(cosmos_database_name)
container = database.get_container_client(cosmos_container_name)
# Insert the call transcript
container.create_item(body=transcript_item)
05: Update call center search page
Open the 5_Call_Center_search.py
file. Replace the make_azure_openai_embedding_request()
stub function with the function you created in 4_Call_Center.py
. The function call will be exactly the same between these two files.
Expand this section to view the solution
The completed version of the make_azure_openai_embedding_request()
function is as follows:
def make_azure_openai_embedding_request(text):
"""Create and return a new embedding request. Key assumptions:
- Azure OpenAI endpoint, key, and deployment name stored in Streamlit secrets."""
token_provider = get_bearer_token_provider(
DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default"
)
aoai_endpoint = st.secrets["aoai"]["endpoint"]
aoai_embedding_deployment_name = st.secrets["aoai"]["embedding_deployment_name"]
client = openai.AzureOpenAI(
azure_ad_token_provider=token_provider,
api_version="2024-06-01",
azure_endpoint = aoai_endpoint
)
# Create and return a new embedding request
return client.embeddings.create(
model=aoai_embedding_deployment_name,
input=text
)
06: Make Cosmos DB vector search request
Complete the make_cosmos_db_vector_search_request()
function. It should perform a VectorDistance()
query and return up to max_results
results with a minimum similarity score of minimum_similarity_score
. The columns it should return include id
, call_id
, call_transcript
, abstractive_summary
, and the results of VectorDistance()
as SimilarityScore
.
Expand this section to view the solution
The completed version of the make_cosmos_db_vector_search_request()
function is as follows:
def make_cosmos_db_vector_search_request(query_embedding, max_results=5,minimum_similarity_score=0.5):
"""Create and return a new vector search request. Key assumptions:
- Query embedding is a list of floats based on a search string.
- Cosmos DB endpoint, client_id, and database name stored in Streamlit secrets."""
cosmos_client_id = st.secrets["cosmos"]["client_id"]
cosmos_credentials = DefaultAzureCredential(managed_identity_client_id=cosmos_client_id)
cosmos_endpoint = st.secrets["cosmos"]["endpoint"]
cosmos_database_name = st.secrets["cosmos"]["database_name"]
cosmos_container_name = "CallTranscripts"
# Create a CosmosClient
client = CosmosClient(url=cosmos_endpoint, credential=cosmos_credentials)
# Load the Cosmos database and container
database = client.get_database_client(cosmos_database_name)
container = database.get_container_client(cosmos_container_name)
results = container.query_items(
query=f"""
SELECT TOP {max_results}
c.id,
c.call_id,
c.call_transcript,
c.abstractive_summary,
VectorDistance(c.request_vector, @request_vector) AS SimilarityScore
FROM c
WHERE
VectorDistance(c.request_vector, @request_vector) > {minimum_similarity_score}
ORDER BY
VectorDistance(c.request_vector, @request_vector)
""",
parameters=[
{"name": "@request_vector", "value": query_embedding}
],
enable_cross_partition_query=True
)
# Create and return a new vector search request
return results
07: Test and deploy
After filling in these code segments, re-run the application and navigate to the Call Center page. Ensure that you can still generate a transcript of the sample call audio. Then, run the function to save your transcript into Cosmos DB. After that, switch to the Call Center Search page and use the term “Airport Gateway hotel” for your search. Ensure that you get one result back. Then, increase the Minimum Similarity Score slider to a point higher than your similarity score and ensure you get zero records back.
Then, deploy the application and ensure that the functionality behaves as expected as an App Service.