Task 02 - Summarize and call transcriptions (40 minutes)
Introduction
The intent of text summarization is to shorten content that users consider too long to read. AI-based summarization focuses on highlighting the most important aspects of the text. Azure AI Services provide several mechanism for summarizing text. The Language service provides extractive and abstractive summarization functions, both of which condense text or documents into key sentences.
- Extractive summarization: Creates a summary by extracting sentences that collectively represent the most important or relevant information within the original content.
- Abstractive summarization: Produces a summary by generating summarized sentences from the document that capture the main idea.
In addition to the summarization capabilities of the Azure AI Services Language service, the Azure OpenAI service also enables the ability to perform query-based summarization, leveraging the power of large language models. The service uses a generative completion model to generate a summary from text. The model uses natural language instructions to identify the requested task and the skill required, a process known as prompt engineering.
In this exercise, you will create an Azure AI Services Language service and compare the outputs of its extractive and abstractive summarization functions to the output of Azure OpenAI’s query-based text summarization.
After that, you will take the analysis a step further in this task, extracting named entities and performing sentiment analysis and opinion mining over the same call transcript.
- Sentiment analysis and opinion mining are features that help you assess what people think of your brand or topic by mining text for clues about positive or negative sentiment, and can associate them with specific aspects of the text.
Description
In this task, you use an Azure AI Services Language service to summarize the transcript of a customer call generated in the 4_Call_Center.py
page of your Streamlit application. You will write code to create summaries of a transcribed call recording using three different Azure AI Services capabilities and compare the results from each. Then, you will analyze sentiment and perform opinion mining against the transcript.
Success Criteria
- You are able to generate extractive and abstractive summaries of a call transcript using the Azure AI Services Language service.
- You are able to generate a summary of a call transcript using your GPT-4o model deployment.
- You are able to perform sentiment analysis and opinion mining using the Azure AI Services Language service.
- You are able to deploy the application to Azure App Services and have the application continue to function as expected.
Learning Resources
- What is Azure AI Language?
- What is document and conversation summarization?
- How to use document summarization
- Quickstart: Using extractive summarization
- Quickstart: Get started using GPT-35-Turbo and GPT-4 with Azure OpenAI Service
- Query-based document summarization
- What is sentiment analysis and opinion mining?
- How to: Use Sentiment analysis and Opinion Mining
- Quickstart: Sentiment analysis and opinion mining
Key Tasks
Note: The Quickstart guides in the Learning Resources section below will show examples of each of the following techniques.
01: Generate extractive summary
Fill in the contents of generate_extractive_summary()
. To do so, you will need to create a TextAnalyticsClient
and perform an ExtractiveSummaryAction()
. Extract at most the 2 most important sentences from the transcript. Generate the extractive summary and return a JSON object including the extractive summary as a string named call-summary
.
Expand this section to view the solution
The code to complete the generate_extractive_summary()
function is as follows:
# Create a TextAnalyticsClient, connecting it to your Language Service endpoint.
client = TextAnalyticsClient(language_endpoint, AzureKeyCredential(language_key))
# Call the begin_analyze_actions method on your client, passing in the joined
# call_contents as an array and an ExtractiveSummaryAction with a max_sentence_countof 2.
poller = client.begin_analyze_actions(
[joined_call_contents],
actions = [
ExtractiveSummaryAction(max_sentence_count=2)
]
)
# Extract the summary sentences and merge them into a single summary string.
for result in poller.result():
summary_result = result[0]
if summary_result.is_error:
st.error(f'Extractive summary resulted in an error with code "{summary_result.code}" and message "{summary_result.message}"')
return ''
extractive_summary = " ".join([sentence.text for sentence in summary_result.sentences])
# Return the summary as a JSON object in the shape '{"call-summary":extractive_summary}'
return json.loads('{"call-summary":"' + extractive_summary + '"}')
This code should replace the return "This is a placeholder result. Fill in with real extractive summary."
line of code.
02: Generate abstractive summary
Fill in the contents of generate_abstractive_summary()
. To do so, you will need to create a TextAnalyticsClient
and perform an AbstractiveSummaryAction()
, with the requirement that it create at most 2 sentences based on the transcript. Generate the abstractive summary and return a JSON object including the abstractive summary as a string named call-summary
.
Expand this section to view the solution
The code to complete the generate_abstractive_summary()
function is as follows:
# Create a TextAnalyticsClient, connecting it to your Language Service endpoint.
client = TextAnalyticsClient(language_endpoint, AzureKeyCredential(language_key))
# Call the begin_analyze_actions method on your client,
# passing in the joined call_contents as an array
# and an AbstractiveSummaryAction with a sentence_count of 2.
poller = client.begin_analyze_actions(
[joined_call_contents],
actions = [
AbstractiveSummaryAction(sentence_count=2)
]
)
# Extract the summary sentences and merge them into a single summary string.
for result in poller.result():
summary_result = result[0]
if summary_result.is_error:
st.error(f'...Is an error with code "{summary_result.code}" and message "{summary_result.message}"')
return ''
abstractive_summary = " ".join([summary.text for summary in summary_result.summaries])
# Return the summary as a JSON object in the shape '{"call-summary":abstractive_summary}'
return json.loads('{"call-summary":"' + abstractive_summary + '"}')
This code should replace the return "This is a placeholder result. Fill in with real abstractive summary."
line of code.
03: Generate query-based summary
Fill in the contents of generate_query_based_summary()
. This will execute against the GPT-4o model deployment. You should create a system prompt that asks for a five-word summary and label that “call-title”. Then, ask for a two-sentence summary and label that as “call-summary”. Finally, output the results in JSON format.
Expand this section to view the solution
The code to complete the generate_query_based_summary()
function is as follows:
# Write a system prompt that instructs the large language model to:
# - Generate a short (5 word) summary from the call transcript.
# - Create a two-sentence summary of the call transcript.
# - Output the response in JSON format, with the short summary
# labeled 'call-title' and the longer summary labeled 'call-summary.'
system = """
Write a five-word summary and label it as call-title.
Write a two-sentence summary and label it as call-summary.
Output the results in JSON format.
"""
# Call make_azure_openai_chat_request().
response = make_azure_openai_chat_request(system, joined_call_contents)
# Return the summary.
return response.choices[0].message.content
This code should replace the return "This is a placeholder result. Fill in with real query-based summary."
line of code.
04: Perform sentiment analysis and opinion mining
Fill in the contents of create_sentiment_analysis_and_opinion_mining_request()
. To do so, you will need to create a TextAnalyticsClient
and then execute the method analyze_sentiment()
. The output you will receive is a JSON document in the following shape:
{
"sentiment": document_sentiment,
"sentiment-scores": {
"positive": document_positive_score_as_two_decimal_float,
"neutral": document_neutral_score_as_two_decimal_float,
"negative": document_negative_score_as_two_decimal_float
},
"sentences": [
{
"text": sentence_text,
"sentiment": document_sentiment,
"sentiment-scores": {
"positive": document_positive_score_as_two_decimal_float,
"neutral": document_neutral_score_as_two_decimal_float,
"negative": document_negative_score_as_two_decimal_float
},
"mined_opinions": [
{
"target-sentiment": opinion_sentiment,
"target-text": opinion_target,
"target-scores": {
"positive": document_positive_score_as_two_decimal_float,
"neutral": document_neutral_score_as_two_decimal_float,
"negative": document_negative_score_as_two_decimal_float
},
"assessments": [
{
"assessment-sentiment": assessment_sentiment,
"assessment-text": assessment_text,
"assessment-scores": {
"positive": document_positive_score_as_two_decimal_float,
"negative": document_negative_score_as_two_decimal_float
}
}
]
}
]
}
]
}
Keeping this document shape in mind, create a new dictionary named sentiment
. For each valid document you receive, set the sentiment
field to the document’s sentiment. Set the sentiment-scores
field to a dictionary containing the numbers of positive, neutral, and negative statements, respectively.
Then, for each sentence in the document, create a similar sentiment analysis, tracking the sentence itself, its sentiment, and the sentiment-scores (positive, neutral, and negative).
Continue in this vein with mined opinions, tracking those per sentence and mined opinion target text, target sentiment, and sentiment-scores (positive and negative).
For each mined opinion, collect a set of assessments, tracking the text, sentiment, and sentiment-scores (positive and negative). Collect all of this in the sentiment
dictionary and return it as the output of the function.
Expand this section to view the solution
The code to complete the create_sentiment_analysis_and_opinion_mining_request()
function is as follows:
# Create a Text Analytics Client
client = TextAnalyticsClient(language_endpoint, AzureKeyCredential(language_key))
# Analyze sentiment of call transcript, enabling opinion mining.
result = client.analyze_sentiment([joined_call_contents], show_opinion_mining=True)
# Retrieve all document results that are not an error.
doc_result = [doc for doc in result if not doc.is_error]
# The output format is a JSON document with the shape:
# {
# "sentiment": document_sentiment,
# "sentiment-scores": {
# "positive": document_positive_score_as_two_decimal_float,
# "neutral": document_neutral_score_as_two_decimal_float,
# "negative": document_negative_score_as_two_decimal_float
# },
# "sentences": [
# {
# "text": sentence_text,
# "sentiment": document_sentiment,
# "sentiment-scores": {
# "positive": document_positive_score_as_two_decimal_float,
# "neutral": document_neutral_score_as_two_decimal_float,
# "negative": document_negative_score_as_two_decimal_float
# },
# "mined_opinions": [
# {
# "target-sentiment": opinion_sentiment,
# "target-text": opinion_target,
# "target-scores": {
# "positive": document_positive_score_as_two_decimal_float,
# "neutral": document_neutral_score_as_two_decimal_float,
# "negative": document_negative_score_as_two_decimal_float
# },
# "assessments": [
# {
# "assessment-sentiment": assessment_sentiment,
# "assessment-text": assessment_text,
# "assessment-scores": {
# "positive": document_positive_score_as_two_decimal_float,
# "negative": document_negative_score_as_two_decimal_float
# }
# }
# ]
# }
# ]
# }
# ]
# }
sentiment = {}
# Assign the correct values to the JSON object.
for document in doc_result:
sentiment["sentiment"] = document.sentiment
sentiment["sentiment-scores"] = {
"positive": document.confidence_scores.positive,
"neutral": document.confidence_scores.neutral,
"negative": document.confidence_scores.negative
}
sentences = []
for s in document.sentences:
sentence = {}
sentence["text"] = s.text
sentence["sentiment"] = s.sentiment
sentence["sentiment-scores"] = {
"positive": s.confidence_scores.positive,
"neutral": s.confidence_scores.neutral,
"negative": s.confidence_scores.negative
}
mined_opinions = []
for mined_opinion in s.mined_opinions:
opinion = {}
opinion["target-text"] = mined_opinion.target.text
opinion["target-sentiment"] = mined_opinion.target.sentiment
opinion["sentiment-scores"] = {
"positive": mined_opinion.target.confidence_scores.positive,
"negative": mined_opinion.target.confidence_scores.negative,
}
opinion_assessments = []
for assessment in mined_opinion.assessments:
opinion_assessment = {}
opinion_assessment["text"] = assessment.text
opinion_assessment["sentiment"] = assessment.sentiment
opinion_assessment["sentiment-scores"] = {
"positive": assessment.confidence_scores.positive,
"negative": assessment.confidence_scores.negative
}
opinion_assessments.append(opinion_assessment)
opinion["assessments"] = opinion_assessments
mined_opinions.append(opinion)
sentence["mined_opinions"] = mined_opinions
sentences.append(sentence)
sentiment["sentences"] = sentences
return sentiment
This code should replace the return "This is a placeholder result. Fill in with real sentiment analysis."
line of code.
05: Test and deploy
After filling in these code segments, re-run the application and navigate to the Call Center page. Ensure that you can generate a transcript of the sample call audio. Then, run the extractive, abstractive, and Azure OpenAI summaries. After that, run the sentiment and opinions check. For each operation, you should get back valid results relating to the call transcript.
Then, deploy the application and ensure that the functionality behaves as expected as an App Service.