Task 02 - Extract insights using Azure AI Language (30 minutes)
Introduction
Azure AI Language is a cloud-based service that provides Natural Language Processing (NLP) features for understanding and analyzing text. In the last task, you used the Azure AI Language service for text summarization. You will take the analysis a step further in this task, extracting named entities and performing sentiment analysis and opinion mining over the same call transcript.
- Sentiment analysis and opinion mining are features that help you assess what people think of your brand or topic by mining text for clues about positive or negative sentiment, and can associate them with specific aspects of the text.
- Named entity recognition (NER) categorizes entities–words or phrases–in unstructured text across several predefined category groups. Examples of these category groups include people, events, places, and dates.
Description
In this task, you will leverage your existing Azure AI Services Language service to extract insights from a call transcript. You will start by analyzing the sentiment of the transcript and also performing opinion mining against the text. Next, you will extract named entities from the call.
The key tasks are as follows:
- Use your Azure AI Services Language service to perform sentiment analysis and opinion mining on a transcript.
- Complete the
create_sentiment_analysis_and_opinion_mining_request()
function by implementing the TODO items within it. - Complete the implementation of the “Analyze call sentiment and perform opinion mining” section of the
main()
function by finishing the TODO items. - Upload the
03_Customer_Service_Call.wav
file to generate a transcription, then analyze the sentiment of the call and perform opinion mining against it. - Finish implementing the
create_named_entity_extraction_request()
function by completing the TODO items within it. - Complete the implementation of the “Extract named entities” section of the
main()
function. - Extract named entities from the
03_Customer_Service_Call.wav
file transcript.
Success criteria
- You have completed the
create_sentiment_analysis_and_opinion_mining_request()
function and are able to output the sentiment analysis and opinion mining results to the Streamlit dashboard in JSON format. - You have completed the
create_named_entity_extraction_request()
function and are able to output the list of named entities detected in the call transcript to the Streamlit dashboard in JSON format.
Learning resources
- What is Azure AI Language?
- What is sentiment analysis and opinion mining?
- How to: Use Sentiment analysis and Opinion Mining
- Quickstart: Sentiment analysis and opinion mining
- What is Named Entity Recognition (NER) in Azure AI Language?
- Quickstart: Detecting named entities (NER)
- st.button
- st.spinner
- st.success
- Streamlit Session State
Solution
Expand this section to view the solution
-
The
create_sentiment_analysis_and_opinion_mining_request()
function in the2_Call_Center.py
file uses the Azure AI Language service’sanalyze_sentiment([joined_call_contents], show_opinion_mining=True)
method on theTextAnalyticsClient
to analyze sentiment . By setting theshow_opinion_mining
parameter toTrue
, you can instruct the client to also perform opinion mining while analyzing the text. The code to implement this function is as follows:def create_sentiment_analysis_and_opinion_mining_request(call_contents): # The call_contents parameter is formatted as a list of strings. Join them together with spaces to pass in as a single document. joined_call_contents = ' '.join(call_contents) # Create a Text Analytics Client client = TextAnalyticsClient(language_endpoint, AzureKeyCredential(language_key)) # Analyze sentiment of call transcript, enabling opinion mining. result = client.analyze_sentiment([joined_call_contents], show_opinion_mining=True) # Retrieve all document results that are not an error. doc_result = [doc for doc in result if not doc.is_error] # The output format is a JSON document with the shape: # { # "sentiment": document_sentiment, # "sentiment-scores": { # "positive": document_positive_score_as_two_decimal_float, # "neutral": document_neutral_score_as_two_decimal_float, # "negative": document_negative_score_as_two_decimal_float # }, # "sentences": [ # { # "text": sentence_text, # "sentiment": document_sentiment, # "sentiment-scores": { # "positive": document_positive_score_as_two_decimal_float, # "neutral": document_neutral_score_as_two_decimal_float, # "negative": document_negative_score_as_two_decimal_float # }, # "mined_opinions": [ # { # "target-sentiment": opinion_sentiment, # "target-text": opinion_target, # "target-scores": { # "positive": document_positive_score_as_two_decimal_float, # "neutral": document_neutral_score_as_two_decimal_float, # "negative": document_negative_score_as_two_decimal_float # }, # "assessments": [ # { # "assessment-sentiment": assessment_sentiment, # "assessment-text": assessment_text, # "assessment-scores": { # "positive": document_positive_score_as_two_decimal_float, # "negative": document_negative_score_as_two_decimal_float # } # } # ] # } # ] # } # ] # } sentiment = {} # Assign the correct values to the JSON object. for document in doc_result: sentiment["sentiment"] = document.sentiment sentiment["sentiment-scores"] = { "positive": document.confidence_scores.positive, "neutral": document.confidence_scores.neutral, "negative": document.confidence_scores.negative } sentences = [] for s in document.sentences: sentence = {} sentence["text"] = s.text sentence["sentiment"] = s.sentiment sentence["sentiment-scores"] = { "positive": s.confidence_scores.positive, "neutral": s.confidence_scores.neutral, "negative": s.confidence_scores.negative } mined_opinions = [] for mined_opinion in s.mined_opinions: opinion = {} opinion["target-text"] = mined_opinion.target.text opinion["target-sentiment"] = mined_opinion.target.sentiment opinion["sentiment-scores"] = { "positive": mined_opinion.target.confidence_scores.positive, "negative": mined_opinion.target.confidence_scores.negative, } opinion_assessments = [] for assessment in mined_opinion.assessments: opinion_assessment = {} opinion_assessment["text"] = assessment.text opinion_assessment["sentiment"] = assessment.sentiment opinion_assessment["sentiment-scores"] = { "positive": assessment.confidence_scores.positive, "negative": assessment.confidence_scores.negative } opinion_assessments.append(opinion_assessment) opinion["assessments"] = opinion_assessments mined_opinions.append(opinion) sentence["mined_opinions"] = mined_opinions sentences.append(sentence) sentiment["sentences"] = sentences return sentiment
-
The code to implement the “Analyze call sentiment and perform opinion mining” section of the
main()
function is as follows:if st.button("Analyze sentiment and mine opinions"): # Set call_contents to file_transcription_results. If it is empty, write out an error message for the user. if 'file_transcription_results' in st.session_state: # Use st.spinner() to wrap the sentiment analysis process. with st.spinner("Analyzing transcript sentiment and mining opinions..."): # Call the create_sentiment_analysis_and_opinion_mining_request function and set its results to a variable named sentiment_and_mined_opinions. sentiment_and_mined_opinions = create_sentiment_analysis_and_opinion_mining_request(st.session_state.file_transcription_results) # Save the sentiment_and_mined_opinions value to session state. st.session_state.sentiment_and_mined_opinions = sentiment_and_mined_opinions # Call st.success() to indicate that the sentiment analysis process is complete. if sentiment_and_mined_opinions is not None: st.success("Sentiment analysis and opinion mining complete!") else: st.error("Please upload an audio file or record a call before attempting to analyze sentiment and mine opinions.") # Write the sentiment_and_mined_opinions value to the Streamlit dashboard. if 'sentiment_and_mined_opinions' in st.session_state: st.write(st.session_state.sentiment_and_mined_opinions)
-
Extracting named entities from text using the
create_named_entity_extraction_request()
function is also accomplished using theTextAnalyticsClient
. The code to implement this function is as follows:def create_named_entity_extraction_request(call_contents): # The call_contents parameter is formatted as a list of strings. Join them together with spaces to pass in as a single document. joined_call_contents = ' '.join(call_contents) # Create a Text Analytics Client client = TextAnalyticsClient(language_endpoint, AzureKeyCredential(language_key)) # Recognize entities within the call transcript result = client.recognize_entities(documents=[joined_call_contents])[0] # Create named_entity list as a JSON array named_entities = [] # Add each extracted named entity to the named_entity array. for entity in result.entities: named_entities.append({ "text": entity.text, "category": entity.category, "subcategory": entity.subcategory, "length": entity.length, "offset": entity.offset, "confidence-score": entity.confidence_score }) return named_entities
-
The code to implement the “Extract named entities” section of the
main()
function is as follows:if st.button("Extract named entities"): # Set call_contents to file_transcription_results. If it is empty, write out an error message for the user. if 'file_transcription_results' in st.session_state: # Use st.spinner() to wrap the named entity extraction process. with st.spinner("Extracting named entities..."): # Call the create_named_entity_extraction_request function and set its results to a variable named named_entities. named_entities = create_named_entity_extraction_request(st.session_state.file_transcription_results) # Save the named_entities value to session state. st.session_state.named_entities = named_entities # Call st.success() to indicate that the named entity extraction process is complete. if named_entities is not None: st.success("Named entity extraction complete!") else: st.error("Please upload an audio file or record a call before attempting to extract named entities.") # Write the named_entities value to the Streamlit dashboard. if 'named_entities' in st.session_state: st.write(st.session_state.named_entities)