Task 02 - Extract insights using Azure AI Language (30 minutes)

Introduction

Azure AI Language is a cloud-based service that provides Natural Language Processing (NLP) features for understanding and analyzing text. In the last task, you used the Azure AI Language service for text summarization. You will take the analysis a step further in this task, extracting named entities and performing sentiment analysis and opinion mining over the same call transcript.

Sentiment analysis and opinion mining are features that help you assess what people think of your brand or topic by mining text for clues about positive or negative sentiment, and can associate them with specific aspects of the text.
Named entity recognition (NER) categorizes entities–words or phrases–in unstructured text across several predefined category groups. Examples of these category groups include people, events, places, and dates.

Description

In this task, you will leverage your existing Azure AI Services Language service to extract insights from a call transcript. You will start by analyzing the sentiment of the transcript and also performing opinion mining against the text. Next, you will extract named entities from the call.

The key tasks are as follows:

Use your Azure AI Services Language service to perform sentiment analysis and opinion mining on a transcript.
Complete the create_sentiment_analysis_and_opinion_mining_request() function by implementing the TODO items within it.
Complete the implementation of the “Analyze call sentiment and perform opinion mining” section of the main() function by finishing the TODO items.
Upload the 03_Customer_Service_Call.wav file to generate a transcription, then analyze the sentiment of the call and perform opinion mining against it.
Finish implementing the create_named_entity_extraction_request() function by completing the TODO items within it.
Complete the implementation of the “Extract named entities” section of the main() function.
Extract named entities from the 03_Customer_Service_Call.wav file transcript.

Success criteria

You have completed the create_sentiment_analysis_and_opinion_mining_request() function and are able to output the sentiment analysis and opinion mining results to the Streamlit dashboard in JSON format.
You have completed the create_named_entity_extraction_request() function and are able to output the list of named entities detected in the call transcript to the Streamlit dashboard in JSON format.

Learning resources

Solution

Expand this section to view the solution

The create_sentiment_analysis_and_opinion_mining_request() function in the 2_Call_Center.py file uses the Azure AI Language service’s analyze_sentiment([joined_call_contents], show_opinion_mining=True) method on the TextAnalyticsClient to analyze sentiment . By setting the show_opinion_mining parameter to True, you can instruct the client to also perform opinion mining while analyzing the text. The code to implement this function is as follows:

  def create_sentiment_analysis_and_opinion_mining_request(call_contents):
      # The call_contents parameter is formatted as a list of strings. Join them together with spaces to pass in as a single document.
      joined_call_contents = ' '.join(call_contents)
    
      # Create a Text Analytics Client
      client = TextAnalyticsClient(language_endpoint, AzureKeyCredential(language_key))
    
      # Analyze sentiment of call transcript, enabling opinion mining.
      result = client.analyze_sentiment([joined_call_contents], show_opinion_mining=True)
    
      # Retrieve all document results that are not an error.
      doc_result = [doc for doc in result if not doc.is_error]
    
      # The output format is a JSON document with the shape:
      # {
      #     "sentiment": document_sentiment,
      #     "sentiment-scores": {
      #         "positive": document_positive_score_as_two_decimal_float,
      #         "neutral": document_neutral_score_as_two_decimal_float,
      #         "negative": document_negative_score_as_two_decimal_float
      #     },
      #     "sentences": [
      #         {
      #             "text": sentence_text,
      #             "sentiment": document_sentiment,
      #             "sentiment-scores": {
      #                 "positive": document_positive_score_as_two_decimal_float,
      #                 "neutral": document_neutral_score_as_two_decimal_float,
      #                 "negative": document_negative_score_as_two_decimal_float
      #             },
      #             "mined_opinions": [
      #                 {
      #                     "target-sentiment": opinion_sentiment,
      #                     "target-text": opinion_target,
      #                     "target-scores": {
      #                         "positive": document_positive_score_as_two_decimal_float,
      #                         "neutral": document_neutral_score_as_two_decimal_float,
      #                         "negative": document_negative_score_as_two_decimal_float
      #                     },
      #                     "assessments": [
      #                       {
      #                         "assessment-sentiment": assessment_sentiment,
      #                         "assessment-text": assessment_text,
      #                         "assessment-scores": {
      #                             "positive": document_positive_score_as_two_decimal_float,
      #                             "negative": document_negative_score_as_two_decimal_float
      #                         }
      #                       }
      #                     ]
      #                 }
      #             ]
      #         }
      #     ]
      # }
      sentiment = {}
    
      # Assign the correct values to the JSON object.
      for document in doc_result:
          sentiment["sentiment"] = document.sentiment
          sentiment["sentiment-scores"] = {
              "positive": document.confidence_scores.positive,
              "neutral": document.confidence_scores.neutral,
              "negative": document.confidence_scores.negative
          }
            
          sentences = []
          for s in document.sentences:
              sentence = {}
              sentence["text"] = s.text
              sentence["sentiment"] = s.sentiment
              sentence["sentiment-scores"] = {
                  "positive": s.confidence_scores.positive,
                  "neutral": s.confidence_scores.neutral,
                  "negative": s.confidence_scores.negative
              }
    
              mined_opinions = []
              for mined_opinion in s.mined_opinions:
                  opinion = {}
                  opinion["target-text"] = mined_opinion.target.text
                  opinion["target-sentiment"] = mined_opinion.target.sentiment
                  opinion["sentiment-scores"] = {
                      "positive": mined_opinion.target.confidence_scores.positive,
                      "negative": mined_opinion.target.confidence_scores.negative,
                  }
                    
                  opinion_assessments = []
                  for assessment in mined_opinion.assessments:
                      opinion_assessment = {}
                      opinion_assessment["text"] = assessment.text
                      opinion_assessment["sentiment"] = assessment.sentiment
                      opinion_assessment["sentiment-scores"] = {
                          "positive": assessment.confidence_scores.positive,
                          "negative": assessment.confidence_scores.negative
                      }
                      opinion_assessments.append(opinion_assessment)
    
                  opinion["assessments"] = opinion_assessments
                  mined_opinions.append(opinion)
    
              sentence["mined_opinions"] = mined_opinions
              sentences.append(sentence)
    
          sentiment["sentences"] = sentences
        
      return sentiment

The code to implement the “Analyze call sentiment and perform opinion mining” section of the main() function is as follows:

  if st.button("Analyze sentiment and mine opinions"):
      # Set call_contents to file_transcription_results. If it is empty, write out an error message for the user.
      if 'file_transcription_results' in st.session_state:
          # Use st.spinner() to wrap the sentiment analysis process.
          with st.spinner("Analyzing transcript sentiment and mining opinions..."):
              # Call the create_sentiment_analysis_and_opinion_mining_request function and set its results to a variable named sentiment_and_mined_opinions.
              sentiment_and_mined_opinions = create_sentiment_analysis_and_opinion_mining_request(st.session_state.file_transcription_results)
              # Save the sentiment_and_mined_opinions value to session state.
              st.session_state.sentiment_and_mined_opinions = sentiment_and_mined_opinions

              # Call st.success() to indicate that the sentiment analysis process is complete.
              if sentiment_and_mined_opinions is not None:
                  st.success("Sentiment analysis and opinion mining complete!")
      else:
          st.error("Please upload an audio file or record a call before attempting to analyze sentiment and mine opinions.")

  # Write the sentiment_and_mined_opinions value to the Streamlit dashboard.
  if 'sentiment_and_mined_opinions' in st.session_state:
      st.write(st.session_state.sentiment_and_mined_opinions)

Extracting named entities from text using the create_named_entity_extraction_request() function is also accomplished using the TextAnalyticsClient. The code to implement this function is as follows:

  def create_named_entity_extraction_request(call_contents):
      # The call_contents parameter is formatted as a list of strings. Join them together with spaces to pass in as a single document.
      joined_call_contents = ' '.join(call_contents)
    
      # Create a Text Analytics Client
      client = TextAnalyticsClient(language_endpoint, AzureKeyCredential(language_key))
    
      # Recognize entities within the call transcript
      result = client.recognize_entities(documents=[joined_call_contents])[0]
    
      # Create named_entity list as a JSON array
      named_entities = []
    
      # Add each extracted named entity to the named_entity array.
      for entity in result.entities:
          named_entities.append({
              "text": entity.text,
              "category": entity.category,
              "subcategory": entity.subcategory,
              "length": entity.length,
              "offset": entity.offset,
              "confidence-score": entity.confidence_score
          })
    
      return named_entities

The code to implement the “Extract named entities” section of the main() function is as follows:

  if st.button("Extract named entities"):
      # Set call_contents to file_transcription_results. If it is empty, write out an error message for the user.
      if 'file_transcription_results' in st.session_state:
          # Use st.spinner() to wrap the named entity extraction process.
          with st.spinner("Extracting named entities..."):
              # Call the create_named_entity_extraction_request function and set its results to a variable named named_entities.
              named_entities = create_named_entity_extraction_request(st.session_state.file_transcription_results)
              # Save the named_entities value to session state.
              st.session_state.named_entities = named_entities

              # Call st.success() to indicate that the named entity extraction process is complete.
              if named_entities is not None:
                  st.success("Named entity extraction complete!")
      else:
          st.error("Please upload an audio file or record a call before attempting to extract named entities.")

  # Write the named_entities value to the Streamlit dashboard.
  if 'named_entities' in st.session_state:
      st.write(st.session_state.named_entities)