Skip to main content Link Menu Expand (external link) Document Search Copy Copied

Task 01 - Prepare a frequently asked questions dataset (10 minutes)

Introduction

The ChatGPT series of models available in Azure OpenAI has been trained on a wide variety of datasets online, but these datasets tend to be publicly available. For private and proprietary datasets, we will need a different approach than to expect ChatGPT has the available information. The first task in exercise 2 is to prepare a dataset that we can use in subsequent tasks and exercises.

Description

In this task, you will load data that Contoso Suites staff has provided to you into Azure Blob Storage. This data contains a summary, in JSON format, of several of their resort properties and hotels located on the resorts. This is an example of the type of data the company would like to use to enhance chat results, so they would like you to incorporate this data into the Azure OpenAI proof of concept. You may find these files in the /src/data folder for the repository.

The key tasks are as follows:

  1. Import the data from Resorts.txt into a container named contoso-suites.
  2. Import the data from Hotels.txt into a container named contoso-suites.

Success Criteria

  • You have created an Azure Blob Storage container and uploaded the Resorts and Hotels data files.
  • You are able to view the files in Azure Blob Storage.

Solution

Expand this section to view the solution
  • Make sure you use the storage account you created in exercise 1, as the storage account must be in the same region as Azure AI Search.
  • Navigate to the storage account in the Azure portal.
  • Select the Containers option from the Data storage menu.
  • Create a new container using the + Container option. Name the container contoso-suites.
  • Inside the “contoso-suites” container, select the Upload option and choose each text file.
  • The files do not need to be in separate folders in the blob storage container.