Challenge 3: Ingest from Cloud

< Previous Challenge - Home - Next Challenge >

Introduction

Great job! You now have a central landing zone for your different data sources.

Caladan currently stores policy data in Azure Cosmos DB.

They also store case, death and recovery metrics for 5 sample countries in Azure SQL DB. Caladan would now like to extract the data from these systems into the data lake. This will set the stage for incorporating additional data from the on-premise country data from the Azure VM.

Description

The team has the freedom during WhatTheHack to choose the solutions which best fit the needs of Caladan. However, the team must be able to explain the thought process behind the decisions to the team’s coach.

At present, encryption and access control is not a requirement for the data.

The team will find the following resources in the WhatTheHack lab subscription.

Caladan Resources

Caladan has one Azure SQL DB with metric data from five countries and a document collection in Cosmos DB with all the policy data. The team will focus on these resources for Challenge 2.

Access keys for Cosmos DB are available from within the Azure portal.

The team’s coach can provide credentials for the SQL database. They are also available on the CloudLabs homepage.

Note: Each team member should add their Client IP to the database if they are going to connect from their home machine via Azure Data Studio, etc.

Alternatively, the team may set the Active Directory Admin to one of the provided WhatTheHack accounts.

Success Criteria

Tips

When selecting a technology to ingest the datasets for this challenge, the team should consider whether that same technology may be leveraged in the future.

Learning Resources

Ramp Up

This challenge focuses on the Extract portion of ETL and ELT workloads, and the Ingestion and Storage stages of modern data warehousing:

Dive In