Modern Data Warehouse - COVID 19

Introduction

This hack is designed so that the students will construct a fully functional Modern Data Pipeline utilizing COVID-19 data and then creating calcuations on growth vs policies enacted by different governments. The process of collecting, organizing and making inferences based on different data sources is someting that most data practioners need to do at some point in there career. This hack teaches how to do this in Azure.
Note: This lab is recommended to be done over at least three days. Is it very in-depth and will test most students.

Learning Objectives

In this hack you will be working to make a recommendation to a fictional government on the COVID-19 mitigation policies they should enact based on collecting, cleaning, correlating and examining avilable data sets.

The technical learning objectives:

  1. Provision a Data Lake
  2. Land data in the Data Lake from Cloud resources (Relational and CosmosDB).
  3. Land data in the Data Lake from On-Premise resources (an Azure VM is used to simulate an on-prem store).
  4. Create Data Pipelines to merge the datasets into usuable format.
  5. Define Star Schemas and create a Data Warehouse.
  6. Enact version control and administrive approval for all pull requests within Github.
  7. Perform calculations on Fact tables.
  8. Enable Unit Tests

Challenges

Prerequisites

Repository Contents

Contributors

Microsoft Government Team :