Exercise 05: Implement the Medallion Architecture using Azure Databricks (Bronze, Silver and Gold layers)
This exercise revolves around implementing the Medallion Architecture utilizing Azure Databricks, with a particular emphasis on its Bronze, Silver, and Gold layers.
The Medallion Architecture is a data processing paradigm where raw data is ingested into the Bronze layer, transformed and curated in the Silver layer, and then aggregated and analyzed in the Gold layer.
In this context, Azure Databricks serves as the platform for executing data transformations, leveraging its powerful analytics capabilities to process data from the Bronze layer, convert it into a delta format, and subsequently store it in the Silver layer for further refinement and analysis.
Table of contents
- 1. Read from Bronze layer, convert the data into delta format and write to the Silver layer
- 2. Verify that delta files were written to the silver medallion container
- 3. Read from the Silver layer, create a data model (Facts, Dimensions) on the data lake from Azure Databricks, and write data into Gold layer as external tables. (Z-ordered within ADLS Gen2 as the Gold layer)
- 4. Verify that delta files were written to the gold medallion container