Challenge 01 - Bring your data to the OneLake

< Previous Challenge - Home - Next Challenge >

Introduction

As you observed at the end of setup in Challenge 0, you have been provided with some anonymous historical data of patients who have suffered heart attacks. The csv file contains a set of conditions (or features) that these patients have had and an indication of whether they had a heart attack or not. The first step to working with this data is to bring it to Fabric’s OneLake, from where you will be able to use it for your analysis and to train a machine learning model.

Description

Your task in this challenge is to make your data available as a delta table in the Fabric OneLake. To do so, you must:

To load the data to the lakehouse, you will be using a spark notebook. Open Notebook 1, that you uploaded to your Fabric workspace in Challenge 0. You will find more guidance and helpful links there. Additionally, visit the end of this challenge for documentation links on how to create a shortcut in Fabric.

NOTE: If you skipped the Azure setup and are completing this challenge by only using Fabric, you will not need to create a shortcut. Instead, upload the csv file to your lakehouse and follow the instructions in Notebook 1.

Notebook sections:

  1. Read the .csv file into a dataframe in the notebook
  2. Write the dataframe to the lakehouse as a delta table

By the end of this challenge, you should be able to understand and know how to use:

Success Criteria

Verify that the heart.csv data is now saved as a delta table on the lakehouse in the same workspace where the notebooks are stored. Verify that you are able to load the table back to the notebook.

Learning Resources

Microsoft Fabric Lakehouse

ADLS Shortcuts in Fabric

Refer to Notebook 1 for more helpful links

Tips

Advanced Challenges (Optional)

Interested in seeing the shortcut’s low latency in action?

Find another dataset of interest to you, save it in the same folder as heart.csv in your Azure storage account and watch the new file appear on Fabric. Explore the differences between files/shortcuts and actual tables in Fabric and what is needed to keep your data up to date at each stage.