We use cookies on this site to enhance your user experience
By clicking the Accept button, you agree to us doing so. More info on our cookie policy
We use cookies on this site to enhance your user experience
By clicking the Accept button, you agree to us doing so. More info on our cookie policy
Gold Layer
The Gold Layer typically needed but not always required. In you’re traditional medallion architecture this layer would be the only layer available for analytics. In a RTI Medallion architecture both the Silver Layer and Gold Layer are available for analytics.
There are several reason that will drive you to utilze a Gold Layer
As your query grows in time period the need for individual records becomes unecessary. For instance, if I want to look at the temperature of my IoT devices over the last year I’ll want an aggregate view, hourly/daily/etc.., the individual records are not helpful in this case. With a materialized view in a KQL Database you can pre-aggregate this aggregate and improve query performance.
The Eventhouse will be sized based on the amount of cache you are storing. In most situations the raw telemetry can be kept for a shorter time and the aggregate data for the longer period. Reducing the total amount of hot cache used by the Eventhouse. For example, I may keep 30 days of the Silver Table and 1 Year of an hourly aggregate.
If you have a dashboard that displays tiles that show the latest value for each device. It can be useful to create a MV that just shows the latest value. This allows the tile to refresh instantly, which is typically the expectation for this type of visual.
There are a few common mistakes that you want to avoid at this layer:
Just because the customer commonly creates a 5 min, 30 min, 1 hour, and daily aggregates doesn’t mean that you need to create a materialized view for each.
For example, in the above scenario you can approach the solution in this manner
So now I’ve made it easy to pull the most common aggregates but only created a single materialized view.
One of the most powerful features of a materialized view is it’s ability to handle late arriving data. But if you have a lot of late arriving data remember that it needs to join that with the existing MV to recalculate the aggregates. This is typically not an issue but if you have a lot of data coming in that is past the caching period for the Materialized View then it can lead to a backlog. You can handle this by
We’ll continue the example of the IoT Data where we have a true dedup MV called ExampleDedupMV. Now we want to create a MV that does hourly aggregates and backfills the data to June 2024
.create async materialized-view with(backfill, effectiveDateTime=datetime(2024-06-01)) IoT_Hourly_MV on materialized-view IoTSilverDedupMV {
IoTSilverDedupMV
| summarize avg_Battery_Level=avg(BatteryLevel),avg_Temp=avg(Temp), avg_Humidity=avg(Humidity) by DeviceId,Model,SerialNumber, bin(EnqueuedTimeUTC,1h)
}
Once this is finished you’ll have an aggregate materialized view that aggregates data as it arrives and handles late arriving data.
As mentioned in the Silver Layer if you create your “dedup” materialized view using something like arg max of the ingestion time then you cannot create an aggregate MV on top of this data. In that case there are two common options you can take:
Option 1: Aggregate the data at query time instead of pre-aggregation
Option 2: Utilize a scheduled job (for example Fabric Notebook) to aggregate the data
Here is an example command to create a latest value view with backfill.
.create async materialized-view with(backfill, effectiveDateTime=datetime(2024-06-01)) IoT_Hourly_MV on materialized-view IoTSilverDedupMV {
IoTSilverDedupMV
| summarize arg_max(EnqueuedTimeUTC, *) by DeviceId
}
At this stage you have a queriable Silver Layer along with an aggregated Gold Layer. This allow you to run ad-hoc queries and build visuals that can span long periods of time along with drilling down to specific records.