Hello @kazinahian , Azure Databricks offers several options for building ETL (Extract, Transform, Load) data pipelines, ranging from low-code to more code-centric approaches:
Delta Live Tables
Delta Live Tables (DLT) is a declarative framework for bu...
Hi @NathanSundarara , regarding your current approach, here are the potential solutions and considerations- Deduplication: Implement deduplication strategies within your DLT pipeline. For example
clicksDedupDf = (
spark.readStream.table("LIVE.rawCl...
Hi @ChristianRRL , as a first quick look, could you please try to create a PySpark dataframe with the _metadata and _rescued_data columns, query the dataframe to make sure you can see those columns, and then create a view using this dataframe?
Hello @guangyi , I am getting back to you with some insights
Regarding your first question about checkpointing
You can manually check the checkpointing location of your stream table. The checkpoints of your Delta Live Tables are under Storage locatio...