Databricks Community

Phani1 · ‎05-11-2023

Hi Team,

Could you please recommend the best practices to implement the delta live tables?

Regards,

Phanindra

Ryan_Chynoweth · ‎05-12-2023

Hi Phani, what exactly are you looking for with best practices?

At a high level:

Always provide an external storage location (S3, ADLS, GCS) for your pipeline
Use Auto Scaling!
Python imports can be leverage to reuse code

With regards to providing a storage location, if you put all your pipelines in a common storage location it makes it easier to read all the associated event logs for pipeline monitoring as well.

View solution in original post

Ryan_Chynoweth · ‎05-12-2023

Hi Phani, what exactly are you looking for with best practices?

At a high level:

Always provide an external storage location (S3, ADLS, GCS) for your pipeline
Use Auto Scaling!
Python imports can be leverage to reuse code

With regards to providing a storage location, if you put all your pipelines in a common storage location it makes it easier to read all the associated event logs for pipeline monitoring as well.

Databricks Community

DLT best practices

Join Us as a Local Community Builder!

🚀 Announcing the Databricks Data Intelligence Platform Cheat Sheet

Find Sensitive Data at Scale with Data Classification in Unity Catalog

Solution Accelerator Series | #6 - Adverse Drug Event Detection

Announcing Backfill Runs in Lakeflow Jobs for Higher Quality Downstream Data

🚀 New: Databricks Interactive Architecture Design Workshops