Databricks Community

Phani1 · ‎05-11-2023

Hi Team,

Could you please recommend the best practices to implement the delta live tables?

Regards,

Phanindra

Ryan_Chynoweth · ‎05-12-2023

Hi Phani, what exactly are you looking for with best practices?

At a high level:

Always provide an external storage location (S3, ADLS, GCS) for your pipeline
Use Auto Scaling!
Python imports can be leverage to reuse code

With regards to providing a storage location, if you put all your pipelines in a common storage location it makes it easier to read all the associated event logs for pipeline monitoring as well.

View solution in original post

Ryan_Chynoweth · ‎05-12-2023

Hi Phani, what exactly are you looking for with best practices?

At a high level:

Always provide an external storage location (S3, ADLS, GCS) for your pipeline
Use Auto Scaling!
Python imports can be leverage to reuse code

With regards to providing a storage location, if you put all your pipelines in a common storage location it makes it easier to read all the associated event logs for pipeline monitoring as well.

Databricks Community

DLT best practices

Join Us as a Local Community Builder!

🌟 Community Sparks of the Week | September 26 – October 2 🌟

Solution Accelerator Series | #4 - Toxicity Detection for Gaming

Level Up with Databricks Specialist Sessions

🚀 Weekly Delta (24-30 September): A Look Back at This Week’s Top Community Highlights!

Announcing Data Intelligence for Cybersecurity