What are best practices for designing a large-scale data engineering pipeline on Databricks for real

Suheb — Mon, 17 Nov 2025 06:39:52 GMT

How do you design a scalable, reliable pipeline that handles both fast/continuous data and slower bulk data in the same system?

Re: What are best practices for designing a large-scale data engineering pipeline on Databricks for

Coffee77 — Mon, 17 Nov 2025 07:52:01 GMT

Very generic question 🙂 Here are general rules and best practices related to Databricks well-architected framework: https://docs.databricks.com/aws/en/lakehouse-architecture/well-architected Take a deeper look on operational excellence, reliability and performance efficiency. On the other hand, try to adopt a mediallion architecture to logically organize data https://www.databricks.com/glossary/medallion-architecture and usage of Unity catalog to centrally control and governance data.