Dive into a collaborative space where members like YOU can exchange knowledge, tips, and best practices. Join the conversation today and unlock a wealth of collective wisdom to enhance your experience and drive success.
As enterprises race toward cloud-native data platforms, modernising legacy ETL pipelines remains one of the most persistent bottlenecks. For organizations that have relied on SQL Server Integration Services (SSIS) for years, rewriting hundreds of pac...
Hi everyone,I just published a new article in my Medium. This article explores an important topic: Designing reliable data pipelines in Databricks.Many pipelines fail not because of code, but because of design decisions made early in development. In ...
Microsoft announced the retirement plan for the Azure Databricks Standard tier. This is vital information for Organizations still on the Standard Tier. It represents a fundamental architectural realignment that Organizations must navigate with precis...
I've created an Azure Resource Graph query that identifies all standard tier Databricks in your environment (assuming you have read access)https://github.com/cjpluta/azretirementqueries/blob/main/queries/databricks-standard.kql
Hi all.If you've ever manually promoted resources from dev to prod on Databricks — copying notebooks, updating configs, hoping nothing breaks — this post is for you.I've been building a CI/CD setup for a Speech-to-Text pipeline on Databricks, and I w...
Hi,
Great question! Databricks Asset Bundles (DABs) are the recommended approach for CI/CD on Databricks. Here is a comprehensive walkthrough.
WHAT ARE DATABRICKS ASSET BUNDLES?
DABs let you define your Databricks resources (jobs, pipelines, dashboar...
Works for any event-driven workload: IoT alerts, e-commerce flash sales, financial market close processing.GoalIn this project, I needed to start Databricks jobs on an irregular basis, driven entirely by timestamps stored in PostgreSQL rather than by...
@PiotrPustola -- The self-rescheduling orchestrator pattern is a really elegant solution for event-driven workloads that depend on externally managed timestamps. A few thoughts and additions that might help you and others who land on this article:
AD...
Databricks Community Fellows February 2026 Recap
The Databricks Community Fellows are internal Brickster experts who volunteer their time to help customers succeed by answering questions in the Databricks Community forums.
This month: 92 customer que...
If you’ve ever needed to maintain historical truth in a data warehouse, you’ve likely bumped into Slowly Changing Dimensions (SCD)—specifically Type 2. In SCD2, we keep every version of a record as it changes over time, so analysis can answer questio...
Discussed the BI & Metrics Tax elimination using Databricks Metric Views here. Semantic Layer is a core component of the lakehouse with Metric Views. Modern stack is moving toward ai data experiences where organizations ask questions instead of build...
In today's data-driven world, trust is currency—and that trust starts with quality data governed by strong principles. For one of our client, where we're on a mission to build intelligent enterprises with AI, data isn't just an asset—it's a responsib...
Why Legacy BI Is Reaching Its Limits, And What Comes NextI have always believed that the original goal of digitalization was to make data available and then find better ways to analyze it. For the past two decades, Business Intelligence has followed ...
This article continues a technical deep dive into building large-scale Lakehouse architectures.The original platform processed billions of records across multiple markets and operated under PCI-DSS compliance requirements — a significant engineering ...
One of the most interesting shifts in the Databricks ecosystem is Lakebase.For years, data architectures have enforced clear boundaries:OLTP → Operational databasesOLAP → Analytical platformsETL → Bridging the gapWhile familiar, this model often crea...
Exactly. As I'm sure you are aware there is already a great "sync" from Delta -> Postgres but coming soon is gonna be a seamless way to do the opposite (Yes, even simpler than the version of this that was in private preview). If I had a moving visual...
Discussed the BI & Metrics Tax elimination using Databricks Metric Views here. Organizations also face an older more persistent tax — the Ingestion Tax.To ingest data from a source like Salesforce or SQL Server into your Lakehouse, you typically stit...
The Hidden Cost of Scaling the LakehouseOver the past few years, many organizations have successfully migrated to Databricks to modernize their data platforms. The Lakehouse architecture has enabled them to unify data engineering, analytics, and AI o...
@Saurabh2406 this is such a rich article and has so many practical takeaways! Congrats!I faced similar challenges in one of my last projects, and I could spend some time building a nice dashboard (using the system.billing tables) that helped us trac...
Recently, I am creating some "self-reminder" videos for helping my long-term poor human memory and maybe to help others. Understand internals of Dataframes, how partitions are related to jobs, stages, shuffles and tasks and, how transformations or a...