cancel
Showing results for 
Search instead for 
Did you mean: 
Community Articles
Dive into a collaborative space where members like YOU can exchange knowledge, tips, and best practices. Join the conversation today and unlock a wealth of collective wisdom to enhance your experience and drive success.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Kirankumarbs
by Contributor III
  • 1352 Views
  • 9 replies
  • 6 kudos

Streaming Failure Models: Why "It Didn't Crash" Is the Worst Outcome

Most Databricks streaming failures don't look dramatic.No cluster termination. No red wall of errors. The UI says RUNNING — and your customers start reporting nonsense.I wrote about the incident that changed how we think about streaming jobs on share...

  • 1352 Views
  • 9 replies
  • 6 kudos
Latest Reply
mderela
Contributor
  • 6 kudos

Completely agree, production war stories are worth more than any documentation. I’ve eaten enough teeth on production data lake issues to write my own chapter on what can go wrong, whether that’s deploying Databricks in financial institutions or bein...

  • 6 kudos
8 More Replies
balajij8
by Contributor III
  • 857 Views
  • 0 replies
  • 3 kudos

Databricks Multi Table Transactions - All Data or Nothing

Databricks introduces multi-table transactions, allowing operations across multiple Delta tables to execute as a single atomic unit. Delta Lake has provided ACID guarantees at the table level, but ensuring atomicity across multiple tables previously ...

  • 857 Views
  • 0 replies
  • 3 kudos
Kirankumarbs
by Contributor III
  • 632 Views
  • 1 replies
  • 2 kudos

Multi-Task on a Shared Cluster — Why That's Also Not Enough

Part 2 of 3 — Databricks Streaming ArchitectureThe instinct after Part 1 was obvious.If running eight queries in one task means one failure can hide while others keep running — split them into multiple tasks. Separate concerns. Give each component it...

  • 632 Views
  • 1 replies
  • 2 kudos
Latest Reply
Kirankumarbs
Contributor III
  • 2 kudos

Part 1: Streaming Failure Models: Why "It Didn't Crash" Is the Worst OutcomePart 3: One Cluster per Task — Proven, Ready, and Waiting

  • 2 kudos
venkat_k
by New Contributor II
  • 625 Views
  • 0 replies
  • 1 kudos

Enterprise Data Platform Architecture on Azure with Databricks

Hi everyone,I recently wrote an article on designing an enterprise-scale data platform architecture using Azure and Databricks.The article covers:• End-to-end architecture for enterprise data platforms• Data ingestion using Azure Data Factory and Kaf...

  • 625 Views
  • 0 replies
  • 1 kudos
MoJaMa
by Databricks Employee
  • 1220 Views
  • 0 replies
  • 4 kudos

One Policy to Mask Them All: ABAC + VARIANT in Unity Catalog

Databricks ABAC lets you apply a single schema-level policy across columns of any data type — no more managing one mask function per type. Here's how to use the VARIANT data type to make it work. If you've implemented column masking in Unity Catalog,...

MoJaMa_0-1773281812229.png MoJaMa_1-1773281812230.png MoJaMa_2-1773281812230.png MoJaMa_3-1773281812231.png
  • 1220 Views
  • 0 replies
  • 4 kudos
Kirankumarbs
by Contributor III
  • 438 Views
  • 0 replies
  • 1 kudos

One Cluster per Task — Proven, Ready, and Waiting

Part 3 of 3: Databricks Streaming ArchitectureBy the end of Part 1 & Part 2, we knew what the real answer was. We just hadn’t committed to it yet.Not because it wouldn’t work. We tested it. We documented it. The code was ready. The answer was one clu...

  • 438 Views
  • 0 replies
  • 1 kudos
nikhilmohod-nm
by New Contributor III
  • 1609 Views
  • 0 replies
  • 2 kudos

Building a Hybrid Lakehouse: Strategic Use of Apache Hudi and Delta Lake in Databricks

Apache Hudi and Delta Lake are built for different workloads. Hudi is optimised for high-frequency writes; Delta Lake is built for fast, reliable reads. Using one format across the entire data platform forces an unnecessary trade-off high ingestion c...

  • 1609 Views
  • 0 replies
  • 2 kudos
Dhyaneshbab2026
by New Contributor II
  • 1104 Views
  • 0 replies
  • 2 kudos

From SSIS to Databricks: Accelerating ETL Modernization with AI-Powered Utility

As enterprises race toward cloud-native data platforms, modernising legacy ETL pipelines remains one of the most persistent bottlenecks. For organizations that have relied on SQL Server Integration Services (SSIS) for years, rewriting hundreds of pac...

arch.png
  • 1104 Views
  • 0 replies
  • 2 kudos
Brahmareddy
by Esteemed Contributor II
  • 496 Views
  • 0 replies
  • 4 kudos

Why Pipeline Design Matters in Databricks

Hi everyone,I just published a new article in my Medium. This article explores an important topic: Designing reliable data pipelines in Databricks.Many pipelines fail not because of code, but because of design decisions made early in development. In ...

  • 496 Views
  • 0 replies
  • 4 kudos
balajij8
by Contributor III
  • 5744 Views
  • 6 replies
  • 8 kudos

The End of an Era - Azure Databricks is Retiring the Standard Tier

Microsoft announced the retirement plan for the Azure Databricks Standard tier. This is vital information for Organizations still on the Standard Tier. It represents a fundamental architectural realignment that Organizations must navigate with precis...

  • 5744 Views
  • 6 replies
  • 8 kudos
Latest Reply
cjpluta
New Contributor II
  • 8 kudos

I've created an Azure Resource Graph query that identifies all standard tier Databricks in your environment (assuming you have read access)https://github.com/cjpluta/azretirementqueries/blob/main/queries/databricks-standard.kql

  • 8 kudos
5 More Replies
Ale_Armillotta
by Valued Contributor II
  • 6371 Views
  • 3 replies
  • 6 kudos

Resolved! CI/CD on Databricks with Asset Bundles (DABs) and GitHub Actions

Hi all.If you've ever manually promoted resources from dev to prod on Databricks — copying notebooks, updating configs, hoping nothing breaks — this post is for you.I've been building a CI/CD setup for a Speech-to-Text pipeline on Databricks, and I w...

Community Articles
CICD
DABs
GitHub
  • 6371 Views
  • 3 replies
  • 6 kudos
Latest Reply
SteveOstrowski
Databricks Employee
  • 6 kudos

Hi, Great question! Databricks Asset Bundles (DABs) are the recommended approach for CI/CD on Databricks. Here is a comprehensive walkthrough. WHAT ARE DATABRICKS ASSET BUNDLES? DABs let you define your Databricks resources (jobs, pipelines, dashboar...

  • 6 kudos
2 More Replies
PiotrPustola
by Databricks Partner
  • 1498 Views
  • 2 replies
  • 2 kudos

Orchestrating Irregular Databricks Jobs from external source Timestamps

Works for any event-driven workload: IoT alerts, e-commerce flash sales, financial market close processing.GoalIn this project, I needed to start Databricks jobs on an irregular basis, driven entirely by timestamps stored in PostgreSQL rather than by...

  • 1498 Views
  • 2 replies
  • 2 kudos
Latest Reply
SteveOstrowski
Databricks Employee
  • 2 kudos

@PiotrPustola -- The self-rescheduling orchestrator pattern is a really elegant solution for event-driven workloads that depend on externally managed timestamps. A few thoughts and additions that might help you and others who land on this article: AD...

  • 2 kudos
1 More Replies
Prosenjeet33
by New Contributor III
  • 2376 Views
  • 0 replies
  • 1 kudos

Building a Production‑Style SCD Type 2 Dimension on Delta Lake — Using Databricks Community Edition

If you’ve ever needed to maintain historical truth in a data warehouse, you’ve likely bumped into Slowly Changing Dimensions (SCD)—specifically Type 2. In SCD2, we keep every version of a record as it changes over time, so analysis can answer questio...

  • 2376 Views
  • 0 replies
  • 1 kudos
Labels