cancel
Showing results for 
Search instead for 
Did you mean: 
Community Articles
Dive into a collaborative space where members like YOU can exchange knowledge, tips, and best practices. Join the conversation today and unlock a wealth of collective wisdom to enhance your experience and drive success.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

nikhilmohod-nm
by New Contributor III
  • 1340 Views
  • 0 replies
  • 2 kudos

Building a Hybrid Lakehouse: Strategic Use of Apache Hudi and Delta Lake in Databricks

Apache Hudi and Delta Lake are built for different workloads. Hudi is optimised for high-frequency writes; Delta Lake is built for fast, reliable reads. Using one format across the entire data platform forces an unnecessary trade-off high ingestion c...

  • 1340 Views
  • 0 replies
  • 2 kudos
Dhyaneshbab2026
by New Contributor II
  • 906 Views
  • 0 replies
  • 2 kudos

From SSIS to Databricks: Accelerating ETL Modernization with AI-Powered Utility

As enterprises race toward cloud-native data platforms, modernising legacy ETL pipelines remains one of the most persistent bottlenecks. For organizations that have relied on SQL Server Integration Services (SSIS) for years, rewriting hundreds of pac...

arch.png
  • 906 Views
  • 0 replies
  • 2 kudos
Brahmareddy
by Esteemed Contributor
  • 463 Views
  • 0 replies
  • 4 kudos

Why Pipeline Design Matters in Databricks

Hi everyone,I just published a new article in my Medium. This article explores an important topic: Designing reliable data pipelines in Databricks.Many pipelines fail not because of code, but because of design decisions made early in development. In ...

  • 463 Views
  • 0 replies
  • 4 kudos
balajij8
by Contributor III
  • 4950 Views
  • 6 replies
  • 8 kudos

The End of an Era - Azure Databricks is Retiring the Standard Tier

Microsoft announced the retirement plan for the Azure Databricks Standard tier. This is vital information for Organizations still on the Standard Tier. It represents a fundamental architectural realignment that Organizations must navigate with precis...

  • 4950 Views
  • 6 replies
  • 8 kudos
Latest Reply
cjpluta
New Contributor II
  • 8 kudos

I've created an Azure Resource Graph query that identifies all standard tier Databricks in your environment (assuming you have read access)https://github.com/cjpluta/azretirementqueries/blob/main/queries/databricks-standard.kql

  • 8 kudos
5 More Replies
Ale_Armillotta
by Valued Contributor II
  • 5002 Views
  • 3 replies
  • 6 kudos

Resolved! CI/CD on Databricks with Asset Bundles (DABs) and GitHub Actions

Hi all.If you've ever manually promoted resources from dev to prod on Databricks — copying notebooks, updating configs, hoping nothing breaks — this post is for you.I've been building a CI/CD setup for a Speech-to-Text pipeline on Databricks, and I w...

Community Articles
CICD
DABs
GitHub
  • 5002 Views
  • 3 replies
  • 6 kudos
Latest Reply
SteveOstrowski
Databricks Employee
  • 6 kudos

Hi, Great question! Databricks Asset Bundles (DABs) are the recommended approach for CI/CD on Databricks. Here is a comprehensive walkthrough. WHAT ARE DATABRICKS ASSET BUNDLES? DABs let you define your Databricks resources (jobs, pipelines, dashboar...

  • 6 kudos
2 More Replies
PiotrPustola
by Databricks Partner
  • 1333 Views
  • 2 replies
  • 2 kudos

Orchestrating Irregular Databricks Jobs from external source Timestamps

Works for any event-driven workload: IoT alerts, e-commerce flash sales, financial market close processing.GoalIn this project, I needed to start Databricks jobs on an irregular basis, driven entirely by timestamps stored in PostgreSQL rather than by...

  • 1333 Views
  • 2 replies
  • 2 kudos
Latest Reply
SteveOstrowski
Databricks Employee
  • 2 kudos

@PiotrPustola -- The self-rescheduling orchestrator pattern is a really elegant solution for event-driven workloads that depend on externally managed timestamps. A few thoughts and additions that might help you and others who land on this article: AD...

  • 2 kudos
1 More Replies
Prosenjeet33
by New Contributor III
  • 1892 Views
  • 0 replies
  • 1 kudos

Building a Production‑Style SCD Type 2 Dimension on Delta Lake — Using Databricks Community Edition

If you’ve ever needed to maintain historical truth in a data warehouse, you’ve likely bumped into Slowly Changing Dimensions (SCD)—specifically Type 2. In SCD2, we keep every version of a record as it changes over time, so analysis can answer questio...

  • 1892 Views
  • 0 replies
  • 1 kudos
balajij8
by Contributor III
  • 1359 Views
  • 0 replies
  • 1 kudos

Databricks Metric Views - Moving Towards Business Semantics

Discussed the BI & Metrics Tax elimination using Databricks Metric Views here. Semantic Layer is a core component of the lakehouse with Metric Views. Modern stack is moving toward ai data experiences where organizations ask questions instead of build...

  • 1359 Views
  • 0 replies
  • 1 kudos
Nidhi_Patni
by Databricks Partner
  • 3460 Views
  • 3 replies
  • 5 kudos

How We Built Robust Data Governance at Scale

In today's data-driven world, trust is currency—and that trust starts with quality data governed by strong principles. For one of our client, where we're on a mission to build intelligent enterprises with AI, data isn't just an asset—it's a responsib...

Nidhi_Patni_0-1753460966300.png Nidhi_Patni_5-1753461715636.png Nidhi_Patni_2-1753460966324.png Nidhi_Patni_3-1753460966330.png
  • 3460 Views
  • 3 replies
  • 5 kudos
Latest Reply
Garethcb
New Contributor II
  • 5 kudos

cannot seem to find Databricks Classification API? 

  • 5 kudos
2 More Replies
Saurabh2406
by Contributor
  • 399 Views
  • 0 replies
  • 1 kudos

Legacy BI to an Agentic Lakehouse in 90 Days -Building Autonomous AI Analytics on Databricks 2026

Why Legacy BI Is Reaching Its Limits, And What Comes NextI have always believed that the original goal of digitalization was to make data available and then find better ways to analyze it. For the past two decades, Business Intelligence has followed ...

0.1 Table.png 1 - visual selection.png 2. 90 Day Plan.png
  • 399 Views
  • 0 replies
  • 1 kudos
wesleyfelipe
by Contributor
  • 366 Views
  • 0 replies
  • 1 kudos

Building PCI-Compliant Lakehouses: Governance Challenges Then and Modern Solutions

This article continues a technical deep dive into building large-scale Lakehouse architectures.The original platform processed billions of records across multiple markets and operated under PCI-DSS compliance requirements — a significant engineering ...

  • 366 Views
  • 0 replies
  • 1 kudos
balajij8
by Contributor III
  • 295 Views
  • 0 replies
  • 2 kudos

Databricks Lake flow - Orchestration Layer is moving to where it belongs

Discussed the BI & Metrics Tax elimination using Databricks Metric Views here. Organizations also face an older more persistent tax — the Ingestion Tax.To ingest data from a source like Salesforce or SQL Server into your Lakehouse, you typically stit...

  • 295 Views
  • 0 replies
  • 2 kudos
Saurabh2406
by Contributor
  • 1525 Views
  • 4 replies
  • 4 kudos

Resolved! Designing a Cost-Efficient Databricks Lakehouse, Performance Tuning and Optimization Best Practices

The Hidden Cost of Scaling the LakehouseOver the past few years, many organizations have successfully migrated to Databricks to modernize their data platforms. The Lakehouse architecture has enabled them to unify data engineering, analytics, and AI o...

1 Databricks Optimization.png
  • 1525 Views
  • 4 replies
  • 4 kudos
Latest Reply
wesleyfelipe
Contributor
  • 4 kudos

@Saurabh2406  this is such a rich article and has so many practical takeaways! Congrats!I faced similar challenges in one of my last projects, and I could spend some time building a nice dashboard (using the system.billing tables) that helped us trac...

  • 4 kudos
3 More Replies
Coffee77
by Honored Contributor II
  • 549 Views
  • 3 replies
  • 6 kudos

🇪🇸 Por qué el DataFrame es el objeto de datos más importante en el procesamiento distribuido

En este video, creado como recordatorio para mi mala memoria a largo plazo, explico de forma sencilla: Qué es un DataFrame Cómo se distribuye en particiones Cómo se ejecuta en un cluster (driver y workers) Qué ocurre en un shuffle Relación entre...

  • 549 Views
  • 3 replies
  • 6 kudos
Latest Reply
Coffee77
Honored Contributor II
  • 6 kudos

Recently, I am creating some "self-reminder" videos for helping my long-term poor human memory and maybe to help others. Understand internals of Dataframes, how partitions are related to jobs, stages, shuffles and tasks and, how transformations or a...

  • 6 kudos
2 More Replies
Labels