cancel
Showing results for 
Search instead for 
Did you mean: 
Community Articles
Dive into a collaborative space where members like YOU can exchange knowledge, tips, and best practices. Join the conversation today and unlock a wealth of collective wisdom to enhance your experience and drive success.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

PiotrPustola
by Databricks Partner
  • 1086 Views
  • 2 replies
  • 2 kudos

Orchestrating Irregular Databricks Jobs from external source Timestamps

Works for any event-driven workload: IoT alerts, e-commerce flash sales, financial market close processing.GoalIn this project, I needed to start Databricks jobs on an irregular basis, driven entirely by timestamps stored in PostgreSQL rather than by...

  • 1086 Views
  • 2 replies
  • 2 kudos
Latest Reply
SteveOstrowski
Databricks Employee
  • 2 kudos

@PiotrPustola -- The self-rescheduling orchestrator pattern is a really elegant solution for event-driven workloads that depend on externally managed timestamps. A few thoughts and additions that might help you and others who land on this article: AD...

  • 2 kudos
1 More Replies
Prosenjeet33
by New Contributor III
  • 1367 Views
  • 0 replies
  • 1 kudos

Building a Production‑Style SCD Type 2 Dimension on Delta Lake — Using Databricks Community Edition

If you’ve ever needed to maintain historical truth in a data warehouse, you’ve likely bumped into Slowly Changing Dimensions (SCD)—specifically Type 2. In SCD2, we keep every version of a record as it changes over time, so analysis can answer questio...

  • 1367 Views
  • 0 replies
  • 1 kudos
balajij8
by Contributor III
  • 1138 Views
  • 0 replies
  • 1 kudos

Databricks Metric Views - Moving Towards Business Semantics

Discussed the BI & Metrics Tax elimination using Databricks Metric Views here. Semantic Layer is a core component of the lakehouse with Metric Views. Modern stack is moving toward ai data experiences where organizations ask questions instead of build...

  • 1138 Views
  • 0 replies
  • 1 kudos
Nidhi_Patni
by Databricks Partner
  • 3077 Views
  • 3 replies
  • 5 kudos

How We Built Robust Data Governance at Scale

In today's data-driven world, trust is currency—and that trust starts with quality data governed by strong principles. For one of our client, where we're on a mission to build intelligent enterprises with AI, data isn't just an asset—it's a responsib...

Nidhi_Patni_0-1753460966300.png Nidhi_Patni_5-1753461715636.png Nidhi_Patni_2-1753460966324.png Nidhi_Patni_3-1753460966330.png
  • 3077 Views
  • 3 replies
  • 5 kudos
Latest Reply
Garethcb
New Contributor II
  • 5 kudos

cannot seem to find Databricks Classification API? 

  • 5 kudos
2 More Replies
Saurabh2406
by Contributor
  • 330 Views
  • 0 replies
  • 1 kudos

Legacy BI to an Agentic Lakehouse in 90 Days -Building Autonomous AI Analytics on Databricks 2026

Why Legacy BI Is Reaching Its Limits, And What Comes NextI have always believed that the original goal of digitalization was to make data available and then find better ways to analyze it. For the past two decades, Business Intelligence has followed ...

0.1 Table.png 1 - visual selection.png 2. 90 Day Plan.png
  • 330 Views
  • 0 replies
  • 1 kudos
wesleyfelipe
by Contributor
  • 312 Views
  • 0 replies
  • 1 kudos

Building PCI-Compliant Lakehouses: Governance Challenges Then and Modern Solutions

This article continues a technical deep dive into building large-scale Lakehouse architectures.The original platform processed billions of records across multiple markets and operated under PCI-DSS compliance requirements — a significant engineering ...

  • 312 Views
  • 0 replies
  • 1 kudos
balajij8
by Contributor III
  • 252 Views
  • 0 replies
  • 2 kudos

Databricks Lake flow - Orchestration Layer is moving to where it belongs

Discussed the BI & Metrics Tax elimination using Databricks Metric Views here. Organizations also face an older more persistent tax — the Ingestion Tax.To ingest data from a source like Salesforce or SQL Server into your Lakehouse, you typically stit...

  • 252 Views
  • 0 replies
  • 2 kudos
Saurabh2406
by Contributor
  • 1224 Views
  • 4 replies
  • 4 kudos

Resolved! Designing a Cost-Efficient Databricks Lakehouse, Performance Tuning and Optimization Best Practices

The Hidden Cost of Scaling the LakehouseOver the past few years, many organizations have successfully migrated to Databricks to modernize their data platforms. The Lakehouse architecture has enabled them to unify data engineering, analytics, and AI o...

1 Databricks Optimization.png
  • 1224 Views
  • 4 replies
  • 4 kudos
Latest Reply
wesleyfelipe
Contributor
  • 4 kudos

@Saurabh2406  this is such a rich article and has so many practical takeaways! Congrats!I faced similar challenges in one of my last projects, and I could spend some time building a nice dashboard (using the system.billing tables) that helped us trac...

  • 4 kudos
3 More Replies
Coffee77
by Honored Contributor II
  • 448 Views
  • 3 replies
  • 6 kudos

🇪🇸 Por qué el DataFrame es el objeto de datos más importante en el procesamiento distribuido

En este video, creado como recordatorio para mi mala memoria a largo plazo, explico de forma sencilla: Qué es un DataFrame Cómo se distribuye en particiones Cómo se ejecuta en un cluster (driver y workers) Qué ocurre en un shuffle Relación entre...

  • 448 Views
  • 3 replies
  • 6 kudos
Latest Reply
Coffee77
Honored Contributor II
  • 6 kudos

Recently, I am creating some "self-reminder" videos for helping my long-term poor human memory and maybe to help others. Understand internals of Dataframes, how partitions are related to jobs, stages, shuffles and tasks and, how transformations or a...

  • 6 kudos
2 More Replies
BijuThottathil
by New Contributor III
  • 254 Views
  • 0 replies
  • 0 kudos

🚀 LDP Tax Pipeline — Spark Declarative Pipelines on macOS (Without Databricks)

LDP Tax Pipeline — Spark Declarative Pipelines on macOS (Without Databricks)Excited to share my latest hands-on implementation of a LakeFlow Declarative Pipeline (LDP) built locally using Apache Spark 4.1 Declarative Pipelines — running entirely on ...

  • 254 Views
  • 0 replies
  • 0 kudos
wesleyfelipe
by Contributor
  • 281 Views
  • 0 replies
  • 1 kudos

Scaling SCD on Databricks: Then vs Now

Between 2019 and 2021, we built a large-scale lakehouse on Databricks supporting multi-market payments processing (7B+ transactions/year).If ingestion was complex (covered in Part 1), the Silver layer was even more interesting.Implementing SCD Type 1...

  • 281 Views
  • 0 replies
  • 1 kudos
wesleyfelipe
by Contributor
  • 345 Views
  • 0 replies
  • 2 kudos

Is Zerobus the Future of Ingestion on Databricks? Lessons from a 7B+ Transaction Platform

Between 2019 and 2021, we built a multi-market payments data platform on Databricks that now processes more than 7 billion transactions per year across seven markets.Ingestion was by far the most operationally complex layer.To support MongoDB CDC str...

  • 345 Views
  • 0 replies
  • 2 kudos
Labels