cancel
Showing results for 
Search instead for 
Did you mean: 
Community Articles
Dive into a collaborative space where members like YOU can exchange knowledge, tips, and best practices. Join the conversation today and unlock a wealth of collective wisdom to enhance your experience and drive success.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

PiotrPustola
by New Contributor II
  • 196 Views
  • 2 replies
  • 1 kudos

Orchestrating Irregular Databricks Jobs from external source Timestamps

Works for any event-driven workload: IoT alerts, e-commerce flash sales, financial market close processing.GoalIn this project, I needed to start Databricks jobs on an irregular basis, driven entirely by timestamps stored in PostgreSQL rather than by...

  • 196 Views
  • 2 replies
  • 1 kudos
Latest Reply
SteveOstrowski
Databricks Employee
  • 1 kudos

@PiotrPustola -- The self-rescheduling orchestrator pattern is a really elegant solution for event-driven workloads that depend on externally managed timestamps. A few thoughts and additions that might help you and others who land on this article: AD...

  • 1 kudos
1 More Replies
Ale_Armillotta
by Contributor
  • 316 Views
  • 2 replies
  • 5 kudos

CI/CD on Databricks with Asset Bundles (DABs) and GitHub Actions

Hi all.If you've ever manually promoted resources from dev to prod on Databricks — copying notebooks, updating configs, hoping nothing breaks — this post is for you.I've been building a CI/CD setup for a Speech-to-Text pipeline on Databricks, and I w...

Community Articles
CICD
DABs
GitHub
  • 316 Views
  • 2 replies
  • 5 kudos
Latest Reply
Ale_Armillotta
Contributor
  • 5 kudos

I've recorded also a YouTube tutorial if someone needs support: https://youtu.be/kStRXqCznHA

  • 5 kudos
1 More Replies
Kirankumarbs
by New Contributor III
  • 113 Views
  • 3 replies
  • 3 kudos

Streaming Failure Models: Why "It Didn't Crash" Is the Worst Outcome

Most Databricks streaming failures don't look dramatic.No cluster termination. No red wall of errors. The UI says RUNNING — and your customers start reporting nonsense.I wrote about the incident that changed how we think about streaming jobs on share...

  • 113 Views
  • 3 replies
  • 3 kudos
Latest Reply
Kirankumarbs
New Contributor III
  • 3 kudos

I completed Part 2 as well! Multi-Task on a Shared Cluster — Why That's Also Not EnoughAn Interesting read up!Thanks for reading and Happy to Learn/Share!

  • 3 kudos
2 More Replies
Nidhi_Patni
by New Contributor III
  • 2344 Views
  • 3 replies
  • 5 kudos

How We Built Robust Data Governance at Scale

In today's data-driven world, trust is currency—and that trust starts with quality data governed by strong principles. For one of our client, where we're on a mission to build intelligent enterprises with AI, data isn't just an asset—it's a responsib...

Nidhi_Patni_0-1753460966300.png Nidhi_Patni_5-1753461715636.png Nidhi_Patni_2-1753460966324.png Nidhi_Patni_3-1753460966330.png
  • 2344 Views
  • 3 replies
  • 5 kudos
Latest Reply
Garethcb
New Contributor
  • 5 kudos

cannot seem to find Databricks Classification API? 

  • 5 kudos
2 More Replies
JstelaBR
by New Contributor III
  • 172 Views
  • 1 replies
  • 3 kudos

Lakebase & the Evolution of Data Architectures

One of the most interesting shifts in the Databricks ecosystem is Lakebase.For years, data architectures have enforced clear boundaries:OLTP → Operational databasesOLAP → Analytical platformsETL → Bridging the gapWhile familiar, this model often crea...

  • 172 Views
  • 1 replies
  • 3 kudos
Latest Reply
MoJaMa
Databricks Employee
  • 3 kudos

Exactly. As I'm sure you are aware there is already a great "sync" from Delta -> Postgres but coming soon is gonna be a seamless way to do the opposite (Yes, even simpler than the version of this that was in private preview). If I had a moving visual...

  • 3 kudos
Saurabh2406
by Contributor
  • 275 Views
  • 4 replies
  • 4 kudos

Resolved! Designing a Cost-Efficient Databricks Lakehouse, Performance Tuning and Optimization Best Practices

The Hidden Cost of Scaling the LakehouseOver the past few years, many organizations have successfully migrated to Databricks to modernize their data platforms. The Lakehouse architecture has enabled them to unify data engineering, analytics, and AI o...

1 Databricks Optimization.png
  • 275 Views
  • 4 replies
  • 4 kudos
Latest Reply
wesleyfelipe
New Contributor III
  • 4 kudos

@Saurabh2406  this is such a rich article and has so many practical takeaways! Congrats!I faced similar challenges in one of my last projects, and I could spend some time building a nice dashboard (using the system.billing tables) that helped us trac...

  • 4 kudos
3 More Replies
Coffee77
by Honored Contributor II
  • 110 Views
  • 3 replies
  • 6 kudos

🇪🇸 Por qué el DataFrame es el objeto de datos más importante en el procesamiento distribuido

En este video, creado como recordatorio para mi mala memoria a largo plazo, explico de forma sencilla: Qué es un DataFrame Cómo se distribuye en particiones Cómo se ejecuta en un cluster (driver y workers) Qué ocurre en un shuffle Relación entre...

  • 110 Views
  • 3 replies
  • 6 kudos
Latest Reply
Coffee77
Honored Contributor II
  • 6 kudos

Recently, I am creating some "self-reminder" videos for helping my long-term poor human memory and maybe to help others. Understand internals of Dataframes, how partitions are related to jobs, stages, shuffles and tasks and, how transformations or a...

  • 6 kudos
2 More Replies
BijuThottathil
by New Contributor III
  • 72 Views
  • 0 replies
  • 0 kudos

🚀 LDP Tax Pipeline — Spark Declarative Pipelines on macOS (Without Databricks)

LDP Tax Pipeline — Spark Declarative Pipelines on macOS (Without Databricks)Excited to share my latest hands-on implementation of a LakeFlow Declarative Pipeline (LDP) built locally using Apache Spark 4.1 Declarative Pipelines — running entirely on ...

  • 72 Views
  • 0 replies
  • 0 kudos
wesleyfelipe
by New Contributor III
  • 99 Views
  • 0 replies
  • 1 kudos

Scaling SCD on Databricks: Then vs Now

Between 2019 and 2021, we built a large-scale lakehouse on Databricks supporting multi-market payments processing (7B+ transactions/year).If ingestion was complex (covered in Part 1), the Silver layer was even more interesting.Implementing SCD Type 1...

  • 99 Views
  • 0 replies
  • 1 kudos
Labels