cancel
Showing results for 
Search instead for 
Did you mean: 
Community Articles
Dive into a collaborative space where members like YOU can exchange knowledge, tips, and best practices. Join the conversation today and unlock a wealth of collective wisdom to enhance your experience and drive success.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Saurabh2406
by Contributor
  • 830 Views
  • 4 replies
  • 4 kudos

Resolved! Designing a Cost-Efficient Databricks Lakehouse, Performance Tuning and Optimization Best Practices

The Hidden Cost of Scaling the LakehouseOver the past few years, many organizations have successfully migrated to Databricks to modernize their data platforms. The Lakehouse architecture has enabled them to unify data engineering, analytics, and AI o...

1 Databricks Optimization.png
  • 830 Views
  • 4 replies
  • 4 kudos
Latest Reply
wesleyfelipe
Contributor
  • 4 kudos

@Saurabh2406  this is such a rich article and has so many practical takeaways! Congrats!I faced similar challenges in one of my last projects, and I could spend some time building a nice dashboard (using the system.billing tables) that helped us trac...

  • 4 kudos
3 More Replies
Coffee77
by Honored Contributor II
  • 338 Views
  • 3 replies
  • 6 kudos

🇪🇸 Por qué el DataFrame es el objeto de datos más importante en el procesamiento distribuido

En este video, creado como recordatorio para mi mala memoria a largo plazo, explico de forma sencilla: Qué es un DataFrame Cómo se distribuye en particiones Cómo se ejecuta en un cluster (driver y workers) Qué ocurre en un shuffle Relación entre...

  • 338 Views
  • 3 replies
  • 6 kudos
Latest Reply
Coffee77
Honored Contributor II
  • 6 kudos

Recently, I am creating some "self-reminder" videos for helping my long-term poor human memory and maybe to help others. Understand internals of Dataframes, how partitions are related to jobs, stages, shuffles and tasks and, how transformations or a...

  • 6 kudos
2 More Replies
BijuThottathil
by New Contributor III
  • 219 Views
  • 0 replies
  • 0 kudos

🚀 LDP Tax Pipeline — Spark Declarative Pipelines on macOS (Without Databricks)

LDP Tax Pipeline — Spark Declarative Pipelines on macOS (Without Databricks)Excited to share my latest hands-on implementation of a LakeFlow Declarative Pipeline (LDP) built locally using Apache Spark 4.1 Declarative Pipelines — running entirely on ...

  • 219 Views
  • 0 replies
  • 0 kudos
wesleyfelipe
by Contributor
  • 227 Views
  • 0 replies
  • 1 kudos

Scaling SCD on Databricks: Then vs Now

Between 2019 and 2021, we built a large-scale lakehouse on Databricks supporting multi-market payments processing (7B+ transactions/year).If ingestion was complex (covered in Part 1), the Silver layer was even more interesting.Implementing SCD Type 1...

  • 227 Views
  • 0 replies
  • 1 kudos
wesleyfelipe
by Contributor
  • 275 Views
  • 0 replies
  • 2 kudos

Is Zerobus the Future of Ingestion on Databricks? Lessons from a 7B+ Transaction Platform

Between 2019 and 2021, we built a multi-market payments data platform on Databricks that now processes more than 7 billion transactions per year across seven markets.Ingestion was by far the most operationally complex layer.To support MongoDB CDC str...

  • 275 Views
  • 0 replies
  • 2 kudos
emma_s
by Databricks Employee
  • 852 Views
  • 0 replies
  • 3 kudos

Level up your AI Assistant Expeience with Agent Skills

Did you know the Databricks Assistant now supports Agent Skills? If your team has common, repeatable workflows, this feature is one you need to explore. Skills provide the Databricks Assistant with a specific set of instructions to handle tasks custo...

emma_s_0-1771431117803.png emma_s_1-1771431117804.png emma_s_2-1771431117804.png
  • 852 Views
  • 0 replies
  • 3 kudos
Nidhi_Patni
by Databricks Partner
  • 709 Views
  • 2 replies
  • 1 kudos

How I Reduced Databricks Costs in a High-Volume Financial Data Platform

In financial services, data never sleeps. Trades flow in every second. Risk calculations refresh continuously. Regulatory reports demand precision. BI dashboards serve business users who expect sub-second responses. And behind all of that? A massive ...

Community Articles
Performance Optimization
  • 709 Views
  • 2 replies
  • 1 kudos
Latest Reply
Kirankumarbs
Contributor
  • 1 kudos

@Nidhi_Patni Thanks for production level information!@wesleyfelipe > I have one question: what was the trade-off between using traditional partitioning with Z-ordering versus liquid clustering?Traditional partitioning with Z-ordering is kind of the o...

  • 1 kudos
1 More Replies
murtadha_s
by Databricks Partner
  • 416 Views
  • 2 replies
  • 1 kudos

Resolved! Custom asset bundles file name

Hi,Is there a way to custom name an assetbundle file name and pass that to databricks bundle deploy?I mean right now I must use databricks.yml, so my question is whether there is a way to pass a custom file name.note that I don't want to embed a file...

  • 416 Views
  • 2 replies
  • 1 kudos
Latest Reply
Kirankumarbs
Contributor
  • 1 kudos

Hi @murtadha_s The simple answer is no, but I’d like to understand the issue you’re facing so I can see if there’s anything I can help with.For example, this is how I’m using it in my production application, and it’s working quite well. We’re handlin...

  • 1 kudos
1 More Replies
Ajay-Pandey
by Databricks MVP
  • 1306 Views
  • 1 replies
  • 1 kudos

Terraform support for AI/BI dashboards

AI/BI dashboards can now be managed through Terraform.Dashboard using serialized_dashboard attribute: data "databricks_sql_warehouse" "starter" { name = "Starter Warehouse" } resource "databricks_dashboard" "dashboard" { display_name = "...

  • 1306 Views
  • 1 replies
  • 1 kudos
Latest Reply
gandalfthearab
New Contributor II
  • 1 kudos

if the content of `file_path` has changed, terraform detects no changes. It is better if you make it checking md5 of the file to allow resource updates. 

  • 1 kudos
JstelaBR
by Databricks Partner
  • 131 Views
  • 0 replies
  • 2 kudos

Scaling Databricks Pipelines with Templates & ADF Orchestration

In a Databricks project integrating multiple legacy systems, one recurring challenge was maintaining development consistency as pipelines and team size grew.Pipeline divergence tends to emerge quickly:• Different ingestion approaches• Inconsistent tr...

  • 131 Views
  • 0 replies
  • 2 kudos
Labels