cancel
Showing results for 
Search instead for 
Did you mean: 
Community Articles
Dive into a collaborative space where members like YOU can exchange knowledge, tips, and best practices. Join the conversation today and unlock a wealth of collective wisdom to enhance your experience and drive success.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Ashwin_DSA
by Databricks Employee
  • 267 Views
  • 0 replies
  • 2 kudos

Databricks Multi-Table Transactions - Part 2

In Part 1, we covered why multi-table transactions matter. Now let's build one. We'll create the tables from the claim wrap-up scenario, load sample P&C insurance data, and walk through what happens when the wrap-up succeeds, when it fails, and when...

s1-claim.png s1-wraplog.png s1-reserves.png s2-error.png
  • 267 Views
  • 0 replies
  • 2 kudos
Kirankumarbs
by Contributor
  • 248 Views
  • 0 replies
  • 1 kudos

One Cluster per Task — Proven, Ready, and Waiting

Part 3 of 3: Databricks Streaming ArchitectureBy the end of Part 1 & Part 2, we knew what the real answer was. We just hadn’t committed to it yet.Not because it wouldn’t work. We tested it. We documented it. The code was ready. The answer was one clu...

  • 248 Views
  • 0 replies
  • 1 kudos
Ale_Armillotta
by Valued Contributor II
  • 2109 Views
  • 3 replies
  • 6 kudos

Resolved! CI/CD on Databricks with Asset Bundles (DABs) and GitHub Actions

Hi all.If you've ever manually promoted resources from dev to prod on Databricks — copying notebooks, updating configs, hoping nothing breaks — this post is for you.I've been building a CI/CD setup for a Speech-to-Text pipeline on Databricks, and I w...

Community Articles
CICD
DABs
GitHub
  • 2109 Views
  • 3 replies
  • 6 kudos
Latest Reply
SteveOstrowski
Databricks Employee
  • 6 kudos

Hi, Great question! Databricks Asset Bundles (DABs) are the recommended approach for CI/CD on Databricks. Here is a comprehensive walkthrough. WHAT ARE DATABRICKS ASSET BUNDLES? DABs let you define your Databricks resources (jobs, pipelines, dashboar...

  • 6 kudos
2 More Replies
AbhaySingh
by Databricks Employee
  • 2504 Views
  • 0 replies
  • 1 kudos

Delta Lake 4.0 in the Real World

Delta Lake 4.0 is the next major open-source release aligned with Spark 4.x, adding first-class Variant for semi-structured data, safer Type Widening, improved DROP FEATURE, better transaction log handling, and a new multi-engine story via Delta Kern...

  • 2504 Views
  • 0 replies
  • 1 kudos
kanikvijay9
by Contributor
  • 1301 Views
  • 2 replies
  • 10 kudos

Optimizing Delta Table Writes for Massive Datasets in Databricks

Problem StatementIn one of my recent projects, I faced a significant challenge: Writing a huge dataset of 11,582,763,212 rows and 2,068 columns to a Databricks managed Delta table.The initial write operation took 22.4 hours using the following setup:...

kanikvijay9_0-1762695454233.png kanikvijay9_1-1762695506126.png kanikvijay9_2-1762695536800.png kanikvijay9_3-1762695573841.png
  • 1301 Views
  • 2 replies
  • 10 kudos
Latest Reply
kanikvijay9
Contributor
  • 10 kudos

Hey @Louis_Frolio ,Thank you for the thoughtful feedback and great suggestions!A few clarifications:AQE is already enabled in my setup, and it definitely helped reduce shuffle overhead during the write.Regarding Column Pruning, in this case, the fina...

  • 10 kudos
1 More Replies
Labels