by
dener
• New Contributor
- 370 Views
- 1 replies
- 0 kudos
I am experiencing performance issues when loading a table with 50 million rows into Delta Lake on AWS using Databricks. Despite successfully handling other larger tables, this especific table/process takes hours and doesn't finish. Here's the command...
- 370 Views
- 1 replies
- 0 kudos
Latest Reply
Thank you for your question! To optimize your Delta Lake write process:
Disable Overhead Options: Avoid overwriteSchema and mergeSchema unless necessary. Use:
df.write.format("delta").mode("overwrite").save(sink)
Increase Parallelism: Use repartition...
- 561 Views
- 5 replies
- 3 kudos
Hi Databricks Community,I'm encountering an issue with watermarks in Delta Live Tables that's causing data loss in my streaming pipeline. Let me explain my specific problem:Current SituationI've implemented watermarks for stateful processing in my De...
- 561 Views
- 5 replies
- 3 kudos
Latest Reply
Dear @VZLA, @Walter_C ,I wanted to take a moment to express my sincere gratitude for your incredibly detailed explanation and thoughtful suggestions. Your guidance has been immensely valuable and has provided us with a clear path forward in addressi...
4 More Replies