- 1796 Views
- 1 replies
- 1 kudos
When I first started working with Databricks, I was genuinely impressed by its potential. The seamless integration with Delta Lake, the power of PySpark, and the ability to process massive datasets at incredible speeds—it was truly impactful.Over tim...
- 1796 Views
- 1 replies
- 1 kudos
Latest Reply
1. Try to remove cache() and persist() in the dataframe operations in the code base.2. Fully avoid driver operations like collect() and take() - the information from the executors are brought back to driver, which is highly network i/o overhead.3. Av...
- 754 Views
- 0 replies
- 0 kudos
Hi,
I hope you're doing well. My name is Prasanna. C, Digital Marketing Strategist at Express Analytics, a company that understands consumer behavior and provides analytics solutions and services to businesses.
Express Analytics primarily offers...
- 754 Views
- 0 replies
- 0 kudos
- 1794 Views
- 2 replies
- 1 kudos
I spent some time to understand how to use automatic liquid clustering with dlt pipelines. Hope this can help you as well.Enable Predictive Optimization Use this code:# Enabling Automatic Liquid Clustering on a new table
@dlt.table(cluster_by_auto=Tr...
- 1794 Views
- 2 replies
- 1 kudos
Latest Reply
Hi @Addy0_, thanks for sharing how to set it for existing table. Unfortunately, I think ALTER cannot be used with materialized view and streaming tables defined in dlt pipelines.I was looking for something similar to @dlt.table(cluster_by_auto=True, ...
1 More Replies