cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Streaming delta table - Performance with incremental refresh

Fnazar
New Contributor II

Hi Team,

We are hitting performance issues with Streaming live delta table specifically when evaluating large tables of more than 10million rows. 
What are the workarounds to handle these streaming live tables in an attempt to load these large tables. 
Also, if we can use partition by then help me with the syntax please

Thanks

1 REPLY 1

Priyanka_Biswas
Databricks Employee
Databricks Employee

Hi @Fnazar 

When dealing with streaming data, you might end up with many small files, which can be inefficient. Use Delta Lake's OPTIMIZE command to compact files into larger ones and ZORDER to colocate related information in the same set of files. This is particularly useful for columns that are often queried together.

Select a column that results in evenly distributed data. Common choices include dates (for time-based data) or some form of categorical data that is well balanced.

When creating or writing to a Delta table, you can specify the partitioning using the PARTITION BY clause. For instance, if you're partitioning by a date column: df.write.format("delta").partitionBy("date_column").save("/mnt/delta/my_table")

This command will create partitions in the Delta table based on unique values in the date_column

If you're ingesting streaming data into Delta Lake, consider using Auto Loader for efficient and incremental processing of new data.

https://docs.delta.io/latest/best-practices.html

https://docs.databricks.com/en/sql/language-manual/delta-optimize.html

 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group