cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

I'm using the Redshift data source to load data into spark SQL data frames. However, I'm not seeing predicate push down for my queries ran on Redshift - is that expected?

sajith_appukutt
Honored Contributor II

I was expecting filter operations to be pushed down to Redshift by the optimizer. However, the entire dataset is getting loaded from Redshift.

1 ACCEPTED SOLUTION

Accepted Solutions

sajith_appukutt
Honored Contributor II

The Spark driver for Redshift pushes the following operators down into Redshift:

  • Filter
  • Project
  • Sort
  • Limit
  • Aggregation
  • Join

However, it does not support expressions operating on dates and timestamps today. If you have a similar requirement, please add a feature request via https://docs.databricks.com/resources/ideas.html

View solution in original post

1 REPLY 1

sajith_appukutt
Honored Contributor II

The Spark driver for Redshift pushes the following operators down into Redshift:

  • Filter
  • Project
  • Sort
  • Limit
  • Aggregation
  • Join

However, it does not support expressions operating on dates and timestamps today. If you have a similar requirement, please add a feature request via https://docs.databricks.com/resources/ideas.html

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.