cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Databricks SQL Warehouse, Tableau and spark.driver.maxResultSize error

Jpeterson
New Contributor III

I'm attempting to create a tableau extract on tableau server with a connection to databricks large sql warehouse. The extract process fails due to spark.driver.maxResultSize error.

Using a databricks interactive cluster in the data science & engineering workspace, I can edit the spark config to change spark.driver.maxResultSize and resolve this error.

Is there a way to change the spark.driver.maxResultSize on a databricks sql warehouse?

Is there a way to reduce the size of the data collected? The full table selected is just 987 MB, but when the sql warehouse tries to read > collect > send to tableau the collect process is resulting in over 32GB.

Any other ideas on how to solve? I've got a bunch of tableau extract processes that suffer from this spark.driver.maxResultSize error.

Driver error message:

java.lang.RuntimeException: [Simba][Hardy] (35) Error from server: error code: '0' error message: 'Error running query: org.apache.spark.SparkException: Job aborted due to stage failure: Total size of serialized results of 196 tasks (32.8 GB) is bigger than spark.driver.maxResultSize (32.0 GB)'.

4 REPLIES 4

karthik_p
Esteemed Contributor

@Josh Peterson​ did you tried to add under data access config in sql, few of config parameters are supported. Data access configuration | Databricks on AWS

Jpeterson
New Contributor III

I've tried to add that config under data access, but it doesn't seem to be a supported key.

Error message:

Line 1 "spark.driver.maxResultSize": Illegal key

dvm
New Contributor II

Hi,

SQL clusters share the same Spark configuration. Ask the workspace administrator - to add the spark config under SQL Admin/SQL Workspace.

Instructions at https://docs.databricks.com/sql/admin/sql-configuration-parameters.html

Jpeterson
New Contributor III

spark.driver.maxResultSize isn't a supported parameter on SQL Warehouses https://docs.databricks.com/sql/language-manual/sql-ref-parameters.html

Maybe this can be added to roadmap, or for BI integrations that ask for select * or full collects to operate differently on the sql warehouses?

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.