cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

GC Driver Error

aschiff
Contributor II

I am using a cluster in databricks to connect to a Tableau workbook through the JDBC connector. My Tableau workbook has been unable to load due to resources not being available through the data connection. I went to look at the driver log for my cluster and I see Full GC (Ergonomics) errors and Full GC Allocation errors. How do I resolve this? I've tried increasing the storage of my driver and worker by changing them in my cluster but that didn't fix it.

32 REPLIES 32

I am officially lost. After attempting the above strategy I went offline for about an hour and came back to see the Tableau workbook loaded successfully and that beast CASE query is the sql tab in the spark ui. Furthermore, there are queries to tables I don't recall executing. They involve tables I never looked at/queried in databricks or tableau.

aschiff
Contributor II

I recreated the problematic workbook connecting to the same cluster and using the same data with its three sheets/charts successfully and all were able to load properly. I then went to databricks to look at the spark UI and the SQL tab to find out the query but none of it loaded (and I waited for it to). So I then restarted my cluster and refreshed my workbook (big mistake). It was struggling to load again. I restarted the cluster again and turned on photon acceleration.

Here are the queries for each sheet:

Sheet 1 that works fine: SELECT `salesforce_export_1_explorium_15sept2022`.`Contact_ID_18_digit` AS `contact_id_18_digit`,

`salesforce_export_1_explorium_15sept2022`.`Emails` AS `emails`,

`salesforce_export_1_explorium_15sept2022`.`Professional_email` AS `professional_email`,

`salesforce_export_1_explorium_15sept2022_professional_email_val`.`Status` AS `status`

FROM `default`.`salesforce_export_1_explorium_15sept2022` `salesforce_export_1_explorium_15sept2022`

JOIN `default`.`salesforce_export_1_explorium_15sept2022_professional_email_validation` `salesforce_export_1_explorium_15sept2022_professional_email_val` ON (`salesforce_export_1_explorium_15sept2022`.`Professional_email` = `salesforce_export_1_explorium_15sept2022_professional_email_val`.`Email`)

WHERE (CASE WHEN ((`salesforce_export_1_explorium_15sept2022_professional_email_val`.`Status` IN ('valid')) OR (`salesforce_export_1_explorium_15sept2022_professional_email_val`.`Status` IS NULL)) THEN false ELSE true END)

GROUP BY 1,

2,

3,

4

Sheet 2 that works fine but has a really messy query:

SELECT

(CASE WHEN ((CASE

WHEN (((CASE

WHEN ((CASE

WHEN (0 IS NULL) THEN NULL

WHEN 0 < 1 THEN INSTR( `salesforce_export_1_explorium_15sept2022`.`Emails`, '}' )

WHEN 0 = INSTR( SUBSTRING(`salesforce_export_1_explorium_15sept2022`.`Emails`,CAST(0 AS INT),CAST(LENGTH(`salesforce_export_1_explorium_15sept2022`.`Emails`) - (0) + 1 AS INT)), '}' ) THEN 0

ELSE INSTR( SUBSTRING(`salesforce_export_1_explorium_15sept2022`.`Emails`,CAST(0 AS INT),CAST(LENGTH(`salesforce_export_1_explorium_15sept2022`.`Emails`) - (0) + 1 AS INT)), '}' ) + 0 - 1

END) IS NULL) THEN NULL

And the above repeats with the WHEN statements and the INSTR functions to become a very long query too long to copy and paste here.

I am unable to get the query for the third and troublesome sheet. I think I may have seen it in the SQL tab in spark UI when originally recreating the workbook before the "big mistake" of restarting the cluster but can't find it now. So as we discussed I created a post on Tableau community regarding finding the SQL query for a sheet in a workbook: https://community.tableau.com/s/question/0D58b0000ACAwyOCQT/how-to-extract-sql-query-from-a-specific...

Attached is the sql query data with the expanded blue rectangles for the query in sheet 2.

In terms of logically what the troublesome query could be similar to portions of the query from sheet 1 in my previous message. My guesstimate is as follows:

SELECT secondEmailAddress FROM `default`.`salesforce_export_1_explorium_15sept2022` `salesforce_export_1_explorium_15sept2022`

JOIN `default`.`salesforce_export_1_explorium_15sept2022_professional_email_validation` `salesforce_export_1_explorium_15sept2022_professional_email_val` ON (`salesforce_export_1_explorium_15sept2022`.`Professional_email` = `salesforce_export_1_explorium_15sept2022_professional_email_val`.`Email`)

WHERE (CASE WHEN ((`salesforce_export_1_explorium_15sept2022_professional_email_val`.`Status` IN ('valid')) OR (`salesforce_export_1_explorium_15sept2022_professional_email_val`.`Status` IS NULL)) THEN false ELSE true END) AND secondEmailAddress IS NOT null

only 447 records(email addresses) were returned. This is all I can logically tell you about the query, but as for the specific query itself I still don't have it.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.