09-23-2022 12:43 PM
I am using a cluster in databricks to connect to a Tableau workbook through the JDBC connector. My Tableau workbook has been unable to load due to resources not being available through the data connection. I went to look at the driver log for my cluster and I see Full GC (Ergonomics) errors and Full GC Allocation errors. How do I resolve this? I've tried increasing the storage of my driver and worker by changing them in my cluster but that didn't fix it.
10-03-2022 10:33 AM
Thanks for the files.
(1) So in this Databricks Shell - Details for Query file, in the graph you need to expand the blocks and see if you see any data spill in there.
(2) Can you please submit the query that you were doing that ran into this problem? How much data are you running this on? Is that data partitioned by you in a custom way?
Side note: I did see in one of the stages you are doing a count distinct and I wanted to draw your attention to a Approx Count Distinct SQL function.
10-03-2022 12:41 PM
I tried to get these answers for you but all the spark UI tabs are coming up blank/default again. I've tried restarting my cluster to solve the issue like before but no dice. Is it possible I can give you permission to look at my cluster?
10-03-2022 03:09 PM
Usually it is blank because the page is still loading. You should try to wait to see it comes up. Plus the cluster should be on and active & you just run the job to see it in Spark UI.
10-03-2022 03:12 PM
Also can you run the command in tableau that caused that issue and then look at the cluster to identify the job number that is associated to that effort? I can see the SQL queries in your stages tab but they are cut off, so I would like to see the full queries that are made that give you this type of error.
10-03-2022 05:18 PM
I thought the query was select * from default.salesforce_export_1_explorium_15sept2022 but it is a lot more. Attached is the 878 pages of word document it took to copy and paste the query. The table I am asking select * from is only 700KB. The job ids associated with this ridiculous query are 293, 294, and 295
10-03-2022 05:18 PM
And here is a picture of the query details with the boxes expanded to show if a data spill happened.
10-06-2022 11:51 AM
Did this JDBC from Databricks to Tableau Online work with a simple query?
Example of simple:
SELECT 'Hello World';
10-06-2022 12:29 PM
I am not sure how to test queries in a Tableau Online to Databricks connection but I was able to run a different Tableau workbook attached to the same cluster and this query below was in the SQL tab of the Spark UI.
SELECT (CASE WHEN (`delta_table_campaign`.`Name` IN ('CRE_MAT_2023', 'CRE_MAT_2023_Email_Had_Prior_Communication', 'CRE_MAT_2023_Email_NO_Prior_Communication')) THEN 'CRE_MAT_2023 campaign group' ELSE `delta_table_campaign`.`Name` END) AS `name__group_`,
COUNT(DISTINCT `delta_table_contact`.`Id`) AS `usr_calculation_496170019756503042_ok`
FROM `salesforce_raw`.`delta_table_contact` `delta_table_contact`
JOIN `salesforce_raw`.`delta_table_campaignmember` `delta_table_campaignmember` ON (`delta_table_contact`.`Id` = `delta_table_campaignmember`.`ContactId`)
JOIN `salesforce_raw`.`delta_table_campaign` `delta_table_campaign` ON (`delta_table_campaignmember`.`CampaignId` = `delta_table_campaign`.`Id`)
WHERE ((NOT (`delta_table_contact`.`LaunchLeads_Status__c` IS NULL)) AND ((CASE WHEN (`delta_table_campaign`.`Name` IN ('CRE_MAT_2023', 'CRE_MAT_2023_Email_Had_Prior_Communication', 'CRE_MAT_2023_Email_NO_Prior_Communication')) THEN 'CRE_MAT_2023 campaign group' ELSE `delta_table_campaign`.`Name` END) IN ('CRE_MAT_Q1_2023', 'CRE_MAT_Q1_2023_PART_1_MF', 'CRE_MAT_Q1_2023_PART_2_TEXT', 'CRE_MAT_Q3_2024_OTHER', 'CRE_MAT_Q3_2024_OTHER_PART_1', 'CRE_MAT_Q3_2024_OTHER_PART_2')))
GROUP BY 1
10-10-2022 12:18 PM
I know you can't figure out exactly what the SQL query is until Tableau tells you a way to download that from Tableau Online, but with the query you just gave that you think is the problem query, did you test that in Databricks to see if it runs without an error?
My thought is, if it runs on Databricks directly yet Tableau it does not, then it is the Tableau support team that needs to help you with this problem. If it doesn't work on our side, then we can look at the Spark UI of that specific run to try to identify what is the problem. I see what you sent for the other queries and it seems Photon is helping with those joins, so when you run it on our side, please make sure to use the Photon enabled cluster you have been using so we can troubleshoot this further.
10-10-2022 02:58 PM
I will try what you suggested and see what happens. We are currently working on other projects while all this databricks/tableau stuff is happening and we have figured out how to get the information from this query without running it in Tableau. Obviously I want to avoid this problem in the future so I would like to solve this regardless. I tried recreating the problematic sheet again and got stuck loading once applying a filter. This query was the last to run in databricks spark UI to show me all the "statuses" before I then filtered on them causing the issue.
SELECT `salesforce_export_1_explorium_15sept2022_professional_email_val`.`Status` AS `status`
FROM `default`.`salesforce_export_1_explorium_15sept2022_professional_email_validation` `salesforce_export_1_explorium_15sept2022_professional_email_val`
GROUP BY 1
ORDER BY `status` ASC
Below is a picture of a failed query, this is the really long case when INSTR stuff I talked about earlier and wonder if you can click in to see the full query.
If you can't see the whole thing, here is some of what I was able to copy and paste (and the query keeps going on like this for what seems like forever) :
SELECT
(CASE WHEN ((CASE
WHEN (((CASE
WHEN ((CASE
WHEN (0 IS NULL) THEN NULL
WHEN 0 < 1 THEN INSTR( `salesforce_export_1_explorium_15sept2022`.`Emails`, '}' )
WHEN 0 = INSTR( SUBSTRING(`salesforce_export_1_explorium_15sept2022`.`Emails`,CAST(0 AS INT),CAST(LENGTH(`salesforce_export_1_explorium_15sept2022`.`Emails`) - (0) + 1 AS INT)), '}' ) THEN 0
ELSE INSTR( SUBSTRING(`salesforce_export_1_explorium_15sept2022`.`Emails`,CAST(0 AS INT),CAST(LENGTH(`salesforce_export_1_explorium_15sept2022`.`Emails`) - (0) + 1 AS INT)), '}' ) + 0 - 1
END) IS NULL) THEN NULL
WHEN (CASE
WHEN (0 IS NULL) THEN NULL
WHEN 0 < 1 THEN INSTR( `salesforce_export_1_explorium_15sept2022`.`Emails`, '}' )
WHEN 0 = INSTR( SUBSTRING(`salesforce_export_1_explorium_15sept2022`.`Emails`,CAST(0 AS INT),CAST(LENGTH(`salesforce_export_1_explorium_15sept2022`.`Emails`) - (0) + 1 AS INT)), '}' ) THEN 0
ELSE INSTR( SUBSTRING(`salesforce_export_1_explorium_15sept2022`.`Emails`,CAST(0 AS INT),CAST(LENGTH(`salesforce_export_1_explorium_15sept2022`.`Emails`) - (0) + 1 AS INT)), '}' ) + 0 - 1
END) < 1 THEN INSTR( `salesforce_export_1_explorium_15sept2022`.`Emails`, 'address' )
WHEN 0 = INSTR( SUBSTRING(`salesforce_export_1_explorium_15sept2022`.`Emails`,CAST((CASE
WHEN (0 IS NULL) THEN NULL
WHEN 0 < 1 THEN INSTR( `salesforce_export_1_explorium_15sept2022`.`Emails`, '}' )
WHEN 0 = INSTR( SUBSTRING(`salesforce_export_1_explorium_15sept2022`.`Emails`,CAST(0 AS INT),CAST(LENGTH(`salesforce_export_1_explorium_15sept2022`.`Emails`) - (0) + 1 AS INT)), '}' ) THEN 0
ELSE INSTR( SUBSTRING(`salesforce_export_1_explorium_15sept2022`.`Emails`,CAST(0 AS INT),CAST(LENGTH(`salesforce_export_1_explorium_15sept2022`.`Emails`) - (0) + 1 AS INT)), '}' ) + 0 - 1
END) AS INT),CAST(LENGTH(`salesforce_export_1_explorium_15sept2022`.`Emails`) - ((CASE
WHEN (0 IS NULL) THEN NULL
WHEN 0 < 1 THEN INSTR( `salesforce_export_1_explorium_15sept2022`.`Emails`, '}' )
WHEN 0 = INSTR( SUBSTRING(`salesforce_export_1_explorium_15sept2022`.`Emails`,CAST(0 AS INT),CAST(LENGTH(`salesforce_export_1_explorium_15sept2022`.`Emails`) - (0) + 1 AS INT)), '}' ) THEN 0
ELSE INSTR( SUBSTRING(`salesforce_export_1_explorium_15sept2022`.`Emails`,CAST(0 AS INT),CAST(LENGTH(`salesforce_export_1_explorium_15sept2022`.`Emails`) - (0) + 1 AS INT)), '}' ) + 0 - 1
END)) + 1 AS INT)), 'address' ) THEN 0
ELSE INSTR( SUBSTRING(`salesforce_export_1_explorium_15sept2022`.`Emails`,CAST((CASE
WHEN (0 IS NULL) THEN NULL
WHEN 0 < 1 THEN INSTR( `salesforce_export_1_explorium_15sept2022`.`Emails`, '}' )
WHEN 0 = INSTR( SUBSTRING(`salesforce_export_1_explorium_15sept2022`.`Emails`,CAST(0 AS INT),CAST(LENGTH(`salesforce_export_1_explorium_15sept2022`.`Emails`) - (0) + 1 AS INT)), '}' ) THEN 0
ELSE INSTR( SUBSTRING(`salesforce_export_1_explorium_15sept2022`.`Emails`,CAST(0 AS INT),CAST(LENGTH(`salesforce_export_1_explorium_15sept2022`.`Emails`) - (0) + 1 AS INT)), '}' ) + 0 - 1
END) AS INT),CAST(LENGTH(`salesforce_export_1_explorium_15sept2022`.`Emails`) - ((CASE
WHEN (0 IS NULL) THEN NULL
WHEN 0 < 1 THEN INSTR( `salesforce_export_1_explorium_15sept2022`.`Emails`, '}' )
WHEN 0 = INSTR( SUBSTRING(`salesforce_export_1_explorium_15sept2022`.`Emails`,CAST(0 AS INT),CAST(LENGTH(`salesforce_export_1_explorium_15sept2022`.`Emails`) - (0) + 1 AS INT)), '}' ) THEN 0
ELSE INSTR( SUBSTRING(`salesforce_export_1_explorium_15sept2022`.`Emails`,CAST(0 AS INT),CAST(LENGTH(`salesforce_export_1_explorium_15sept2022`.`Emails`) - (0) + 1 AS INT)), '}' ) + 0 - 1
END)) + 1 AS INT)), 'address' ) + (CASE
WHEN (0 IS NULL) THEN NULL
WHEN 0 < 1 THEN INSTR( `salesforce_export_1_explorium_15sept2022`.`Emails`, '}' )
WHEN 0 = INSTR( SUBSTRING(`salesforce_export_1_explorium_15sept2022`.`Emails`,CAST(0 AS INT),CAST(LENGTH(`salesforce_export_1_explorium_15sept2022`.`Emails`) - (0) + 1 AS INT)), '}' ) THEN 0
ELSE INSTR( SUBSTRING(`salesforce_export_1_explorium_15sept2022`.`Emails`,CAST(0 AS INT),CAST(LENGTH(`salesforce_export_1_explorium_15sept2022`.`Emails`) - (0) + 1 AS INT)), '}' ) + 0 - 1
END) - 1
END) + 10) IS NULL) THEN NULL
WHEN ((CASE
WHEN ((CASE
WHEN (0 IS NULL) THEN NULL
WHEN 0 < 1 THEN INSTR( `salesforce_export_1_explorium_15sept2022`.`Emails`, '}' )
WHEN 0 = INSTR( SUBSTRING(`salesforce_export_1_explorium_15sept2022`.`Emails`,CAST(0 AS INT),CAST(LENGTH(`salesforce_export_1_explorium_15sept2022`.`Emails`) - (0) + 1 AS INT)), '}' ) THEN 0
ELSE INSTR( SUBSTRING(`salesforce_export_1_explorium_15sept2022`.`Emails`,CAST(0 AS INT),CAST(LENGTH(`salesforce_export_1_explorium_15sept2022`.`Emails`) - (0) + 1 AS INT)), '}' ) + 0 - 1
END) IS NULL) THEN NULL
WHEN (CASE
WHEN (0 IS NULL) THEN NULL
WHEN 0 < 1 THEN INSTR( `salesforce_export_1_explorium_15sept2022`.`Emails`, '}' )
WHEN 0 = INSTR( SUBSTRING(`salesforce_export_1_explorium_15sept2022`.`Emails`,CAST(0 AS INT),CAST(LENGTH(`salesforce_export_1_explorium_15sept2022`.`Emails`) - (0) + 1 AS INT)), '}' ) THEN 0
10-11-2022 01:00 PM
It's hilarious to me that that SELECT query turned into that beast with a filter button in Tableau. So just to confirm, this CASE beast you have here does not run on Databricks when you copy and paste it to run in a notebook on the cluster you use for Tableau Online?
10-11-2022 01:01 PM
or just run the cell as SQL
10-12-2022 03:52 PM
So looking at this more closely, this looks like a Tableau problem. To go from that nice SELECT statement to this mess looks like a bug on Tableau's side. I mean look at the query, "WHEN (0 IS NULL) THEN NULL" and then "WHEN 0 < 1 THEN INSTR( `salesforce_export_1_explorium_15sept2022`.`Emails`, '}' )" When is 0 not less than 1? Why does a filter have to have an CASE statement like that for a filter? You see these strange portions of the query repeated in that SQL beast like a broken record.
I'm not sure why a WHERE clause wouldn't work so here is my work around Tableau, change the SQL query and create a table that does the filter and then read that into Tableau.
CREATE TABLE add_new_tablename_here AS SELECT `salesforce_export_1_explorium_15sept2022_professional_email_val`.`Status` AS `status`
FROM `default`.`salesforce_export_1_explorium_15sept2022_professional_email_validation` `salesforce_export_1_explorium_15sept2022_professional_email_val`
GROUP BY 1
ORDER BY `status` ASC
WHERE column_you_are_filter_clause_goes_here
I hope this helps.
10-12-2022 04:47 PM
Yeah this is very weird. Earlier today I was able to load the tableau sheet just fine and also run that beast CASE query in a databricks notebook. But then a few hours ago the tableau sheets weren't loading and everything messed up again. I will look into trying your solution and let you know what happens, fingers crossed.
10-13-2022 01:13 PM
I tried your strategy of doing the status filtering and making the table in databricks then connecting to this new table in tableau with the same cluster. Everything went well when connecting to the table in tableau and making the sheet but when I hit publish workbook and gave the workbook a name, the continuous loading leading to a "Resource temporarily unavailable" error happened. This certainly seems to be a tableau issue, at least we isolated the problem right? 🙂
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group