05-23-2022 08:48 AM
Heya,
I'm having an issue with extract creation from a Delta lake table. Tableau is frozen on "Rows retrieved: X" for too long.
I actually succeeded in creating the first extract but saw I was missing a column. I went ahead and did a full rewrite -
events_df.write.mode('overwrite').option("overwriteSchema", "true").partitionBy('created_date').format(write_format).save(save_path)
Since then, trying to create/refresh the extract just gets stuck on an arbitrary number of rows.
In Databricks SQL and notebooks, I can read the table fine...
I've tried refreshing the extract, deleting it and creating it anew, restarting my PC and doing it again... Nothing seems to work.
Any idea on what might be causing this / how to solve it?
05-24-2022 01:38 AM
Hi @Amit Steiner , Can you share the version you see the issue?
05-25-2022 01:14 AM
Hey 🙂
I'm using:
05-24-2022 02:44 PM
@Amit Steiner what is the size of the table. Do you see any error or does Tableau get frozen without any error? I believe this to be more of a Tableau-related issue than Databricks.
What is the version of Tableau that you are using? What is the connector version?
Are you facing this problem only while reading from Databricks? Have you tried taking the dataset to the local machine and tried accessing using Tableau? This will help to eliminate Databriccks and see if it's an issue with Databricks or Tableau.
Additionally, there are multiple variables that will affect the amount of time it takes to complete an extract. Take a look at the following link on how to optimize it.
https://help.tableau.com/current/server/en-us/perf_optimize_extracts.htm
05-25-2022 01:50 AM
Hey 🙂
What gave me pause was that the extract only failed after my rewrite. It might've been random, but I wanted to understand if I did something wrong as the troubleshooting section is still lean.
I'm using:
Table size is 2.3M rows over 20 columns.
Yes, other connectors (Redshift, Athena, GA) are working as expected.
I Haven't tried downloading the data and accessing it locally. How does that work? Can Tableau read Delta lake files locally, or do I need to convert them to another format? Also, how do I do it? I can't find any documentation on how to download a Databricks table as a single file for Tableau to read
05-25-2022 02:15 AM
I just tried to extract again now, and it worked. I'm seriously confused as to why it wasn't the case two days ago, no matter what I tried. Do you have any suggestions on how to investigate this to see which part of the process broke down?
I also believe it was Tableau, but I'd like to know for sure.
05-26-2022 07:49 AM
I would recommend checking the Simba driver logs. I am not aware of the logs that you can check from Tableau.
06-14-2022 09:24 AM
Hi @Amit Steiner , We haven’t heard from you on the last response from @Prabakar Ammeappin , and I was checking back to see if you have a resolution yet. If you have any solution, please share it with the community as it can be helpful to others. Otherwise, we will respond with more details and try to help.
06-14-2022 10:47 PM
Hey @Kaniz Fatma, unfortunately, I don't have anything new to share. I didn't have the chance to check the logs yet as the tasks keep on piling up... I think it's safe to say this is a problem on Tableau's end, as this issue didn't manifest again with Databricks but did manifest for my colleagues with different data sources on different occasions. My only recommendation would be to try to restart either your own machine or, if possible, the cloud machine if you encounter the same issues there.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group