cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
cancel
Showing results for 
Search instead for 
Did you mean: 

Tableau extract creation frozen

amits
New Contributor III

Heya,

I'm having an issue with extract creation from a Delta lake table. Tableau is frozen on "Rows retrieved: X" for too long.

I actually succeeded in creating the first extract but saw I was missing a column. I went ahead and did a full rewrite -

events_df.write.mode('overwrite').option("overwriteSchema", "true").partitionBy('created_date').format(write_format).save(save_path)

Since then, trying to create/refresh the extract just gets stuck on an arbitrary number of rows.

In Databricks SQL and notebooks, I can read the table fine...

I've tried refreshing the extract, deleting it and creating it anew, restarting my PC and doing it again... Nothing seems to work.

Any idea on what might be causing this / how to solve it?

8 REPLIES 8

Kaniz
Community Manager
Community Manager

Hi @Amit Steiner​ , Can you share the version you see the issue?

amits
New Contributor III

Hey 🙂

I'm using:

  • Tableau Desktop 2022.1.1 64-bit, professional edition
  • Databricks Runtime 10.4 LTS for ETL and data writes
  • "Classic" small SQL endpoint for Tableau integration

Prabakar
Esteemed Contributor III
Esteemed Contributor III

@Amit Steiner​ what is the size of the table. Do you see any error or does Tableau get frozen without any error? I believe this to be more of a Tableau-related issue than Databricks.

What is the version of Tableau that you are using? What is the connector version?

Are you facing this problem only while reading from Databricks? Have you tried taking the dataset to the local machine and tried accessing using Tableau? This will help to eliminate Databriccks and see if it's an issue with Databricks or Tableau.

Additionally, there are multiple variables that will affect the amount of time it takes to complete an extract. Take a look at the following link on how to optimize it.

https://help.tableau.com/current/server/en-us/perf_optimize_extracts.htm

amits
New Contributor III

Hey 🙂

What gave me pause was that the extract only failed after my rewrite. It might've been random, but I wanted to understand if I did something wrong as the troubleshooting section is still lean.

I'm using:

  • Tableau Desktop 2022.1.1 64-bit, professional edition
  • Databricks Runtime 10.4 LTS for ETL and data writes
  • "Classic" small SQL endpoint for Tableau integration
  • Connector version - Simba Spark ODBC Driver for Windows version 10.00.22000.01

Table size is 2.3M rows over 20 columns.

Yes, other connectors (Redshift, Athena, GA) are working as expected.

I Haven't tried downloading the data and accessing it locally. How does that work? Can Tableau read Delta lake files locally, or do I need to convert them to another format? Also, how do I do it? I can't find any documentation on how to download a Databricks table as a single file for Tableau to read

amits
New Contributor III

I just tried to extract again now, and it worked. I'm seriously confused as to why it wasn't the case two days ago, no matter what I tried. Do you have any suggestions on how to investigate this to see which part of the process broke down?

I also believe it was Tableau, but I'd like to know for sure.

Prabakar
Esteemed Contributor III
Esteemed Contributor III

I would recommend checking the Simba driver logs. I am not aware of the logs that you can check from Tableau.

Kaniz
Community Manager
Community Manager

Hi @Amit Steiner​ ​, We haven’t heard from you on the last response from @Prabakar Ammeappin​ , and I was checking back to see if you have a resolution yet. If you have any solution, please share it with the community as it can be helpful to others. Otherwise, we will respond with more details and try to help.

amits
New Contributor III

Hey @Kaniz Fatma​, unfortunately, I don't have anything new to share. I didn't have the chance to check the logs yet as the tasks keep on piling up... I think it's safe to say this is a problem on Tableau's end, as this issue didn't manifest again with Databricks but did manifest for my colleagues with different data sources on different occasions. My only recommendation would be to try to restart either your own machine or, if possible, the cloud machine if you encounter the same issues there.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.