cancel
Showing results for 
Search instead for 
Did you mean: 
Warehousing & Analytics
Engage in discussions on data warehousing, analytics, and BI solutions within the Databricks Community. Share insights, tips, and best practices for leveraging data for informed decision-making.
cancel
Showing results for 
Search instead for 
Did you mean: 

PowerBI "Token expired while fetching results: TEAuthTokenExpired."

pauloquantile
New Contributor III

Hi everyone,

We are at the moment stumbeling upon a big challenge with loading data into PowerBI. I need some advice!

To give a bit of conext: we introduced Databricks instead of Azure Synapse for a client of ours. We are currently busy with moving all the PowerBI's to read from Azure Synapse instead of Databricks. Everything was fine and working well with smaller datasets. But for the larger and most important ones we stumble upon the "Token expired while fetching results: TEAuthTokenExpired." error. This occures when refreshing the data online/locally.

- All the PowerBI datasets are authenticated using OAuth.

- Some of the datasets need to load more than 350m records. I know, this isn't a best practice. But loading data from Azure Synapse was able, eventhough it would take more than 5 hours. We are now focussing on just replacing the data source with Databricks and optimize the PowerBI's later.

- Is it some way possible to increase the token duration when making fetching results via PowerBI?

 

I am not able to find a working solution on the Databricks forum and the internet, please help me out! 

pauloquantile_0-1693824616520.png

 

9 REPLIES 9

Kaniz_Fatma
Community Manager
Community Manager

Hi @pauloquantile , The "Token expired while fetching results: TEAuthTokenExpired" error you're encountering is likely due to the token's lifetime. As per the Databricks documentation, the lifetime of an Azure AD passthrough token is one hour. When a command is sent to the cluster that takes longer than one hour, it fails if an ADLS resource is accessed after the one-hour mark.

Unfortunately, it is not possible to increase the lifetime of an Azure AD passthrough token. 

Here are some potential solutions:

1. Rewrite your queries: Ensure no single command takes longer than an hour to complete 

2. Use OAuth token: As per Databricks documentation, you can use OAuth tokens to authenticate to both account-level APIs and workspace-level APIs. The access token will expire in one hour, and you must request a new OAuth access token after the expiration.

Hi @Kaniz_Fatma ,

 

Thank you for the response! The problem isn't in the queries, it is just the volume of the data. Loading the data for some reason just takes hours and hours. 

 

Where can I find some of the resources or tutorials about the OAuth token?

I am curious if requesting new access would help us in this case, since it fails during e.g. when loading 200m records. When you have loaded e.g. 100m records, I can expect that it might cause some problems because it might lose the state of loading the data?

 

Thanks!

Kaniz_Fatma
Community Manager
Community Manager

Hi @pauloquantile , For information about OAuth tokens, you can refer to the following resources:-

The document "Authentication with Google ID tokens" provides information on how to authenticate to Databricks REST APIs using Google ID tokens, which are based on the OAuth 2.0 protocol. It also provides details on how to create the required service accounts and generate tokens for these accounts.

You can find more details [here](https://docs.databricks.com/dev-tools/authentication-google-id.html).

- The document "Troubleshoot access tokens" provides information on how to troubleshoot errors you may encounter when getting access tokens and how to validate access tokens. It also provides code snippets to decode and validate an Azure AD access token which is based on OAuth 2.0 protocol.

You can find more details [here](https://docs.databricks.com/dev-tools/troubleshoot-aad-token.html).

Please note that requesting a new access token may not necessarily help with the data loading issue. The volume of data being loaded and the state of the loading process are more likely to be influenced by factors such as the efficiency of your queries, the capacity of your system, and your data management practices. You may need to look into optimizing these aspects to improve the data loading time.

Thank you! I will take a look at it tomorrow.

I just can't imagine that Databricks isn't doing something about this problem, because we can't be the only one that want to load in a lot of data!

I know it isn't a best practice, and we would like to aggregate more in the DWH, but selling a product to a client that can't load data for more than an hour is a serious risk for the product if you ask me.

Hi @pauloquantile , 

I completely understand your concerns, and you're right that loading large volumes of data efficiently is crucial for our clients. Rest assured, Databricks is actively working on addressing this challenge. While it's not a best practice to load extensive data directly into Databricks, we continuously improve our solutions to enhance data loading capabilities and optimize performance.

Your feedback is invaluable and helps us prioritize and develop solutions that align with your needs. We're committed to providing a robust and reliable product, and your input is essential in achieving that goal.

Please feel free to reach out if you have any further questions or concerns. We appreciate your trust in Databricks and are dedicated to ensuring your success.

pauloquantile
New Contributor III

Currently our solution to this problem is using a Personal Access Token as authentication method. I stumbled upon the problem that when the dataset is scheduled via PowerBI it went back to OAuth authentication. Still checking if the problem is staying.

 

If this works, we will use OAuth for the datasets that take less than an hour and the token for datasets > 1 hour. Also looking into if I can manage to generate a personal access token via a service principal instead of an user.

@pauloquantile Hi, Paulo. We do have 350+ million records and I am facing the same issue. Is there any workaround for this??

I mentioned it in the thread on 09/07/2023!

viralpatel
New Contributor II

@Kaniz_Fatma Is there any further update from Databricks which can be helpful here OR what @pauloquantile mentioned is the only workaround solution?

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group