cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Query text truncated for queries longer than 153,596 characters

mangosta
New Contributor II

Hi, 

When using the `query_history.list` function of the python SDK workspace client the queries that have more than 153,596 characters are truncated.

I could not find anywhere in the documentation this limit so I wanted to know if this is documented somewhere and if it is configurable.

Thanks,

Fede

1 ACCEPTED SOLUTION

Accepted Solutions

brockb
Databricks Employee
Databricks Employee

Hi @mangosta , I did some testing internally and was able to replicate the behavior you described. 

The query text limit is a limitation not of the SDK or the API, but rather of the backing system table `system.query.history`. More information on this system table can be seen in the Query History System Table.

There is a limit imposed on the query text size due to storage limitations which maps to the System table column `statement_text`. Due to differing compression characteristics of the query text, there is no guarantee that the limit will always be 153,596.

I have requested internally that we update the docs on this `system.query.history` system table to clarify this.

Thanks for your question.

View solution in original post

5 REPLIES 5

Miguel_Suarez
Databricks Employee
Databricks Employee

Hi @mangosta,

Our 'Notebook outputs and results' Document references the limit of truncated query rows at 60,000 (ref: https://docs.gcp.databricks.com/en/notebooks/notebook-outputs.html#:~:text=If%20the%20data%20returne...), but this seems to be the only document mentioning a limit. I hope this helps!

Hi Miguel, 

Thanks for the reply but I am referring to the length of the query, not the returned rows. When I use the ListQueries API '/api/2.0/sql/queries' the query text appears to be truncated if the queries are longer than `153,596` characters.

So I want to know if that is documented somewhere and if there is any setting to change it.

Thanks,

Fede

brockb
Databricks Employee
Databricks Employee

Hi @mangosta , I did some testing internally and was able to replicate the behavior you described. 

The query text limit is a limitation not of the SDK or the API, but rather of the backing system table `system.query.history`. More information on this system table can be seen in the Query History System Table.

There is a limit imposed on the query text size due to storage limitations which maps to the System table column `statement_text`. Due to differing compression characteristics of the query text, there is no guarantee that the limit will always be 153,596.

I have requested internally that we update the docs on this `system.query.history` system table to clarify this.

Thanks for your question.

mangosta
New Contributor II

Thanks a lot @brockb ! Please let me know when the documentation related to the Query History System Table is updated so I can get all the information and we can plan on our side how to address the problem. 

Regards,

Fede

brockb
Databricks Employee
Databricks Employee

Hi @mangosta , the System Tables docs have been updated with additional context on the `statement_text` here: https://docs.databricks.com/en/admin/system-tables/query-history.html#using-the-query-history-table

Thanks again for letting us know.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group