cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to avoid trimming in EXPLAIN?

vr
Contributor

I am looking on EXPLAIN EXTENDED plan for a statement.

In == Physical Plan == section, I go down to FileScan node and see a lot of ellipsis, like

                     +- FileScan parquet schema.table[Time#8459,TagName#8460,Value#8461,Quality#8462,day#8466,isLate#8467] Batched: true, DataFilters: [isnotnull(TagName#8460), isnotnull(Quality#8462), isnotnull(Value#8461), isnotnull(Time#8459), (..., Format: Parquet, Location: PreparedDeltaFileIndex(1 paths)[mcfs-abfss://t-125a3c9d-90a3-46dc-a577-196577aff13d+abc-masked..., PartitionFilters: [isnotnull(day#8466), (cast(day#8466 as timestamp) >= 2022-11-19 00:00:00), (cast(day#8466 as tim..., PushedFilters: [IsNotNull(TagName), IsNotNull(Quality), IsNotNull(Value), IsNotNull(Time), EqualTo(TagName,FRO_P..., ReadSchema: struct<Time:timestamp,TagName:string,Value:double,Quality:int>

How to see full description of the section, without trimming? I am particularly interested in PartitionFilters section.

1 ACCEPTED SOLUTION

Accepted Solutions

-werners-
Esteemed Contributor III

You can try using FORMATTED as display option, this does not truncate. But it will probably not display everything you want.

There are also the following parameters:

spark.sql.maxMetadataStringLength

spark.sql.maxPlanStringLength

I'd try changing the default values and see if it works.

https://spark.apache.org/docs/latest/configuration.html

View solution in original post

5 REPLIES 5

Anonymous
Not applicable

Hi @Vladimir Ryabtsev​ 

Great to meet you, and thanks for your question! 

Let's see if your peers in the community have an answer to your question first. Or else bricksters will get back to you soon.

Thanks

-werners-
Esteemed Contributor III

You can try using FORMATTED as display option, this does not truncate. But it will probably not display everything you want.

There are also the following parameters:

spark.sql.maxMetadataStringLength

spark.sql.maxPlanStringLength

I'd try changing the default values and see if it works.

https://spark.apache.org/docs/latest/configuration.html

Indeed, FORMATTED gave better results out of the box.

Indeed it did not have everything, but tweaking spark.sql.maxMetadataStringLength helped with EXTENDED!

I did not quite get meaning of spark.sql.maxPlanStringLength, in my cluster it defaults to a strange number "2147483632b".

UmaMahesh1
Honored Contributor III

Hi @Vladimir Ryabtsev​ 

Glad you found a solution.

That strange number you are seeing is the number of bits.

maxPlanStringLength is used to set the max number of characters (default = 2147483632) we can output to a plan string. Anything more than that, output will be truncated. Ideally this value is large enough for you to print out the plan using formatted and other options. But sometimes, using other options to extend the output from it's truncated state sometimes leads to OutOfMemory errors in the driver node or processes.

Cheers.

Uma Mahesh D

SS2
Valued Contributor

I also faced the same ​

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group