cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
cancel
Showing results for 
Search instead for 
Did you mean: 

Hive Catalog DDL, describe extended returns "... n more fields" when detailing a many column array<struct<

Anonymous
Not applicable

I am using Hackolade data modelling tool to reverse engineer (using cluster connection) deployed databases and their table and view definitions.

Some of our tables contain large multi-column structs, and these can only be partially described as a character or column limit is reached.

It appears that the Hive Catalog DDL, describe extended/formatted, returned data_type column is restricted to a character limit (1000 characters ?), which results in large many column structs being partially defined and closed with "... n more fields">>

Is it possible to change the configuration of the Databricks embedded Hive such that these structs are fully defined ?

1 REPLY 1

Anonymous
Not applicable

Yes, it is possible to configure the Hive Catalog in Databricks to return full descriptions of tables with large multi-column structs.

One way to achieve this is to increase the value of the Hive configuration property "hive.metastore.client.record.max.field.length". This property determines the maximum length of a field description returned by the Hive metastore. By default, this property is set to 4000 characters. You can increase this value to a higher value, such as 10000 or 20000, to allow for more complete descriptions of large struct fields.

To set this property in Databricks, you can create a cluster-scoped init script that sets the property for the Hive metastore client. For example, you can create a shell script that sets the property using the following command:

databricks configure --set hive.metastore.client.record.max.field.length 20000

Then, you can upload this script to Databricks and configure it as an init script for your cluster. This will ensure that the property is set for all Hive metastore clients running on your cluster.

Once you have configured the Hive Catalog in this way, you should be able to retrieve full descriptions of tables with large multi-column structs using the "describe extended" command.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.