cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

Hive Catalog DDL, describe extended returns "... n more fields" when detailing a many column array<struct<

Anonymous
Not applicable

I am using Hackolade data modelling tool to reverse engineer (using cluster connection) deployed databases and their table and view definitions.

Some of our tables contain large multi-column structs, and these can only be partially described as a character or column limit is reached.

It appears that the Hive Catalog DDL, describe extended/formatted, returned data_type column is restricted to a character limit (1000 characters ?), which results in large many column structs being partially defined and closed with "... n more fields">>

Is it possible to change the configuration of the Databricks embedded Hive such that these structs are fully defined ?

1 REPLY 1

Anonymous
Not applicable

Yes, it is possible to configure the Hive Catalog in Databricks to return full descriptions of tables with large multi-column structs.

One way to achieve this is to increase the value of the Hive configuration property "hive.metastore.client.record.max.field.length". This property determines the maximum length of a field description returned by the Hive metastore. By default, this property is set to 4000 characters. You can increase this value to a higher value, such as 10000 or 20000, to allow for more complete descriptions of large struct fields.

To set this property in Databricks, you can create a cluster-scoped init script that sets the property for the Hive metastore client. For example, you can create a shell script that sets the property using the following command:

databricks configure --set hive.metastore.client.record.max.field.length 20000

Then, you can upload this script to Databricks and configure it as an init script for your cluster. This will ensure that the property is set for all Hive metastore clients running on your cluster.

Once you have configured the Hive Catalog in this way, you should be able to retrieve full descriptions of tables with large multi-column structs using the "describe extended" command.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group