Databricks Community

pantelis_mare · ‎07-22-2022

Hello community!

I have a table with a column that is an array of a struct that has a very very long schema.

When the table is written, all works well. Though, when I create a view based on this table and I try to access the view I get the error:

rg.apache.spark.SparkException: Cannot recognize hive type string: array<struct<hitNumber:bigint....,latencyTracking:struct<pageLoadSample:bigint,pageLoadTime:bigint,pageDownloadTime:bigin, column: hits, db: test, table: test

as you see the actual schema has been truncated at the end, before the column name comes.

The question is: is there a configuration on the size of the hive type string I could play with?

Thank you in advance,

-werners- · ‎07-25-2022

what version of hive metastore do you use? Because there are issues with large metadata in metastore versions < 2.3.0

pantelis_mare · ‎07-25-2022

Hello @Werner Stinckens !

I see your point. Just checked and we are still in 0.13.0 (default one)

Is there any official documentation from #[Azure databricks] on how to do that?

Related to this, but not solved yet

-werners- · ‎07-25-2022

https://docs.microsoft.com/en-us/azure/databricks/data/metastores/external-hive-metastore

That's the only doc I know about.

pantelis_mare · ‎07-26-2022

Thanks a lot @Werner Stinckens !

I also came across this kind of documentation.. The question is whether you can upgrade your current internal metastore, and nothing about that unfortunately 😑