02-17-2022 07:22 AM
I've been trying to use the HiveMetastoreClient class in Scala to extract some metadata from Databricks internal Metastore, without success. I'm currently using the 7.3 LTS runtime.
The error seems to be related to some kind of inconsistency between Client and Hive Metastore Schema versions, but I can't immediately tell why. By running the following Scala snippet:
%scala
import org.apache.hadoop.hive.metastore.HiveMetaStoreClient
import org.apache.hadoop.hive.conf.HiveConf
var hiveConf = new HiveConf
var client = new HiveMetaStoreClient(hiveConf)
client.getTable("my_db", "my_table")
I get the following MetaException
Caused by: javax.jdo.JDOException: Exception thrown when executing query : SELECT DISTINCT 'org.apache.hadoop.hive.metastore.model.MTable' AS NUCLEUS_TYPE,A0.CREATE_TIME,A0.LAST_ACCESS_TIME,A0.OWNER,A0.RETENTION,A0.IS_REWRITE_ENABLED,A0.TBL_NAME,A0.TBL_TYPE,A0.TBL_ID FROM TBLS A0 LEFT OUTER JOIN DBS B0 ON A0.DB_ID = B0.DB_ID WHERE A0.TBL_NAME = ? AND B0.`NAME` = ?
NestedThrowables:
java.sql.SQLSyntaxErrorException: (conn=18979393) Unknown column 'A0.IS_REWRITE_ENABLED' in 'field list'
Does anyone know what could be the issue here?
Runtime: 7.3 LTS
Conf:
spark.sql.hive.metastore.version: 0.13.0
03-15-2022 10:21 PM
@Lucas Cardozo so basically we have our metastore based on the database you have, so you can modify as per your need but yeah I agree that this may break other job that relates to this change. so you may cross check with users who are using the same table. This link also may be useful to know about databricks metastore.
https://www.confessionsofadataguy.com/hive-metastore-in-databricks-what-to-know/
02-17-2022 08:27 AM
Hello, @Lucas Cardozo! My name is Piper, and I'm a moderator for Databricks. Welcome! It's nice to meet you. Thank you for your question. Let's give it a while for the members to respond to your question. We'll circle back around later if we need to. 🙂
02-23-2022 08:35 AM
Hi, @Kaniz Fatma ! Thanks for the reply 🙂
This SQL statement is generated by the HiveMetastoreClient and these columns are from the Databricks Internal Metastore (metastore_db).
Do you know if there are any updates to the Databricks Internal Metastore schema that could make it somehow incompatible with the `HiveMetastoreClient`? It looks like the client expects this column to exist, but it seems that it doesn't 🤔
03-03-2022 07:04 AM
@Lucas Cardozo may be this can help - https://stackoverflow.com/questions/63782822/setting-up-azure-sql-external-metastore-for-azure-datab...
03-04-2022 09:28 AM
Thanks for the reference, @Atanu Sarkar .
Seems a little odd to me that I'd need to change the internal Databricks Metastore table to add a column expected by the client default Scala client. I'm afraid this could cause issues with other users/jobs in the same workspace.
I can't seem to find more information about the internal Databricks Metastore besides this single sentence in the documentation that could help me understand what the issue is. 😕
03-15-2022 10:21 PM
@Lucas Cardozo so basically we have our metastore based on the database you have, so you can modify as per your need but yeah I agree that this may break other job that relates to this change. so you may cross check with users who are using the same table. This link also may be useful to know about databricks metastore.
https://www.confessionsofadataguy.com/hive-metastore-in-databricks-what-to-know/
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group