cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Problems with HiveMetastoreClient and internal Databricks Metastore.

lecardozo
New Contributor II

I've been trying to use ​the HiveMetastoreClient class in Scala to extract some metadata from Databricks internal Metastore, without success. I'm currently using the 7.3 LTS runtime.

The error seems to be related to some kind of inconsistency between Client and Hive Metastore Schema versions, but I can't immediately tell why. By running the following Scala snippet:

%scala 
 
import org.apache.hadoop.hive.metastore.HiveMetaStoreClient
import org.apache.hadoop.hive.conf.HiveConf
 
var hiveConf = new HiveConf
var client = new HiveMetaStoreClient(hiveConf)
client.getTable("my_db", "my_table")

I get the following MetaException

Caused by: javax.jdo.JDOException: Exception thrown when executing query : SELECT DISTINCT 'org.apache.hadoop.hive.metastore.model.MTable' AS NUCLEUS_TYPE,A0.CREATE_TIME,A0.LAST_ACCESS_TIME,A0.OWNER,A0.RETENTION,A0.IS_REWRITE_ENABLED,A0.TBL_NAME,A0.TBL_TYPE,A0.TBL_ID FROM TBLS A0 LEFT OUTER JOIN DBS B0 ON A0.DB_ID = B0.DB_ID WHERE A0.TBL_NAME = ? AND B0.`NAME` = ?
NestedThrowables:
java.sql.SQLSyntaxErrorException: (conn=18979393) Unknown column 'A0.IS_REWRITE_ENABLED' in 'field list'

Does anyone know what could be the issue here?

Runtime: 7.3 LTS

Conf:

spark.sql.hive.metastore.version: 0.13.0

1 ACCEPTED SOLUTION

Accepted Solutions

Atanu
Esteemed Contributor

@Lucas Cardozo​  so basically we have our metastore based on the database you have, so you can modify as per your need but yeah I agree that this may break other job that relates to this change. so you may cross check with users who are using the same table. This link also may be useful to know about databricks metastore.

https://www.confessionsofadataguy.com/hive-metastore-in-databricks-what-to-know/

View solution in original post

6 REPLIES 6

Anonymous
Not applicable

Hello, @Lucas Cardozo​! My name is Piper, and I'm a moderator for Databricks. Welcome! It's nice to meet you. Thank you for your question. Let's give it a while for the members to respond to your question. We'll circle back around later if we need to. 🙂

Kaniz_Fatma
Community Manager
Community Manager

Hi @Lucas Cardozo​ , The error here seems to be SQLSyntaxErrorException which means you have a syntax error in your SQL statement.

It states that there is an unknown column named 'A0.IS_REWRITE_ENABLED' in the 'field list'.

Can you please check the column names once again?

lecardozo
New Contributor II

Hi, @Kaniz Fatma​ ! Thanks for the reply 🙂

This SQL statement is generated by the HiveMetastoreClient and these columns are from the Databricks Internal Metastore (metastore_db).

Do you know if there are any updates to the Databricks Internal Metastore schema that could make it somehow incompatible with the `HiveMetastoreClient`? It looks like the client expects this column to exist, but it seems that it doesn't 🤔

Atanu
Esteemed Contributor

lecardozo
New Contributor II

Thanks for the reference, @Atanu Sarkar​ .

Seems a little odd to me that I'd need to change the internal Databricks Metastore table to add a column expected by the client default Scala client. I'm afraid this could cause issues with other users/jobs in the same workspace.

I can't seem to find more information about the internal Databricks Metastore besides this single sentence in the documentation that could help me understand what the issue is. 😕

Atanu
Esteemed Contributor

@Lucas Cardozo​  so basically we have our metastore based on the database you have, so you can modify as per your need but yeah I agree that this may break other job that relates to this change. so you may cross check with users who are using the same table. This link also may be useful to know about databricks metastore.

https://www.confessionsofadataguy.com/hive-metastore-in-databricks-what-to-know/

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group