cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Error Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient - while trying to create database

Karthig
New Contributor III

Hello All,

I get the org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient while trying to create a database

scripts used

%sql

USE hive_metastore;

CREATE DATABASE XYZ;

%sql

CREATE DATABASE XYZ;

%sql

CREATE DATABASE hive_metastore.XYZ;

the warehouse seems to be in started state

imageimage.png 

Cluster details

image 

13 REPLIES 13

Anonymous
Not applicable

Hi @Karthigesan Vijayakumar​ 

Great to meet you, and thanks for your question! 

Let's see if your peers in the community have an answer to your question first. Or else bricksters will get back to you soon.

Thanks.

Mentens
New Contributor II

I have run into the same issue, any update on this issue?

addy
New Contributor III

Facing the same error. Any updates on this ?

mroy
New Contributor III

This issue is getting worse: it's happening more often, and persisting for longer periods of time. It's getting harder & harder to work around it.

Please do something. The error is clearly not on the customers' side.

mroy
New Contributor III

Same issue:

Starting about one month ago, we've been getting those error on jobs/workflows that have been running successfully for years, without any code change.

No idea what is causing it, or how to fix, but it seems like adding a sleep/pause at the beginning of the notebook is helping... so maybe something is taking a while to initialize on the cluster. 🤷‍♂️

Jobs are running on DBR 10.4 LTS (includes Apache Spark 3.2.1, Scala 2.12).

In our case, it's hard to debug, because we're using pyspark, and:

/databricks/spark/python/lib/py4j-0.10.9.1-src.zip/py4j/java_gateway.py in __call__(self, *args)
   1302 
   1303         answer = self.gateway_client.send_command(command)
-> 1304         return_value = get_return_value(
   1305             answer, self.gateway_client, self.target_id, self.name)
   1306 
 
/databricks/spark/python/pyspark/sql/utils.py in deco(*a, **kw)
    121                 # Hide where the exception came from that shows a non-Pythonic
    122                 # JVM exception message.
--> 123                 raise converted from None
    124             else:
    125                 raise
 
AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient

Vijaykumarj
New Contributor III

please let us know if you got any solution for this issue

mroy
New Contributor III

Still no solution. Pausing the script is only a stopgap measure.

The issue is on Databrick's side, there's nothing we can do about it, and it seems to be getting worse.

@Vidula Khanna​: Any feedback from Databricks on this?

mroy
New Contributor III

@Vijay Kumar J​: So far, adding a sleep/pause at the top of the notebook has been the only thing that works:

# Sleep/Pause for 2 minutes, to give the Hive Catalog time to initialize.
import time
time.sleep(120)

It has reduced the errors by 99%. We still get them occasionally, so maybe a longer pause would be enough to handle that last 1%.

Vijaykumarj
New Contributor III

Tried but same error, I also tried creating manually using UI but same error

HTTP ERROR: 500

Problem accessing /import/new-table. Reason:

com.databricks.backend.common.rpc.SparkDriverExceptions$SQLExecutionException: org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient

Mentens
New Contributor II

Our issue is resolved it was related to a firewall that was blocking us to perform certain commands

Srividya1
New Contributor II

Got the same issue with both UI and notebook. Tried with sleep/pause at the top of the notebook but didn't work. Please let me know if you got any other solution for this issue.

mroy
New Contributor III

Alright, we've implemented a workaround for this, and so far it's been working very well:

  • First, we created a reusable notebook to wait until Hive has been initialized (see code below).
  • We then execute this notebook using the %run command at the top of any notebook which is encountering the Hive issue.

Here is the code:

import time
 
retries = 0
max_retries = 10
while True:
  try:
    # Use this table to check if Hive is ready, since it's very small & all in 1 file
    table("database.small_table")
    break
  except Exception as e:
    if retries == max_retries:
      raise e
      
    retries += 1
    print(f"Hive is not initialized yet. Retrying in 60 seconds. (Retry #{retries})")
    time.sleep(60)
    
print("Hive is initialized!")

And here is what the output looks like:

image.png

mroy
New Contributor III

Alright, good news! We've had one job fail after the 10 maximum retries, and it ended up producing a much more complete stack trace than the single `AnalysisException` we typically get.

tl;dr: It seems like the underlying issue (in our case, at least), is too many connections to the Hive metastore, which is basically a MariaDB instance hosted by Databricks. This answer provides some context behind the error.

The full stack trace is attached below, where we can see that the originating exception (at the bottom) is an SQLException due to too many connections, in the org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol package. Here is the snippet:

Caused by: java.sql.SQLException: Too many connections
	at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.authentication(AbstractConnectProtocol.java:856)
	at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.handleConnectionPhases(AbstractConnectProtocol.java:777)
	at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.connect(AbstractConnectProtocol.java:451)
	at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.connectWithoutProxy(AbstractConnectProtocol.java:1103)

This, along with the original HiveMetaStoreClient exception, pretty much confirms that the root cause of the issue is indeed too many connections to the Hive metastore (the MariaDB instance).

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.