โ10-27-2022 04:55 AM
Hello All,
I get the org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient while trying to create a database
scripts used
%sql
USE hive_metastore;
CREATE DATABASE XYZ;
%sql
CREATE DATABASE XYZ;
%sql
CREATE DATABASE hive_metastore.XYZ;
the warehouse seems to be in started state
 
Cluster details
 
โ11-27-2022 10:50 PM
Hi @Karthigesan Vijayakumarโ
Great to meet you, and thanks for your question!
Let's see if your peers in the community have an answer to your question first. Or else bricksters will get back to you soon.
Thanks.
โ12-14-2022 02:54 AM
I have run into the same issue, any update on this issue?
โ02-02-2023 10:38 AM
Facing the same error. Any updates on this ?
โ02-18-2023 09:01 AM
This issue is getting worse: it's happening more often, and persisting for longer periods of time. It's getting harder & harder to work around it.
Please do something. The error is clearly not on the customers' side.
โ12-14-2022 08:10 AM
Same issue:
Starting about one month ago, we've been getting those error on jobs/workflows that have been running successfully for years, without any code change.
No idea what is causing it, or how to fix, but it seems like adding a sleep/pause at the beginning of the notebook is helping... so maybe something is taking a while to initialize on the cluster. ๐คทโโ๏ธ
Jobs are running on DBR 10.4 LTS (includes Apache Spark 3.2.1, Scala 2.12).
In our case, it's hard to debug, because we're using pyspark, and:
/databricks/spark/python/lib/py4j-0.10.9.1-src.zip/py4j/java_gateway.py in __call__(self, *args)
   1302 
   1303         answer = self.gateway_client.send_command(command)
-> 1304         return_value = get_return_value(
   1305             answer, self.gateway_client, self.target_id, self.name)
   1306 
 
/databricks/spark/python/pyspark/sql/utils.py in deco(*a, **kw)
    121                 # Hide where the exception came from that shows a non-Pythonic
    122                 # JVM exception message.
--> 123                 raise converted from None
    124             else:
    125                 raise
 
AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClientโ12-20-2022 02:47 AM
please let us know if you got any solution for this issue
โ02-18-2023 09:06 AM
Still no solution. Pausing the script is only a stopgap measure.
The issue is on Databrick's side, there's nothing we can do about it, and it seems to be getting worse.
@Vidula Khannaโ: Any feedback from Databricks on this?
โ12-20-2022 07:51 AM
@Vijay Kumar Jโ: So far, adding a sleep/pause at the top of the notebook has been the only thing that works:
# Sleep/Pause for 2 minutes, to give the Hive Catalog time to initialize.
import time
time.sleep(120)It has reduced the errors by 99%. We still get them occasionally, so maybe a longer pause would be enough to handle that last 1%.
โ12-20-2022 07:13 PM
Tried but same error, I also tried creating manually using UI but same error
HTTP ERROR: 500
Problem accessing /import/new-table. Reason:
com.databricks.backend.common.rpc.SparkDriverExceptions$SQLExecutionException: org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
โ12-21-2022 02:08 AM
Our issue is resolved it was related to a firewall that was blocking us to perform certain commands
โ01-03-2023 05:07 AM
Got the same issue with both UI and notebook. Tried with sleep/pause at the top of the notebook but didn't work. Please let me know if you got any other solution for this issue.
โ02-21-2023 12:23 PM
Alright, we've implemented a workaround for this, and so far it's been working very well:
Here is the code:
import time
 
retries = 0
max_retries = 10
while True:
  try:
    # Use this table to check if Hive is ready, since it's very small & all in 1 file
    table("database.small_table")
    break
  except Exception as e:
    if retries == max_retries:
      raise e
      
    retries += 1
    print(f"Hive is not initialized yet. Retrying in 60 seconds. (Retry #{retries})")
    time.sleep(60)
    
print("Hive is initialized!")And here is what the output looks like:
โ02-22-2023 07:15 PM
Alright, good news! We've had one job fail after the 10 maximum retries, and it ended up producing a much more complete stack trace than the single `AnalysisException` we typically get.
tl;dr: It seems like the underlying issue (in our case, at least), is too many connections to the Hive metastore, which is basically a MariaDB instance hosted by Databricks. This answer provides some context behind the error.
The full stack trace is attached below, where we can see that the originating exception (at the bottom) is an SQLException due to too many connections, in the org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol package. Here is the snippet:
Caused by: java.sql.SQLException: Too many connections
	at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.authentication(AbstractConnectProtocol.java:856)
	at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.handleConnectionPhases(AbstractConnectProtocol.java:777)
	at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.connect(AbstractConnectProtocol.java:451)
	at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.connectWithoutProxy(AbstractConnectProtocol.java:1103)This, along with the original HiveMetaStoreClient exception, pretty much confirms that the root cause of the issue is indeed too many connections to the Hive metastore (the MariaDB instance).
โ07-10-2024 06:49 AM
That's exactly my case!
This is what I saw in `databricks bundle run my_job`:
AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
And this is what I found in the Log4j output of the cluster:
Caused by: org.datanucleus.exceptions.NucleusException: Attempt to invoke the "HikariCP" plugin to create a ConnectionPool gave an error : Failed to initialize pool: Could not connect to address=(host=consolidated-northeuropec2-prod-metastore-0.mysql.database.azure.com)(port=3306)(type=master) : Could not connect to consolidated-northeuropec2-prod-metastore-0.mysql.database.azure.com:3306 : Connection reset
	at org.datanucleus.store.rdbms.ConnectionFactoryImpl.generateDataSources(ConnectionFactoryImpl.java:232)
	at org.datanucleus.store.rdbms.ConnectionFactoryImpl.initialiseDataSources(ConnectionFactoryImpl.java:117)
	at org.datanucleus.store.rdbms.ConnectionFactoryImpl.<init>(ConnectionFactoryImpl.java:82)
	... 123 more
Caused by: com.zaxxer.hikari.pool.HikariPool$PoolInitializationException: Failed to initialize pool: Could not connect to address=(host=consolidated-northeuropec2-prod-metastore-0.mysql.database.azure.com)(port=3306)(type=master) : Could not connect to consolidated-northeuropec2-prod-metastore-0.mysql.database.azure.com:3306 : Connection reset
	at com.zaxxer.hikari.pool.HikariPool.checkFailFast(HikariPool.java:512)
	at com.zaxxer.hikari.pool.HikariPool.<init>(HikariPool.java:105)
	at com.zaxxer.hikari.HikariDataSource.<init>(HikariDataSource.java:71)
	at org.datanucleus.store.rdbms.connectionpool.HikariCPConnectionPoolFactory.createConnectionPool(HikariCPConnectionPoolFactory.java:176)
	at org.datanucleus.store.rdbms.ConnectionFactoryImpl.generateDataSources(ConnectionFactoryImpl.java:213)
	... 125 more
Caused by: java.sql.SQLNonTransientConnectionException: Could not connect to address=(host=consolidated-northeuropec2-prod-metastore-0.mysql.database.azure.com)(port=3306)(type=master) : Could not connect to consolidated-northeuropec2-prod-metastore-0.mysql.database.azure.com:3306 : Connection reset
	at org.mariadb.jdbc.internal.util.exceptions.ExceptionFactory.createException(ExceptionFactory.java:73)
	at org.mariadb.jdbc.internal.util.exceptions.ExceptionFactory.create(ExceptionFactory.java:197)
	at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.connectWithoutProxy(AbstractConnectProtocol.java:1404)
	at org.mariadb.jdbc.internal.util.Utils.retrieveProxy(Utils.java:635)
	at org.mariadb.jdbc.MariaDbConnection.newConnection(MariaDbConnection.java:150)
	at org.mariadb.jdbc.Driver.connect(Driver.java:89)
	at com.zaxxer.hikari.util.DriverDataSource.getConnection(DriverDataSource.java:95)
	at com.zaxxer.hikari.util.DriverDataSource.getConnection(DriverDataSource.java:101)
	at com.zaxxer.hikari.pool.PoolBase.newConnection(PoolBase.java:341)
	at com.zaxxer.hikari.pool.HikariPool.checkFailFast(HikariPool.java:506)
	... 129 more
Caused by: java.sql.SQLNonTransientConnectionException: Could not connect to consolidated-northeuropec2-prod-metastore-0.mysql.database.azure.com:3306 : Connection reset
	at org.mariadb.jdbc.internal.util.exceptions.ExceptionFactory.createException(ExceptionFactory.java:73)
	at org.mariadb.jdbc.internal.util.exceptions.ExceptionFactory.create(ExceptionFactory.java:188)
	at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.createConnection(AbstractConnectProtocol.java:588)
	at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.connectWithoutProxy(AbstractConnectProtocol.java:1399)
	... 136 more
Caused by: java.net.SocketException: Connection reset
	at java.net.SocketInputStream.read(SocketInputStream.java:210)
	at java.net.SocketInputStream.read(SocketInputStream.java:141)
	at java.io.FilterInputStream.read(FilterInputStream.java:133)
	at org.mariadb.jdbc.internal.io.input.ReadAheadBufferedStream.fillBuffer(ReadAheadBufferedStream.java:131)
	at org.mariadb.jdbc.internal.io.input.ReadAheadBufferedStream.read(ReadAheadBufferedStream.java:104)
	at org.mariadb.jdbc.internal.io.input.StandardPacketInputStream.getPacketArray(StandardPacketInputStream.java:247)
	at org.mariadb.jdbc.internal.io.input.StandardPacketInputStream.getPacket(StandardPacketInputStream.java:218)
	at org.mariadb.jdbc.internal.com.read.ReadInitialHandShakePacket.<init>(ReadInitialHandShakePacket.java:89)
	at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.createConnection(AbstractConnectProtocol.java:540)
	... 137 moreconsolidated-northeuropec2-prod-metastore-0.mysql.database.azure.com:3306 is mentioned in https://learn.microsoft.com/en-us/azure/databricks/release-notes/product/2022/january#additional-met....
 
					
				
				
			
		
 
					
				
				
			
		
Passionate about hosting events and connecting people? Help us grow a vibrant local communityโsign up today to get started!
Sign Up Now