cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Governance
Join discussions on data governance practices, compliance, and security within the Databricks Community. Exchange strategies and insights to ensure data integrity and regulatory compliance.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

How to remove legacy hive metasore under catalog section ?

DSoni
New Contributor II

Respected Concern,

We are currently using unity catalog in our workspace but we still see legacy hive_metastore under catalog section. Is there any way we can remove it. Because the issue we are facing is our cluster still somehow tries to connect with hive metastore client even though we are using unity catalog and it's throwing an exceptions continously.

It would be great if someone can help resolve this issue.

Thanks ๐Ÿ™‚

8 REPLIES 8

Alberto_Umana
Databricks Employee
Databricks Employee

Hello @DSoni,

Removing the hive_metastore catalog from your environment is not currently supported. Can you share more details on the issue you are having? What cluster mode access are you using? Also what are the errors?

DSoni
New Contributor II

Hello @Alberto_Umana ,
Would you mind providing an update on the information I provided ?
It would be really appreciated !
Thanks ๐Ÿ™‚

DSoni
New Contributor II

Thank you for your response.

I understand that removing the hive_metastore catalog is not supported. To provide more context, here are the details of the issue I am encountering:

We are using Unity Catalog in my workspace and no longer need the hive_metastore.

 

  1. Cluster Mode Access: Multi Node (Shared) (Single Job Cluster)
  2. Mechanism: Structured Streaming
  3. Cloud Provider: AWS
  4. DB Version: 15.4 LTS

I hope this information helps in diagnosing the issue. Please let me know if you need any further details or logs.

Thank you for your assistance.

 

DSoni
New Contributor II

Error we encounter is below:

  1. Errors:
    • Caused by: org.datanucleus.exceptions.NucleusException: Attempt to invoke the "HikariCP" plugin to create a ConnectionPool gave an error: Failed to initialize pool: Could not connect to address=(host=xyz.region.rds.amazonaws.com)(port=<port>)(type=master): Could not connect to xyz.region.rds.amazonaws.com:<port>: Connection reset
      • at org.datanucleus.store.rdbms.ConnectionFactoryImpl.generateDataSources(ConnectionFactoryImpl.java:232)
      • at org.datanucleus.store.rdbms.ConnectionFactoryImpl.initialiseDataSources(ConnectionFactoryImpl.java:117)
      • at org.datanucleus.store.rdbms.ConnectionFactoryImpl.<init>(ConnectionFactoryImpl.java:82)
      • ... 127 more
    • Caused by: com.zaxxer.hikari.pool.HikariPool$PoolInitializationException: Failed to initialize pool: Could not connect to address=(host=xyz.region.rds.amazonaws.com)(port=<port>)(type=master): Could not connect to xyz.region.rds.amazonaws.com:<port>: Connection reset
      • at com.zaxxer.hikari.pool.HikariPool.checkFailFast(HikariPool.java:512)
      • at com.zaxxer.hikari.pool.HikariPool.<init>(HikariPool.java:105)
      • at com.zaxxer.hikari.HikariDataSource.<init>(HikariDataSource.java:71)
      • at org.datanucleus.store.rdbms.connectionpool.HikariCPConnectionPoolFactory.createConnectionPool(HikariCPConnectionPoolFactory.java:176)
      • at org.datanucleus.store.rdbms.ConnectionFactoryImpl.generateDataSources(ConnectionFactoryImpl.java:213)
      • ... 129 more
    • Caused by: java.sql.SQLNonTransientConnectionException: Could not connect to address=(host=xyz.region.rds.amazonaws.com)(port=<port>)(type=master): Could not connect to xyz.region.rds.amazonaws.com:<port>: Connection reset
      • at org.mariadb.jdbc.internal.util.exceptions.ExceptionFactory.createException(ExceptionFactory.java:73)
      • at org.mariadb.jdbc.internal.util.exceptions.ExceptionFactory.create(ExceptionFactory.java:197)
      • at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.connectWithoutProxy(AbstractConnectProtocol.java:1404)
      • at org.mariadb.jdbc.internal.util.Utils.retrieveProxy(Utils.java:635)
      • at org.mariadb.jdbc.MariaDbConnection.newConnection(MariaDbConnection.java:150)
      • at org.mariadb.jdbc.Driver.connect(Driver.java:89)
      • at com.zaxxer.hikari.util.DriverDataSource.getConnection(DriverDataSource.java:95)
      • at com.zaxxer.hikari.util.DriverDataSource.getConnection(DriverDataSource.java:101)
      • at com.zaxxer.hikari.pool.PoolBase.newConnection(PoolBase.java:341)
      • at com.zaxxer.hikari.pool.HikariPool.checkFailFast(HikariPool.java:506)
      • ... 133 more
    • Caused by: java.sql.SQLNonTransientConnectionException: Could not connect to xyz.region.rds.amazonaws.com:<port>: Connection reset
      • at org.mariadb.jdbc.internal.util.exceptions.ExceptionFactory.createException(ExceptionFactory.java:73)
      • at org.mariadb.jdbc.internal.util.exceptions.ExceptionFactory.create(ExceptionFactory.java:188)
      • at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.createConnection(AbstractConnectProtocol.java:588)
      • at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.connectWithoutProxy(AbstractConnectProtocol.java:1399)
      • ... 140 more
    • Caused by: java.net.SocketException: Connection reset
      • at java.net.SocketInputStream.read(SocketInputStream.java:210)
      • at java.net.SocketInputStream.read(SocketInputStream.java:141)
      • at java.io.FilterInputStream.read(FilterInputStream.java:133)
      • at org.mariadb.jdbc.internal.io.input.ReadAheadBufferedStream.fillBuffer(ReadAheadBufferedStream.java:131)
      • at org.mariadb.jdbc.internal.io.input.ReadAheadBufferedStream.read(ReadAheadBufferedStream.java:104)
      • at org.mariadb.jdbc.internal.io.input.StandardPacketInputStream.getPacketArray(StandardPacketInputStream.java:247)
      • at org.mariadb.jdbc.internal.io.input.StandardPacketInputStream.getPacket(StandardPacketInputStream.java:218)
      • at org.mariadb.jdbc.internal.com.read.ReadInitialHandShakePacket.<init>(ReadInitialHandShakePacket.java:89)
      • at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.createConnection(AbstractConnectProtocol.java:540)
      • ... 141 more

Alberto_Umana
Databricks Employee
Databricks Employee

Hi @DSoni,

Thanks for the details you have shared! 

Couple of question, are you interacting with any table in hive_metastore path? 

Do you have a private link setup?

Based on the error there might be a limit in the pool of the RDS of hive metastore in your AWS instance, can you check RDS metrics to validate and increase the limit?

DSoni
New Contributor II

Hello @Alberto_Umana ,
I consulted with our DevOps and we are not interacting with any table in hive_metasore , neither we have any private link setup. Also we are not using RDS !

Alberto_Umana
Databricks Employee
Databricks Employee

Hi @DSoni,

By default hive_metastore uses an RDS in AWS, in the failure you can see: xyz.region.rds.amazonaws.com which I think you redacted, that is the host with the issue. I would require more details to check hive_metastore comes into picture if you mentioned no hive_metastore table is being used. 

Was the error observed during a workflow run? or during a manual query?

DSoni
New Contributor II

Hello @Alberto_Umana ,
We are running workflow and in driver logs we see this exception. We are using structured streaming in our work. Everything is in UC.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group