cancel
Showing results for 
Search instead for 
Did you mean: 
Data Governance
Join discussions on data governance practices, compliance, and security within the Databricks Community. Exchange strategies and insights to ensure data integrity and regulatory compliance.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to remove legacy hive metasore under catalog section ?

DSoni
New Contributor III

Respected Concern,

We are currently using unity catalog in our workspace but we still see legacy hive_metastore under catalog section. Is there any way we can remove it. Because the issue we are facing is our cluster still somehow tries to connect with hive metastore client even though we are using unity catalog and it's throwing an exceptions continously.

It would be great if someone can help resolve this issue.

Thanks 🙂

9 REPLIES 9

Alberto_Umana
Databricks Employee
Databricks Employee

Hello @DSoni,

Removing the hive_metastore catalog from your environment is not currently supported. Can you share more details on the issue you are having? What cluster mode access are you using? Also what are the errors?

DSoni
New Contributor III

Hello @Alberto_Umana ,
Would you mind providing an update on the information I provided ?
It would be really appreciated !
Thanks 🙂

DSoni
New Contributor III

Thank you for your response.

I understand that removing the hive_metastore catalog is not supported. To provide more context, here are the details of the issue I am encountering:

We are using Unity Catalog in my workspace and no longer need the hive_metastore.

 

  1. Cluster Mode Access: Multi Node (Shared) (Single Job Cluster)
  2. Mechanism: Structured Streaming
  3. Cloud Provider: AWS
  4. DB Version: 15.4 LTS

I hope this information helps in diagnosing the issue. Please let me know if you need any further details or logs.

Thank you for your assistance.

 

bhanu_dp
New Contributor III

Were you able to get any solution/workaround to your requirement?

DSoni
New Contributor III

Error we encounter is below:

  1. Errors:
    • Caused by: org.datanucleus.exceptions.NucleusException: Attempt to invoke the "HikariCP" plugin to create a ConnectionPool gave an error: Failed to initialize pool: Could not connect to address=(host=xyz.region.rds.amazonaws.com)(port=<port>)(type=master): Could not connect to xyz.region.rds.amazonaws.com:<port>: Connection reset
      • at org.datanucleus.store.rdbms.ConnectionFactoryImpl.generateDataSources(ConnectionFactoryImpl.java:232)
      • at org.datanucleus.store.rdbms.ConnectionFactoryImpl.initialiseDataSources(ConnectionFactoryImpl.java:117)
      • at org.datanucleus.store.rdbms.ConnectionFactoryImpl.<init>(ConnectionFactoryImpl.java:82)
      • ... 127 more
    • Caused by: com.zaxxer.hikari.pool.HikariPool$PoolInitializationException: Failed to initialize pool: Could not connect to address=(host=xyz.region.rds.amazonaws.com)(port=<port>)(type=master): Could not connect to xyz.region.rds.amazonaws.com:<port>: Connection reset
      • at com.zaxxer.hikari.pool.HikariPool.checkFailFast(HikariPool.java:512)
      • at com.zaxxer.hikari.pool.HikariPool.<init>(HikariPool.java:105)
      • at com.zaxxer.hikari.HikariDataSource.<init>(HikariDataSource.java:71)
      • at org.datanucleus.store.rdbms.connectionpool.HikariCPConnectionPoolFactory.createConnectionPool(HikariCPConnectionPoolFactory.java:176)
      • at org.datanucleus.store.rdbms.ConnectionFactoryImpl.generateDataSources(ConnectionFactoryImpl.java:213)
      • ... 129 more
    • Caused by: java.sql.SQLNonTransientConnectionException: Could not connect to address=(host=xyz.region.rds.amazonaws.com)(port=<port>)(type=master): Could not connect to xyz.region.rds.amazonaws.com:<port>: Connection reset
      • at org.mariadb.jdbc.internal.util.exceptions.ExceptionFactory.createException(ExceptionFactory.java:73)
      • at org.mariadb.jdbc.internal.util.exceptions.ExceptionFactory.create(ExceptionFactory.java:197)
      • at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.connectWithoutProxy(AbstractConnectProtocol.java:1404)
      • at org.mariadb.jdbc.internal.util.Utils.retrieveProxy(Utils.java:635)
      • at org.mariadb.jdbc.MariaDbConnection.newConnection(MariaDbConnection.java:150)
      • at org.mariadb.jdbc.Driver.connect(Driver.java:89)
      • at com.zaxxer.hikari.util.DriverDataSource.getConnection(DriverDataSource.java:95)
      • at com.zaxxer.hikari.util.DriverDataSource.getConnection(DriverDataSource.java:101)
      • at com.zaxxer.hikari.pool.PoolBase.newConnection(PoolBase.java:341)
      • at com.zaxxer.hikari.pool.HikariPool.checkFailFast(HikariPool.java:506)
      • ... 133 more
    • Caused by: java.sql.SQLNonTransientConnectionException: Could not connect to xyz.region.rds.amazonaws.com:<port>: Connection reset
      • at org.mariadb.jdbc.internal.util.exceptions.ExceptionFactory.createException(ExceptionFactory.java:73)
      • at org.mariadb.jdbc.internal.util.exceptions.ExceptionFactory.create(ExceptionFactory.java:188)
      • at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.createConnection(AbstractConnectProtocol.java:588)
      • at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.connectWithoutProxy(AbstractConnectProtocol.java:1399)
      • ... 140 more
    • Caused by: java.net.SocketException: Connection reset
      • at java.net.SocketInputStream.read(SocketInputStream.java:210)
      • at java.net.SocketInputStream.read(SocketInputStream.java:141)
      • at java.io.FilterInputStream.read(FilterInputStream.java:133)
      • at org.mariadb.jdbc.internal.io.input.ReadAheadBufferedStream.fillBuffer(ReadAheadBufferedStream.java:131)
      • at org.mariadb.jdbc.internal.io.input.ReadAheadBufferedStream.read(ReadAheadBufferedStream.java:104)
      • at org.mariadb.jdbc.internal.io.input.StandardPacketInputStream.getPacketArray(StandardPacketInputStream.java:247)
      • at org.mariadb.jdbc.internal.io.input.StandardPacketInputStream.getPacket(StandardPacketInputStream.java:218)
      • at org.mariadb.jdbc.internal.com.read.ReadInitialHandShakePacket.<init>(ReadInitialHandShakePacket.java:89)
      • at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.createConnection(AbstractConnectProtocol.java:540)
      • ... 141 more

Alberto_Umana
Databricks Employee
Databricks Employee

Hi @DSoni,

Thanks for the details you have shared! 

Couple of question, are you interacting with any table in hive_metastore path? 

Do you have a private link setup?

Based on the error there might be a limit in the pool of the RDS of hive metastore in your AWS instance, can you check RDS metrics to validate and increase the limit?

DSoni
New Contributor III

Hello @Alberto_Umana ,
I consulted with our DevOps and we are not interacting with any table in hive_metasore , neither we have any private link setup. Also we are not using RDS !

Alberto_Umana
Databricks Employee
Databricks Employee

Hi @DSoni,

By default hive_metastore uses an RDS in AWS, in the failure you can see: xyz.region.rds.amazonaws.com which I think you redacted, that is the host with the issue. I would require more details to check hive_metastore comes into picture if you mentioned no hive_metastore table is being used. 

Was the error observed during a workflow run? or during a manual query?

DSoni
New Contributor III

Hello @Alberto_Umana ,
We are running workflow and in driver logs we see this exception. We are using structured streaming in our work. Everything is in UC.

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now