Metaexception [Version information not found in me...

User16783853906 · ‎06-23-2021

Trying to configure new external metastore and running into the following exception during cluster initialization -

Caused by: MetaException(message:Version information not found in metastore. )
 
	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:83)
 
	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:92)
 
	at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:6902)
 
	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:164)
 
	at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:70)
 
	... 96 more
 
Caused by: MetaException(message:Version information not found in metastore. )
 
	at org.apache.hadoop.hive.metastore.ObjectStore.checkSchema(ObjectStore.java:7810)
 
	at org.apache.hadoop.hive.metastore.ObjectStore.verifySchema(ObjectStore.java:7788)
 
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess

User16783853906 · ‎06-23-2021

The above exception happens when the hive schema is not available in the metastore instance. Please check in your init scripts to make sure the following flag is enabled to create hive Schema and tables if not already present.

	datanucleus.autoCreateAll true

If the above option does not work, you can copy the Hive schema that you need (0.13, 2.1, 2.3 etc) and manually import the schema onto RDS using the follwing syntax -

mysql -h<host> -u<uname> -p <dbname> < hive_13.sql

Changes to metastore health are recorded in cluster events.

Look for "Metastore health check ok" or "Metastore health check failed" in the driver logs to see the status of this health check. Additionally you can check the cluster event logs for METASTORE_DOWN events to find any failed or timed-out health checks.

In addition to the above issue reported, this health check could fail under the following circumstances

The metastore RDS is unreachable because of network connectivity from cluster IPs to metastore instance
There is a high degree of concurrency and contention for metastore access.

Note that if the health check fails due to reason 2, it does not mean that the metastore is permanently down, just that it is temporarily unavailable. If a cluster consistently displays behavior where a METASTORE_DOWN event is registered but the MetastoreMonitor itself is up, it is due to reason 2 and can probably be fixed by increasing spark.databricks.hive.metastore.client.pool.size on the cluster and in few scenarios you might have to upgrade the metastore instance to allow higher concurrency.

Metaexception [Version information not found in metastore] during cluster [re]start