โ03-09-2022 08:41 PM
We are using Dataricks in AWS. i am not able to run a notebook even when cluster is running. When i run a cell, it returns "cancel". When i check the event log for the cluster, it shows "Metastore is down". Couldn't see any databases or tables that i created in "data" tab, Its keeps on showing loading and returns some error there.
Appreciate a help on this.
โ04-13-2022 05:34 AM
This means the network is fine, but something in the spark config is amiss.
What are the DBR version and the hive version? Please check f you are using a compatible version.
If you don't specify any version, it will take 1.3 and you wouldn't have to use jars in this case.
Please use below config with your appropriate values,
spark.hadoop.datanucleus.fixedDatastore false
spark.hadoop.datanucleus.autoCreateSchema true
spark.hadoop.datanucleus.autoCreateTables true
spark.hadoop.javax.jdo.option.ConnectionURL jdbc:mariadb://<database-name>.cj11tymkwz5w.us-west-2.rds.amazonaws.com
:3306/databricks_metastore?createDatabaseIfNotExist=true
spark.hadoop.javax.jdo.option.ConnectionDriverName org.mariadb.jdbc.Driver
spark.hadoop.javax.jdo.option.ConnectionUserName <value>
spark.hadoop.javax.jdo.option.ConnectionPassword <value>
โ03-09-2022 11:39 PM
do you happen to use an external metastore?
โ03-10-2022 01:34 AM
@Werner Stinckensโ No we dont use external metastore
โ03-10-2022 01:40 AM
ok that is not normal behaviour.
Is it a new databricks workspace, community or premium or standard?
seems like something is wrong with the workspace deployment, because databricks uses a default metastore which works stable.
โ03-10-2022 02:07 AM
Yes, i have created new Databricks Trial account (on AWS) (Premium)
โ03-10-2022 02:10 AM
@Werner Stinckensโ I have created workspace using "custom AWS configuration". It was working fine till yesterday but suddenly returns this error.
โ03-10-2022 02:15 AM
Seems like a workspace deployment issue.
But I see you have another topic open, which seems related tbh.
โ03-10-2022 05:32 AM
Metastore is hosted in AWS RDS.
Additionally you can check cluster start logs. In logs is metastore url and connection message, probably you just blocked traffic to that.
On that page you can also find urls/ips which should be allowed out/routed through internet gateway: https://docs.databricks.com/administration-guide/cloud-configurations/aws/customer-managed-vpc.html
โ03-11-2022 12:57 AM
As suggested by @Hubert Dudekโ , please check your network settings, in case you have any firewall or rules that may be blocking the RDS connection on port 3306.
Also, check if you are whitelisting the correct endpoint as per your region.
To debug this further, you can SSH into one of the AWS worker instances used by a Databricks cluster and test the connectivity.
Eg: For us-east-1 region, run this command
nc -zv mdb7sywh50xhpr.chkweekm4xjq.us-east-1.rds.amazonaws.com 3306
โ03-13-2022 10:14 PM
@Hubert Dudekโ @Kavya Manohar Paragโ
Thanks for the input. I have tried running below command, it was succeeded.
"!nc -zv md1n4trqmokgnhr.csnrqwqko4ho.ap-southeast-1.rds.amazonaws.com 3306".
However, Database is not accessible. Please see the error comes for show databases.
Also, I have tried several trial accounts using this AWS account. Is there any restriction for trial accounts ?
โ04-17-2022 10:46 AM
So this means still you need to check the config of your External metastore.
If you run below - whats the output -
%scala
val metastoreURL = spark.sparkContext.hadoopConfiguration.get(โjavax.jdo.option.ConnectionURLโ)
val metastoreUser = spark.sparkContext.hadoopConfiguration.get(โjavax.jdo.option.ConnectionUserNameโ)
val metastorePassword = spark.sparkContext.hadoopConfiguration.get(โjavax.jdo.option.ConnectionPasswordโ) print(metastoreURL)
print(metastoreUser)
also, it's worth try to recheck the configuration .
โ04-13-2022 05:34 AM
This means the network is fine, but something in the spark config is amiss.
What are the DBR version and the hive version? Please check f you are using a compatible version.
If you don't specify any version, it will take 1.3 and you wouldn't have to use jars in this case.
Please use below config with your appropriate values,
spark.hadoop.datanucleus.fixedDatastore false
spark.hadoop.datanucleus.autoCreateSchema true
spark.hadoop.datanucleus.autoCreateTables true
spark.hadoop.javax.jdo.option.ConnectionURL jdbc:mariadb://<database-name>.cj11tymkwz5w.us-west-2.rds.amazonaws.com
:3306/databricks_metastore?createDatabaseIfNotExist=true
spark.hadoop.javax.jdo.option.ConnectionDriverName org.mariadb.jdbc.Driver
spark.hadoop.javax.jdo.option.ConnectionUserName <value>
spark.hadoop.javax.jdo.option.ConnectionPassword <value>
โ04-25-2022 02:04 PM
Hi @Dhusanth Thangavadivelโ ,
Just a friendly follow-up. Did @Kavya Manohar Paragโ 's response help you to resolved this issue?
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group