Databricks

Anonymous · ‎03-09-2022

We are using Dataricks in AWS. i am not able to run a notebook even when cluster is running. When i run a cell, it returns "cancel". When i check the event log for the cluster, it shows "Metastore is down". Couldn't see any databases or tables that i created in "data" tab, Its keeps on showing loading and returns some error there.

Appreciate a help on this.

User16753725182 · ‎04-13-2022

This means the network is fine, but something in the spark config is amiss.

What are the DBR version and the hive version? Please check f you are using a compatible version.

If you don't specify any version, it will take 1.3 and you wouldn't have to use jars in this case.

Please use below config with your appropriate values,

spark.hadoop.datanucleus.fixedDatastore false
 
spark.hadoop.datanucleus.autoCreateSchema true
 
spark.hadoop.datanucleus.autoCreateTables true
 
spark.hadoop.javax.jdo.option.ConnectionURL jdbc:mariadb://<database-name>.cj11tymkwz5w.us-west-2.rds.amazonaws.com
 
:3306/databricks_metastore?createDatabaseIfNotExist=true
 
spark.hadoop.javax.jdo.option.ConnectionDriverName org.mariadb.jdbc.Driver
 
spark.hadoop.javax.jdo.option.ConnectionUserName <value> 
 
spark.hadoop.javax.jdo.option.ConnectionPassword <value>

View solution in original post

-werners- · ‎03-09-2022

do you happen to use an external metastore?

Anonymous · ‎03-10-2022

@Werner Stinckens No we dont use external metastore

-werners- · ‎03-10-2022

ok that is not normal behaviour.

Is it a new databricks workspace, community or premium or standard?

seems like something is wrong with the workspace deployment, because databricks uses a default metastore which works stable.

Anonymous · ‎03-10-2022

Yes, i have created new Databricks Trial account (on AWS) (Premium)

Anonymous · ‎03-10-2022

@Werner Stinckens I have created workspace using "custom AWS configuration". It was working fine till yesterday but suddenly returns this error.

-werners- · ‎03-10-2022

Seems like a workspace deployment issue.

But I see you have another topic open, which seems related tbh.

Hubert-Dudek · ‎03-10-2022

Metastore is hosted in AWS RDS.

Additionally you can check cluster start logs. In logs is metastore url and connection message, probably you just blocked traffic to that.

On that page you can also find urls/ips which should be allowed out/routed through internet gateway: https://docs.databricks.com/administration-guide/cloud-configurations/aws/customer-managed-vpc.html

User16753725182 · ‎03-11-2022

As suggested by @Hubert Dudek , please check your network settings, in case you have any firewall or rules that may be blocking the RDS connection on port 3306.

Also, check if you are whitelisting the correct endpoint as per your region.

To debug this further, you can SSH into one of the AWS worker instances used by a Databricks cluster and test the connectivity.

Eg: For us-east-1 region, run this command

nc -zv mdb7sywh50xhpr.chkweekm4xjq.us-east-1.rds.amazonaws.com 3306

Anonymous · ‎03-13-2022

@Hubert Dudek @Kavya Manohar Parag

Thanks for the input. I have tried running below command, it was succeeded.

"!nc -zv md1n4trqmokgnhr.csnrqwqko4ho.ap-southeast-1.rds.amazonaws.com 3306".

However, Database is not accessible. Please see the error comes for show databases.

Also, I have tried several trial accounts using this AWS account. Is there any restriction for trial accounts ?

Atanu · ‎04-17-2022

So this means still you need to check the config of your External metastore.

If you run below - whats the output -

%scala

val metastoreURL = spark.sparkContext.hadoopConfiguration.get(“javax.jdo.option.ConnectionURL”)

val metastoreUser = spark.sparkContext.hadoopConfiguration.get(“javax.jdo.option.ConnectionUserName”)

val metastorePassword = spark.sparkContext.hadoopConfiguration.get(“javax.jdo.option.ConnectionPassword”) print(metastoreURL)

print(metastoreUser)

also, it's worth try to recheck the configuration .

User16753725182 · ‎04-13-2022

This means the network is fine, but something in the spark config is amiss.

What are the DBR version and the hive version? Please check f you are using a compatible version.

If you don't specify any version, it will take 1.3 and you wouldn't have to use jars in this case.

Please use below config with your appropriate values,

spark.hadoop.datanucleus.fixedDatastore false
 
spark.hadoop.datanucleus.autoCreateSchema true
 
spark.hadoop.datanucleus.autoCreateTables true
 
spark.hadoop.javax.jdo.option.ConnectionURL jdbc:mariadb://<database-name>.cj11tymkwz5w.us-west-2.rds.amazonaws.com
 
:3306/databricks_metastore?createDatabaseIfNotExist=true
 
spark.hadoop.javax.jdo.option.ConnectionDriverName org.mariadb.jdbc.Driver
 
spark.hadoop.javax.jdo.option.ConnectionUserName <value> 
 
spark.hadoop.javax.jdo.option.ConnectionPassword <value>

jose_gonzalez · ‎04-25-2022

Hi @Dhusanth Thangavadivel ,

Just a friendly follow-up. Did @Kavya Manohar Parag 's response help you to resolved this issue?

Databricks

Not able to run notebook even when cluster is running and databases/tables are not visible in "data" tab.

How to successfully build GenAI applications

Registration now open! Databricks Data + AI Summit 2024

Meet DBRX, the New Standard for High-Quality LLMs

Register now and save 50% on training at Data + AI Summit!