cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Not able to run notebook even when cluster is running and databases/tables are not visible in "data" tab.

Anonymous
Not applicable

We are using Dataricks in AWS. i am not able to run a notebook even when cluster is running. When i run a cell, it returns "cancel". When i check the event log for the cluster, it shows "Metastore is down". Couldn't see any databases or tables that i created in "data" tab, Its keeps on showing loading and returns some error there.

Appreciate a help on this.ImageImage 

Image

1 ACCEPTED SOLUTION

Accepted Solutions

User16753725182
Contributor III
Contributor III

This means the network is fine, but something in the spark config is amiss.

What are the DBR version and the hive version? Please check f you are using a compatible version.

If you don't specify any version, it will take 1.3 and you wouldn't have to use jars in this case.

Please use below config with your appropriate values,

spark.hadoop.datanucleus.fixedDatastore false
 
spark.hadoop.datanucleus.autoCreateSchema true
 
spark.hadoop.datanucleus.autoCreateTables true
 
spark.hadoop.javax.jdo.option.ConnectionURL jdbc:mariadb://<database-name>.cj11tymkwz5w.us-west-2.rds.amazonaws.com
 
:3306/databricks_metastore?createDatabaseIfNotExist=true
 
spark.hadoop.javax.jdo.option.ConnectionDriverName org.mariadb.jdbc.Driver
 
spark.hadoop.javax.jdo.option.ConnectionUserName <value> 
 
spark.hadoop.javax.jdo.option.ConnectionPassword <value> 

View solution in original post

12 REPLIES 12

-werners-
Esteemed Contributor III

do you happen to use an external metastore?

Anonymous
Not applicable

@Werner Stinckens​  No we dont use external metastore

-werners-
Esteemed Contributor III

ok that is not normal behaviour.

Is it a new databricks workspace, community or premium or standard?

seems like something is wrong with the workspace deployment, because databricks uses a default metastore which works stable.

Anonymous
Not applicable

Yes, i have created new Databricks Trial account (on AWS) (Premium)

Anonymous
Not applicable

@Werner Stinckens​  I have created workspace using "custom AWS configuration". It was working fine till yesterday but suddenly returns this error.

-werners-
Esteemed Contributor III

Seems like a workspace deployment issue.

But I see you have another topic open, which seems related tbh.

Hubert-Dudek
Esteemed Contributor III

Metastore is hosted in AWS RDS.

Additionally you can check cluster start logs. In logs is metastore url and connection message, probably you just blocked traffic to that.

On that page you can also find urls/ips which should be allowed out/routed through internet gateway: https://docs.databricks.com/administration-guide/cloud-configurations/aws/customer-managed-vpc.html

image.png

User16753725182
Contributor III
Contributor III

As suggested by @Hubert Dudek​ , please check your network settings, in case you have any firewall or rules that may be blocking the RDS connection on port 3306.

Also, check if you are whitelisting the correct endpoint as per your region.

To debug this further, you can SSH into one of the AWS worker instances used by a Databricks cluster and test the connectivity.

Eg: For us-east-1 region, run this command

nc -zv mdb7sywh50xhpr.chkweekm4xjq.us-east-1.rds.amazonaws.com 3306

Anonymous
Not applicable

@Hubert Dudek​  @Kavya Manohar Parag​ 

Thanks for the input. I have tried running below command, it was succeeded.

"!nc -zv md1n4trqmokgnhr.csnrqwqko4ho.ap-southeast-1.rds.amazonaws.com 3306".

However, Database is not accessible. Please see the error comes for show databases.

Also, I have tried several trial accounts using this AWS account. Is there any restriction for trial accounts ?

Image

Atanu
Esteemed Contributor
Esteemed Contributor

So this means still you need to check the config of your External metastore.

If you run below - whats the output -

%scala

val metastoreURL = spark.sparkContext.hadoopConfiguration.get(“javax.jdo.option.ConnectionURL”)

 val metastoreUser = spark.sparkContext.hadoopConfiguration.get(“javax.jdo.option.ConnectionUserName”)

 val metastorePassword = spark.sparkContext.hadoopConfiguration.get(“javax.jdo.option.ConnectionPassword”) print(metastoreURL)

 print(metastoreUser)

also, it's worth try to recheck the configuration .

User16753725182
Contributor III
Contributor III

This means the network is fine, but something in the spark config is amiss.

What are the DBR version and the hive version? Please check f you are using a compatible version.

If you don't specify any version, it will take 1.3 and you wouldn't have to use jars in this case.

Please use below config with your appropriate values,

spark.hadoop.datanucleus.fixedDatastore false
 
spark.hadoop.datanucleus.autoCreateSchema true
 
spark.hadoop.datanucleus.autoCreateTables true
 
spark.hadoop.javax.jdo.option.ConnectionURL jdbc:mariadb://<database-name>.cj11tymkwz5w.us-west-2.rds.amazonaws.com
 
:3306/databricks_metastore?createDatabaseIfNotExist=true
 
spark.hadoop.javax.jdo.option.ConnectionDriverName org.mariadb.jdbc.Driver
 
spark.hadoop.javax.jdo.option.ConnectionUserName <value> 
 
spark.hadoop.javax.jdo.option.ConnectionPassword <value> 

Hi @Dhusanth Thangavadivel​ ,

Just a friendly follow-up. Did @Kavya Manohar Parag​ 's response help you to resolved this issue?

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.