cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Not able to run notebook even when cluster is running and databases/tables are not visible in "data" tab.

Anonymous
Not applicable

We are using Dataricks in AWS. i am not able to run a notebook even when cluster is running. When i run a cell, it returns "cancel". When i check the event log for the cluster, it shows "Metastore is down". Couldn't see any databases or tables that i created in "data" tab, Its keeps on showing loading and returns some error there.

Appreciate a help on this.ImageImage 

Image

1 ACCEPTED SOLUTION

Accepted Solutions

User16753725182
Databricks Employee
Databricks Employee

This means the network is fine, but something in the spark config is amiss.

What are the DBR version and the hive version? Please check f you are using a compatible version.

If you don't specify any version, it will take 1.3 and you wouldn't have to use jars in this case.

Please use below config with your appropriate values,

spark.hadoop.datanucleus.fixedDatastore false
 
spark.hadoop.datanucleus.autoCreateSchema true
 
spark.hadoop.datanucleus.autoCreateTables true
 
spark.hadoop.javax.jdo.option.ConnectionURL jdbc:mariadb://<database-name>.cj11tymkwz5w.us-west-2.rds.amazonaws.com
 
:3306/databricks_metastore?createDatabaseIfNotExist=true
 
spark.hadoop.javax.jdo.option.ConnectionDriverName org.mariadb.jdbc.Driver
 
spark.hadoop.javax.jdo.option.ConnectionUserName <value> 
 
spark.hadoop.javax.jdo.option.ConnectionPassword <value> 

View solution in original post

12 REPLIES 12

-werners-
Esteemed Contributor III

do you happen to use an external metastore?

Anonymous
Not applicable

@Werner Stinckensโ€‹  No we dont use external metastore

-werners-
Esteemed Contributor III

ok that is not normal behaviour.

Is it a new databricks workspace, community or premium or standard?

seems like something is wrong with the workspace deployment, because databricks uses a default metastore which works stable.

Anonymous
Not applicable

Yes, i have created new Databricks Trial account (on AWS) (Premium)

Anonymous
Not applicable

@Werner Stinckensโ€‹  I have created workspace using "custom AWS configuration". It was working fine till yesterday but suddenly returns this error.

-werners-
Esteemed Contributor III

Seems like a workspace deployment issue.

But I see you have another topic open, which seems related tbh.

Hubert-Dudek
Esteemed Contributor III

Metastore is hosted in AWS RDS.

Additionally you can check cluster start logs. In logs is metastore url and connection message, probably you just blocked traffic to that.

On that page you can also find urls/ips which should be allowed out/routed through internet gateway: https://docs.databricks.com/administration-guide/cloud-configurations/aws/customer-managed-vpc.html

image.png

User16753725182
Databricks Employee
Databricks Employee

As suggested by @Hubert Dudekโ€‹ , please check your network settings, in case you have any firewall or rules that may be blocking the RDS connection on port 3306.

Also, check if you are whitelisting the correct endpoint as per your region.

To debug this further, you can SSH into one of the AWS worker instances used by a Databricks cluster and test the connectivity.

Eg: For us-east-1 region, run this command

nc -zv mdb7sywh50xhpr.chkweekm4xjq.us-east-1.rds.amazonaws.com 3306

Anonymous
Not applicable

@Hubert Dudekโ€‹  @Kavya Manohar Paragโ€‹ 

Thanks for the input. I have tried running below command, it was succeeded.

"!nc -zv md1n4trqmokgnhr.csnrqwqko4ho.ap-southeast-1.rds.amazonaws.com 3306".

However, Database is not accessible. Please see the error comes for show databases.

Also, I have tried several trial accounts using this AWS account. Is there any restriction for trial accounts ?

Image

Atanu
Databricks Employee
Databricks Employee

So this means still you need to check the config of your External metastore.

If you run below - whats the output -

%scala

val metastoreURL = spark.sparkContext.hadoopConfiguration.get(โ€œjavax.jdo.option.ConnectionURLโ€)

 val metastoreUser = spark.sparkContext.hadoopConfiguration.get(โ€œjavax.jdo.option.ConnectionUserNameโ€)

 val metastorePassword = spark.sparkContext.hadoopConfiguration.get(โ€œjavax.jdo.option.ConnectionPasswordโ€) print(metastoreURL)

 print(metastoreUser)

also, it's worth try to recheck the configuration .

User16753725182
Databricks Employee
Databricks Employee

This means the network is fine, but something in the spark config is amiss.

What are the DBR version and the hive version? Please check f you are using a compatible version.

If you don't specify any version, it will take 1.3 and you wouldn't have to use jars in this case.

Please use below config with your appropriate values,

spark.hadoop.datanucleus.fixedDatastore false
 
spark.hadoop.datanucleus.autoCreateSchema true
 
spark.hadoop.datanucleus.autoCreateTables true
 
spark.hadoop.javax.jdo.option.ConnectionURL jdbc:mariadb://<database-name>.cj11tymkwz5w.us-west-2.rds.amazonaws.com
 
:3306/databricks_metastore?createDatabaseIfNotExist=true
 
spark.hadoop.javax.jdo.option.ConnectionDriverName org.mariadb.jdbc.Driver
 
spark.hadoop.javax.jdo.option.ConnectionUserName <value> 
 
spark.hadoop.javax.jdo.option.ConnectionPassword <value> 

Hi @Dhusanth Thangavadivelโ€‹ ,

Just a friendly follow-up. Did @Kavya Manohar Paragโ€‹ 's response help you to resolved this issue?

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group