03-26-2024 07:55 AM - edited 03-26-2024 07:59 AM
Hi!
We are currently PoC-ing Databricks with Unity Catalog on AWS but it seems there are some issues.
Creating a database in an existing (unity) catalog takes over 10 minutes. Creating an external table on top of an existing delta table (CREATE TABLE main.bronze.dummy_table USING DELTA LOCATION 's3://<CLOUD_URI>/dummy_data.delta';) takes also 10+ minutes. The cell's status is stuck at `Performing Hive catalog operation: databaseExists`. Working directly on tables via paths works snappy as expected.
The logs show a lot of `java.sql.SQLNonTransientConnectionException: Could not connect to mdv2llxgl8lou0.ceptxxgorjrc.eu-central-1.rds.amazonaws.com:3306 : Connection reset` errors. However, this doesn't seem to be a network issue, as we can reach this url/port via bash in the web-terminal. The cluster's event log also shows multiple `metastore is down' messages.
Not sure if it's related, but eventhough our unity catalog is configured to store its data in a specific S3 directory (also shown via DESCRIBE CATALOG EXTENDED), this directory is still empty. Our metastore does NOT have a managed storage assigned, so I'm not even sure where it actually stores the database/table metadata!?
Almost forgot: Cluster runs on DBR 13.3
Does anyone have an idea, what could be the issue here?
Thanks!
03-26-2024 10:05 AM
Hi @breaka, It appears that you’re encountering some challenges while PoC-ing Databricks with Unity Catalog on AWS.
Let’s break down the issues you’ve described:
Database Creation Delay:
Performing Hive catalog operation: databaseExists
.External Table Creation Delay:
s3://<CLOUD_URI>/dummy_data.delta
) also takes 10+ minutes.Connection Errors:
java.sql.SQLNonTransientConnectionException: Could not connect to mdv2llxgl8lou0.ceptxxgorjrc.eu-central-1.rds.amazonaws.com:3306 : Connection reset
.Empty S3 Directory:
Given these observations, let’s explore potential solutions:
Unity Catalog Configuration:
Metastore Health:
Storage Credentials and Locations:
Connection Troubleshooting:
Metadata Storage Location:
Good luck with your PoC! 🚀
03-29-2024 05:50 AM
Hi @Kaniz_Fatma ,
thank you for your reply!
> Unity Catalog Configuration
We configured the metastore, workspace and catalog to our best knowledge and Databricks' documentation. The DB runtime and AWS itself should be fully supported.
> Metastore H ealth (Consider restarting or verifying the h ealth of the metastore service)
AFAIK, the the error message is related to the legacy HIVE metastore at mdv2llxgl8lou0.ceptxxgorjrc.eu-central-1.rds.amazonaws.com address which is hosted and maintained centrally by Databricks. Nothing we can do here.
> Storage Credentials and Locations
Testing the external location that "should" hold the unity catalog data via the data catalog Web-UI shows: "All Permissions Confirmed. The associated Storage Credential grants permission to perform all necessary operations." We successfully use the same storage creds for external volumes on the same S3 bucket (though, different sub-folder.
> Connection Troubleshooting
I'm not sure how we can set any credentials for the legacy hive metastore. Shouldn't this be fully managed by Databricks (via keystore)?
> Since you can reach the URL/port via the web terminal, consider checking the security group rules and firewall settings
With respect to group rules and firewall, is there a difference between making a network connection via Spark (JVM) and via bash / python if it is the very same VM/Container? I can also sucessfully create a socket with Python or shell (%sh) within a Databricks Notebook.
> Metadata Storage Location
I fully agree that "understanding its behavior is crucial" but apparently I'm missing something here. I just created a catalog and set its storage root to an S3 directory, where the GUI shows that we have full access.
I replied to your potential solutions, I hope this clears things up a bit.
Thanks!
03-29-2024 05:54 AM
PS: Apparently I'm not allowed to use the world H E A L T H (without spaces) in my reply (The message body contains H e a l t h, which is not permitted in this community. Please remove this content before sending your post.)
03-29-2024 09:46 AM
This word has now been whitelisted, thank you for the tip!
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group