cancel
Showing results for 
Search instead for 
Did you mean: 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 

Setup unity catalog external location to minio

gwilson
New Contributor II

We have a minio server running in Azure that we have connected to the spark clusters directly. As we move to unity catalog, we would like to make the data stored in our minio servers accessible as an external location in Azure Databricks account via unity catalog. Is it possible to connect a non-ADLS object storage as an external storage location in Azure Databricks?

3 REPLIES 3

Kaniz_Fatma
Community Manager
Community Manager

Hi @gwilson,  Here are the steps to connect a Minio server to Azure Databricks as an external storage location:

  1. Configure your Minio server to allow network access and obtain the endpoint URL, access key, and secret key.

  2. Set the Spark configuration values in the spark.conf file of your Azure Databricks workspace to interface with Minio using the s3a connector.

  3. Register the Minio storage location as an external table in Unity Catalog (or using the Databricks CLI) using the Delta format.

  4. Access the data stored in Minio via the external table minio_data from your Spark scripts.

Following these steps, you can connect your Minio server to Azure Databricks as an external storage location and use it for your Spark workload.

174817
New Contributor III

Hi @Kaniz_Fatma ,

I have a server on Azure that supports the S3 protocol, and I am trying to follow these instructions in order to use Unity on Azure DataBricks with it.  I am not sure about this part of your reply:


Set the Spark configuration values in the spark.conf file of your Azure Databricks workspace

I know how to configure Spark on a DataBricks Compute Spark cluster, but I could not find where to do this on a Unity catalog.  I tried in the web GUI under "Catalog > Catalog Explorer > External Data" and also under "Compute / SQL Warehouses".

Is it possible to set a Unity Catalog external location to access an S3-compatible endpoint on Azure?

Thanks!

gwilson
New Contributor II

Support for S3-compatible storage is not available with unity catalog. 

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!