cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

Unity Catalog - existing dbfs mounts and feature store

Ashley1
Contributor

Hi All,

We're currently considering turning on Unity Catalog but before we flick the switch I'm hoping I can get a bit more confidence of what will happen with our existing dbfs mounts and feature store. The bit that makes me nervous is the credential association with a storage URL. I'm hoping this is effective for only operations being performed via Unity Catalog objects. We have non-structured data in the same storage account that we currently have our externally managed tables and I want to be sure we can still continue to access this via the dbfs mounts once we switch on Unity Catalog. We also have a recently built feature store that I'm not sure how/if this ties back into Unity Catalog and potential impacts. Can anyone shed some light on whether turning on Unity Catalog will effect our existing access methods to both S3 and ADLS mounted in dbfs or our feature store?

Regards,

Ashley

5 REPLIES 5

Debayan
Esteemed Contributor III

Thanks @Debayan Mukherjee​ . Unfortunately they don't really answer the question which was related to whether turning unity catalog is likely to effect our existing dbfs mounts or feature store.

-werners-
Esteemed Contributor III

I was looking into this too and found the following:

https://learn.microsoft.com/en-us/azure/databricks/dbfs/mounts

"Mounted data does not work with Unity Catalog, and Databricks recommends migrating away from using mounts and managing data governance with Unity Catalog."

but also

"Databricks recommends using service principals with scheduled jobs and single user access mode for production workloads that need access to data managed by both DBFS and Unity Catalog."

So this is still not clear to me: is it impossible to use mounts if you enable unity, or does it render unity useless (because you can access everything using the mount)?

Hi @Werner Stinckens​ , yes, the wording leaves a lot of questions. I'm assuming it means you can't use externally managed tables (in Unity) whose location is in a mounted directory.

Since the post I created a new workspace, assigned the unity metastore to it and checked that mounts still operated (for things not in Unity Catalog). This seemed to work but I must say my testing wasn't extensive. It would be nice to get an official position from Databricks. I didn't test the feature store tables so I'm still not sure what this means for these.

karthik_p
Esteemed Contributor

@Ashley Betts​ can you please check below article, as far as i know we can use external mount points by configuring storage credentials in unity catalog . default method is managed tables, but we can point external tables also. 1. you can upgrade existing managed tables in default 2. external tables from default hive 3. we can create external tables which store data in external dbfs

links:

Upgrade tables and views to Unity Catalog | Databricks on AWS

please let me know this helps

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group