12-05-2023 10:51 PM
Hi all,
we are in the process of rolling out a new unity-enabled databricks env with 2 tiers: dev and prod.
Initially we had the plan to completely decouple dev and prod, each with their own data lake as storage.
While this is the safest option, it does give me some headaches on how to get good data into that dev data lake.
So I started thinking: is it possible to define TWO storage credentials (with different Azure Connectors!) to the same (prod) data lake, define one of the two as read-only and use that for our dev environment. Create a read-only external location, catalog etc.
While for the prod env we do the same, but write enabled.
Like that we can read prod data without risking writing to the prod data lake.
Is this possible? I know Unity is quite strict concerning overlap, but perhaps it works with different Access Connectors/storage credentials?
12-05-2023 11:51 PM
That is awesome. I was not sure it would work like that but apparently it does.
Tnx!
12-06-2023 05:50 AM
Unfortunately it does not seem to work. When creating a second external location to the same path, I get the dreaded error saying an external location already exists on that path. And that is exactly what I want to do 😞
12-28-2023 07:39 PM
Are you able to resolve this, we are in similar situation. We want to create 2 catalogs and provide different permissions, we are getting overlap error
we we thought to go with above approach. How to handle for DLT UC. We created 2 catalogs one is using default metastore and other using altogether new storage and container and catalog level segregation with managed location, when we execute we are getting overlap error with managed stare for 2 nd catalog @-werners- @Retired_mod
12-06-2023 06:21 AM
I could apply 'force create' but it does seem weiry to do so.
12-28-2023 10:40 PM
I think you are overcomplicating it a bit.
You can have 2 workspace prod and dev and 2 catalogs prod and dev.
You can make prod catalog to be read only in dev environment or you can do shallow clones from prod to dev with some scripts.
Additionally you have ACLs over external locations and catalogs, so let's say you have engineers and analys. You grant access to create objects in catalog and wrute table to ext location on dev to engineers and you grand read to analyst on prod.
If you screw ACL on Unity , the backend setup does not matter.
I allow for write operation on prod only to service principals ( via jobs)
I used above security in few Unity enabled project and no surprise so far.
01-03-2024 12:31 AM
That is the way I am working at right now. Assign workspace to the catalog and set to read-only if necessary.
It would be easier though if it was possible to define a 2nd external location in read-only, as this cannot break anything (of course in read-only mode).
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group