Monday
Hi all,
Iโm setting up access control for Databricks jobs and have two questions:
Ephemeral Job Clusters: Since job clusters are created at runtime, is it best practice to set ACLs on the job itself? The /api/2.0/permissions/clusters/{cluster_id} endpoint requires a cluster ID, but ephemeral clusters donโt exist beforehand.
All Jobs & New Jobs: Whatโs the recommended way to manage ACLs for all existing jobs and automatically apply permissions to newly created jobs?
Looking for scalable, best-practice guidance.
Tuesday
Hi @shweta_m,
I don't think this is exactly what you're asking, which seems to be some kind of configuration at the account management console level, but I don't know of a way to do what you're proposing.
In my case, we had a similar problem with my organisation. We opted to migrate the implementation of jobs/pipelines through Databricks Asset Bundles, and the Azure EntraID integration provides us directly with the user groups we have in Azure.
Therefore, when developing and deploying in different environments (including dev), we choose the permissions that each group has in the targets section of databricks.yml, so that only selected people can see the jobs/pipelines and with different permissions to view, execute, edit, or manage.
In our case we so differents job cluster configuration so we decide to declare the compute over each job declaration yml.
Now, nobody can run or even see anything they shouldn't.
I attached you some resources:
https://learn.microsoft.com/en-us/azure/databricks/dev-tools/bundles/permissions
https://learn.microsoft.com/en-us/azure/databricks/dev-tools/bundles/settings
Tuesday
Hi @shweta_m
I do agree with @juan_maedo ..
Just to add on top of it , answer of your solution is automated deployment pipelines embedded into your project repo using databricks asset bundles which is scalable and reliable way to implement permissions dynamically.
we dynamically create instance pools with right configuration defined under asset bundles files and grant 'can manage' permissions to this master SP (this is owned by our team as entire infra depends on this sp, basically owner of RG and part of service connection)
We also deploy databricks job using asset bundles (which use those instance pools to configure a job cluster) and make owner and runas to same master sp but we also add can manage permissions to this team ad group which is being synced from azure entraid to databricks.
As a side note apply tags to these instance pools which will help you to segregate cost per workload basis or domain basis , how you prefer.
Br
Tuesday
Hi @shweta_m,
I don't think this is exactly what you're asking, which seems to be some kind of configuration at the account management console level, but I don't know of a way to do what you're proposing.
In my case, we had a similar problem with my organisation. We opted to migrate the implementation of jobs/pipelines through Databricks Asset Bundles, and the Azure EntraID integration provides us directly with the user groups we have in Azure.
Therefore, when developing and deploying in different environments (including dev), we choose the permissions that each group has in the targets section of databricks.yml, so that only selected people can see the jobs/pipelines and with different permissions to view, execute, edit, or manage.
In our case we so differents job cluster configuration so we decide to declare the compute over each job declaration yml.
Now, nobody can run or even see anything they shouldn't.
I attached you some resources:
https://learn.microsoft.com/en-us/azure/databricks/dev-tools/bundles/permissions
https://learn.microsoft.com/en-us/azure/databricks/dev-tools/bundles/settings
Tuesday
Hi @shweta_m
I do agree with @juan_maedo ..
Just to add on top of it , answer of your solution is automated deployment pipelines embedded into your project repo using databricks asset bundles which is scalable and reliable way to implement permissions dynamically.
we dynamically create instance pools with right configuration defined under asset bundles files and grant 'can manage' permissions to this master SP (this is owned by our team as entire infra depends on this sp, basically owner of RG and part of service connection)
We also deploy databricks job using asset bundles (which use those instance pools to configure a job cluster) and make owner and runas to same master sp but we also add can manage permissions to this team ad group which is being synced from azure entraid to databricks.
As a side note apply tags to these instance pools which will help you to segregate cost per workload basis or domain basis , how you prefer.
Br
Tuesday
Thanks! @juan_maedo @saurabh18cs
Passionate about hosting events and connecting people? Help us grow a vibrant local communityโsign up today to get started!
Sign Up Now