cancel
Showing results for 
Search instead for 
Did you mean: 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

noorbasha534
by Valued Contributor II
  • 1659 Views
  • 2 replies
  • 0 kudos

Enforcing developers to use something like a single user cluster

Dear allwe have a challenge. Developers create/recreate tables/views in PRD environment by running notebooks on all-purpose clusters where as the same notebooks already exist as jobs. Not sure, why the developers feel comfortable in using all-purpose...

  • 1659 Views
  • 2 replies
  • 0 kudos
Latest Reply
noorbasha534
Valued Contributor II
  • 0 kudos

Hi Stefan, exactly, we have the same. the CI/CD process invokes jobs that run as service principal. So far, so good. But, please note that not all situations would fall under this ideal case. There will be cases wherein I have to recreate 50 views ou...

  • 0 kudos
1 More Replies
SmileyVille
by New Contributor III
  • 8202 Views
  • 3 replies
  • 0 kudos

Leverage Azure PIM with DataBricks with Contributor role privilege

We are trying to leverage Azure PIM.  This works great for most things, however; we've run into a snag.  We want to limit the contributor role to a group and only at the resource group level, not subscription.  We wish to elevate via PIM.  This will ...

  • 8202 Views
  • 3 replies
  • 0 kudos
Latest Reply
SmileyVille
New Contributor III
  • 0 kudos

Never did, so we scrapped PIM with Databricks for now.

  • 0 kudos
2 More Replies
KLin
by Databricks Partner
  • 2601 Views
  • 7 replies
  • 1 kudos

Resolved! Unable to Pinpoint where network traffic originates from in GCP

Hi everyone,I have a question regarding networking. A bit of background first: For security reasons, the current allow-policy from GCP to our on-prem-infrastructure is being replaced by a deny-policy for traffic originating from GCP. Therefore access...

  • 2601 Views
  • 7 replies
  • 1 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 1 kudos

Hi @KLin, happy to help! -  The reason why traffic originates from the pods subnet for clusters/SQL warehouses without the x-databricks-nextgen-cluster tag (still using GKE) and from the node subnet for clusters with the GCE tag is due to the underly...

  • 1 kudos
6 More Replies
mnorland
by Valued Contributor II
  • 5194 Views
  • 1 replies
  • 0 kudos

Resolved! Custom VPC Subranges for New GCP Databricks Deployment

What Pods and Services subranges would you recommend for a /22 subnet for a custom VPC for a new GCP Databricks deployment in the GCE era?  

  • 5194 Views
  • 1 replies
  • 0 kudos
Latest Reply
mnorland
Valued Contributor II
  • 0 kudos

The secondary ranges are there to support legacy GKE clusters.  While required in the UI, they can be empty in terraform (per a source) for new deployments as clusters are GCE now. (There is a green GCE next to the cluster name.)  When observing the ...

  • 0 kudos
hartenc
by New Contributor II
  • 1552 Views
  • 2 replies
  • 0 kudos

Workflow job runs are disabled

I'm not totally clear on the financial details, but from what I've been told: A few months our contract with Databricks expired and changed in a per-month subscription. In those months there was a problem with payments due to bills being sent to a wr...

  • 1552 Views
  • 2 replies
  • 0 kudos
Latest Reply
hartenc
New Contributor II
  • 0 kudos

We contacted them, but were told that we could only use community support unless we got a premium support subscription (not sure about the exact term, somebody else asked them).Our account ID is ddcb191f-aff5-4ba5-be46-41adf1705e03. If the  workspace...

  • 0 kudos
1 More Replies
Georgi
by New Contributor
  • 935 Views
  • 1 replies
  • 0 kudos

How to set a static IP to a cluster

Is there a way to set a static IP to a cluster on the Databricks instance? I'm trying to establish connection with a service outside AWS and it seems the only way to allow inbound connections is by adding the IP to a set of rules. thanks!I couldn’t f...

  • 935 Views
  • 1 replies
  • 0 kudos
Latest Reply
Takuya-Omi
Valued Contributor III
  • 0 kudos

Hi, @Georgi Databricks clusters on AWS don’t have a built‐in way to assign a static IP address. Instead, the typical workaround is to route all outbound traffic from your clusters through a NAT Gateway (or similar solution) that has an Elastic IP ass...

  • 0 kudos
mzs
by Contributor
  • 4937 Views
  • 1 replies
  • 1 kudos

Resolved! Understanding Azure frontend private link endpoints

Hi,I've been reading up on private link (https://learn.microsoft.com/en-us/azure/databricks/security/network/classic/private-link) and have some questions:In the standard deployment, do the transit VNet (frontend private endpoint) and Databricks work...

  • 4937 Views
  • 1 replies
  • 1 kudos
Latest Reply
Zubisid
New Contributor III
  • 1 kudos

Below are the answers to your questions -1) No, they don’t have to be in the same subscription. You can have the transit VNet (with the front-end Private Endpoint) in one subscription and the Databricks workspace in another, as long as you set up the...

  • 1 kudos
mzs
by Contributor
  • 7075 Views
  • 2 replies
  • 2 kudos

Using a proxy server to install packages from PyPI in Azure Databricks

Hi,I'm setting up a workspace in Azure and would like to put some restrictions in place on outbound Internet access to reduce the risk of data exfiltration from notebooks and jobs. I plan to use VNet Injection and SCC + back-end private link for comp...

  • 7075 Views
  • 2 replies
  • 2 kudos
Latest Reply
mzs
Contributor
  • 2 kudos

Thanks Isi, this is great info. I'll update once I've tried it.

  • 2 kudos
1 More Replies
meshko
by New Contributor II
  • 1975 Views
  • 4 replies
  • 1 kudos

help undersanding RAM utilization graph

I am trying to understand the following graph databricks is showing me and failing:What is that constant lightly shaded area close to 138GB? It is not explained in the "Usage type" legend. The job is running completely on the driver node, not utilizi...

databricks.png
  • 1975 Views
  • 4 replies
  • 1 kudos
Latest Reply
koji_kawamura
Databricks Employee
  • 1 kudos

Hi @meshko  The light-shaded area represents the total available RAM size. The tooltip shows it when you hover over a mouse.    

  • 1 kudos
3 More Replies
Mr_7199
by New Contributor
  • 1951 Views
  • 1 replies
  • 1 kudos

Assigning Dedicated (SINGLE_USER) ML Clusters to a Group in Databricks

I'm working with Databricks Runtime ML and have configured a cluster in Dedicated access mode (formerly SINGLE_USER). The documentation indicates that a compute resource with Dedicated access can be assigned to a group, allowing user permissions to a...

Mr_7199_0-1742418753327.png
  • 1951 Views
  • 1 replies
  • 1 kudos
Latest Reply
Isi
Honored Contributor III
  • 1 kudos

Hey @Mr_7199 Yes, I’ve successfully configured a dedicated ML cluster assigned to a group.Here are three things to check:1.Cluster Policy – Ensure the cluster policy does not impose restrictions. Using an unrestricted policy simplifies testing.2.Perm...

  • 1 kudos
noorbasha534
by Valued Contributor II
  • 2406 Views
  • 1 replies
  • 1 kudos

Disable usage of serverless jobs & serverless all-purpose clusters usage

Dear all,I see some developers started using serverless jobs and serverless all-purpose clusters. as a platform admin, I like to disable them as we are not yet prepared as a team to move to serverless; we get huge discounts on compute from Microsoft ...

  • 2406 Views
  • 1 replies
  • 1 kudos
Latest Reply
ashraf1395
Honored Contributor
  • 1 kudos

You can disable the serverless compute featuure from your account console :https://docs.databricks.com/aws/en/admin/workspace-settings/serverless#enable-serverless-computeI have heard that for some ,If this option is not available , it means it is au...

  • 1 kudos
bhanu_dp
by New Contributor III
  • 2790 Views
  • 2 replies
  • 0 kudos

How to restore if a catalog is deleted

I am looking to identify potential pitfall in the decentralized workspace framework where the key business owner have full access to their respective workspace and catalogs. In case of accidental delete/drop schema or catalog from the UC, what are th...

Administration & Architecture
catalog
DR
Recovery
  • 2790 Views
  • 2 replies
  • 0 kudos
Latest Reply
KaranamS
Contributor III
  • 0 kudos

Hi @bhanu_dp , To retrieve accidental deletes, you can -1. Restore it to a previous version using time travel featurehttps://docs.databricks.com/gcp/en/delta/history#restore-a-delta-table-to-an-earlier-state2. Use UNDROP commandhttps://docs.databrick...

  • 0 kudos
1 More Replies
pranav_
by New Contributor
  • 5557 Views
  • 1 replies
  • 1 kudos

How to Query All the users who have access to a databricks workspace?

Hi There,I'm new to Databricks and we currently have a lot of users among different groups having access to a databricks workspace. I would like to know how I could query the users, groups and Entitlements of each groups using SQL or the API. Incase ...

  • 5557 Views
  • 1 replies
  • 1 kudos
Latest Reply
tejaskelkar
Databricks Employee
  • 1 kudos

To query all users who have access to a Databricks workspace, you can follow these steps: 1. Check Workspace Users via Admin Console If you are a workspace admin, navigate to the Admin Console in the Databricks UI. Under the "Users" tab, you can vie...

  • 1 kudos
SANJAYKJ
by New Contributor II
  • 1321 Views
  • 1 replies
  • 0 kudos

Spark Executor - Parallelism Question

I was reading the book Spark: The Definitive Guide, I came across below statement in Chapter 2 on partitions."If you have many partitions but only one executor, Spark will still have a parallelism of only one because there is only one computation res...

  • 1321 Views
  • 1 replies
  • 0 kudos
Latest Reply
Isi
Honored Contributor III
  • 0 kudos

Hey @SANJAYKJ It is correct in the sense that a single executor is a limiting factor, but the actual parallelism within that executor depends on the number of cores assigned to it. If you want to leverage multiple partitions effectively, you either n...

  • 0 kudos
mrstevegross
by Contributor III
  • 2715 Views
  • 4 replies
  • 0 kudos

Resolved! Possible to programmatically adjust Databricks instance pool more intelligently?

We'd like to adopt Databricks instance pool in order to reduce instance-acquisition times (a significant contributor to our test latency). Based on my understanding of the docs, the main levers we can control are: min instance count, max instance cou...

  • 2715 Views
  • 4 replies
  • 0 kudos
Latest Reply
Isi
Honored Contributor III
  • 0 kudos

Hi Steve,If the goal is to pre-warm 100 instances in the Databricks Instance Pool, you could create a temporary job that will request instances from the pool. This ensures that Databricks provisions the required instances before the actual test run.T...

  • 0 kudos
3 More Replies