cancel
Showing results for 
Search instead for 
Did you mean: 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

sharat_n
by New Contributor
  • 1688 Views
  • 1 replies
  • 1 kudos

Delta lake : delete data from storage manually instead of vacuum

Hi AllWe have a unique use case where we are unable to run vacuum to clean our storage space of delta lake tables. Since we have data partitioned by date, we plan to delete files older than a certain date directly from storage. Could this lead to any...

  • 1688 Views
  • 1 replies
  • 1 kudos
Latest Reply
Walter_C
Databricks Employee
  • 1 kudos

Deleting files older than a certain date directly from storage without using the VACUUM command can lead to potential issues with your Delta Lake tables. Here are the key points to consider: Corruption Risk: Directly deleting files from storage can ...

  • 1 kudos
Erik
by Valued Contributor III
  • 1153 Views
  • 1 replies
  • 1 kudos

Resolved! Where is the Open Apache Hive Metastore API?

I 2023 it was announced that databricks has made a "Hive Metastore (HMS) interface for Databricks Unity Catalog, which allows any software compatible with Apache Hive to connect to Unity Catalog".Is this discontinued? If not, is there any documentati...

  • 1153 Views
  • 1 replies
  • 1 kudos
Latest Reply
Walter_C
Databricks Employee
  • 1 kudos

It seems that this option has been deprecated, it was a private preview but is no longer available for enrollment

  • 1 kudos
ambigus9
by Contributor
  • 4486 Views
  • 10 replies
  • 0 kudos

Resolved! Failed to add 3 workers to the compute. Will attempt retry: true. Reason: Driver unresponsive

Currently I trying to Create a Compute Cluster on a Workspaces with Privatelink and Custom VPC.I'm using Terraform: https://registry.terraform.io/providers/databricks/databricks/latest/docs/guides/aws-private-link-workspaceAfter the deployment is com...

ambigus9_0-1735912629708.png ambigus9_1-1735912708564.png ambigus9_2-1735912741139.png
  • 4486 Views
  • 10 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @ambigus9, Looks like based on connectivity test to the RDS it's not working. Can you check if there is any Firewall blocking the request, since connection is not going through the RDS.

  • 0 kudos
9 More Replies
xecel
by New Contributor II
  • 1401 Views
  • 1 replies
  • 0 kudos

Resolved! How to Retrieve Admin and Non-Admin Permissions at Workspace Level in Azure Databricks.

Hello,I am working on a project to document permissions for both admins and non-admin users across all relevant objects at the workspace level in Azure Databricks (e.g., tables, jobs, clusters, etc.).I understand that admin-level permissions might be...

  • 1401 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

In Databricks the object permissions are based in the object itself and not the user. Unfortunately as of now there is no way to get all the objects permissions in a single built in query.There is custom options as for example for clusters, first run...

  • 0 kudos
kunalmishra9
by Contributor
  • 2479 Views
  • 2 replies
  • 1 kudos

Resolved! Databricks Connect: Enabling Arrow on Serverless Compute

I recently upgraded my Databricks Connect version to 15.4 and got set up for Serverless, but ran into the following error when I ran the standard code to enable Arrow on Pyspark: >>> spark.conf.set(key='spark.sql.execution.arrow.pyspark.enabled', val...

  • 2479 Views
  • 2 replies
  • 1 kudos
Latest Reply
kunalmishra9
Contributor
  • 1 kudos

Gotcha, thanks! Missed it in the limitations.

  • 1 kudos
1 More Replies
ashraf1395
by Honored Contributor
  • 895 Views
  • 3 replies
  • 0 kudos

Can we change our cloud service connected with our Databricks account

We are moving from old aws account to azure account. Is there any way. I can migrate my old databricks account to this new azure account.I have my Databricks partner workspace access with this Databricks account. That's the reason, I want to keep thi...

  • 895 Views
  • 3 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

Unfortunately I am not able to find a way to move the workspace, if you have an account representative within Databricks I will suggest to reach out to see any options you can have to migrate also this credits if possible  

  • 0 kudos
2 More Replies
LMe
by New Contributor
  • 1860 Views
  • 4 replies
  • 0 kudos

Get a static IP for my Databricks App

Hello,I'm trying to find how to set-up a static IP for a Azure Databricks App. I tried to set-up a NAT gateway to have a static IP for the workspace, but it doesn't change anything, I still can't access my OpenAI ressource even if I authorize the NaT...

  • 1860 Views
  • 4 replies
  • 0 kudos
Latest Reply
TechGuy329
New Contributor II
  • 0 kudos

Hi, I’m following up here as I have the same issue. Did the solution provided in the replies help resolve this for you?

  • 0 kudos
3 More Replies
drumcircle
by New Contributor II
  • 997 Views
  • 1 replies
  • 1 kudos

Determining spill from system tables

I'm trying to optimize machine selection (D, E, or L types on Azure) for job clusters and all-purpose compute and am struggling to identify where performance is sagging on account of disk spill.  Disk spill would suggest that more memory is needed.  ...

  • 997 Views
  • 1 replies
  • 1 kudos
Latest Reply
Walter_C
Databricks Employee
  • 1 kudos

For historical diagnostics, you might need to consider setting up a custom logging mechanism that captures these metrics over time and stores them in a persistent storage solution, such as a database or a logging service. This way, you can query and ...

  • 1 kudos
rtreves
by Contributor
  • 3817 Views
  • 15 replies
  • 0 kudos

Resolved! Permissions error on cluster requirements.txt installation

Hi Databricks Community,I'm looking to resolve the following error:Library installation attempted on the driver node of cluster {My cluster ID} and failed. Please refer to the following error message to fix the library or contact Databricks support. ...

  • 3817 Views
  • 15 replies
  • 0 kudos
Latest Reply
rtreves
Contributor
  • 0 kudos

Noting here for other users: I was able to resolve the issue on a shared cluster by cloning the cluster and using the clone.

  • 0 kudos
14 More Replies
ambigus9
by Contributor
  • 2977 Views
  • 8 replies
  • 3 kudos

PrivateLink Validation Error - When trying to access to Workspace

We have a workspace that had been deployed on AWS customer architecture using Terraform privatelink: https://registry.terraform.io/providers/databricks/databricks/latest/docs/guides/aws-private-link-workspaceThe fact is when we disable the Public Acc...

ambigus9_0-1732035784493.png ambigus9_1-1732035847145.png ambigus9_2-1732037994364.png ambigus9_3-1732038098998.png
  • 2977 Views
  • 8 replies
  • 3 kudos
Latest Reply
Walter_C
Databricks Employee
  • 3 kudos

Can you share your workspace id so I can do a validation?  

  • 3 kudos
7 More Replies
jjsnlee
by New Contributor II
  • 764 Views
  • 2 replies
  • 0 kudos

Can't create cluster in AWS with p3 instance type

Hi, I'm trying to create a `p3.2xlarge` in my workspace, but the cluster fails to instantiate, specifically getting this error message: `No zone supports both the driver instance type [p3.2xlarge] and the worker instance type [p3.2xlarge]` (though I ...

  • 764 Views
  • 2 replies
  • 0 kudos
Latest Reply
jjsnlee
New Contributor II
  • 0 kudos

Yes sorry for the double post (I couldn't figure out how to delete this one)

  • 0 kudos
1 More Replies
John_OC
by New Contributor
  • 472 Views
  • 1 replies
  • 0 kudos

Querying on multi-node cluster on AWS does not complete

Querying in isolation mode is completely fine but when trying to run the same query using the multi-node it does complete or error out. Any assistance to troubleshoot this issue? oh, Happy New year if you're reading this.

  • 472 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

hello John,Happy new year to you, can you please confirm what is the error message received? when you say isolation mode do you mean single node or do you refer to single user cluster while the other is shared mode?

  • 0 kudos
staskh
by New Contributor III
  • 2169 Views
  • 5 replies
  • 2 kudos

Resolved! S3 access credentials: Pandas vs Spark

Hi,I need to read Parquet files located in S3 into the Pandas dataframe.I configured "external location" to access my S3 bucket and havedf = spark.read.parquet(s3_parquet_file_path)working perfectly well.However, df = pd.read_parquet(s3_parquet_file_...

  • 2169 Views
  • 5 replies
  • 2 kudos
Latest Reply
Walter_C
Databricks Employee
  • 2 kudos

Yes, you understand correctly. The Spark library in Databricks uses the Unity Catalog credential model, which includes the use of "external locations" for managing data access. This model ensures that access control and permissions are centrally mana...

  • 2 kudos
4 More Replies
nanda_
by New Contributor
  • 1891 Views
  • 2 replies
  • 1 kudos

Assistance Required: Integrating Databricks ODBC Connector with Azure App Service

Hi,I have successfully established an ODBC connection with Databricks to retrieve data from the Unity Catalog in a local C# application using the Simba Spark ODBC Driver, and it is working as expected.I now need to integrate this functionality into a...

  • 1891 Views
  • 2 replies
  • 1 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 1 kudos

Hi @nanda_ ,So basically what you need to do is to install simba odbc driver on your Azure App Service environment. Then your code should work in the same way as in your local machine.One possibility is to use Windows or Linux Containers on Azure App...

  • 1 kudos
1 More Replies
soumiknow
by Contributor II
  • 4318 Views
  • 1 replies
  • 0 kudos

Resolved! How to add 'additionallyAllowedTenants' in Databricks config or PySpark config?

I have a multi-tenant Azure app. I am using this app's credentials to read ADLS container files from Databricks cluster using PySpark dataframe.I need to set this 'additionallyAllowedTenants' flag value to '*' or a specific tenant_id of the multi-ten...

  • 4318 Views
  • 1 replies
  • 0 kudos
Latest Reply
soumiknow
Contributor II
  • 0 kudos

Update: Currently spark does not have this feature.

  • 0 kudos