cancel
Showing results for 
Search instead for 
Did you mean: 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Lakehouse Architecture


Forum Posts

TSK
by New Contributor
  • 313 Views
  • 0 replies
  • 0 kudos

GitLab on DCS, Datarbricks Container Services

I would like to set up GitLab and Grafana servers using Databricks Container Services (DCS). The reason is that our development team is small, and the management costs of using EKS are not justifiable. We want to make GitLab and Grafana accessible in...

Administration & Architecture
AWS
Container
DevOps
EKS
Kubernetes
  • 313 Views
  • 0 replies
  • 0 kudos
noorbasha534
by Contributor
  • 555 Views
  • 4 replies
  • 0 kudos

Libraries installation governance

Dear allI like to know the best practices around libraries installation on Databricks compute - all-purpose, job.The need is to screen the libraries, conduct vulnerability tests, and then let them be installed through a centralized CI/CD process. How...

  • 555 Views
  • 4 replies
  • 0 kudos
Latest Reply
noorbasha534
Contributor
  • 0 kudos

@filipniziol thanks again for your time. The thing is we like to block access to these URLs as at times we found developers & data scientists downloading packages that were marked as vulnerable by Maven.

  • 0 kudos
3 More Replies
VJ5
by New Contributor
  • 347 Views
  • 2 replies
  • 0 kudos

Azure Databricks Serverless Compute

Hello,Looking for documents related to Azure Databricks Serverless Compute. What are the things we need to consider for security point of view when we decide to use serverless compute? 

  • 347 Views
  • 2 replies
  • 0 kudos
Latest Reply
David-jono123
New Contributor II
  • 0 kudos

These steps are really helpful. I especially appreciate the reminder to check my credentials and consider browser-related issues, as those are often overlooked. I'll make sure to clear my cache and cookies first, and if that doesn't work, I’ll try us...

  • 0 kudos
1 More Replies
matthiasjg
by New Contributor II
  • 270 Views
  • 1 replies
  • 0 kudos

How to NOT install or disable or uninstall Databricks Delta Live Tables dlt module on jobs cluster?

I need to NOT have the Databricks Delta Live Tables (DLT) Python stub installed on job cluster b/c of naming conflict w/ pip library dlt (and I also don't need delta live tables).There is no "simple" way of uninstalling. It's not installed via pip as...

  • 270 Views
  • 1 replies
  • 0 kudos
Latest Reply
matthiasjg
New Contributor II
  • 0 kudos

For anyone facing a similar problem: I've addressed the issue of the dlt module conflict on my job cluster, by using an init script to remove the dlt module from the cluster's Python environment.Simply by doing:%bash #!/bin/bash rm -rf /databricks/sp...

  • 0 kudos
Awoke101
by New Contributor III
  • 219 Views
  • 0 replies
  • 0 kudos

Ray cannot detect GPU on the cluster

I am trying to run ray on databricks for chunking and embedding tasks. The cluster I’m using is:g4dn.xlarge1-4 workers with 4-16 cores1 GPU and 16GB memoryI have set spark.task.resource.gpu.amount to 0.5 currently.This is how I have setup my ray clus...

  • 219 Views
  • 0 replies
  • 0 kudos
JessieWen
by Databricks Employee
  • 233 Views
  • 1 replies
  • 0 kudos

legacy repo error fetching git status files over 200MB

Working directory contains files that exceed the allowed limit of 200 MB.  how to solve this?

  • 233 Views
  • 1 replies
  • 0 kudos
Latest Reply
filipniziol
Contributor III
  • 0 kudos

Hi @JessieWen ,What you can do besides removing some files from the repo, is to use "Sparce mode" and select only certain paths to be synchronized with Databricks repos. Hope it helps

  • 0 kudos
harripy
by New Contributor III
  • 4005 Views
  • 8 replies
  • 0 kudos

Databricks SQL connectivity in Python with Service Principals

Tried to use M2M OAuth connectivity on Databricks SQL Warehouse in Python:from databricks.sdk.core import Config, oauth_service_principal from databricks import sql .... config = Config(host=f"https://{host}", client_...

  • 4005 Views
  • 8 replies
  • 0 kudos
Latest Reply
Mat_Conquest
New Contributor II
  • 0 kudos

Did anyone get this to work? I have tried the code above but I get a slightly different error but I don't see the same level of details from the logs2024-10-04 14:59:25,508 [databricks.sdk][DEBUG] Attempting to configure auth: pat2024-10-04 14:59:25,...

  • 0 kudos
7 More Replies
OU_Professor
by New Contributor II
  • 11913 Views
  • 1 replies
  • 0 kudos

Connect Community Edition to Power BI Desktop

I have submitted this question several times to Databricks over the past few weeks, and I have gotten no response at all, not even an acknowledgement that my request was received.Please help.How can I connect a certain dataset in Databricks Community...

  • 11913 Views
  • 1 replies
  • 0 kudos
Latest Reply
Knguyen
New Contributor II
  • 0 kudos

Hi @Retired_mod,It seams the Commnity Edition doesn't let us to generate the personal-access-token any more. Could you let us know some where we can get the token in the Comminity Edition?Thanks.

  • 0 kudos
echiro
by New Contributor II
  • 450 Views
  • 1 replies
  • 0 kudos

cluster administrator

Is individual cluster more cost effective or shared group cluster?

  • 450 Views
  • 1 replies
  • 0 kudos
Latest Reply
rangu
New Contributor III
  • 0 kudos

This is very generic, it depends upon use case. If you have a bunch of users trying to read data from catalogs, and perform data analysis or analytics creating a common cluster will be more cost effective and provided better performance. Also, largel...

  • 0 kudos
JKR
by Contributor
  • 496 Views
  • 1 replies
  • 0 kudos

How to assign user group for email notification in databricks Alerts

How can I assign a azure databricks user group to an alert for notification?Current scenario is whenever we need to add a user for alert email notification we are manually adding that user email address to each we setup (more than 100) which is very ...

JKR_0-1723550146638.png
  • 496 Views
  • 1 replies
  • 0 kudos
Latest Reply
rangu
New Contributor III
  • 0 kudos

One option is to handle the logic inside the python notebook to trigger alerts using emali and smtp lib which accepts databricks local groups and AD groups that are synched.

  • 0 kudos
cgrass
by New Contributor III
  • 271 Views
  • 0 replies
  • 0 kudos

Creating Group in Terraform using external_id

The documentation here doesn't give much information about how to use `external_id` when creating a new group. If I reference the object_id for an Azure AD Group, the databricks group gets created but the members from the AD group are not added, nor ...

  • 271 Views
  • 0 replies
  • 0 kudos
cgrass
by New Contributor III
  • 579 Views
  • 1 replies
  • 0 kudos

Resolved! Resource organization in a large company

Hello,We are using Azure Databricks in a single tenant. We will have many teams working in multiple (Unity Enabled) Workspaces using a variety of Catalogs, External Locations, Storage Credentials, ect. Some of those resources will be shared (e.g., an...

Administration & Architecture
Architecture
azure
catalogs
design
  • 579 Views
  • 1 replies
  • 0 kudos
Latest Reply
cgrass
New Contributor III
  • 0 kudos

We are using Azure Databricks in a single tenant. We will have many teams working in multiple (Unity Enabled) Workspaces using a variety of Catalogs, External Locations, Storage Credentials, etc. Some of those resources will be shared (e.g., an Exter...

  • 0 kudos
SunilSamal
by New Contributor II
  • 590 Views
  • 3 replies
  • 0 kudos

HTTPSConnectionPool(host='sandvik.peakon.com', port=443): Max retries exceeded with url: /api/v1/seg

while connecting to an api from databricks notebook with the bearer token I am getting the below errorHTTPSConnectionPool(host='sandvik.peakon.com', port=443): Max retries exceeded with url: /api/v1/segments?page=1 (Caused by SSLError(SSLCertVerifica...

  • 590 Views
  • 3 replies
  • 0 kudos
Latest Reply
saikumar246
Databricks Employee
  • 0 kudos

Hi @SunilSamal  The error you are encountering, SSLCertVerificationError, indicates that the SSL certificate verification failed because the local issuer certificate could not be obtained. This is a common issue when the SSL certificate chain is inco...

  • 0 kudos
2 More Replies
Seb_G
by New Contributor
  • 494 Views
  • 0 replies
  • 0 kudos

Unity Catalog Volume mounting broken by cluster environment variables (http proxy)

Hello all,I have a slightly niche issue here, albeit one that others are likely to run into.Using databricks on Azure, my organisation has included extended our WAN into the cloud, so that all compute clusters are granted a private IP address that ca...

  • 494 Views
  • 0 replies
  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels