cancel
Showing results for 
Search instead for 
Did you mean: 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Carsten03
by New Contributor III
  • 24638 Views
  • 9 replies
  • 6 kudos

Resolved! Run workflow using git integration with service principal

Hi,I want to run a dbt workflow task and would like to use the git integration for that. Using my personal user I am able to do so but I am running my workflows using a service principal.I added git credentials and the repository using terraform. I a...

  • 24638 Views
  • 9 replies
  • 6 kudos
Latest Reply
Ramana
Valued Contributor
  • 6 kudos

Databricks has updated documentation on authorizing a service principal to access Git folders.Now Databricks has 3 different options to run the jobs by pointing to the Git code.1. User PAT - Configure Git credentials & connect a remote repo to Databr...

  • 6 kudos
8 More Replies
alcole
by New Contributor II
  • 81 Views
  • 2 replies
  • 4 kudos

Databricks Release Hub

I launched a new app this week to help keep track of Databricks releases.you can view and filter the latest releases in the timeline view, or go to the resources page and go to a product area and see the latest releases alongside useful links for blo...

  • 81 Views
  • 2 replies
  • 4 kudos
Latest Reply
Raman_Unifeye
Contributor
  • 4 kudos

@alcole - thanks for sharing it. I already bookmarked it last week when saw it on social.

  • 4 kudos
1 More Replies
Marco37
by Contributor II
  • 2567 Views
  • 13 replies
  • 6 kudos

Resolved! Install python packages from Azure DevOps feed with service principal authentication

At the moment I install python packages from our Azure DevOps feed with a PAT token as authentication mechanism. This works well, but I want to use a service principal instead of the PAT token.I have created an Azure service principal and assigned it...

Marco37_0-1753975679472.png Marco37_1-1753975813527.png Marco37_2-1753975934347.png
  • 2567 Views
  • 13 replies
  • 6 kudos
Latest Reply
FilipD
Visitor
  • 6 kudos

I'm kinda late to the party but what is the suggested way of retriving the access token rn? Using some #bash or python code stored in global init script or cluster scoped init scripts? I don't want to stored this code in the notebook.Idea is to block...

  • 6 kudos
12 More Replies
APJESK
by New Contributor III
  • 43 Views
  • 1 replies
  • 0 kudos

Can anyone share Databricks security model documentation or best-practice references

Can anyone share Databricks security model documentation or best-practice references

  • 43 Views
  • 1 replies
  • 0 kudos
Latest Reply
Coffee77
Contributor III
  • 0 kudos

Here is the official documentation of Databricks: https://docs.databricks.com/aws/en/security/  Do you need to dive deeper into any specific area?

  • 0 kudos
chandru44
by New Contributor
  • 59 Views
  • 1 replies
  • 0 kudos

Moving Databricks Metastore Storage Account Between Azure Subscriptions

I have two Azure subscriptions: one for Prod and another for Non-Prod. During the initial setup of the Non-Production Databricks Workspace, I configured the metastore storage account in the Non-Prod subscription. However, I now want to move this meta...

chandru44_0-1763266537882.png
  • 59 Views
  • 1 replies
  • 0 kudos
Latest Reply
Coffee77
Contributor III
  • 0 kudos

Assuming the metastore is the same for your DEV and PROD environments and what you want is just to use the same storage account + container to place managed tables, volumes, etc. in theory you just need to copy all content from your source storage ac...

  • 0 kudos
margarita_shir
by New Contributor
  • 74 Views
  • 1 replies
  • 0 kudos

aws databricks with frontend private link

 In aws databricks documentation, frontend PrivateLink assumes a separate transit VPC connected via Direct Connect/VPN. However, I'm implementing a different architecture using Tailscale for private network access.My setup: Tailscale subnet router de...

  • 74 Views
  • 1 replies
  • 0 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 0 kudos

Hello @margarita_shir  Short answer: Yes—if your clients can privately reach the existing Databricks “Workspace (including REST API)” interface endpoint, you can reuse that same VPC endpoint for front‑end (user) access. You must not try to use the se...

  • 0 kudos
pdiamond
by Contributor
  • 66 Views
  • 1 replies
  • 1 kudos

Lakebase query history / details

Is there somehwere in Databricks that I can see details about queries run againt one of my Lakebase databases (similar to query history system tables)?What I'm ultimately trying to figure out is where the time is being spent between when I issue the ...

  • 66 Views
  • 1 replies
  • 1 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 1 kudos

Hi @pdiamond ,Currently in beta there's a feature that let's you monitor active queries:https://docs.databricks.com/aws/en/oltp/projects/active-queriesAlso in beta there's Lakebase SQL editor that will allow you to analyze queries:https://docs.databr...

  • 1 kudos
RDE305
by New Contributor II
  • 166 Views
  • 1 replies
  • 0 kudos

A single DLT for Ingest - feedback on this architecture

What are your thoughts on this Databricks pipeline design?Different facilities will send me backups of a proprietary transactional database containing tens of thousands of tables. Each facility may have differences in how these tables are populated o...

  • 166 Views
  • 1 replies
  • 0 kudos
Latest Reply
nayan_wylde
Esteemed Contributor
  • 0 kudos

Your design shows strong alignment with the Medallion Architecture principles and addresses schema variability well, but there are some scalability and governance considerations worth discussing. Also Pre-Bronze, Building a schema registry early is e...

  • 0 kudos
FabianGutierrez
by Contributor
  • 1330 Views
  • 3 replies
  • 0 kudos

Looking for experiences with DABS CLI Deployment, Terraform and Security

Hi Community,I hope my topic finds you well. Within our Databricks landscape we decided to use DABS (Databricks Asset Bundles) however we found out (the hard way) that it uses Terraform for Deployment purposes. This is a concern now for Security and ...

  • 1330 Views
  • 3 replies
  • 0 kudos
Latest Reply
Coffee77
Contributor III
  • 0 kudos

Try to use always service principals to deploy your asset bundles. If desired take a look here: https://www.youtube.com/watch?v=5WreXn0zbt8 Concerning terraform state, it is indeed generated, take a look at this picture extracted from one of my deplo...

  • 0 kudos
2 More Replies
maikel
by New Contributor III
  • 187 Views
  • 3 replies
  • 1 kudos

Agent outside databricks communication with databricks MCP server

Hello Community!I have a following use case in my project:User -> AI agent -> MCP Server -> Databricks data from unity catalog.- AI agent is not created in the databricks- MCP server is created in the databricks and should expose tools to get data fr...

  • 187 Views
  • 3 replies
  • 1 kudos
Latest Reply
mark_ott
Databricks Employee
  • 1 kudos

Hopefully this helps... You can securely connect your external AI agent to a Model Context Protocol (MCP) server and Unity Catalog while maintaining strong control over authentication and resource management. The method depends on whether MCP is outs...

  • 1 kudos
2 More Replies
eshwari
by New Contributor III
  • 140 Views
  • 1 replies
  • 1 kudos

Restricting Catalog and External Location Visibility Across Databricks Workspaces

Restricting Catalog and External Location Visibility Across Databricks Workspaces I am facing exact similar issue, But I don't want to create separate metastore. and I have added environment name as a prefix to all external locations. All the locatio...

  • 140 Views
  • 1 replies
  • 1 kudos
Latest Reply
mark_ott
Databricks Employee
  • 1 kudos

You can hide or scope external locations and catalogs so they are only visible within their respective Databricks workspaces—even when using a shared metastore—by using "workspace binding" (also called isolation mode or workspace-catalog/workspace-ex...

  • 1 kudos
erigaud
by Honored Contributor
  • 2590 Views
  • 5 replies
  • 4 kudos

Resolved! DLT-Asset bundle : Pipelines do not support a setting a run_as user that is different from the owner

Hello !We're using Databricks asset bundles to deploy to several environments using a devops pipeline. The service principal running the CICD pipeline and creating the job (owner) is not the same as the SPN that will be running the jobs (run_as).This...

  • 2590 Views
  • 5 replies
  • 4 kudos
Latest Reply
Coffee77
Contributor III
  • 4 kudos

Maybe I'm not catching this or missing something else but I've got the following job in one of my demo workspaces:Creator is my user and the job runs as a service principal account. Those are different identities. I got this by deploying the job with...

  • 4 kudos
4 More Replies
LEE_SUKJUN
by New Contributor II
  • 101 Views
  • 1 replies
  • 0 kudos

Inquire the location of creating a metastore resource

 I created Databricks on AWS today.Of course, we're planning to switch to paid.By the way, the Metastore is in the US Region after I was born.I'm a Korea APJ, and is the Metastore all only run in the US?Does the Metastore have no impact if I query or...

LEE_SUKJUN_0-1762932473218.png
  • 101 Views
  • 1 replies
  • 0 kudos
Latest Reply
Coffee77
Contributor III
  • 0 kudos

Hi @LEE_SUKJUN , I think the general principle should be to keep all components (metastore, workspace, and cloud storage) in the same region in order to avoid cross-region latency, data egress costs, and compliance issues.Concerning number of metasto...

  • 0 kudos
DazMunro
by New Contributor
  • 122 Views
  • 2 replies
  • 0 kudos

Using or integrating SIlver or Gold Zone data in an Operational API

I am looking to understand what sort of approach we can take to use Silver or Gold zone data in an Operational style API, or even if we should. We have data that makes it's way to the Silver and Gold zones in our Medallion Architecture and it kind of...

  • 122 Views
  • 2 replies
  • 0 kudos
Latest Reply
Rjdudley
Honored Contributor
  • 0 kudos

Databricks is an analytics system and isn't optimized to perform as an OLTP system.  Additionally, Databricks compute can scale to zero if you set it to do so.  This means if you want to use gold/silver data in a real-time way you need to keep a clus...

  • 0 kudos
1 More Replies