Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
Hi all, I have a couple of use cases that may benefit from using graphs. I'm interested in whether anyone has graph databases in Production and, if so, whether you're using GraphFrames, Neo4j or something else? What is the architecture you have the...
Up to now the way to go is graphx or graphframes.There is also the possibility to use python libraries or others (single node that is), perhaps even Arrow-based.Another option is to load the data to a graph database and then move back to databricks a...
Hi Team,In our company we are planning to migrate our workflows with Databricks Asset bundles, is it mandatory to install Databricks CLI tool for getting started with DAB ? Any one who integrated with Github with CICD pipeline please let me know the ...
I forgot the CI/CD part:that is not that hard. Basically in DAB you define the type of environment you are using.If you use 'development', DAB assumes you are in actual development mode (feature branch). so there you can connect git and put the fil...
Hello,I am working on creating an architecture diagram for Databricks on AWS.I would like to adopt the de facto standard used by enterprises. Based on my research, I have identified the following components:Network: Customer-managed VPC,Secure Cluste...
I would not call it a 'standard' but a possible architecture. The great thing about the cloud is you can complete the puzzle in many ways and make it as complex or as easy as possible.Also I would not consider Fivetran to be standard in companies. ...
I am using Databricks Asset Bundles to deploy Databricks workflows to all of my target environments (dev, staging, prod). However, I have one specific workflow that is supposed to be deployed only to the dev target environment.How can I implement tha...
Hi, I'm also looking to deploy different jobs in different targets. And these jobs are defined in a separate .yml file and we'll need to reference these jobs in the targets accordingly. Any updates on this implementation?
@bigger_dave If you are trying to create a compute policy, permissions tab should be available during configuration. If you wanted to grant to an existing policy, then permissions tab is available once you choose edit the policy. If you are looking f...
Hello,My organization is experiencing difficulties updating our Google Kubernetes Engine (GKE) cluster.We've reviewed the official GKE documentation for automated cluster updates, but it appears to primarily focus on AWS integrations. We haven't foun...
HiI am trying to setup an oauth connection with databricks, so I ask the user to enter their Workspace URL and ClientId.Once the user enters these values, I want to validate whether they are correct or not, so I ask them to login by redirecting them ...
Hi All,Based on the article below to enable Genie one needs to:1. Enable Azure AI services-powered featuresThat is done:2. Enable Genie must be enabled from the Previews pageI do not see Genie among Previews:I am using Azure Databricks. Any idea how ...
We have a Self service portal through which users can launch databricks clusters of different configurations. This portal is set up to work in Dev, Sandbox and Prod environments. We have configured databricks workspaces only for Sandbox and Prod por...
@Alberto_Umana Thanks for sharing doc linksWe have exact same set up to support shared databricks workspace. But still Im facing issue while adding instance profileI am trying to add AWS Instance Profile created in source AWS Account (No databricks w...
Many packages output a html-report, e.g. ydata-profiler. The report contains links to other parts of the report. But when the user clicks the links a new window is opened instead of scrolling to the correct section of the displayed html.Could this be...
Hello @invalidargument,
Currently, there is no direct support from the Databricks end to modify this behavior without using such a workaround. The displayHTML function in Databricks renders HTML content within an iframe, and the injected JavaScript h...
OK, as it turns out - in order to bypass proxy we needed to set no_proxy env variable in both upper and lower case (!), like this:NO_PROXY="adb-xxx.azuredatabricks.net"
docker run \
-v %teamcity.build.checkoutDir%:/my-bundle \
-v %teamcity.build...
We are exploring Databricks Apps.We want Databricks APP to interact with AWS Secret Manager. How we can configure this and configure IAM on AWS side for this to take place.@app
Thanks @Alberto_Umana .. Yes we will try to use databricks secret, that can be helpful. Couple of other questions on Databricks App 1) Can we use Framework other than mentioned in documentation( Streamlit,Flask,Dash,Gradio,Shiny).2) If required can w...
Hello,We are forwarding this Microsoft tutorial to secure our storage access:https://learn.microsoft.com/en-us/azure/databricks/security/network/serverless-network-security/serverless-firewallWe have a weird behavior when we create several NCCs in th...
Ok, so no, I correctly set the subnets of my NCC in the Virtual Networks setting as documented:https://learn.microsoft.com/en-us/azure/databricks/security/network/serverless-network-security/serverless-firewallThis setting is working fine, without th...
Hi,I want to create an sql warehouse with {{"data_security_mode": "USER_ISOLATION" }} however, I dont find the section to get the json file of my cluster. Thanks
Hello,In my job I have a task where I should modify a notebook to get dynamically the environment, for example:This is how we get it:dic = {"D":"dev", "Q":"qa", "P":"prod"}managedResourceGroup = spark.conf.get("spark.databricks.xxxxx")xxxxx_Index = m...