Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
Hello,I've configured the DABs on our project successfully. Moreover, I could switch from setuptools to poetry almost successfully. In the project's databricks.yml I configured it as the documentation suggested, I've just changed the name of the arti...
Hi @Fiabane ,Could you first check:Do you see your .whl file in your artifacts folder?Could you try to install the package by running the code in your notebook : %pip install <path to your wheel>As far as I understand you want to have a job ...
I have a schema that has grown very large. There are mainly two types of tables in it. One of those types accounts for roughly 80% of the storage. Is there a way to somehow set a policy for those tables only to transfer them to a different storage cl...
I would like to consolidate all our Spark jobs in Databricks. One of those jobs that are currently running in Azure HDInsight is not properly working using a Databricks JAR job.It uses Spark 3.3 RDDs and requires configuring Kryo serialisation. There...
Integrating Spark tasks with Databricks can greatly improve your workflow. For tasks that require Kryo serialization, make sure you configure your Spark session correctly. You may need to adjust the serialization settings in your Spark configuration....
I’m designing a compute plane configuration that will align our data platform with internal policies from a security perspective. As part of this exercise I'm documenting how the permissible traffic inbound and outbound is controlled using NSG rules,...
@Jim-Shady wrote:I’m designing a compute plane configuration that will align our data platform with internal policies from a security perspective. As part of this exercise I'm documenting how the permissible traffic inbound and outbound is controlled...
I have a CI/CD process that after a Pull Request (PR) to main it deploys to staging.It works using a Personal Access Token using Azure Pipelines.From local, deploying using Service Principal works (https://community.databricks.com/t5/administration-a...
I needed to deploy a job using CI/CD Azure Pipelines without using the OAuth, this is the way:First you need to have configured the Service Principal, for that you need to generate it in your workspace with this you will have:A host: Which is your wo...
I am working on a notebook to help me create Azure Databricks Groups. When I create a group in a workspace using the UI, it automatically creates the group at the account level and links them. When I create a group using the API, and I create the w...
I am attempting to create a task in a job using the Git Provider as a source and GitHub is the provider. The repo is a private repo. Regardless of how I enter the path to the notebook I receive the same error that the notebook path is invalid and o...
Like I said in a previous response. This started working automatically a few days ago with no changes on our end. The developer who was working on this decided to try it one more time and it just worked, no error this time. I don't know if Databri...
Good afternoon to all and I am new to this community.We are trying to bring data from databricks to sharepoint list using the Power Automate app (create workflow and trigger it when there is new record or exising record is modified in source table in...
Browsing this page of the documentation, the displayed GIF shows a notebook that is opened in its own tab. I've been looking for how to enable this feature in my own workspace, but cannot find it.Does anyone know how to enable this feature?
Hello, I currently have a Service Principal (SP) Client_Id and its associated secret, I generated it directly from my workspace in Databricks, i was following this post: https://github.com/databricks/cli/issues/1722, but I don't know how to generate ...
Learn to summon an Azure Subscription from a Databricks-generated Service Principal. Harness the power of data with this vital step in Azure infrastructure management. Mastering it is as crucial as surviving Fnaf
As recommended by Databricks, we are trying to use Compute Policies to set environment variables, which are used by our notebooks, across clusters.However, when specifying a JSON string as env var, we are getting this error upon applying the policy t...
This is because you use Shared access mode.This enables multiple users to use the cluster simultaneously.However, there are features that do not work on these Shared access mode clusters:https://docs.databricks.com/en/compute/access-mode-limitations....
Hi there,We have one Azure tenant with multiple subscriptions. Each subscription is a project for itself.At this moment, we have only one Azure Databricks account, and all workspaces (created under different subscriptions) are associated with it.Can ...
hello @stevanovic ,as far as I understand, in Azure, you can create one databricks account per tenant, meaning for example unity catalog is also tenant-level resource.There is a fantastic blog post available here:https://community.databricks.com/t5/t...
We have a few people working in Databricks right now in different clones of the same repository. Occasionally we'll have multiple people with the same branch open- one working, another just has it open to see what it looks like, sort of deal.This has...
hi @Kayla ,I think the easiest way to check the current notebook location when opened is just hover the mouse cursor over the name of the notebook (top left, "ADE 3.1 - Streaming Deduplication" in this case) and wait for about 1-2 seconds; after that...
Is it safe to run a delete query when there are active writes to a delta lake table? Next question : Is it safe to run a vacuum when writes are being done actively?
Hello @sharat_n ,Yes, it is generally safe to run a DELETE query on a Delta Lake table while active writes are happening.Delta Lake is designed with ACID transactions, meaning operations like DELETE, UPDATE, and MERGE are atomic and isolated.In other...
Has anyone encountered this error and knows how to resolve it?"Unexpected end of stream, read 0 bytes from 4 (socket was closed by server)."This occurs in Databricks while generating reports.I've already adjusted the wait_timeout to 28,800, and both ...