Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
When I try to delete a table, I'm getting this error:[ErrorClass=INVALID_STATE] TABLE catalog.schema.table_name cannot be deleted because it is being shared via Delta Sharing.I have checked on the internet about it, but could not find any info about ...
Hi @IGRACH ,You are facing this issue bcz I guess the table you want to delete is being shared by delta sharing. You can go to the shared object by following this dochttps://docs.databricks.com/aws/en/delta-sharing/create-share#update-sharesandThen, ...
Hi all,I am able to deploy Databricks assets to the target workspace. Jobs and workflows can also be created successfully.But I have aspecial requirement, that I copy the note books to the target folder on databricks workspace.Example:on Local I have...
Hello @ashraf1395 ,Nice to hear you and thank you for your hints.Actually with your idea, I could reach half of my aim you can see here the folder structure in my VS code:and here is part of my `databrick.yml` file:targets: dev: # The default tar...
I have an issue with DAB where all the project files, starting from root ., get deployed to the /files folder in the bundle. I would prefer being able to deploy certain util notebooks, but not all the files of the project. I'm able to not deploy any ...
Hello everyone ,We currently have 2 streaming (Bronze job) created on 2 tasks in the same job, running the same compute job and both merge data into the same table (Silver table). If I create it like above, sometimes I get an error related to "insert...
Hello community,I have implemented a DLT pipeline.In the "Destination" setting of the pipeline I have specified a unity catalog with target schema of type external referring to an S3 destination.My DLT pipeline works well. Yet, I noticed that all str...
This won't work.best approach is create dlt sink to write to delta external table. This pipeline should only be 1 step. Read table and append flow using data sink. It works fine.
We have a CI/CD pipeline where we run:databricks bundle deploy [...]The code works fine, however, if we missconfigure it, we see in the output an error message such asDeploying resources...
Updating deployment state...
Warning: Detected unresolved va...
Since last year, we have adopted Databricks Asset Bundles for deploying our workflows to the production and staging environments. The tool has proven to be quite effective, and we currently use Azure DevOps Pipelines to automate bundle deployment, tr...
I am running below code -df = spark.read.json('xyz.json')df.countI want to understand the actual working of the spark. How many jobs & stages will be created. I want to understand the detailed & easier concept of how it works?
I am trying to run a job with (1) custom containers, and (2) via an instance pool. Here's the setup:The custom container is just the DBR-provided `databricksruntime/standard:12.2-LTS`The instance pool is defined via the UI (see screenshot, below).At ...
I think I have solved this. I added a URL for `preloaded_docker_image` to my instance pool, and the job worked correctly.This suggests that the DBR docs for preloaded_docker_image are incomplete; they should clarify that a user must add an entry in o...
I am running a job on a Cluster from a compute pool that is installing a package from our Azure Artifacts Feed. My task is supposed to run a wheel task from our library which has about a dozen dependencies.For more than 95% of the runs this job works...
I'm trying to use a custom docker image for my job. This is my docker file:FROM databricksruntime/standard:12.2-LTS
COPY . .
RUN /databricks/python3/bin/pip install -U pip
RUN /databricks/python3/bin/pip install -r requirements.txt
USER rootMy job ...
Hi.I have a workflow in which I write few rows into a Delta Table with auto-generated IDs. Then, I need to retrieve them back just after they're written into the table to collect those generated IDs, so I read the table and I use two columns (one is ...
We run several workflows and tasks parallel using serverless compute. In many different places of code we started to get errors as below. It looks like that when one task fails, every other that run at the same moment fails as well. After retry on on...
I am trying to assign my databricks_current_metastore on terraform and I get the following error back as an output Error: cannot read current metastore: cannot get client current metastore: invalid Databricks Workspace configurationwith data.databric...
@badari_narayan Based on above terraform code, you are trying to use the databricks.accounts provider to read the current workspace metastore, which is incorrect — the databricks_current_metastore data source is a workspace-level resource, and must b...
Hi,Question: Are expectations supposed to function in conjunction with create_streaming_table() and apply_changes_from_snapshot?Our team is investigating Delta Live Tables and we have a working prototype using Autoloader to ingest some files from a m...
Hi Stefan-Koch,We reached out to our account rep and was instructed to create an Azure support ticket since we do not yet have a paid support plan. We are hoping to negotiate for paid support. However, I do not believe the documentation surrounding...