cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Hubert-Dudek
by Databricks MVP
  • 28993 Views
  • 4 replies
  • 26 kudos

How to connect your Azure Data Lake Storage to Azure DatabricksStandard Workspace �� Private link In your storage accounts please go to “Networ...

How to connect your Azure Data Lake Storage to Azure DatabricksStandard Workspace Private linkIn your storage accounts please go to “Networking” -> “Private endpoint connections” and click Add Private Endpoint.It is important to add private links in ...

image.png image.png image.png image.png
  • 28993 Views
  • 4 replies
  • 26 kudos
Latest Reply
dollyb
Contributor II
  • 26 kudos

This should be updated for Unity Catalog workspaces. 

  • 26 kudos
3 More Replies
Trilleo
by New Contributor III
  • 1646 Views
  • 1 replies
  • 0 kudos

STATEMENT_TIMEOUT on a specific SQL Warehouse

Hi, I would like to se STATEMENT_TIMEOUT for a specific SQL warehouse and not on a global level.How would I do that?P.s. I would like to avoid it on a session level, just one-time configuration for a given SQL warehouse. 

  • 1646 Views
  • 1 replies
  • 0 kudos
Latest Reply
MoJaMa
Databricks Employee
  • 0 kudos

Unfortunately we do not support that. We only support Global and Session level settings. We have an internal feature request for this (DB-I-6556 ) but it has not been prioritized in the Roadmap.

  • 0 kudos
haroon_24
by New Contributor II
  • 1281 Views
  • 2 replies
  • 0 kudos

Error when trying to run model in staging

i am learning dbt and pache airflow i am using the samples catalog and tpch schema/databasewhen i try to run a sql query in my staging folder I get this error - I am using the premium trial version for databricks09:30:21 Running with dbt=1.8.8using l...

  • 1281 Views
  • 2 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

The error indicates that the command you are trying to run is not supported in UC, can you please share what is the SQL command you are currently running?

  • 0 kudos
1 More Replies
mkEngineer
by New Contributor III
  • 1170 Views
  • 2 replies
  • 0 kudos

Configuring DLT _delta_logs with Log Analytics Workspace on Job Clusters

Hi,How do I configure my DLT (Delta Live Table pipeline notebook) _delta_logs with my Azure Log Analytics workspace? I'm encountering issues because the pipeline runs on a job cluster, which doesn't allow me to specify the destination of the log file...

  • 1170 Views
  • 2 replies
  • 0 kudos
Latest Reply
mkEngineer
New Contributor III
  • 0 kudos

Hi @Alberto_Umana ,The error I received was related to cells not being connected to the DLT pipeline, as mentioned in my other post, "Cannot run a cell when connected to the pipeline Databricks." However, after browsing the web, I realized that there...

  • 0 kudos
1 More Replies
Milliman
by New Contributor
  • 2837 Views
  • 1 replies
  • 0 kudos

How could we automatically re run the complete job if any of its associted task fails.?

I need to re run the compete job automatically if any of its associated task gets failed, any help would be appreciable. Thanks

  • 2837 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

You could use the suggestions being provided in Community post https://community.databricks.com/t5/data-engineering/need-to-automatically-rerun-the-failed-jobs-in-databricks/td-p/89074 

  • 0 kudos
mkEngineer
by New Contributor III
  • 1384 Views
  • 1 replies
  • 1 kudos

Resolved! DLT: "cannot run a cell when connected to pipeline databricks"

Hi,I have several different cells in my notebook that are connected to a DLT pipeline. Why are some cells skipped and others aren't?I get the message "cannot run a cell when connected to the pipeline Databricks" when try running a cell when I'm conne...

  • 1384 Views
  • 1 replies
  • 1 kudos
Latest Reply
Takuya-Omi
Valued Contributor III
  • 1 kudos

Hi, @mkEngineer When working with Delta Live Tables (DLT) in Databricks, you cannot run individual cells interactively as you would in a standard Databricks notebook.DLT Pipeline Behavior:Delta Live Tables notebooks are executed as part of a managed ...

  • 1 kudos
issa
by New Contributor III
  • 4911 Views
  • 9 replies
  • 5 kudos

Resolved! How to access bronze dlt in silver dlt

I have a job in Workflows thatt runs two DLT pipelines, one for Bronze_Transaction and on for Silver_Transaction. The reason for two DLT pipelines is because i want the tables to be created in bronze catalog and erp schema, and silver catalog and erp...

Data Engineering
dlt
DLT pipeline
Medallion
Workflows
  • 4911 Views
  • 9 replies
  • 5 kudos
Latest Reply
issa
New Contributor III
  • 5 kudos

Final solution for the Bronze:# Define view as the source@dlt.viewdef Transactions_Bronze_View():    return (        spark.readStream.format("cloudFiles")        .option("cloudFiles.format", "json")        .option("inferSchema", True)        .option(...

  • 5 kudos
8 More Replies
Timmes0815
by New Contributor III
  • 1798 Views
  • 3 replies
  • 0 kudos

Resolved! Set up Loacation using widget

I'm struggeling using the databricks widget to set up the location in an sql create table statement. I tried the following to set up the location:Step1: Creating a notebook (Notebook1) to define the variable.Location_Path =   'abfss:xxxxx@xxxx.xxx.ne...

  • 1798 Views
  • 3 replies
  • 0 kudos
Latest Reply
Timmes0815
New Contributor III
  • 0 kudos

I finaly solved my problem by using the parameters in python F-Strings:Location_Path = dbutils.widgets.text("Location_Path","") -- create table using widget query = f""" CREATE OR REPLACE TABLE schema.Tabelname1 LOCATION '{Location_Path}' AS SELECT...

  • 0 kudos
2 More Replies
holychs
by Databricks Partner
  • 1569 Views
  • 1 replies
  • 0 kudos

Running child job under parent job using run_job_task

Hi Community,I am trying to call another job under a workflow job using run_job_task. Currently I am manually providing job_id of the child job. I want to know if there is any way to pass job_name instead of run_id. This will automate the deployment ...

  • 1569 Views
  • 1 replies
  • 0 kudos
Latest Reply
filipniziol
Esteemed Contributor
  • 0 kudos

Hi @holychs ,It is possible to do using lookup in Databricks Asset Bundles.You define the job id variable that finds id of the job based on its name and use this variable to specify job_id in the run_job_task. Here is the code: variables: my_job_id...

  • 0 kudos
holychs
by Databricks Partner
  • 1364 Views
  • 2 replies
  • 0 kudos

Resolved! Concurrent Workflow Jobs

Hi Community, I am trying to run a Databricks workflow job using run_job_task under a for_loop. I have set the concurrent jobs as 2. I can see 2 iteration jobs getting triggered successfully. But both fail with an error:"ConnectException: Connection ...

  • 1364 Views
  • 2 replies
  • 0 kudos
Latest Reply
holychs
Databricks Partner
  • 0 kudos

It was an internal bug resolved with managing different parameters for each loop jobs.

  • 0 kudos
1 More Replies
JK2021
by New Contributor III
  • 6451 Views
  • 6 replies
  • 3 kudos

Resolved! Exception handling in Databricks

We are planning to customise code on Databricks to call Salesforce bulk API 2.0 to load data from databricks delta table to Salesforce.My question is : All the exception handling, retries and all around Bulk API can be coded explicitly in Data bricks...

  • 6451 Views
  • 6 replies
  • 3 kudos
Latest Reply
Rolx
New Contributor II
  • 3 kudos

Bulk api is working as expected for loading data?

  • 3 kudos
5 More Replies
suryateja405555
by New Contributor III
  • 1760 Views
  • 1 replies
  • 1 kudos

Databricks workflow deployment issue

The below one is the data bricks workflow. Note: ETL_schema check is if/else task in databricks workflow)Declaring below taskValues based on some conditions in ETL_data_check notebooks. Based on the below output the next task"ETL_schema_checks" (if/e...

suryateja405555_2-1732082532269.png suryateja405555_3-1732083107096.png
Data Engineering
assetbundles
DAB
  • 1760 Views
  • 1 replies
  • 1 kudos
Latest Reply
Brahmareddy
Esteemed Contributor
  • 1 kudos

Hi @suryateja405555, How are you doing today?As per my understanding, Make sure ETL_data_checks is properly declared in the tasks section of your workflow configuration in the root module. For example, add it with a task_key and its respective proper...

  • 1 kudos
Faisal
by Contributor
  • 3466 Views
  • 2 replies
  • 1 kudos

DLT maintainace clusters

How does maintenance clusters do the cleanup using optimize, zorder and vacuum. I read that it is handled automatically but how does maintenance cluster know which column to optimize, where do we need to specify that info ?

  • 3466 Views
  • 2 replies
  • 1 kudos
Latest Reply
cgrant
Databricks Employee
  • 1 kudos

At this time, Z-order columns must be specified in the asset definition, the property is pipelines.autoOptimize.zOrderCols. This may change in the future with Predictive Optimization.

  • 1 kudos
1 More Replies
Pramod_G
by New Contributor II
  • 1809 Views
  • 4 replies
  • 0 kudos

Job Cluster with Continuous Trigger Type: Is Frequent Restart Required?

Hi All,I have a job continuously processing IoT data. The workflow reads data from Azure Event Hub and inserts it into the Databricks bronze layer. From there, the data is read, processed, validated, and inserted into the Databricks silver layer. The...

Data Engineering
Driver or Cluster Stability Issues
Long-Running Job Challenges
  • 1809 Views
  • 4 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

How are you ingesting the data? Are you using the Delta Live Table mechanism - https://docs.databricks.com/en/delta-live-tables/index.html?

  • 0 kudos
3 More Replies
GS_S
by New Contributor III
  • 3165 Views
  • 7 replies
  • 0 kudos

Resolved! Error during merge operation: 'NoneType' object has no attribute 'collect'

Why does merge.collect() not return results in access mode: SINGLE_USER, but it does in USER_ISOLATION? I need to log the affected rows (inserted and updated) and can’t find a simple way to get this data in SINGLE_USER mode. Is there a solution or an...

  • 3165 Views
  • 7 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

15.4 does not directly required the serverless but for fine-grained it indeed requires it to run it on Single User as mentioned  This data filtering is performed behind the scenes using serverless compute. In terms of costs:Customers are charged for ...

  • 0 kudos
6 More Replies
Labels