Data Engineering

Forum Posts

Sorted by:

by Oliver_Angelil • Valued Contributor II

04-25-2024 4:53:08 AM

3854 Views
2 replies
0 kudos

Append-only table from non-streaming source in Delta Live Tables

I have a DLT pipeline, where all tables are non-streaming (materialized views), except for the last one, which needs to be append-only, and is therefore defined as a streaming table.The pipeline runs successfully on the first run. However on the seco...

Data Engineering

3854 Views
2 replies
0 kudos

04-25-2024 4:53:08 AM

View Replies

Latest Reply

nkarwa
New Contributor II

07-17-2025 6:34:01 PM

0 kudos

@Oliver_Angelil - I was wondering if you found a solution? I have a similar use-case. I want to create a archive table using DLT from non-streaming (MV). I would prefer a DLT solution. I was able to get it to work using traditional merge approach (no...

0 kudos

07-17-2025 6:34:01 PM

1 More Replies

by simonB2025 • New Contributor III

07-16-2025 7:44:08 AM

2477 Views
3 replies
1 kudos

Resolved! Deploying Data Assets Bundle with VSCode Add-in

Deploying a bundle containing a Pipeline that references a DLT Notebook.In the YAML I am passing the relative path to the Notebook from the repository root (where the YAML lives).Deploying says 'success' but when validating, the Pipeline cannot find ...

Data Engineering

2477 Views
3 replies
1 kudos

07-16-2025 7:44:08 AM

View Replies

Latest Reply

ilir_nuredini
Honored Contributor

07-22-2025 2:29:29 AM

1 kudos

Hello @simonB2025 May you share a snippet of your folder/project structure and where the notebook resides, so we can suggest toyou the exact solution. Thank you!Best, Ilir

1 kudos

07-22-2025 2:29:29 AM

2 More Replies

by Miloud_G • New Contributor III

07-16-2025 12:23:01 PM

2876 Views
1 replies
1 kudos

Resolved! connecting to unity catalog with power BI

i created a group of powerBI users, i grant consumer access permission to this groupe on unity catalog i started a shared compute cluster (Standard_D4ds_v5). when trying to connect from power BI, i will connect to tables only if i grant access to wor...

Data Engineering

2876 Views
1 replies
1 kudos

07-16-2025 12:23:01 PM

View Replies

Latest Reply

radothede
Valued Contributor II

07-22-2025 2:43:33 AM

1 kudos

Hi @Miloud_G Instead of granting full workspace access, you can grant "Databricks SQL access" entitlement, which provides limited workspace access specifically for BI tools.Go to Workspace Settings → Identity and Access → Groups/Users/SPs Manage → Se...

1 kudos

07-22-2025 2:43:33 AM

by smoortema • Contributor

07-21-2025 7:55:52 AM

1919 Views
1 replies
1 kudos

Resolved! Complex pipeline with many tasks and dependencies: orchestration in Jobs or in Notebook?

We need to set up a job that consists of several hundred tasks with many dependencies between each other. We are considering two different directions:1. Databricks job with tasks, with dependencies defined as code and deployed with Databricks asset b...

Data Engineering

1919 Views
1 replies
1 kudos

07-21-2025 7:55:52 AM

View Replies

Latest Reply

radothede
Valued Contributor II

07-22-2025 2:08:31 AM

1 kudos

Hi @smoortema To my best knowlegde:Option 1)You can create jobs that contain up to 1000 tasks, however, it is recommended to split tasks into logical subgroups.Jobs with more than 100 tasks require API 2.2 and above Jobs with a large number of tasks ...

1 kudos

07-22-2025 2:08:31 AM

by Locomo_Dncr • New Contributor

07-21-2025 10:33:22 AM

1850 Views
4 replies
0 kudos

Time Travel vs Bronze historical archive

HelloI am working on building a pipeline using Medallion architecture. The source tables in this pipeline are overwritten each time the table is updated. In the bronze ingestion layer, I plan to append this new table to the current bronze table, addi...

Data Engineering

Medallion

Processing Time

Storage Costs

Time Travel

1850 Views
4 replies
0 kudos

07-21-2025 10:33:22 AM

View Replies

Latest Reply

MariuszK
Valued Contributor III

07-22-2025 2:03:27 AM

0 kudos

Hi @Locomo_Dncr Time travel isn't recomended to store historical data. It's for backup, audit purpose. You can store snapshot data or use SCD2 to keep history."Databricks does not recommend using Delta Lake table history as a long-term backup solutio...

0 kudos

07-22-2025 2:03:27 AM

3 More Replies

by jar • Contributor

07-17-2025 8:14:37 PM

1804 Views
6 replies
0 kudos

Understanding infra costs of Databricks compute

Hi.I can see in our Azure cost analysis tool that a not insignificant part of our costs come from the managed Databricks RG deployed with the workspace, and that it relates particularly to VMs (so compute, I assume?) and storage, the later which, tho...

Data Engineering

1804 Views
6 replies
0 kudos

07-17-2025 8:14:37 PM

View Replies

Latest Reply

jar
Contributor

07-21-2025 11:09:56 PM

0 kudos

Thank you all for your replies. The issue is not getting an overview of costs - I already have that from the Cost Management Export function in Azure, and by using the system.billing tables in Databricks. The issue is understanding the relation betwe...

0 kudos

07-21-2025 11:09:56 PM

5 More Replies

by Sanjeeb2024 • Valued Contributor

07-17-2025 7:01:05 AM

2127 Views
4 replies
2 kudos

Not able to get the location of the table in Databricks Free edition

Hi Team,I am exploring the delta table properties and wants to see the internal folder structure of the delta table. I created a delta table using Databricks Free edition and when I executed the describe extended command, the location field is coming...

Data Engineering

2127 Views
4 replies
2 kudos

07-17-2025 7:01:05 AM

View Replies

Latest Reply

akshayt272
New Contributor II

07-21-2025 9:40:31 AM

2 kudos

Then how can we see the versions of our delta file then ?

2 kudos

07-21-2025 9:40:31 AM

3 More Replies

by parthSundarka • Databricks Employee

07-04-2025 2:08:48 AM

2663 Views
4 replies
6 kudos

Job Failures Observed In DBR 13.3 due to changes in pip package - aiosignal

Problem Statement Few jobs running on DBR 13.3 have started failing with error mentioned below - TypeError: TypeVarTuple.__init__() got an unexpected keyword argument 'default'--------------------------------------------------------------------------...

Data Engineering

2663 Views
4 replies
6 kudos

07-04-2025 2:08:48 AM

View Replies

Latest Reply

pradr
New Contributor II

07-21-2025 6:13:24 PM

6 kudos

Pinning down aiosignal==1.3.2 worked for me, are you sure the package you are building is using the pinned version?

6 kudos

07-21-2025 6:13:24 PM

3 More Replies

by vedant1611 • New Contributor

07-16-2025 6:39:13 PM

3255 Views
1 replies
1 kudos

Resolved! How can I connect databricks apps with Unity Catalog?

Hi,I'm working on building a Databricks app to streamline several team tasks through a user-friendly data application.One key feature I’d like to implement is integrating Databricks Apps with Unity Catalog. Specifically, I want to display a dropdown ...

Data Engineering

3255 Views
1 replies
1 kudos

07-16-2025 6:39:13 PM

View Replies

Latest Reply

ximenesfel
Databricks Employee

07-21-2025 2:19:51 PM

1 kudos

Hi @vedant1611 , Yes, it is possible to integrate Databricks Apps with Unity Catalog to achieve the functionality you're describing. You can build a Databricks app—such as one using Streamlit—that dynamically queries Unity Catalog to list schemas, ta...

1 kudos

07-21-2025 2:19:51 PM

by noorbasha534 • Valued Contributor II

07-21-2025 1:51:37 AM

1138 Views
5 replies
4 kudos

Resolved! OPTIMIZE in parallel with actual data load

Dear allIf I understand correctly, OPTIMIZE cannot run in parallel with actual data load. We see 'concurrent update' errors in our environment if this happens; due to which we are unable to dedicate a maintenance window for the tables health.And, I s...

Data Engineering

1138 Views
5 replies
4 kudos

07-21-2025 1:51:37 AM

View Replies

Latest Reply

noorbasha534
Valued Contributor II

07-21-2025 2:13:33 PM

4 kudos

@MariuszK @szymon_dybczak thanks both. appreciate your support.

4 kudos

07-21-2025 2:13:33 PM

4 More Replies

by jollymon • New Contributor II

07-21-2025 8:42:31 AM

682 Views
3 replies
1 kudos

Resolved! Access notebooks parameters from a bash cell

How can I access a notebook parameter from a bash cell (%sh)? For python I use dbutils.widgets.get('param'), and for SQL I can use :param. Is there a something similar for bash?

Data Engineering

682 Views
3 replies
1 kudos

07-21-2025 8:42:31 AM

View Replies

Latest Reply

szymon_dybczak
Esteemed Contributor III

07-21-2025 9:17:48 AM

1 kudos

Hi @jollymon ,I believe there is no direct way to do this. But maybe there are some workarounds though. You can try to read widgets in python and set those values as envrionment variables. Then you can use shell to read that variables. Something like...

1 kudos

07-21-2025 9:17:48 AM

2 More Replies

by raypritha • New Contributor II

07-21-2025 7:52:04 AM

676 Views
1 replies
1 kudos

Resolved! Switch from Partner Academy to Customer Academy

I accidentally signed up for the partner academy when I should have signed up for the customer academy. How can I switch to the customer academy? My e-mail is the same as I use for this community platform.

Data Engineering

676 Views
1 replies
1 kudos

07-21-2025 7:52:04 AM

View Replies

Latest Reply

Advika
Community Manager

07-21-2025 8:17:36 AM

1 kudos

Hello @raypritha! Please raise a ticket with the Databricks Support Team. They’ll be able to assist you with switching to the Customer Academy.

1 kudos

07-21-2025 8:17:36 AM

by yvesbeutler • New Contributor III

07-21-2025 12:40:57 AM

1556 Views
2 replies
5 kudos

Resolved! run_if dependencies configuration within YAML

Hi guysI have a workflow with various python wheel tasks and one job task to call another workflow. How can I prevent my original workflow from getting an unsuccessful state if the second workflow fails? These workflows are independent and shouldn't ...

Data Engineering

1556 Views
2 replies
5 kudos

07-21-2025 12:40:57 AM

View Replies

Latest Reply

eniwoke
Contributor II

07-21-2025 8:09:50 AM

5 kudos

Hi @yvesbeutler here is a sample way I did it using databricks asset bundles for notebook tasksresources: jobs: chained_jobs: name: chained-jobs tasks: - task_key: main notebook_task: notebook_path: /W...

5 kudos

07-21-2025 8:09:50 AM

1 More Replies

by pgruetter • Contributor

03-27-2023 10:19:58 AM

33682 Views
8 replies
2 kudos

Resolved! How to use Service Principal to connect PowerBI to Databrick SQL Warehouse

Hi allI'm struggling to connect PowerBI service to a Databricks SQL Warehouse using a service principal. I'm following mostly this guide.I created a new app registration in the AAD and created a client secret for it.Now I'm particularly struggling wi...

Data Engineering

33682 Views
8 replies
2 kudos

03-27-2023 10:19:58 AM

View Replies

Latest Reply

Lone
New Contributor II

07-21-2025 7:56:28 AM

2 kudos

Hello All,After successfully adding a service principal to Databricks and generating a client ID and client secret, I plan to utilize these credentials for authentication when configuring Databricks as a data source in Power BI. Could you please clar...

2 kudos

07-21-2025 7:56:28 AM

7 More Replies

by HariPrasad1 • Databricks Partner

07-17-2025 5:12:14 AM

4834 Views
3 replies
0 kudos

Jobs in Spark UI

Is there a way to get the url where all the spark jobs which are created in a specific notebook run can be found? I am creating an audit framework, in that the requirement is to get the spark jobs of a specific task or a notebook run so that we can d...

Data Engineering

4834 Views
3 replies
0 kudos

07-17-2025 5:12:14 AM

View Replies

Latest Reply

eniwoke
Contributor II

07-21-2025 6:45:59 AM

0 kudos

Hi @HariPrasad1 here is a way to get the job list (note: works for non-serverless clusters)from dbruntime.databricks_repl_context import get_context cluster_id = spark.conf.get("spark.databricks.clusterUsageTags.clusterId") workspaceUrl = spark.conf...

0 kudos

07-21-2025 6:45:59 AM

2 More Replies

Databricks Community

Forum Posts

Append-only table from non-streaming source in Delta Live Tables

Resolved! Deploying Data Assets Bundle with VSCode Add-in

Resolved! connecting to unity catalog with power BI

Resolved! Complex pipeline with many tasks and dependencies: orchestration in Jobs or in Notebook?

Time Travel vs Bronze historical archive

Understanding infra costs of Databricks compute

Not able to get the location of the table in Databricks Free edition

Job Failures Observed In DBR 13.3 due to changes in pip package - aiosignal

Resolved! How can I connect databricks apps with Unity Catalog?

Resolved! OPTIMIZE in parallel with actual data load

Resolved! Access notebooks parameters from a bash cell

Resolved! Switch from Partner Academy to Customer Academy

Resolved! run_if dependencies configuration within YAML

Resolved! How to use Service Principal to connect PowerBI to Databrick SQL Warehouse

Jobs in Spark UI

File Arrival Trigger - Multiple tables

Issue while handling Deletes and Inserts in Struct...

DLT with CDC and schema changes in streaming pipel...

how to update not tracked column only in new row v...

Databricks Cost Estimation Template