cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Oliver_Angelil
by Valued Contributor II
  • 3854 Views
  • 2 replies
  • 0 kudos

Append-only table from non-streaming source in Delta Live Tables

I have a DLT pipeline, where all tables are non-streaming (materialized views), except for the last one, which needs to be append-only, and is therefore defined as a streaming table.The pipeline runs successfully on the first run. However on the seco...

  • 3854 Views
  • 2 replies
  • 0 kudos
Latest Reply
nkarwa
New Contributor II
  • 0 kudos

@Oliver_Angelil - I was wondering if you found a solution? I have a similar use-case. I want to create a archive table using DLT from non-streaming (MV). I would prefer a DLT solution. I was able to get it to work using traditional merge approach (no...

  • 0 kudos
1 More Replies
simonB2025
by New Contributor III
  • 2477 Views
  • 3 replies
  • 1 kudos

Resolved! Deploying Data Assets Bundle with VSCode Add-in

Deploying a bundle containing a Pipeline that references a DLT Notebook.In the YAML I am passing the relative path to the Notebook from the repository root (where the YAML lives).Deploying says 'success' but when validating, the Pipeline cannot find ...

  • 2477 Views
  • 3 replies
  • 1 kudos
Latest Reply
ilir_nuredini
Honored Contributor
  • 1 kudos

Hello @simonB2025 May you share a snippet of your folder/project structure and where the notebook resides, so we can suggest toyou the exact solution. Thank you!Best, Ilir

  • 1 kudos
2 More Replies
Miloud_G
by New Contributor III
  • 2876 Views
  • 1 replies
  • 1 kudos

Resolved! connecting to unity catalog with power BI

i created a group of powerBI users, i grant consumer access permission to this groupe on unity catalog i started a shared compute cluster (Standard_D4ds_v5). when trying to connect from power BI, i will connect to tables only if i grant access to wor...

  • 2876 Views
  • 1 replies
  • 1 kudos
Latest Reply
radothede
Valued Contributor II
  • 1 kudos

Hi @Miloud_G Instead of granting full workspace access, you can grant "Databricks SQL access" entitlement, which provides limited workspace access specifically for BI tools.Go to Workspace Settings → Identity and Access → Groups/Users/SPs Manage → Se...

  • 1 kudos
smoortema
by Contributor
  • 1919 Views
  • 1 replies
  • 1 kudos

Resolved! Complex pipeline with many tasks and dependencies: orchestration in Jobs or in Notebook?

We need to set up a job that consists of several hundred tasks with many dependencies between each other. We are considering two different directions:1. Databricks job with tasks, with dependencies defined as code and deployed with Databricks asset b...

  • 1919 Views
  • 1 replies
  • 1 kudos
Latest Reply
radothede
Valued Contributor II
  • 1 kudos

Hi @smoortema To my best knowlegde:Option 1)You can create jobs that contain up to 1000 tasks, however, it is recommended to split tasks into logical subgroups.Jobs with more than 100 tasks require API 2.2 and above Jobs with a large number of tasks ...

  • 1 kudos
Locomo_Dncr
by New Contributor
  • 1850 Views
  • 4 replies
  • 0 kudos

Time Travel vs Bronze historical archive

HelloI am working on building a pipeline using Medallion architecture. The source tables in this pipeline are overwritten each time the table is updated. In the bronze ingestion layer, I plan to append this new table to the current bronze table, addi...

Data Engineering
Medallion
Processing Time
Storage Costs
Time Travel
  • 1850 Views
  • 4 replies
  • 0 kudos
Latest Reply
MariuszK
Valued Contributor III
  • 0 kudos

Hi @Locomo_Dncr Time travel isn't recomended to store historical data. It's for backup, audit purpose. You can store snapshot data or use SCD2 to keep history."Databricks does not recommend using Delta Lake table history as a long-term backup solutio...

  • 0 kudos
3 More Replies
jar
by Contributor
  • 1804 Views
  • 6 replies
  • 0 kudos

Understanding infra costs of Databricks compute

Hi.I can see in our Azure cost analysis tool that a not insignificant part of our costs come from the managed Databricks RG deployed with the workspace, and that it relates particularly to VMs (so compute, I assume?) and storage, the later which, tho...

  • 1804 Views
  • 6 replies
  • 0 kudos
Latest Reply
jar
Contributor
  • 0 kudos

Thank you all for your replies. The issue is not getting an overview of costs - I already have that from the Cost Management Export function in Azure, and by using the system.billing tables in Databricks. The issue is understanding the relation betwe...

  • 0 kudos
5 More Replies
Sanjeeb2024
by Valued Contributor
  • 2127 Views
  • 4 replies
  • 2 kudos

Not able to get the location of the table in Databricks Free edition

Hi Team,I am exploring the delta table properties and wants to see the internal folder structure of the delta table. I created a delta table using Databricks Free edition and when I executed the describe extended command, the location field is coming...

Sanjeeb2024_0-1752760734723.png
  • 2127 Views
  • 4 replies
  • 2 kudos
Latest Reply
akshayt272
New Contributor II
  • 2 kudos

Then how can we see the versions of our delta file then ?

  • 2 kudos
3 More Replies
parthSundarka
by Databricks Employee
  • 2663 Views
  • 4 replies
  • 6 kudos

Job Failures Observed In DBR 13.3 due to changes in pip package - aiosignal

Problem Statement Few jobs running on DBR 13.3 have started failing with error mentioned below - TypeError: TypeVarTuple.__init__() got an unexpected keyword argument 'default'--------------------------------------------------------------------------...

image (1) (1).png
  • 2663 Views
  • 4 replies
  • 6 kudos
Latest Reply
pradr
New Contributor II
  • 6 kudos

Pinning down aiosignal==1.3.2 worked for me, are you sure the package you are building is using the pinned version?

  • 6 kudos
3 More Replies
vedant1611
by New Contributor
  • 3255 Views
  • 1 replies
  • 1 kudos

Resolved! How can I connect databricks apps with Unity Catalog?

Hi,I'm working on building a Databricks app to streamline several team tasks through a user-friendly data application.One key feature I’d like to implement is integrating Databricks Apps with Unity Catalog. Specifically, I want to display a dropdown ...

  • 3255 Views
  • 1 replies
  • 1 kudos
Latest Reply
ximenesfel
Databricks Employee
  • 1 kudos

Hi @vedant1611 , Yes, it is possible to integrate Databricks Apps with Unity Catalog to achieve the functionality you're describing. You can build a Databricks app—such as one using Streamlit—that dynamically queries Unity Catalog to list schemas, ta...

  • 1 kudos
noorbasha534
by Valued Contributor II
  • 1138 Views
  • 5 replies
  • 4 kudos

Resolved! OPTIMIZE in parallel with actual data load

Dear allIf I understand correctly, OPTIMIZE cannot run in parallel with actual data load. We see 'concurrent update' errors in our environment if this happens; due to which we are unable to dedicate a maintenance window for the tables health.And, I s...

  • 1138 Views
  • 5 replies
  • 4 kudos
Latest Reply
noorbasha534
Valued Contributor II
  • 4 kudos

@MariuszK @szymon_dybczak thanks both. appreciate your support.

  • 4 kudos
4 More Replies
jollymon
by New Contributor II
  • 682 Views
  • 3 replies
  • 1 kudos

Resolved! Access notebooks parameters from a bash cell

How can I access a notebook parameter from a bash cell (%sh)? For python I use dbutils.widgets.get('param'), and for SQL I can use :param. Is there a something similar for bash?

  • 682 Views
  • 3 replies
  • 1 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 1 kudos

Hi @jollymon ,I believe there is no direct way to do this. But maybe there are some workarounds though. You can try to read widgets in python and set those values as envrionment variables. Then you can use shell to read that variables. Something like...

  • 1 kudos
2 More Replies
raypritha
by New Contributor II
  • 676 Views
  • 1 replies
  • 1 kudos

Resolved! Switch from Partner Academy to Customer Academy

I accidentally signed up for the partner academy when I should have signed up for the customer academy. How can I switch to the customer academy? My e-mail is the same as I use for this community platform.

  • 676 Views
  • 1 replies
  • 1 kudos
Latest Reply
Advika
Community Manager
  • 1 kudos

Hello @raypritha! Please raise a ticket with the Databricks Support Team. They’ll be able to assist you with switching to the Customer Academy.

  • 1 kudos
yvesbeutler
by New Contributor III
  • 1556 Views
  • 2 replies
  • 5 kudos

Resolved! run_if dependencies configuration within YAML

Hi guysI have a workflow with various python wheel tasks and one job task to call another workflow. How can I prevent my original workflow from getting an unsuccessful state if the second workflow fails? These workflows are independent and shouldn't ...

dbx-issue.png
  • 1556 Views
  • 2 replies
  • 5 kudos
Latest Reply
eniwoke
Contributor II
  • 5 kudos

Hi @yvesbeutler here is a sample way I did  it using databricks asset bundles for notebook tasksresources: jobs: chained_jobs: name: chained-jobs tasks: - task_key: main notebook_task: notebook_path: /W...

  • 5 kudos
1 More Replies
pgruetter
by Contributor
  • 33682 Views
  • 8 replies
  • 2 kudos

Resolved! How to use Service Principal to connect PowerBI to Databrick SQL Warehouse

Hi allI'm struggling to connect PowerBI service to a Databricks SQL Warehouse using a service principal. I'm following mostly this guide.I created a new app registration in the AAD and created a client secret for it.Now I'm particularly struggling wi...

  • 33682 Views
  • 8 replies
  • 2 kudos
Latest Reply
Lone
New Contributor II
  • 2 kudos

Hello All,After successfully adding a service principal to Databricks and generating a client ID and client secret, I plan to utilize these credentials for authentication when configuring Databricks as a data source in Power BI. Could you please clar...

  • 2 kudos
7 More Replies
HariPrasad1
by Databricks Partner
  • 4834 Views
  • 3 replies
  • 0 kudos

Jobs in Spark UI

Is there a way to get the url where all the spark jobs which are created in a specific notebook run can be found? I am creating an audit framework, in that the requirement is to get the spark jobs of a specific task or a notebook run so that we can d...

  • 4834 Views
  • 3 replies
  • 0 kudos
Latest Reply
eniwoke
Contributor II
  • 0 kudos

Hi @HariPrasad1 here is a way to get the job list (note: works for non-serverless clusters)from dbruntime.databricks_repl_context import get_context cluster_id = spark.conf.get("spark.databricks.clusterUsageTags.clusterId") workspaceUrl = spark.conf...

  • 0 kudos
2 More Replies
Labels