cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

PriyankaM1
by New Contributor II
  • 1004 Views
  • 4 replies
  • 4 kudos

Cobrix Library

I need cobrix package free in my account. How do I install it, as there is only 1 compute. Do I need to email admin? Also, I am not able open any support ticket for this

  • 1004 Views
  • 4 replies
  • 4 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 4 kudos

HI @PriyankaM1 ,Assuming you're using Unity Catalog compute cluster with Dedicated access mode (formerly called single user), to install that library you need to:- grab maven coordinates -> za.co.absa.cobrix:spark-cobol_2.13:2.8.4- go to Compute sect...

  • 4 kudos
3 More Replies
noorbasha534
by Valued Contributor II
  • 677 Views
  • 5 replies
  • 1 kudos

EXPLAIN PLAN parser

hello all,have you come across a freely available parser that does good job for parsing explain plans for SQL queries in databricks...

  • 677 Views
  • 5 replies
  • 1 kudos
Latest Reply
WiliamRosa
Contributor III
  • 1 kudos

got it. If execution is not an option, you can still extract columns without running the queries by parsing their AST.Use Spark’s internal SQL parser (no execution)You can parse to an unresolved logical plan and walk the tree for Filter, Join, and Ag...

  • 1 kudos
4 More Replies
ayush_273
by New Contributor
  • 4277 Views
  • 6 replies
  • 1 kudos

Migrate transformations from snowflake to databricks

I want to migrate my data as well as the transformations i have to convert my raw data into BI data. Is there a way i can move those transformation to databricks? Some of the transformations use native snowflake functions.Thanks in advance.

  • 4277 Views
  • 6 replies
  • 1 kudos
Latest Reply
WiliamRosa
Contributor III
  • 1 kudos

Hi @thelogicplus, I’ve also used Travinto Technologies’ tool for this kind of migration.

  • 1 kudos
5 More Replies
saicharandeepb
by Contributor
  • 682 Views
  • 1 replies
  • 0 kudos

Clarification on how Streaming Backlog Duration & Records are calculated

Hi all,I’m working on preparing a dashboard for streaming observability and I’m trying to understand how some of the backlog metrics shown in Databricks are actually calculated.In particular, I’m looking at:Streaming Backlog (records): described as t...

  • 682 Views
  • 1 replies
  • 0 kudos
Latest Reply
WiliamRosa
Contributor III
  • 0 kudos

Hi @saicharandeepb,Here’s what I found after reproducing this in Databricks (Auto Loader and the rate source) and inspecting lastProgress.TL;DRStreaming Backlog (records) = how many input records are still discoverable but not yet committed by the co...

  • 0 kudos
javasquez
by New Contributor
  • 290 Views
  • 1 replies
  • 0 kudos

For each: nested JAR task can’t reference upstream taskValues

I’m creating a For each with a nested JAR task. init_job (notebook) sets dbutils.jobs.taskValues.set("my_value", ...), and the For each container depends on init_job.but the UI says: “Reference … can only be used in a task that depends on task 'init_...

javasquez_0-1755874934680.png javasquez_1-1755874991786.png
  • 290 Views
  • 1 replies
  • 0 kudos
Latest Reply
WiliamRosa
Contributor III
  • 0 kudos

Yep—this is a current Workflows limitation. Task value refs ({{tasks.<key>.values.<name>}}) only work if the consuming task directly depends on the producer. A For Each/group dependency doesn’t count—nested tasks don’t inherit those edges for interpo...

  • 0 kudos
Nidhig
by Contributor
  • 844 Views
  • 1 replies
  • 1 kudos

Resolved! Databricks Badge Challenge

Hi Team,As part of badges challenge from Databricks for partners. The criteria is the badge from Sales/Pre-Sales not from Tech-Sales?  

  • 844 Views
  • 1 replies
  • 1 kudos
Latest Reply
Advika
Databricks Employee
  • 1 kudos

Hello @Nidhig! Please raise a ticket with the Databricks Support Team, they’ll be able to provide clarity on this topic.

  • 1 kudos
jeremy98
by Honored Contributor
  • 633 Views
  • 2 replies
  • 1 kudos

how to manage the databricks failure notifications to slack webhook?

Hi community,I’d like to handle the case when one of our Databricks jobs fails. From the documentation, I understand that the HTTP response from Databricks will look like this:{ "event_type": "jobs.on_failure", "workspace_id": "your_workspace_id"...

  • 633 Views
  • 2 replies
  • 1 kudos
Latest Reply
jeremy98
Honored Contributor
  • 1 kudos

bc this:is not correct I suppose, right?

  • 1 kudos
1 More Replies
RPalmer
by New Contributor III
  • 4714 Views
  • 6 replies
  • 4 kudos

Confusion around the dollar param being flagged as deprecated

Over the past week we have seen a warning showing in our notebooks about the dollar param being deprecated. Which has apparently been deprecated since runtime 7, but I cannot find any info about when it will actually be removed. Will the removal be t...

  • 4714 Views
  • 6 replies
  • 4 kudos
Latest Reply
BS_THE_ANALYST
Esteemed Contributor III
  • 4 kudos

Hey everyone, I've just had a look into this. I think the documentation seems pretty clear on what's expected with using the legacy-style parameters https://docs.databricks.com/aws/en/notebooks/legacy-widgets @Yourstruly, I appreciate the frustration...

  • 4 kudos
5 More Replies
Subhrajyoti
by New Contributor
  • 444 Views
  • 1 replies
  • 0 kudos

Issue in Databricks Cluster Configuration in Pre Prod Environment

Hi team,Hope you are doing well!!This is just to share one incident regarding one of the difficulties we are facing for a UC enabled cluster (interactive and job cluster both) in our pre prod environment that the data is not getting refreshed properl...

  • 444 Views
  • 1 replies
  • 0 kudos
Latest Reply
Sidhant07
Databricks Employee
  • 0 kudos

Hi @Subhrajyoti , Can you please try running REFRESH TABLE table_name when you encounter this issue. Can you also try disabling Delta caching and check if it returns correct result (spark.databricks.io.cache.enabled false)

  • 0 kudos
bharathelsker
by New Contributor
  • 379 Views
  • 1 replies
  • 0 kudos

Resolved! Why does disabling Photon fix my ConcurrentDeleteDeleteException in Databricks?

I’m running a Databricks 15.4 LTS job with Photon acceleration enabled.I have a wrapper notebook that uses ThreadPoolExecutor to trigger multiple child notebooks in parallel.Each thread calls a function that runs a child notebook and updates an audit...

  • 379 Views
  • 1 replies
  • 0 kudos
Latest Reply
Sidhant07
Databricks Employee
  • 0 kudos

Hi @bharathelsker , ConcurrentDeleteDeleteException This exception occurs when a concurrent operation deleted a file that your operation also deletes. This could be caused by two concurrent compaction operations rewriting the same files. To further i...

  • 0 kudos
Ovasheli
by New Contributor
  • 368 Views
  • 1 replies
  • 0 kudos

How to Get CDF Metadata from an Overwritten Batch Source in DLT?

Hello,I'm working on a Delta Live Tables pipeline and need help with a data source challenge.My source tables are batch-loaded SCD2 tables with CDF (Change Data Feed) enabled. These tables are updated daily using a complete overwrite operation.For my...

  • 368 Views
  • 1 replies
  • 0 kudos
Latest Reply
Sidhant07
Databricks Employee
  • 0 kudos

Hi @Ovasheli , I believe the error message would be something like below. Error: com.databricks.sql.transaction.tahoe.DeltaUnsupportedOperationException: [DELTA_SOURCE_TABLE_IGNORE_CHANGES] Detected a data update (for example DELETE (Map(predicate - ...

  • 0 kudos
turagittech
by Contributor
  • 660 Views
  • 1 replies
  • 1 kudos

Credential Sharing Across Cluster Nodes - spark.conf()

Hi All,I am struggling to understand how to manage credentials for azure storage across cluster when trying to use Azure python libraries within functions that may end up on the cluster worker nodes.I am building a task to load blobs from Azure stora...

  • 660 Views
  • 1 replies
  • 1 kudos
Latest Reply
Sidhant07
Databricks Employee
  • 1 kudos

Hi @turagittech , I found a KB article related to this error.Let me know if this helps. https://kb.databricks.com/data-sources/keyproviderexception-error-when-trying-to-create-an-external-table-on-an-external-schema-with-authentication-at-the-noteboo...

  • 1 kudos
JothyGanesan
by New Contributor III
  • 949 Views
  • 5 replies
  • 2 kudos

Resolved! History Retention in DLT table (Delta Live Table)

Hi AllWe have the DLT table as our curated layer with apply changes. The DLT pipelines runs on continuous mode for streaming real time data ingestion. There is a requirement as per regulatory to retain only 1 year data in the DLT table and to move th...

  • 949 Views
  • 5 replies
  • 2 kudos
Latest Reply
Ranga_naik1180
New Contributor III
  • 2 kudos

@szymon_dybczak  ..Can you please suggest how we can delete records ..we have a scd2 target table(Silver) on top of that scd2 table we are having another scd2 table (Gold_layer) target table ..idea was if i delete a row in silver table how it can pro...

  • 2 kudos
4 More Replies
Datalight
by Contributor
  • 1011 Views
  • 9 replies
  • 2 kudos

Adobe Campaign to Azure Databricks file transfer

I have to create a Data Pipeline which pushes Data (2. JSON FILE)  from Source Adobe using corn job to ADLS Gen2. 1. How My ADLS Gen2 will know the new file came to container from Adobe. I am using Databricks as orchestrator and ETL tool.2. What all ...

Datalight_0-1755698274906.png
  • 1011 Views
  • 9 replies
  • 2 kudos
Latest Reply
nachoBot
New Contributor II
  • 2 kudos

Datalight,With regards to 1) I see that you are using the Medallion Architecture. Have you considered using AutoLoader for the detection and ingestion of new files in ADLS Gen2

  • 2 kudos
8 More Replies
Anubhav2603
by New Contributor
  • 558 Views
  • 1 replies
  • 0 kudos

DLT Pipeline Question

I am new to DLT and trying to understand the process. My bronze table will receive incremental data from SAP in real time. In my bronze table, we are not maintaining any history and any data older than 2 weeks will be deleted. This data from bronze w...

  • 558 Views
  • 1 replies
  • 0 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 0 kudos

When loading data from SAP, how will you determine which records are new? With Lakeflow, incremental loads from cloud storage or another Delta table are fully automated. However, when pulling directly from SAP, Lakeflow does not have visibility into ...

  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels