cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

WiliamRosa
by Databricks Partner
  • 1159 Views
  • 5 replies
  • 10 kudos

Databricks Hungary Meetup

Details:Free event, registration is required: https://datapao.com/databricks-hungary-meetup/SUMMARYJoin DATAPAO for the autumn edition of our Databricks Hungary Meetup. Join us and our exciting guests from Databricks for talks, product updates and ne...

WiliamRosa_0-1755269485946.png
  • 1159 Views
  • 5 replies
  • 10 kudos
Latest Reply
BS_THE_ANALYST
Databricks Partner
  • 10 kudos

@WiliamRosa if you make any material available after the meeting, I'd be interested to take a look ! I hope you guys have a great time. All the best,BS

  • 10 kudos
4 More Replies
PriyankaM1
by New Contributor II
  • 1355 Views
  • 4 replies
  • 4 kudos

Cobrix Library

I need cobrix package free in my account. How do I install it, as there is only 1 compute. Do I need to email admin? Also, I am not able open any support ticket for this

  • 1355 Views
  • 4 replies
  • 4 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 4 kudos

HI @PriyankaM1 ,Assuming you're using Unity Catalog compute cluster with Dedicated access mode (formerly called single user), to install that library you need to:- grab maven coordinates -> za.co.absa.cobrix:spark-cobol_2.13:2.8.4- go to Compute sect...

  • 4 kudos
3 More Replies
noorbasha534
by Valued Contributor II
  • 1034 Views
  • 5 replies
  • 1 kudos

EXPLAIN PLAN parser

hello all,have you come across a freely available parser that does good job for parsing explain plans for SQL queries in databricks...

  • 1034 Views
  • 5 replies
  • 1 kudos
Latest Reply
WiliamRosa
Databricks Partner
  • 1 kudos

got it. If execution is not an option, you can still extract columns without running the queries by parsing their AST.Use Spark’s internal SQL parser (no execution)You can parse to an unresolved logical plan and walk the tree for Filter, Join, and Ag...

  • 1 kudos
4 More Replies
ayush_273
by New Contributor
  • 5856 Views
  • 6 replies
  • 1 kudos

Migrate transformations from snowflake to databricks

I want to migrate my data as well as the transformations i have to convert my raw data into BI data. Is there a way i can move those transformation to databricks? Some of the transformations use native snowflake functions.Thanks in advance.

  • 5856 Views
  • 6 replies
  • 1 kudos
Latest Reply
WiliamRosa
Databricks Partner
  • 1 kudos

Hi @thelogicplus, I’ve also used Travinto Technologies’ tool for this kind of migration.

  • 1 kudos
5 More Replies
saicharandeepb
by Contributor
  • 1013 Views
  • 1 replies
  • 0 kudos

Clarification on how Streaming Backlog Duration & Records are calculated

Hi all,I’m working on preparing a dashboard for streaming observability and I’m trying to understand how some of the backlog metrics shown in Databricks are actually calculated.In particular, I’m looking at:Streaming Backlog (records): described as t...

  • 1013 Views
  • 1 replies
  • 0 kudos
Latest Reply
WiliamRosa
Databricks Partner
  • 0 kudos

Hi @saicharandeepb,Here’s what I found after reproducing this in Databricks (Auto Loader and the rate source) and inspecting lastProgress.TL;DRStreaming Backlog (records) = how many input records are still discoverable but not yet committed by the co...

  • 0 kudos
javasquez
by New Contributor
  • 388 Views
  • 1 replies
  • 0 kudos

For each: nested JAR task can’t reference upstream taskValues

I’m creating a For each with a nested JAR task. init_job (notebook) sets dbutils.jobs.taskValues.set("my_value", ...), and the For each container depends on init_job.but the UI says: “Reference … can only be used in a task that depends on task 'init_...

javasquez_0-1755874934680.png javasquez_1-1755874991786.png
  • 388 Views
  • 1 replies
  • 0 kudos
Latest Reply
WiliamRosa
Databricks Partner
  • 0 kudos

Yep—this is a current Workflows limitation. Task value refs ({{tasks.<key>.values.<name>}}) only work if the consuming task directly depends on the producer. A For Each/group dependency doesn’t count—nested tasks don’t inherit those edges for interpo...

  • 0 kudos
Nidhig
by Databricks Partner
  • 1050 Views
  • 1 replies
  • 1 kudos

Resolved! Databricks Badge Challenge

Hi Team,As part of badges challenge from Databricks for partners. The criteria is the badge from Sales/Pre-Sales not from Tech-Sales?  

  • 1050 Views
  • 1 replies
  • 1 kudos
Latest Reply
Advika
Community Manager
  • 1 kudos

Hello @Nidhig! Please raise a ticket with the Databricks Support Team, they’ll be able to provide clarity on this topic.

  • 1 kudos
jeremy98
by Honored Contributor
  • 1234 Views
  • 2 replies
  • 1 kudos

how to manage the databricks failure notifications to slack webhook?

Hi community,I’d like to handle the case when one of our Databricks jobs fails. From the documentation, I understand that the HTTP response from Databricks will look like this:{ "event_type": "jobs.on_failure", "workspace_id": "your_workspace_id"...

  • 1234 Views
  • 2 replies
  • 1 kudos
Latest Reply
jeremy98
Honored Contributor
  • 1 kudos

bc this:is not correct I suppose, right?

  • 1 kudos
1 More Replies
RPalmer
by New Contributor III
  • 5897 Views
  • 6 replies
  • 4 kudos

Confusion around the dollar param being flagged as deprecated

Over the past week we have seen a warning showing in our notebooks about the dollar param being deprecated. Which has apparently been deprecated since runtime 7, but I cannot find any info about when it will actually be removed. Will the removal be t...

  • 5897 Views
  • 6 replies
  • 4 kudos
Latest Reply
BS_THE_ANALYST
Databricks Partner
  • 4 kudos

Hey everyone, I've just had a look into this. I think the documentation seems pretty clear on what's expected with using the legacy-style parameters https://docs.databricks.com/aws/en/notebooks/legacy-widgets @Yourstruly, I appreciate the frustration...

  • 4 kudos
5 More Replies
Subhrajyoti
by Databricks Partner
  • 555 Views
  • 1 replies
  • 0 kudos

Issue in Databricks Cluster Configuration in Pre Prod Environment

Hi team,Hope you are doing well!!This is just to share one incident regarding one of the difficulties we are facing for a UC enabled cluster (interactive and job cluster both) in our pre prod environment that the data is not getting refreshed properl...

  • 555 Views
  • 1 replies
  • 0 kudos
Latest Reply
Sidhant07
Databricks Employee
  • 0 kudos

Hi @Subhrajyoti , Can you please try running REFRESH TABLE table_name when you encounter this issue. Can you also try disabling Delta caching and check if it returns correct result (spark.databricks.io.cache.enabled false)

  • 0 kudos
bharathelsker
by New Contributor
  • 520 Views
  • 1 replies
  • 0 kudos

Resolved! Why does disabling Photon fix my ConcurrentDeleteDeleteException in Databricks?

I’m running a Databricks 15.4 LTS job with Photon acceleration enabled.I have a wrapper notebook that uses ThreadPoolExecutor to trigger multiple child notebooks in parallel.Each thread calls a function that runs a child notebook and updates an audit...

  • 520 Views
  • 1 replies
  • 0 kudos
Latest Reply
Sidhant07
Databricks Employee
  • 0 kudos

Hi @bharathelsker , ConcurrentDeleteDeleteException This exception occurs when a concurrent operation deleted a file that your operation also deletes. This could be caused by two concurrent compaction operations rewriting the same files. To further i...

  • 0 kudos
Ovasheli
by New Contributor
  • 481 Views
  • 1 replies
  • 0 kudos

How to Get CDF Metadata from an Overwritten Batch Source in DLT?

Hello,I'm working on a Delta Live Tables pipeline and need help with a data source challenge.My source tables are batch-loaded SCD2 tables with CDF (Change Data Feed) enabled. These tables are updated daily using a complete overwrite operation.For my...

  • 481 Views
  • 1 replies
  • 0 kudos
Latest Reply
Sidhant07
Databricks Employee
  • 0 kudos

Hi @Ovasheli , I believe the error message would be something like below. Error: com.databricks.sql.transaction.tahoe.DeltaUnsupportedOperationException: [DELTA_SOURCE_TABLE_IGNORE_CHANGES] Detected a data update (for example DELETE (Map(predicate - ...

  • 0 kudos
turagittech
by Contributor
  • 925 Views
  • 1 replies
  • 1 kudos

Credential Sharing Across Cluster Nodes - spark.conf()

Hi All,I am struggling to understand how to manage credentials for azure storage across cluster when trying to use Azure python libraries within functions that may end up on the cluster worker nodes.I am building a task to load blobs from Azure stora...

  • 925 Views
  • 1 replies
  • 1 kudos
Latest Reply
Sidhant07
Databricks Employee
  • 1 kudos

Hi @turagittech , I found a KB article related to this error.Let me know if this helps. https://kb.databricks.com/data-sources/keyproviderexception-error-when-trying-to-create-an-external-table-on-an-external-schema-with-authentication-at-the-noteboo...

  • 1 kudos
JothyGanesan
by New Contributor III
  • 1339 Views
  • 5 replies
  • 2 kudos

Resolved! History Retention in DLT table (Delta Live Table)

Hi AllWe have the DLT table as our curated layer with apply changes. The DLT pipelines runs on continuous mode for streaming real time data ingestion. There is a requirement as per regulatory to retain only 1 year data in the DLT table and to move th...

  • 1339 Views
  • 5 replies
  • 2 kudos
Latest Reply
Ranga_naik1180
Databricks Partner
  • 2 kudos

@szymon_dybczak  ..Can you please suggest how we can delete records ..we have a scd2 target table(Silver) on top of that scd2 table we are having another scd2 table (Gold_layer) target table ..idea was if i delete a row in silver table how it can pro...

  • 2 kudos
4 More Replies
Datalight
by Contributor
  • 2026 Views
  • 9 replies
  • 2 kudos

Adobe Campaign to Azure Databricks file transfer

I have to create a Data Pipeline which pushes Data (2. JSON FILE)  from Source Adobe using corn job to ADLS Gen2. 1. How My ADLS Gen2 will know the new file came to container from Adobe. I am using Databricks as orchestrator and ETL tool.2. What all ...

Datalight_0-1755698274906.png
  • 2026 Views
  • 9 replies
  • 2 kudos
Latest Reply
nachoBot
New Contributor II
  • 2 kudos

Datalight,With regards to 1) I see that you are using the Medallion Architecture. Have you considered using AutoLoader for the detection and ingestion of new files in ADLS Gen2

  • 2 kudos
8 More Replies
Labels