cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

ChristianRRL
by Honored Contributor
  • 9 Views
  • 1 replies
  • 1 kudos

Resolved! Serverless Compute Spark Version Flexibility?

Hi there, I'm wondering what determines the Serverless Compute spark version? Is it based on the current DBR LTS? And is there a way to modify the spark version for serverless compute?For example, when I check the spark version for our serverless com...

ChristianRRL_0-1768409059721.png ChristianRRL_1-1768409577998.png
  • 9 Views
  • 1 replies
  • 1 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 1 kudos

Hi @ChristianRRL ,The answer is simple Serverless compute always run on the latest runtime version. You cannot choose it like in standard compute.Connect to serverless compute | Databricks on AWSIn Serverless compute you can only choose different en...

  • 1 kudos
PabloCSD
by Valued Contributor II
  • 75 Views
  • 2 replies
  • 0 kudos

How to use/install a driver in Spark Declarative Pipelines (ETL)?

Salutations,I'm using SDP for an ETL that extracts data from HANA and put it in the Unity Catalog. I defined a Policy with the needed driver:But I get this error:An error occurred while calling o1013.load. : java.lang.ClassNotFoundException: com.sap....

PabloCSD_0-1768228884826.png
  • 75 Views
  • 2 replies
  • 0 kudos
Latest Reply
anshu_roy
Databricks Employee
  • 0 kudos

  Hello,It is recommended that you upload libraries to source locations that support installation onto compute with standard access mode (formerly shared access mode), as this is the recommended mode for all workloads. Please refer the documentation ...

  • 0 kudos
1 More Replies
ChristianRRL
by Honored Contributor
  • 75 Views
  • 4 replies
  • 7 kudos

Resolved! Testing Spark Declarative Pipeline in Docker Container > PySparkRuntimeError

Hi there, I see via an announcement last year that Spark Declarative Pipeline (previously DLT) was getting open sourced into Apache Spark, and I see that this recently is true as of Apache 4.1:Spark Declarative Pipelines Programming Guide I'm trying ...

ChristianRRL_0-1768361209159.png
  • 75 Views
  • 4 replies
  • 7 kudos
Latest Reply
aleksandra_ch
Databricks Employee
  • 7 kudos

Hi @ChristianRRL ,In addition to @osingh 's answers, check out this old but good blog post about how to structure the pipelines's code to enable dev and test cycle: https://www.databricks.com/blog/applying-software-development-devops-best-practices-d...

  • 7 kudos
3 More Replies
Chandana_Ramesh
by New Contributor II
  • 81 Views
  • 3 replies
  • 1 kudos

Lakebridge SetUp Issue

Hi,I'm getting the below error upon executing databricks labs lakebridge analyze command. All the dependencies have been installed before execution of the command. Can someone please give a solution, or suggest if anything is missing? Below attached ...

  • 81 Views
  • 3 replies
  • 1 kudos
Latest Reply
aleksandra_ch
Databricks Employee
  • 1 kudos

Hi Chandana_Ramesh, Please rerun the command with a --debug flag and share the command and the whole output. From the message that you shared, it looks like the Analyzer.exe binary is not accessible: Verify the binary exists and is accessible: C:\Use...

  • 1 kudos
2 More Replies
Anish_2
by New Contributor II
  • 47 Views
  • 2 replies
  • 0 kudos

daabricks workflow design

Hello Team,I have use-case in which i want to trigger another dlt pipeline if 1 table got succeded in my parent dlt pipeline. I dont want to create pipeline to pipeline dependency. Is there any way to create table to pipeline dependency?Thank youAnis...

Data Engineering
deltalivetable
workflowdesign
  • 47 Views
  • 2 replies
  • 0 kudos
Latest Reply
Raman_Unifeye
Contributor III
  • 0 kudos

@Anish_2 - TUT is the solution. in TUT, instead of the parent pipeline "pushing" a notification, the child job is "pulled" into action by a metadata change.Set it up as below.Create a Databricks Job and add a Pipeline task pointing to your Secondary ...

  • 0 kudos
1 More Replies
Alf01
by New Contributor
  • 114 Views
  • 1 replies
  • 0 kudos

Databricks Serverless Pipelines - Incremental Refresh Doubts

Hello everyone,I would like to clarify some doubts regarding how Databricks Pipelines (DLT) behave when using serverless pipelines with incremental updates.In general, incremental processing is enabled and works as expected. However, I have observed ...

  • 114 Views
  • 1 replies
  • 0 kudos
Latest Reply
aleksandra_ch
Databricks Employee
  • 0 kudos

Hi Alf0 and welcome to the Databricks Community!The Lakeflow Spark Declarative Pipelines (SDP) cost model considers multiple factors when deciding whether to perform an incremental refresh or a full recompute. It makes a best-effort attempt to increm...

  • 0 kudos
Dhruv-22
by Contributor II
  • 129 Views
  • 4 replies
  • 0 kudos

Feature request: Allow to set value as null when not present in schema evolution

I want to raise a feature request as follows.Currently, in the Automatic schema evolution for merge when a column is not present in the source dataset it is not changed in the target dataset. For e.g.%sql CREATE OR REPLACE TABLE edw_nprd_aen.bronze.t...

Dhruv22_0-1767970990008.png Dhruv22_1-1767971051176.png Dhruv22_2-1767971116934.png Dhruv22_3-1767971213212.png
  • 129 Views
  • 4 replies
  • 0 kudos
Latest Reply
ManojkMohan
Honored Contributor II
  • 0 kudos

@Dhruv-22 ProblemWhen using MERGE INTO ... WITH SCHEMA EVOLUTION, if a column exists in the target table but is not present in the source dataset, that column is left unchanged on matched rows.Solution ThinkingThis can be emulated by introspecting th...

  • 0 kudos
3 More Replies
Dhruv-22
by Contributor II
  • 93 Views
  • 3 replies
  • 0 kudos

Merge with schema evolution fails because of upper case columns

The following is a minimal reproducible example of what I'm facing right now.%sql CREATE OR REPLACE TABLE edw_nprd_aen.bronze.test_table ( id INT ); INSERT INTO edw_nprd_aen.bronze.test_table VALUES (1); SELECT * FROM edw_nprd_aen.bronze.test_tab...

Dhruv22_0-1768233514715.png Dhruv22_1-1768233551139.png Dhruv22_0-1768234077162.png
  • 93 Views
  • 3 replies
  • 0 kudos
Latest Reply
css-1029
New Contributor
  • 0 kudos

Hi @Dhruv-22,It's actually not a bug. Let me explain what's happening.The Root CauseThe issue stems from how schema evolution works with Delta Lake's MERGE statement, combined with Spark SQL's case-insensitivity settings.Here's the key insight: spark...

  • 0 kudos
2 More Replies
NotCuriosAtAll
by New Contributor
  • 74 Views
  • 2 replies
  • 3 kudos

Resolved! Cluster crashes occasionally but not all of the time

We have a small cluster (Standard D2ads v6) with 8 gigs of ram and 2 cores. This is an all-purpose cluster and for some reason, the client demands to use this one for our ETL process. The ETL process is simple, the client drops parquet files in the b...

  • 74 Views
  • 2 replies
  • 3 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 3 kudos

Hi @NotCuriosAtAll ,You have undersized cluster for your workload. This error is typical on driver node with that high cpu consumption. You can check below article (and related solution):Job run fails with error message “Could not reach driver of clu...

  • 3 kudos
1 More Replies
JothyGanesan
by New Contributor III
  • 41 Views
  • 1 replies
  • 0 kudos

DLT Continuous Pipeline load

Hi All,In our project we are working on the DLT pipeline with the DLT tables as target running in continuous mode.These tables are common for multiple countries, and we go live in batches for different countries.So, every time a new change is request...

  • 41 Views
  • 1 replies
  • 0 kudos
Latest Reply
ManojkMohan
Honored Contributor II
  • 0 kudos

@JothyGanesan Use dynamic schema handling and selective table updates to apply metadata changes incrementally from the current watermark, preserving history across country go-lives.Replace static @dlt.table definitions with Auto Loader's schema infer...

  • 0 kudos
bsr
by New Contributor II
  • 933 Views
  • 4 replies
  • 4 kudos

Resolved! DBR 17.3.3 introduced unexpected DEBUG logs from ThreadMonitor – how to disable?

After upgrading from DBR 17.3.2 to DBR 17.3.3, we started seeing a flood of DEBUG logs like this in job outputs:```DEBUG:ThreadMonitor:Logging python thread stack frames for MainThread and py4j threads: DEBUG:ThreadMonitor:Logging Thread-8 (run) stac...

  • 933 Views
  • 4 replies
  • 4 kudos
Latest Reply
WAHID
New Contributor II
  • 4 kudos

@iyashk-DBWe are currently using DBR version 17.3 LTS, and the issue is still occurring.Do you know when the fix is expected to be applied? We need this information to decide whether we should wait for the fix or proceed with the workaround you propo...

  • 4 kudos
3 More Replies
rijin-thomas
by New Contributor II
  • 249 Views
  • 4 replies
  • 3 kudos

Mongo Db connector - Connection timeout when trying to connect to AWS Document DB

I am on Databricks Run Time LTE 14.3 Spark 3.5.0 Scala 2.12 and mongodb-spark-connector_2.12:10.2.0. Trying to connect to Document DB using the connector and all I get is a connection timeout. I tried using PyMongo, which works as expected and I can ...

  • 249 Views
  • 4 replies
  • 3 kudos
Latest Reply
Sanjeeb2024
Contributor III
  • 3 kudos

Hi @rijin-thomas - Can you please allow the CIDR block for databricks account VPC from aws document db sg ( Executor connectivity stated by@bianca_unifeye ) . 

  • 3 kudos
3 More Replies
tvdh
by New Contributor
  • 56 Views
  • 1 replies
  • 0 kudos

Tab navigation between fields in dashboards is random

Tab navigation between fields in published dashboards seem very random.I have a dashboard with multiple text input fields (mapped to query paramters / filters). I expect to move logically between them when pressing tab (keyboard navigation), but I mo...

  • 56 Views
  • 1 replies
  • 0 kudos
Latest Reply
Advika
Community Manager
  • 0 kudos

Hello @tvdh! You can share this as product feedback so it’s visible to the Databricks product team and can be tracked and prioritized.

  • 0 kudos
Upendra_Dwivedi
by Contributor
  • 3024 Views
  • 2 replies
  • 1 kudos

Resolved! Databricks APP OBO User Authorization

Hi All,We are using on-behalf of user authorization method for our app and the x-forwarded-access-token is expiring after sometime and we have to redeploy our app to rectify the issue. I am not sure what is the issue or how we can keep the token aliv...

Upendra_Dwivedi_0-1747911721728.png
  • 3024 Views
  • 2 replies
  • 1 kudos
Latest Reply
jpt
New Contributor
  • 1 kudos

I am confronted with a similar error. I am also using obo user auth and have implemented accessing the token via  st.context.headers.get('x-forwarded-access-token') for every query and do not save it in a cache. Still, after 1 hour, i am hit with the...

  • 1 kudos
1 More Replies
Labels