cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Chandana_Ramesh
by New Contributor II
  • 66 Views
  • 3 replies
  • 1 kudos

Lakebridge SetUp Issue

Hi,I'm getting the below error upon executing databricks labs lakebridge analyze command. All the dependencies have been installed before execution of the command. Can someone please give a solution, or suggest if anything is missing? Below attached ...

  • 66 Views
  • 3 replies
  • 1 kudos
Latest Reply
aleksandra_ch
Databricks Employee
  • 1 kudos

Hi Chandana_Ramesh, Please rerun the command with a --debug flag and share the command and the whole output. From the message that you shared, it looks like the Analyzer.exe binary is not accessible: Verify the binary exists and is accessible: C:\Use...

  • 1 kudos
2 More Replies
Anish_2
by New Contributor II
  • 42 Views
  • 2 replies
  • 0 kudos

daabricks workflow design

Hello Team,I have use-case in which i want to trigger another dlt pipeline if 1 table got succeded in my parent dlt pipeline. I dont want to create pipeline to pipeline dependency. Is there any way to create table to pipeline dependency?Thank youAnis...

Data Engineering
deltalivetable
workflowdesign
  • 42 Views
  • 2 replies
  • 0 kudos
Latest Reply
Raman_Unifeye
Contributor III
  • 0 kudos

@Anish_2 - TUT is the solution. in TUT, instead of the parent pipeline "pushing" a notification, the child job is "pulled" into action by a metadata change.Set it up as below.Create a Databricks Job and add a Pipeline task pointing to your Secondary ...

  • 0 kudos
1 More Replies
Alf01
by New Contributor
  • 105 Views
  • 1 replies
  • 0 kudos

Databricks Serverless Pipelines - Incremental Refresh Doubts

Hello everyone,I would like to clarify some doubts regarding how Databricks Pipelines (DLT) behave when using serverless pipelines with incremental updates.In general, incremental processing is enabled and works as expected. However, I have observed ...

  • 105 Views
  • 1 replies
  • 0 kudos
Latest Reply
aleksandra_ch
Databricks Employee
  • 0 kudos

Hi Alf0 and welcome to the Databricks Community!The Lakeflow Spark Declarative Pipelines (SDP) cost model considers multiple factors when deciding whether to perform an incremental refresh or a full recompute. It makes a best-effort attempt to increm...

  • 0 kudos
Dhruv-22
by Contributor II
  • 123 Views
  • 4 replies
  • 0 kudos

Feature request: Allow to set value as null when not present in schema evolution

I want to raise a feature request as follows.Currently, in the Automatic schema evolution for merge when a column is not present in the source dataset it is not changed in the target dataset. For e.g.%sql CREATE OR REPLACE TABLE edw_nprd_aen.bronze.t...

Dhruv22_0-1767970990008.png Dhruv22_1-1767971051176.png Dhruv22_2-1767971116934.png Dhruv22_3-1767971213212.png
  • 123 Views
  • 4 replies
  • 0 kudos
Latest Reply
ManojkMohan
Honored Contributor II
  • 0 kudos

@Dhruv-22 ProblemWhen using MERGE INTO ... WITH SCHEMA EVOLUTION, if a column exists in the target table but is not present in the source dataset, that column is left unchanged on matched rows.Solution ThinkingThis can be emulated by introspecting th...

  • 0 kudos
3 More Replies
Dhruv-22
by Contributor II
  • 91 Views
  • 3 replies
  • 0 kudos

Merge with schema evolution fails because of upper case columns

The following is a minimal reproducible example of what I'm facing right now.%sql CREATE OR REPLACE TABLE edw_nprd_aen.bronze.test_table ( id INT ); INSERT INTO edw_nprd_aen.bronze.test_table VALUES (1); SELECT * FROM edw_nprd_aen.bronze.test_tab...

Dhruv22_0-1768233514715.png Dhruv22_1-1768233551139.png Dhruv22_0-1768234077162.png
  • 91 Views
  • 3 replies
  • 0 kudos
Latest Reply
css-1029
New Contributor
  • 0 kudos

Hi @Dhruv-22,It's actually not a bug. Let me explain what's happening.The Root CauseThe issue stems from how schema evolution works with Delta Lake's MERGE statement, combined with Spark SQL's case-insensitivity settings.Here's the key insight: spark...

  • 0 kudos
2 More Replies
NotCuriosAtAll
by Visitor
  • 61 Views
  • 2 replies
  • 3 kudos

Resolved! Cluster crashes occasionally but not all of the time

We have a small cluster (Standard D2ads v6) with 8 gigs of ram and 2 cores. This is an all-purpose cluster and for some reason, the client demands to use this one for our ETL process. The ETL process is simple, the client drops parquet files in the b...

  • 61 Views
  • 2 replies
  • 3 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 3 kudos

Hi @NotCuriosAtAll ,You have undersized cluster for your workload. This error is typical on driver node with that high cpu consumption. You can check below article (and related solution):Job run fails with error message “Could not reach driver of clu...

  • 3 kudos
1 More Replies
JothyGanesan
by New Contributor III
  • 40 Views
  • 1 replies
  • 0 kudos

DLT Continuous Pipeline load

Hi All,In our project we are working on the DLT pipeline with the DLT tables as target running in continuous mode.These tables are common for multiple countries, and we go live in batches for different countries.So, every time a new change is request...

  • 40 Views
  • 1 replies
  • 0 kudos
Latest Reply
ManojkMohan
Honored Contributor II
  • 0 kudos

@JothyGanesan Use dynamic schema handling and selective table updates to apply metadata changes incrementally from the current watermark, preserving history across country go-lives.Replace static @dlt.table definitions with Auto Loader's schema infer...

  • 0 kudos
bsr
by New Contributor II
  • 894 Views
  • 4 replies
  • 4 kudos

Resolved! DBR 17.3.3 introduced unexpected DEBUG logs from ThreadMonitor – how to disable?

After upgrading from DBR 17.3.2 to DBR 17.3.3, we started seeing a flood of DEBUG logs like this in job outputs:```DEBUG:ThreadMonitor:Logging python thread stack frames for MainThread and py4j threads: DEBUG:ThreadMonitor:Logging Thread-8 (run) stac...

  • 894 Views
  • 4 replies
  • 4 kudos
Latest Reply
WAHID
New Contributor II
  • 4 kudos

@iyashk-DBWe are currently using DBR version 17.3 LTS, and the issue is still occurring.Do you know when the fix is expected to be applied? We need this information to decide whether we should wait for the fix or proceed with the workaround you propo...

  • 4 kudos
3 More Replies
rijin-thomas
by New Contributor II
  • 248 Views
  • 4 replies
  • 3 kudos

Mongo Db connector - Connection timeout when trying to connect to AWS Document DB

I am on Databricks Run Time LTE 14.3 Spark 3.5.0 Scala 2.12 and mongodb-spark-connector_2.12:10.2.0. Trying to connect to Document DB using the connector and all I get is a connection timeout. I tried using PyMongo, which works as expected and I can ...

  • 248 Views
  • 4 replies
  • 3 kudos
Latest Reply
Sanjeeb2024
Contributor III
  • 3 kudos

Hi @rijin-thomas - Can you please allow the CIDR block for databricks account VPC from aws document db sg ( Executor connectivity stated by@bianca_unifeye ) . 

  • 3 kudos
3 More Replies
tvdh
by New Contributor
  • 56 Views
  • 1 replies
  • 0 kudos

Tab navigation between fields in dashboards is random

Tab navigation between fields in published dashboards seem very random.I have a dashboard with multiple text input fields (mapped to query paramters / filters). I expect to move logically between them when pressing tab (keyboard navigation), but I mo...

  • 56 Views
  • 1 replies
  • 0 kudos
Latest Reply
Advika
Community Manager
  • 0 kudos

Hello @tvdh! You can share this as product feedback so it’s visible to the Databricks product team and can be tracked and prioritized.

  • 0 kudos
Upendra_Dwivedi
by Contributor
  • 3015 Views
  • 2 replies
  • 1 kudos

Resolved! Databricks APP OBO User Authorization

Hi All,We are using on-behalf of user authorization method for our app and the x-forwarded-access-token is expiring after sometime and we have to redeploy our app to rectify the issue. I am not sure what is the issue or how we can keep the token aliv...

Upendra_Dwivedi_0-1747911721728.png
  • 3015 Views
  • 2 replies
  • 1 kudos
Latest Reply
jpt
New Contributor
  • 1 kudos

I am confronted with a similar error. I am also using obo user auth and have implemented accessing the token via  st.context.headers.get('x-forwarded-access-token') for every query and do not save it in a cache. Still, after 1 hour, i am hit with the...

  • 1 kudos
1 More Replies
Ved88
by New Contributor II
  • 208 Views
  • 5 replies
  • 1 kudos

databricks all-purpose cluster

getting below error-Failure starting repl. Try detaching and re-attaching the notebook. while executing notebook and can see cluster have all installed lib.

  • 208 Views
  • 5 replies
  • 1 kudos
Latest Reply
Ved88
New Contributor II
  • 1 kudos

Hi,we are not using hive metastore anywhere not sure why that host ((host=consolidated-westeuropec2-prod-metastore-0.mysql.database.azure.com)(port=3306))is coming in driver log ,will i need to do whitelist for that .we are having other use case simi...

  • 1 kudos
4 More Replies
csondergaardp
by New Contributor
  • 114 Views
  • 2 replies
  • 2 kudos

Resolved! [PATH_NOT_FOUND] Structured Streaming uses wrong checkpoint location

I'm trying to perform a simple example using structured streaming on a directory created as a Volume. The use case is purely educational; I am investigating various forms of triggers. Basic info:Catalog: "dev_catalog"Schema: "stream"Volume: "streamin...

  • 114 Views
  • 2 replies
  • 2 kudos
Latest Reply
cgrant
Databricks Employee
  • 2 kudos

Your checkpoint code looks correct. What is the source of `df`? Is it `/Volumes/dev_catalog/default/streaming_basics/` ? The path looks incorrect - add `stream` to it.  

  • 2 kudos
1 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels