cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

deepu1
by Visitor
  • 1 Views
  • 0 replies
  • 0 kudos

DLT Gold aggregation with apply_change

I am building a Gold table using Delta Live Tables (DLT). The Gold table contains aggregated data derived from a Silver table. Aggregation happens monthly. However, the requirement is Only the current (year, month) should be recalculated. Previous mo...

  • 1 Views
  • 0 replies
  • 0 kudos
Chandana_Ramesh
by New Contributor II
  • 16 Views
  • 0 replies
  • 0 kudos

Lakebridge SetUp Issue

Hi,I'm getting the below error upon executing databricks labs lakebridge analyze command. All the dependencies have been installed before execution of the command. Can someone please give a solution, or suggest if anything is missing? Below attached ...

  • 16 Views
  • 0 replies
  • 0 kudos
HarishKumarM
by Visitor
  • 34 Views
  • 1 replies
  • 0 kudos

Zerobus Connector Issue

I was trying to implement the example posted on the below link for Zerobus connector to test its functionality on my free edition workspace but unfortunately I am getting below error.Reference Code: https://learn.microsoft.com/en-us/azure/databricks/...

  • 34 Views
  • 1 replies
  • 0 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 0 kudos

Hey @HarishKumarM , I did some digging and found some helpful information to help you troubleshoot.   What the error means Your workspace isn’t currently enrolled in the Zerobus Ingest preview. Even though Zerobus is labeled a Public Preview, it’s st...

  • 0 kudos
vijsharm
by New Contributor II
  • 46 Views
  • 4 replies
  • 0 kudos

checkpoint changes not working on my databricks job

Hi,I do have a job processing kafka stream using kafka.readstream process and due to some issue we changed the checkpoint path to other path and it pulled all the records and later when i changed to the original checkpoint location it is not pulling ...

Data Engineering
checkpoint
  • 46 Views
  • 4 replies
  • 0 kudos
Latest Reply
cgrant
Databricks Employee
  • 0 kudos

When you swapped back to the old checkpoint, were any records flowing through, and were batches completing? It's possible that you've accumulated a big backlog with the old checkpoint, and/or records in Kafka have expired. And the "startingOffsets" o...

  • 0 kudos
3 More Replies
csondergaardp
by New Contributor
  • 67 Views
  • 1 replies
  • 0 kudos

[PATH_NOT_FOUND] Structured Streaming uses wrong checkpoint location

I'm trying to perform a simple example using structured streaming on a directory created as a Volume. The use case is purely educational; I am investigating various forms of triggers. Basic info:Catalog: "dev_catalog"Schema: "stream"Volume: "streamin...

  • 67 Views
  • 1 replies
  • 0 kudos
Latest Reply
cgrant
Databricks Employee
  • 0 kudos

Your checkpoint code looks correct. What is the source of `df`? Is it `/Volumes/dev_catalog/default/streaming_basics/` ? The path looks incorrect - add `stream` to it.  

  • 0 kudos
SatabrataMuduli
by New Contributor II
  • 61 Views
  • 1 replies
  • 1 kudos

Unable to Connect to Oracle from Databricks UC Cluster (DBR 15.4) – ORA-12170 Timeout Error

 Hi all,I’m trying to connect to an Oracle database from my Databricks UC cluster (DBR 15.4) using the ojdbc8.jar driver, which I’ve installed on the cluster. Here’s the code I’m using:df = spark.read.format("jdbc")\ .option("url", jdbc_url)\ ...

  • 61 Views
  • 1 replies
  • 1 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 1 kudos

Hi @SatabrataMuduli ,I'm quite sure this is networking issue. You didn't provide much detalis about your environment , so I'll give you general advise. You cannot reach on premise oracle database unless networking is explicitly configured or your dat...

  • 1 kudos
Dhruv-22
by Contributor II
  • 77 Views
  • 2 replies
  • 0 kudos

Feature request: Allow to set value as null when not present in schema evolution

I want to raise a feature request as follows.Currently, in the Automatic schema evolution for merge when a column is not present in the source dataset it is not changed in the target dataset. For e.g.%sql CREATE OR REPLACE TABLE edw_nprd_aen.bronze.t...

Dhruv22_0-1767970990008.png Dhruv22_1-1767971051176.png Dhruv22_2-1767971116934.png Dhruv22_3-1767971213212.png
  • 77 Views
  • 2 replies
  • 0 kudos
Latest Reply
ManojkMohan
Honored Contributor II
  • 0 kudos

@Dhruv-22 ProblemWhen using MERGE INTO ... WITH SCHEMA EVOLUTION, if a column exists in the target table but is not present in the source dataset, that column is left unchanged on matched rows.Solution ThinkingThis can be emulated by introspecting th...

  • 0 kudos
1 More Replies
RevanthV
by New Contributor III
  • 76 Views
  • 3 replies
  • 2 kudos

Data validation with df writes using append mode

Hi Team,Recently i came across a situation where I had to write a huge data and it took 6 hrs to complete...later when i checked the target data , I saw 20% of the total records written incorrectly or corrupted because the source data itself was corr...

  • 76 Views
  • 3 replies
  • 2 kudos
Latest Reply
RevanthV
New Contributor III
  • 2 kudos

Hey @K_Anudeep , thanks a lot for tagging me into the GitHub issue.. This is exactly what I want " validate and commit" feature and i se you have already raised a PR for the same with a new option called . I will try this out and check if it satisfie...

  • 2 kudos
2 More Replies
ramsai
by New Contributor
  • 94 Views
  • 5 replies
  • 2 kudos

Updating Job Creator to Service Principal

Regarding data governance best practices: I have jobs created by a user who has left the organization, and I need to change the job creator to a service principal. Currently, it seems the only option is to clone the job and update it. Is this the rec...

  • 94 Views
  • 5 replies
  • 2 kudos
Latest Reply
Sanjeeb2024
Contributor III
  • 2 kudos

I agree with @nayan_wylde , for auditing, creator is important and it should in immutable by nature. 

  • 2 kudos
4 More Replies
Rose_15
by New Contributor
  • 96 Views
  • 3 replies
  • 0 kudos

Databricks SQL Warehouse fails when streaming ~53M rows via Python (token/session expiry)

Hello Team,I am facing a consistent issue when streaming a large table (~53 million rows) from a Databricks SQL Warehouse using Python (databricks-sql-connector) with OAuth authentication.I execute a single long-running query and fetch data in batche...

  • 96 Views
  • 3 replies
  • 0 kudos
Latest Reply
Sanjeeb2024
Contributor III
  • 0 kudos

Hi @Rose_15 - Thanks for the details. It is better to do a planning like number of tables, size and number of records and better to extract the files to a cloud storage and reload the data using any mechanism. Once your extraction is complete, you wi...

  • 0 kudos
2 More Replies
jfvizoso
by New Contributor II
  • 12923 Views
  • 6 replies
  • 0 kudos

Can I pass parameters to a Delta Live Table pipeline at running time?

I need to execute a DLT pipeline from a Job, and I would like to know if there is any way of passing a parameter. I know you can have settings in the pipeline that you use in the DLT notebook, but it seems you can only assign values to them when crea...

  • 12923 Views
  • 6 replies
  • 0 kudos
Latest Reply
Sudharsan
New Contributor II
  • 0 kudos

@DeepakAI : May I know, how you resolved it?

  • 0 kudos
5 More Replies
Phani1
by Databricks MVP
  • 3163 Views
  • 8 replies
  • 0 kudos

Triggering DLT Pipelines with Dynamic Parameters

Hi Team,We have a scenario where we need to pass a dynamic parameter to a Spark job that will trigger a DLT pipeline in append mode. Can you please suggest an approach for this?Regards,Phani

  • 3163 Views
  • 8 replies
  • 0 kudos
Latest Reply
Sudharsan
New Contributor II
  • 0 kudos

@koji_kawamura : I have more or less the same scenario say I have 3 tables.The sources and targets are different but I would like to use a generic pipeline and pass in the source and target as a parameter and run them parallely. @sas30 : can you be m...

  • 0 kudos
7 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels