Data Engineering

Forum Posts

Sorted by:

by aliacovella • Contributor

01-29-2025 4:55:04 AM

2997 Views
3 replies
1 kudos

Resolved! Custom Checkpointing

The following is my scenario:I need to query on a daily basis from an external table that maintains a row versionI would like to be able to query for all records where the row version is greater than the max previously processed row version. The sour...

Data Engineering

2997 Views
3 replies
1 kudos

01-29-2025 4:55:04 AM

View Replies

Latest Reply

jeremy98
Honored Contributor

01-29-2025 5:34:04 AM

1 kudos

Hi, I totally agree with VZLA, within my internal team we have a similar issue and we used a table to track the latest versions of each table, since we haven't a streaming process in our side. DLT pipelines could be a choice, but depends also if you ...

1 kudos

01-29-2025 5:34:04 AM

2 More Replies

by ashraf1395 • Honored Contributor

01-29-2025 3:06:52 AM

2850 Views
3 replies
0 kudos

Resolved! Databricks Workflow design

I have 7 - 8 different dlt pipelines which have to be run at the same time according to their batch type i.e. hourly and daily. Right now they are triggered effectively according to their batch type. I want to move to a next stage where I want to clu...

Data Engineering

2850 Views
3 replies
0 kudos

01-29-2025 3:06:52 AM

View Replies

Latest Reply

ashraf1395
Honored Contributor

01-29-2025 3:53:41 AM

0 kudos

Hi @VZLA , I got the idea. There will be a small change in the way, we will use it. Since we don't schedule the workflow in databricks we trigger it using the API. So I will pass a job parameter along with the trigger according to the timestamp wheth...

0 kudos

01-29-2025 3:53:41 AM

2 More Replies

by maddan80 • New Contributor II

01-27-2025 4:43:58 PM

1304 Views
3 replies
0 kudos

History load from Source and

Hi As part of our requirement we wanted to load a huge historical data from the Source System to Databricks in Bronze and then process it to Gold, We wanted to use batch with read and Write so that the historical load is done and then for the delta o...

Data Engineering

1304 Views
3 replies
0 kudos

01-27-2025 4:43:58 PM

View Replies

Latest Reply

MariuszK
Valued Contributor III

01-29-2025 3:01:35 AM

0 kudos

I imported 16 TB of data using ADF. In this scenario I'd create a process that will extract from a source data using ADF and then execute the rest of logic to populate tables in the gold. For the new data I'd create a separate process using Autoloade...

0 kudos

01-29-2025 3:01:35 AM

2 More Replies

by javiomotero • New Contributor III

01-14-2025 11:09:56 AM

3965 Views
4 replies
4 kudos

How to consume Fabric Datawarehouse inside a Databricks notebook

Hello,I'm having a hard time figuring out (and finding the right documentation) to be able to connect my databricks notebook to consume tables from a fabric datawarehouse. I've checked this, but seems to work only with onelake and this, but I'm not ...

Data Engineering

datawarehouse

fabric

3965 Views
4 replies
4 kudos

01-14-2025 11:09:56 AM

View Replies

Latest Reply

javiomotero
New Contributor III

01-29-2025 2:51:18 AM

4 kudos

Hello, I would like to get a bit more options regarding reading Views. Using the abfss is fine for reading tables, but I don't know how to load Views, which are visible in the SQL Endpoint. Is there any alternative for connecting to Fabric and be abl...

4 kudos

01-29-2025 2:51:18 AM

3 More Replies

by Avinash_Narala • Databricks Partner

01-26-2025 11:49:29 PM

2085 Views
3 replies
4 kudos

Redshift to Databricks Migration

Hi,I want a detailed plan steps to migrate my data from redshift to databricks.where to start, what to assess and what to migrate.It could really help me if you provide the detailed explaination on migration.Thanks in Advance.

Data Engineering

2085 Views
3 replies
4 kudos

01-26-2025 11:49:29 PM

View Replies

Latest Reply

MariuszK
Valued Contributor III

01-29-2025 2:31:03 AM

4 kudos

I migrated Oracle to Databricks and have an experience with Redshift. The cost and effort will depend on your technical stuck:- What do you use for ETL?- What do you use for data ingestion?- Reporting tools?In general, simplest steps are: data and mo...

4 kudos

01-29-2025 2:31:03 AM

2 More Replies

by ahen • New Contributor

08-27-2024 4:02:14 PM

4853 Views
1 replies
0 kudos

Deployed DABs job via Gitlab CICD. It is creating duplicate jobs.

We had error in DABs deploy and then subsequent retries resulted in a locked stateAnd as suggested in the logs, we use --force-lock option and the deploy succeededHowever, it created duplicate jobs for all assets in the bundle instead of updating the...

Data Engineering

4853 Views
1 replies
0 kudos

08-27-2024 4:02:14 PM

View Replies

Latest Reply

Satyadeepak
Databricks Employee

01-28-2025 4:44:46 PM

0 kudos

@ahen When you used the --force-lock option during the Databricks Asset Bundle (DAB) deployment, it likely bypassed certain checks that would normally prevent duplicate resource creation. This option is used to force a deployment even when a lock is ...

0 kudos

01-28-2025 4:44:46 PM

by shubham_007 • Contributor III

01-27-2025 7:55:36 AM

3242 Views
6 replies
0 kudos

Resolved! Need urgent help and guidance on information/details with reference links on below topics:

Dear experts,I need urgent help and guidance on information/details with reference links on below topics:Steps on Package Installation with Serverless in Databricks.What are Delta Lake Connector with serverless ? How to run Delta Lake queries outside...

Data Engineering

3242 Views
6 replies
0 kudos

01-27-2025 7:55:36 AM

View Replies

Latest Reply

brockb
Databricks Employee

01-28-2025 10:12:17 AM

0 kudos

Were you able to review the documentation provided here: https://docs.databricks.com/en/compute/serverless/dependencies.html#install-notebook-dependencies?

0 kudos

01-28-2025 10:12:17 AM

5 More Replies

by mrkure • New Contributor II

01-27-2025 10:39:12 AM

1237 Views
2 replies
0 kudos

Databricks connect, set spark config

Hi, Iam using databricks connect to compute with databricks cluster. I need to set some spark configurations, namely spark.files.ignoreCorruptFiles. As I have experienced, setting spark configuration in databricks connect for the current session, has...

Data Engineering

1237 Views
2 replies
0 kudos

01-27-2025 10:39:12 AM

View Replies

Latest Reply

Walter_C
Databricks Employee

01-27-2025 12:57:46 PM

0 kudos

Have you tried setting it up in your code as: from pyspark.sql import SparkSession # Create a Spark session spark = SparkSession.builder \ .appName("YourAppName") \ .config("spark.files.ignoreCorruptFiles", "true") \ .getOrCreate() # Yo...

0 kudos

01-27-2025 12:57:46 PM

1 More Replies

by Buranapat • New Contributor II

09-30-2024 8:00:07 PM

2830 Views
4 replies
4 kudos

Error when accessing 'num_inserted_rows' in Spark SQL (DBR 15.4 LTS)

Hello Databricks Community,I've encountered an issue while trying to capture the number of rows inserted after executing an SQL insert statement in Databricks (DBR 15.4 LTS). My code is attempting to access the number of inserted rows as follows: row...

Data Engineering

2830 Views
4 replies
4 kudos

09-30-2024 8:00:07 PM

View Replies

Latest Reply

GeorgeP1
Databricks Partner

01-28-2025 3:24:48 AM

4 kudos

Hi,we are experiencing the same issue. We also turned on liquid clustering on table and we had additional checks on the inserted data information, which was really helpful.@GavinReeves3 did you manage to solve the issue?@MuthuLakshmi any idea? Thank ...

4 kudos

01-28-2025 3:24:48 AM

3 More Replies

by zg • New Contributor III

01-27-2025 8:07:48 AM

2365 Views
4 replies
3 kudos

Resolved! Unable to Create Alert Using API

Hi All, I'm trying to create an alert using the Databricks REST API, but I keep encountering the following error:Error creating alert: 400 {"message": "Alert name cannot be empty or whitespace"}:{"alert": {"seconds_to_retrigger": 0,"display_name": "A...

Data Engineering

2365 Views
4 replies
3 kudos

01-27-2025 8:07:48 AM

View Replies

Latest Reply

filipniziol
Esteemed Contributor

01-28-2025 2:03:12 AM

3 kudos

Hi @zg ,You are sending the payload related to the new endpoint (/api/2.0/sql/alerts) to the old endpoint (/api/2.0/preview/sql/alerts).That are the docs of the old endpoint:https://docs.databricks.com/api/workspace/alertslegacy/createAs you can see ...

3 kudos

01-28-2025 2:03:12 AM

3 More Replies

by Mattias • New Contributor II

01-27-2025 5:57:16 AM

2640 Views
3 replies
0 kudos

How to increase timeout in Databricks Workflows DBT task

Hi,I have a Databricks Workflows DBT task that targets a PRO SQL warehouse. However, the task fails with a "to many retries" error (see below) if the PRO SQL warehouse is not up and running when the task starts. How can I increase the timeout or allo...

Data Engineering

2640 Views
3 replies
0 kudos

01-27-2025 5:57:16 AM

View Replies

Latest Reply

Mattias
New Contributor II

01-28-2025 12:45:46 AM

0 kudos

One option seems to be to reference a custom "profiles.yml" in the job configuration and specify a custom DBT Databricks connector timeout there (https://docs.getdbt.com/docs/core/connect-data-platform/databricks-setup#additional-parameters).However,...

0 kudos

01-28-2025 12:45:46 AM

2 More Replies

by Mkk1 • New Contributor

03-27-2024 11:34:45 AM

1703 Views
1 replies
0 kudos

Joining tables across DLT pipelines

How can I join a silver table (s1) from a DLT pipeline (D1) to another silver table (S2) from a different DLT pipeline (D2)?#DLT #DeltaLiveTables

Data Engineering

1703 Views
1 replies
0 kudos

03-27-2024 11:34:45 AM

View Replies

Latest Reply

JothyGanesan
New Contributor III

01-28-2025 12:33:40 AM

0 kudos

@Mkk1 Did you get to get this completed? We are in the similar situation, how did you get to acheive this?

0 kudos

01-28-2025 12:33:40 AM

by MAHANK • New Contributor II

01-27-2025 10:25:34 AM

3699 Views
3 replies
0 kudos

How to compare two databricks notebooks which are in different folders? note we dont have GIT setup

we would to like compare two notebooks which are in different folders , we are yet set up a GIT repo for these folders.?what are the other options we have to compare two notebooks?thanksNAnda

Data Engineering

3699 Views
3 replies
0 kudos

01-27-2025 10:25:34 AM

View Replies

Latest Reply

arekmust
New Contributor III

01-27-2025 10:53:17 PM

0 kudos

Then using the Repos and Git (GitHub/Azure DevOps) is the way to go!

0 kudos

01-27-2025 10:53:17 PM

2 More Replies

by MatthewMills • Databricks Partner

01-20-2025 3:56:25 AM

5592 Views
3 replies
7 kudos

Resolved! DLT Apply Changes Tables corrupt

Got a weird DLT error.Test harness using the new(ish) 'Apply Changes from Snapshot' Functionality and DLT Serverless (Current Channel). Azure Aus East Region.Has been working for several months without issue - but within the last week these DLT table...

Data Engineering

Apply Changes From Snapshot

dlt

5592 Views
3 replies
7 kudos

01-20-2025 3:56:25 AM

View Replies

Latest Reply

Lakshay
Databricks Employee

01-27-2025 9:29:20 PM

7 kudos

We have an open ticket on this issue. The issue is caused by the maintenance pipeline renaming the backing table. We expect the fix to be rolled out soon for this issue.

7 kudos

01-27-2025 9:29:20 PM

2 More Replies

by shubham_007 • Contributor III

01-27-2025 7:53:43 AM

1427 Views
1 replies
0 kudos

Urgent !! Need information/details and reference link on below two topics:

Data Engineering

1427 Views
1 replies
0 kudos

01-27-2025 7:53:43 AM

View Replies

Latest Reply

brockb
Databricks Employee

01-27-2025 1:18:20 PM

0 kudos

Seems like a duplicate: https://community.databricks.com/t5/data-engineering/urgent-need-information-details-and-reference-link-on-below-two/td-p/107260

0 kudos

01-27-2025 1:18:20 PM

Databricks Community

Forum Posts

Resolved! Custom Checkpointing

Resolved! Databricks Workflow design

History load from Source and

How to consume Fabric Datawarehouse inside a Databricks notebook

Redshift to Databricks Migration

Deployed DABs job via Gitlab CICD. It is creating duplicate jobs.

Resolved! Need urgent help and guidance on information/details with reference links on below topics:

Databricks connect, set spark config

Error when accessing 'num_inserted_rows' in Spark SQL (DBR 15.4 LTS)

Resolved! Unable to Create Alert Using API

How to increase timeout in Databricks Workflows DBT task

Joining tables across DLT pipelines

How to compare two databricks notebooks which are in different folders? note we dont have GIT setup

Resolved! DLT Apply Changes Tables corrupt

Urgent !! Need information/details and reference link on below two topics:

Databricks to Salesforce Core (Not cloud)

Databricks optimization for query perfomance and p...

Parametrize the DLT pipeline for dynamic loading o...

File Arrival Trigger - Multiple tables

Issue while handling Deletes and Inserts in Struct...