Data Engineering

Forum Posts

Sorted by:

by mrkure • New Contributor II

01-27-2025 10:39:12 AM

1229 Views
2 replies
0 kudos

Databricks connect, set spark config

Hi, Iam using databricks connect to compute with databricks cluster. I need to set some spark configurations, namely spark.files.ignoreCorruptFiles. As I have experienced, setting spark configuration in databricks connect for the current session, has...

Data Engineering

1229 Views
2 replies
0 kudos

01-27-2025 10:39:12 AM

View Replies

Latest Reply

Walter_C
Databricks Employee

01-27-2025 12:57:46 PM

0 kudos

Have you tried setting it up in your code as: from pyspark.sql import SparkSession # Create a Spark session spark = SparkSession.builder \ .appName("YourAppName") \ .config("spark.files.ignoreCorruptFiles", "true") \ .getOrCreate() # Yo...

0 kudos

01-27-2025 12:57:46 PM

1 More Replies

by Buranapat • New Contributor II

09-30-2024 8:00:07 PM

2820 Views
4 replies
4 kudos

Error when accessing 'num_inserted_rows' in Spark SQL (DBR 15.4 LTS)

Hello Databricks Community,I've encountered an issue while trying to capture the number of rows inserted after executing an SQL insert statement in Databricks (DBR 15.4 LTS). My code is attempting to access the number of inserted rows as follows: row...

Data Engineering

2820 Views
4 replies
4 kudos

09-30-2024 8:00:07 PM

View Replies

Latest Reply

GeorgeP1
Databricks Partner

01-28-2025 3:24:48 AM

4 kudos

Hi,we are experiencing the same issue. We also turned on liquid clustering on table and we had additional checks on the inserted data information, which was really helpful.@GavinReeves3 did you manage to solve the issue?@MuthuLakshmi any idea? Thank ...

4 kudos

01-28-2025 3:24:48 AM

3 More Replies

by zg • New Contributor III

01-27-2025 8:07:48 AM

2356 Views
4 replies
3 kudos

Resolved! Unable to Create Alert Using API

Hi All, I'm trying to create an alert using the Databricks REST API, but I keep encountering the following error:Error creating alert: 400 {"message": "Alert name cannot be empty or whitespace"}:{"alert": {"seconds_to_retrigger": 0,"display_name": "A...

Data Engineering

2356 Views
4 replies
3 kudos

01-27-2025 8:07:48 AM

View Replies

Latest Reply

filipniziol
Esteemed Contributor

01-28-2025 2:03:12 AM

3 kudos

Hi @zg ,You are sending the payload related to the new endpoint (/api/2.0/sql/alerts) to the old endpoint (/api/2.0/preview/sql/alerts).That are the docs of the old endpoint:https://docs.databricks.com/api/workspace/alertslegacy/createAs you can see ...

3 kudos

01-28-2025 2:03:12 AM

3 More Replies

by Mattias • New Contributor II

01-27-2025 5:57:16 AM

2627 Views
3 replies
0 kudos

How to increase timeout in Databricks Workflows DBT task

Hi,I have a Databricks Workflows DBT task that targets a PRO SQL warehouse. However, the task fails with a "to many retries" error (see below) if the PRO SQL warehouse is not up and running when the task starts. How can I increase the timeout or allo...

Data Engineering

2627 Views
3 replies
0 kudos

01-27-2025 5:57:16 AM

View Replies

Latest Reply

Mattias
New Contributor II

01-28-2025 12:45:46 AM

0 kudos

One option seems to be to reference a custom "profiles.yml" in the job configuration and specify a custom DBT Databricks connector timeout there (https://docs.getdbt.com/docs/core/connect-data-platform/databricks-setup#additional-parameters).However,...

0 kudos

01-28-2025 12:45:46 AM

2 More Replies

by Mkk1 • New Contributor

03-27-2024 11:34:45 AM

1699 Views
1 replies
0 kudos

Joining tables across DLT pipelines

How can I join a silver table (s1) from a DLT pipeline (D1) to another silver table (S2) from a different DLT pipeline (D2)?#DLT #DeltaLiveTables

Data Engineering

1699 Views
1 replies
0 kudos

03-27-2024 11:34:45 AM

View Replies

Latest Reply

JothyGanesan
New Contributor III

01-28-2025 12:33:40 AM

0 kudos

@Mkk1 Did you get to get this completed? We are in the similar situation, how did you get to acheive this?

0 kudos

01-28-2025 12:33:40 AM

by MAHANK • New Contributor II

01-27-2025 10:25:34 AM

3682 Views
3 replies
0 kudos

How to compare two databricks notebooks which are in different folders? note we dont have GIT setup

we would to like compare two notebooks which are in different folders , we are yet set up a GIT repo for these folders.?what are the other options we have to compare two notebooks?thanksNAnda

Data Engineering

3682 Views
3 replies
0 kudos

01-27-2025 10:25:34 AM

View Replies

Latest Reply

arekmust
New Contributor III

01-27-2025 10:53:17 PM

0 kudos

Then using the Repos and Git (GitHub/Azure DevOps) is the way to go!

0 kudos

01-27-2025 10:53:17 PM

2 More Replies

by MatthewMills • Databricks Partner

01-20-2025 3:56:25 AM

5580 Views
3 replies
7 kudos

Resolved! DLT Apply Changes Tables corrupt

Got a weird DLT error.Test harness using the new(ish) 'Apply Changes from Snapshot' Functionality and DLT Serverless (Current Channel). Azure Aus East Region.Has been working for several months without issue - but within the last week these DLT table...

Data Engineering

Apply Changes From Snapshot

dlt

5580 Views
3 replies
7 kudos

01-20-2025 3:56:25 AM

View Replies

Latest Reply

Lakshay
Databricks Employee

01-27-2025 9:29:20 PM

7 kudos

We have an open ticket on this issue. The issue is caused by the maintenance pipeline renaming the backing table. We expect the fix to be rolled out soon for this issue.

7 kudos

01-27-2025 9:29:20 PM

2 More Replies

by shubham_007 • Contributor III

01-27-2025 7:53:43 AM

1427 Views
1 replies
0 kudos

Urgent !! Need information/details and reference link on below two topics:

Dear experts,I need urgent help and guidance on information/details with reference links on below topics:Steps on Package Installation with Serverless in Databricks.What are Delta Lake Connector with serverless ? How to run Delta Lake queries outside...

Data Engineering

1427 Views
1 replies
0 kudos

01-27-2025 7:53:43 AM

View Replies

Latest Reply

brockb
Databricks Employee

01-27-2025 1:18:20 PM

0 kudos

Seems like a duplicate: https://community.databricks.com/t5/data-engineering/urgent-need-information-details-and-reference-link-on-below-two/td-p/107260

0 kudos

01-27-2025 1:18:20 PM

by data-grassroots • New Contributor III

04-16-2024 8:36:45 AM

8574 Views
7 replies
1 kudos

Resolved! Ingesting Files - Same file name, modified content

We have a data feed with files whose filenames stays the same but the contents change over time (brand_a.csv, brand_b.csv, brand_c.csv ....).Copy Into seems to ignore the files when they change.If we set the Force flag to true and run it, we end up w...

Data Engineering

8574 Views
7 replies
1 kudos

04-16-2024 8:36:45 AM

View Replies

Latest Reply

data-grassroots
New Contributor III

04-22-2024 3:43:36 PM

1 kudos

Thanks for the validation, Werners! That's the path we've been heading down (copy + merge). I still have some DLT experiments planned but - at least for this situation - copy + merge works just fine.

1 kudos

04-22-2024 3:43:36 PM

6 More Replies

by peter_ticker • New Contributor III

01-24-2025 3:54:22 AM

11681 Views
17 replies
2 kudos

XML Auto Loader rescuedDataColumn Doesn't Rescue Array Fields

Hiya! I'm interested whether anyone has a solution to the following problem. If you load XML using Auto Loader or otherwise and set the schema to be such that a single value is assumed for a given xpath but the actual XML contains multiple values (i....

Data Engineering

11681 Views
17 replies
2 kudos

01-24-2025 3:54:22 AM

View Replies

Latest Reply

Witold
Databricks Partner

01-27-2025 6:14:05 AM

2 kudos

Let me rephrase it. You can't use Message as the rowTag, because it's the root element. rowTag implies that it's a tag within the root element, which might occur multiple times. Check the docs around reading and write XML files, there you'll find exa...

2 kudos

01-27-2025 6:14:05 AM

16 More Replies

by evangelos • New Contributor III

01-22-2025 6:37:45 AM

6729 Views
5 replies
0 kudos

Resolved! Databricks asset bundles: name_prefix doesn't work with presets

Hello!I am deploying a databricks workflow using bundles and want to attach the prefix "prod_" to the name of my job.My target uses the `mode: production` and I follow the instructions in https://learn.microsoft.com/en-us/azure/databricks/dev-tools/b...

Data Engineering

6729 Views
5 replies
0 kudos

01-22-2025 6:37:45 AM

View Replies

Latest Reply

NandiniN
Databricks Employee

01-26-2025 3:19:00 AM

0 kudos

You need to attach the prefix "prod_" to the name of your job in a Databricks workflow using bundles, you need to ensure that the name_prefix preset is correctly configured in your databricks.yml file. targets: prod: mode: production pres...

0 kudos

01-26-2025 3:19:00 AM

4 More Replies

by oakhill • New Contributor III

12-04-2024 3:31:08 PM

6039 Views
3 replies
1 kudos

How do we create a job cluster in Databricks Asset Bundles for use across different jobs?

When developing jobs on DABs, we use new_cluster to create a cluster for a particular job. I think it's a lot of lines and YAML when what I really need is a "small cluster" and "big cluster" to reference for certain kind of jobs. Tags would be on the...

Data Engineering

6039 Views
3 replies
1 kudos

12-04-2024 3:31:08 PM

View Replies

Latest Reply

filipniziol
Esteemed Contributor

12-05-2024 12:56:11 AM

1 kudos

Hi @oakhill ,You can specify you job cluster configuration in your variables:variables: small_cluster_id: description: "The small cluster with 2 workers used by the jobs" type: complex default: spark_version: "15.4.x-scala2.12" ...

1 kudos

12-05-2024 12:56:11 AM

2 More Replies

by saniok • New Contributor II

01-24-2025 7:06:19 AM

2270 Views
2 replies
0 kudos

How to Handle Versioning in Databricks Asset Bundles?

Hi everyone,In our organization, we are transitioning from defining Databricks jobs using the UI to managing them with asset bundles. Since asset bundles can be deployed across multiple workspaces—each potentially having multiple targets (e.g., stag...

Data Engineering

2270 Views
2 replies
0 kudos

01-24-2025 7:06:19 AM

View Replies

Latest Reply

Alberto_Umana
Databricks Employee

01-24-2025 7:27:24 AM

0 kudos

Hi @saniok, In databricks.yml file you can include version information in this file to manage different versions of your bundles.Example: bundle: name: my-bundle version: 1.0.0 resources: jobs: my-job: name: my-job ...

0 kudos

01-24-2025 7:27:24 AM

1 More Replies

by Avinash_Narala • Databricks Partner

01-24-2025 9:50:08 PM

3500 Views
7 replies
3 kudos

Resolved! SQL Server to Databricks Migration

Hi,I want to build a python function to migrate SQL Server tables to Databricks.Is there any guide/ best practices on how to do so.It'll be really helpful if there is any.Regards,Avinash N

Data Engineering

3500 Views
7 replies
3 kudos

01-24-2025 9:50:08 PM

View Replies

Latest Reply

filipniziol
Esteemed Contributor

01-26-2025 11:47:13 PM

3 kudos

Hi @Avinash_Narala ,If it is lift and shift, then try this:1. Set up Lakehouse Federation to SQL Server2. Use CTAS statements to copy each table into Unity Catalog CREATE TABLE catalog_name.schema_name.table_name AS SELECT * FROM sql_server_catalog_...

3 kudos

01-26-2025 11:47:13 PM

6 More Replies

by jeremy98 • Honored Contributor

01-13-2025 5:40:26 AM

9660 Views
22 replies
1 kudos

wheel package to install in a serveless workflow

Hi guys, Which is the way through Databricks Asset Bundle to declare a new job definition having a serveless compute associated on each task that composes the workflow and be able that inside each notebook task definition is possible to catch the dep...

Data Engineering

9660 Views
22 replies
1 kudos

01-13-2025 5:40:26 AM

View Replies

Latest Reply

jeremy98
Honored Contributor

01-27-2025 1:38:04 AM

1 kudos

Ping @Alberto_Umana

1 kudos

01-27-2025 1:38:04 AM

21 More Replies

Databricks Community

Forum Posts

Databricks connect, set spark config

Error when accessing 'num_inserted_rows' in Spark SQL (DBR 15.4 LTS)

Resolved! Unable to Create Alert Using API

How to increase timeout in Databricks Workflows DBT task

Joining tables across DLT pipelines

How to compare two databricks notebooks which are in different folders? note we dont have GIT setup

Resolved! DLT Apply Changes Tables corrupt

Urgent !! Need information/details and reference link on below two topics:

Resolved! Ingesting Files - Same file name, modified content

XML Auto Loader rescuedDataColumn Doesn't Rescue Array Fields

Resolved! Databricks asset bundles: name_prefix doesn't work with presets

How do we create a job cluster in Databricks Asset Bundles for use across different jobs?

How to Handle Versioning in Databricks Asset Bundles?

Resolved! SQL Server to Databricks Migration

wheel package to install in a serveless workflow

File Arrival Trigger - Multiple tables

Issue while handling Deletes and Inserts in Struct...

DLT with CDC and schema changes in streaming pipel...

how to update not tracked column only in new row v...

Databricks Cost Estimation Template