cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

mrkure
by New Contributor II
  • 1229 Views
  • 2 replies
  • 0 kudos

Databricks connect, set spark config

Hi, Iam using databricks connect to compute with databricks cluster. I need to set some spark configurations, namely spark.files.ignoreCorruptFiles. As I have experienced, setting spark configuration in databricks connect for the current session, has...

  • 1229 Views
  • 2 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

Have you tried setting it up in your code as: from pyspark.sql import SparkSession # Create a Spark session spark = SparkSession.builder \ .appName("YourAppName") \ .config("spark.files.ignoreCorruptFiles", "true") \ .getOrCreate() # Yo...

  • 0 kudos
1 More Replies
Buranapat
by New Contributor II
  • 2820 Views
  • 4 replies
  • 4 kudos

Error when accessing 'num_inserted_rows' in Spark SQL (DBR 15.4 LTS)

Hello Databricks Community,I've encountered an issue while trying to capture the number of rows inserted after executing an SQL insert statement in Databricks (DBR 15.4 LTS). My code is attempting to access the number of inserted rows as follows: row...

Buranapat_4-1727751428815.png Buranapat_3-1727750986067.png
  • 2820 Views
  • 4 replies
  • 4 kudos
Latest Reply
GeorgeP1
Databricks Partner
  • 4 kudos

Hi,we are experiencing the same issue. We also turned on liquid clustering on table and we had additional checks on the inserted data information, which was really helpful.@GavinReeves3 did you manage to solve the issue?@MuthuLakshmi any idea? Thank ...

  • 4 kudos
3 More Replies
zg
by New Contributor III
  • 2356 Views
  • 4 replies
  • 3 kudos

Resolved! Unable to Create Alert Using API

Hi All, I'm trying to create an alert using the Databricks REST API, but I keep encountering the following error:Error creating alert: 400 {"message": "Alert name cannot be empty or whitespace"}:{"alert": {"seconds_to_retrigger": 0,"display_name": "A...

  • 2356 Views
  • 4 replies
  • 3 kudos
Latest Reply
filipniziol
Esteemed Contributor
  • 3 kudos

Hi @zg ,You are sending the payload related to the new endpoint (/api/2.0/sql/alerts) to the old endpoint (/api/2.0/preview/sql/alerts).That are the docs of the old endpoint:https://docs.databricks.com/api/workspace/alertslegacy/createAs you can see ...

  • 3 kudos
3 More Replies
Mattias
by New Contributor II
  • 2627 Views
  • 3 replies
  • 0 kudos

How to increase timeout in Databricks Workflows DBT task

Hi,I have a Databricks Workflows DBT task that targets a PRO SQL warehouse. However, the task fails with a "to many retries" error (see below) if the PRO SQL warehouse is not up and running when the task starts. How can I increase the timeout or allo...

  • 2627 Views
  • 3 replies
  • 0 kudos
Latest Reply
Mattias
New Contributor II
  • 0 kudos

One option seems to be to reference a custom "profiles.yml" in the job configuration and specify a custom DBT Databricks connector timeout there (https://docs.getdbt.com/docs/core/connect-data-platform/databricks-setup#additional-parameters).However,...

  • 0 kudos
2 More Replies
Mkk1
by New Contributor
  • 1699 Views
  • 1 replies
  • 0 kudos

Joining tables across DLT pipelines

How can I join a silver table (s1) from a DLT pipeline (D1) to another silver table (S2) from a different DLT pipeline (D2)?#DLT #DeltaLiveTables

  • 1699 Views
  • 1 replies
  • 0 kudos
Latest Reply
JothyGanesan
New Contributor III
  • 0 kudos

@Mkk1 Did you get to get this completed? We are in the similar situation, how did you get to acheive this?

  • 0 kudos
MAHANK
by New Contributor II
  • 3682 Views
  • 3 replies
  • 0 kudos

How to compare two databricks notebooks which are in different folders? note we dont have GIT setup

we would to like compare two notebooks which are in different folders , we are yet set up a GIT repo for these folders.?what are the other options we have to compare two notebooks?thanksNAnda  

  • 3682 Views
  • 3 replies
  • 0 kudos
Latest Reply
arekmust
New Contributor III
  • 0 kudos

Then using the Repos and Git (GitHub/Azure DevOps) is the way to go!

  • 0 kudos
2 More Replies
MatthewMills
by Databricks Partner
  • 5580 Views
  • 3 replies
  • 7 kudos

Resolved! DLT Apply Changes Tables corrupt

Got a weird DLT error.Test harness using the new(ish) 'Apply Changes from Snapshot' Functionality and DLT Serverless (Current Channel). Azure Aus East Region.Has been working for several months without issue - but within the last week these DLT table...

Data Engineering
Apply Changes From Snapshot
dlt
  • 5580 Views
  • 3 replies
  • 7 kudos
Latest Reply
Lakshay
Databricks Employee
  • 7 kudos

We have an open ticket on this issue. The issue is caused by the maintenance pipeline renaming the backing table. We expect the fix to be rolled out soon for this issue.

  • 7 kudos
2 More Replies
shubham_007
by Contributor III
  • 1427 Views
  • 1 replies
  • 0 kudos

Urgent !! Need information/details and reference link on below two topics:

Dear experts,I need urgent help and guidance on information/details with reference links on below topics:Steps on Package Installation with Serverless in Databricks.What are Delta Lake Connector with serverless ? How to run Delta Lake queries outside...

  • 1427 Views
  • 1 replies
  • 0 kudos
Latest Reply
brockb
Databricks Employee
  • 0 kudos

Seems like a duplicate: https://community.databricks.com/t5/data-engineering/urgent-need-information-details-and-reference-link-on-below-two/td-p/107260

  • 0 kudos
data-grassroots
by New Contributor III
  • 8574 Views
  • 7 replies
  • 1 kudos

Resolved! Ingesting Files - Same file name, modified content

We have a data feed with files whose filenames stays the same but the contents change over time (brand_a.csv, brand_b.csv, brand_c.csv ....).Copy Into seems to ignore the files when they change.If we set the Force flag to true and run it, we end up w...

  • 8574 Views
  • 7 replies
  • 1 kudos
Latest Reply
data-grassroots
New Contributor III
  • 1 kudos

Thanks for the validation, Werners! That's the path we've been heading down (copy + merge). I still have some DLT experiments planned but - at least for this situation - copy + merge works just fine.

  • 1 kudos
6 More Replies
peter_ticker
by New Contributor III
  • 11681 Views
  • 17 replies
  • 2 kudos

XML Auto Loader rescuedDataColumn Doesn't Rescue Array Fields

Hiya! I'm interested whether anyone has a solution to the following problem. If you load XML using Auto Loader or otherwise and set the schema to be such that a single value is assumed for a given xpath but the actual XML contains multiple values (i....

  • 11681 Views
  • 17 replies
  • 2 kudos
Latest Reply
Witold
Databricks Partner
  • 2 kudos

Let me rephrase it. You can't use Message as the rowTag, because it's the root element. rowTag implies that it's a tag within the root element, which might occur multiple times. Check the docs around reading and write XML files, there you'll find exa...

  • 2 kudos
16 More Replies
evangelos
by New Contributor III
  • 6729 Views
  • 5 replies
  • 0 kudos

Resolved! Databricks asset bundles: name_prefix doesn't work with presets

Hello!I am deploying a databricks workflow using bundles and want to attach the prefix "prod_" to the name of my job.My target uses the `mode: production` and I follow the instructions in https://learn.microsoft.com/en-us/azure/databricks/dev-tools/b...

  • 6729 Views
  • 5 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

You need to attach the prefix "prod_" to the name of your job in a Databricks workflow using bundles, you need to ensure that the name_prefix preset is correctly configured in your databricks.yml file.   targets: prod: mode: production pres...

  • 0 kudos
4 More Replies
oakhill
by New Contributor III
  • 6039 Views
  • 3 replies
  • 1 kudos

How do we create a job cluster in Databricks Asset Bundles for use across different jobs?

When developing jobs on DABs, we use new_cluster to create a cluster for a particular job. I think it's a lot of lines and YAML when what I really need is a "small cluster" and "big cluster" to reference for certain kind of jobs. Tags would be on the...

  • 6039 Views
  • 3 replies
  • 1 kudos
Latest Reply
filipniziol
Esteemed Contributor
  • 1 kudos

Hi @oakhill ,You can specify you job cluster configuration in your variables:variables: small_cluster_id: description: "The small cluster with 2 workers used by the jobs" type: complex default: spark_version: "15.4.x-scala2.12" ...

  • 1 kudos
2 More Replies
saniok
by New Contributor II
  • 2270 Views
  • 2 replies
  • 0 kudos

How to Handle Versioning in Databricks Asset Bundles?

 Hi everyone,In our organization, we are transitioning from defining Databricks jobs using the UI to managing them with asset bundles. Since asset bundles can be deployed across multiple workspaces—each potentially having multiple targets (e.g., stag...

  • 2270 Views
  • 2 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @saniok,   In databricks.yml file you can include version information in this file to manage different versions of your bundles.Example: bundle: name: my-bundle version: 1.0.0 resources: jobs: my-job: name: my-job ...

  • 0 kudos
1 More Replies
Avinash_Narala
by Databricks Partner
  • 3500 Views
  • 7 replies
  • 3 kudos

Resolved! SQL Server to Databricks Migration

Hi,I want to build a python function to migrate SQL Server tables to Databricks.Is there any guide/ best practices on how to do so.It'll be really helpful if there is any.Regards,Avinash N

  • 3500 Views
  • 7 replies
  • 3 kudos
Latest Reply
filipniziol
Esteemed Contributor
  • 3 kudos

Hi @Avinash_Narala ,If it is lift and shift, then try this:1. Set up Lakehouse Federation to SQL Server2. Use CTAS statements to copy each table into Unity Catalog CREATE TABLE catalog_name.schema_name.table_name AS SELECT * FROM sql_server_catalog_...

  • 3 kudos
6 More Replies
jeremy98
by Honored Contributor
  • 9660 Views
  • 22 replies
  • 1 kudos

wheel package to install in a serveless workflow

Hi guys, Which is the way through Databricks Asset Bundle to declare a new job definition having a serveless compute associated on each task that composes the workflow and be able that inside each notebook task definition is possible to catch the dep...

  • 9660 Views
  • 22 replies
  • 1 kudos
Latest Reply
jeremy98
Honored Contributor
  • 1 kudos

Ping @Alberto_Umana 

  • 1 kudos
21 More Replies
Labels