Data Engineering

Forum Posts

Sorted by:

by lsrinivas2k13 • New Contributor II

04-09-2025 6:49:03 AM

1689 Views
3 replies
0 kudos

not able to run python script even after everything is in place in azure data bricks

getting the below error while running a python which connects to azure sql db Database connection error: ('01000', "[01000] [unixODBC][Driver Manager]Can't open lib 'ODBC Driver 17 for SQL Server' : file not found (0) (SQLDriverConnect)") can some on...

Data Engineering

1689 Views
3 replies
0 kudos

04-09-2025 6:49:03 AM

View Replies

Latest Reply

Louis_Frolio
Databricks Employee

04-10-2025 7:24:03 AM

0 kudos

The error occurs because the Microsoft ODBC Driver 17 for SQL Server is missing on your Azure Databricks cluster. Here's how to fix it: Steps to Resolve Step 1: Create an Init Script to Install ODBC Driver1. Create a file named `odbc-install.sh` with...

0 kudos

04-10-2025 7:24:03 AM

2 More Replies

by NikosLoutas • Databricks Partner

04-09-2025 2:26:31 AM

1439 Views
3 replies
2 kudos

Resolved! Materialized Views Compute

When creating a Materialized View (MV) without a schedule, there seems to be a cost associated with the MV once it is created, even if it is not queried.The question is, once the MV is created, is there already a "hot" compute ready for use in case a...

Data Engineering

1439 Views
3 replies
2 kudos

04-09-2025 2:26:31 AM

View Replies

Latest Reply

Louis_Frolio
Databricks Employee

04-10-2025 7:14:41 AM

2 kudos

Please select "Accept as Solution" so that others can benefit from this exchange. Regards, Louis.

2 kudos

04-10-2025 7:14:41 AM

2 More Replies

by noname123 • New Contributor III

03-14-2024 5:37:30 AM

7707 Views
2 replies
0 kudos

Resolved! Delta table version protocol

I do:df.write.format("delta").mode("append").partitionBy("timestamp").option("mergeSchema", "true").save(destination)If table doesn't exist, it creates new table with "minReaderVersion":3,"minWriterVersion":7.Yesterday it was creating table with "min...

Data Engineering

7707 Views
2 replies
0 kudos

03-14-2024 5:37:30 AM

View Replies

Latest Reply

AddBox45
New Contributor II

04-10-2025 6:26:26 AM

0 kudos

hello how did you fix this explicitly?how did you enable/disable the auto-enable deletion vectors setting to write again with minReaderVersion 1 and minWriterVersion 2?

0 kudos

04-10-2025 6:26:26 AM

1 More Replies

by kmodelew • New Contributor III

04-01-2025 1:39:10 PM

1571 Views
2 replies
1 kudos

Do I need many wheels for each job in project?

I have a project witch my commons, like sparksession object (to run code in pycharm using databricks connect library and the same code directly on databricks).I have under src a few packages from which DAB creates separate jobs. I'm using PyCharm. S...

Data Engineering

1571 Views
2 replies
1 kudos

04-01-2025 1:39:10 PM

View Replies

Latest Reply

kmodelew
New Contributor III

04-10-2025 2:35:46 AM

1 kudos

Hi, I hope it would be usefuel. Here are my files: project structure -> DAB_project_structure.pngeach yml file for job definitions -> task_group_1_job.png and task_group_2_job.pngEach .py file has main() method.setup.py:description="wheel file based ...

1 kudos

04-10-2025 2:35:46 AM

1 More Replies

by jeremy98 • Honored Contributor

04-09-2025 9:14:35 AM

1184 Views
2 replies
0 kudos

how to install the package using --index-url

Hi community,I created a job using databricks asset bundle, but I'm worrying about how to install this dependency in the right way?because, I was testing the related job, but seems it doesn't install the torch library properly

Data Engineering

1184 Views
2 replies
0 kudos

04-09-2025 9:14:35 AM

View Replies

Latest Reply

jeremy98
Honored Contributor

04-09-2025 10:01:27 AM

0 kudos

I tried to do it manually and it works.. through databricks asset bundle no. But, I did at the end: dependencies: - torch==2.5.1 - --index-url https://download.pytorch.org/whl/cpu It says:Error: file doesn't exi...

0 kudos

04-09-2025 10:01:27 AM

1 More Replies

by pratik21 • New Contributor II

06-04-2022 11:24:03 PM

8851 Views
4 replies
1 kudos

Unexpected error while calling Notebook string matching regex `\$[\w_]+' expected but `M' found

Run result unavailable: job failed with error message INVALID_PARAMETER_VALUE: Failed to parse %run command: string matching regex `\$[\w_]+' expected but `M' found) Stacktrace:/Notebookpath: scalato call notebook we are using dbutils.notebook.run("N...

Data Engineering

8851 Views
4 replies
1 kudos

06-04-2022 11:24:03 PM

View Replies

Latest Reply

thedeadturtle
Databricks Partner

04-09-2025 10:07:40 PM

1 kudos

Since you're using dbutils.notebook.run() properly now, the issue is not in your current notebook, but actually in the target notebook you’re calling.Specifically, Databricks is trying to parse a %run command in that notebook, and it’s hitting a synt...

1 kudos

04-09-2025 10:07:40 PM

3 More Replies

by Vasu_Kumar_T • Databricks Partner

04-09-2025 9:00:23 AM

538 Views
1 replies
0 kudos

Blade bridge Analyzer out of memory issue

We are running bladebridge analyzer, and we are getting to run out of memorywe tried to increase the RAM and still it gives the same error.We cannot run the analyzer against subset of metadata as it would not generate comprehensive report with how th...

Data Engineering

538 Views
1 replies
0 kudos

04-09-2025 9:00:23 AM

View Replies

Latest Reply

Brahmareddy
Esteemed Contributor

04-09-2025 7:55:47 PM

0 kudos

Hi Vasu_Kumar_T,How are you doing today?, As per my understanding, running out of memory with BladeBridge Analyzer can be tough, especially when you're working with large and complex metadata where you need the full picture. Even if you've increased ...

0 kudos

04-09-2025 7:55:47 PM

by patacoing • New Contributor II

04-09-2025 2:34:56 PM

772 Views
1 replies
1 kudos

Medaillon architecture

Hello, I have in a S3 data lake, in it: a structure of files that are of different formats : json, csv, text, binary, ...Would you consider this as my bronze layer ? or a "pre-bronze" layer since it can't be processed directly by spark (because of d...

Data Engineering

772 Views
1 replies
1 kudos

04-09-2025 2:34:56 PM

View Replies

Latest Reply

Brahmareddy
Esteemed Contributor

04-09-2025 7:53:00 PM

1 kudos

Hi patacoing,How are you doing today?, As per my understanding, The structure you described in your S3 data lake sounds more like a "pre-bronze" layer, since the files are in mixed formats (JSON, CSV, text, binary), which makes it tricky to process t...

1 kudos

04-09-2025 7:53:00 PM

by jeremy98 • Honored Contributor

01-02-2025 8:11:21 AM

3608 Views
9 replies
0 kudos

Resolved! Error Databricks Bundle Deploy with changes in the wheel file

Hello Community,Suddenly, I have an error, when I'm doing the deploy of the new bundle to databricks changing the python script, the cluster continue to point to an old version of the py script uploaded from databricks asset bundle, why this?

Data Engineering

3608 Views
9 replies
0 kudos

01-02-2025 8:11:21 AM

View Replies

Latest Reply

denis-dbx
Databricks Employee

04-09-2025 2:35:54 AM

0 kudos

We've added a solution for this problem in v0.245.0. There is opt-in "dynamic_version: true" flag on artifact to enable automated wheel patching that break the cache (Example). Once set, "bundle deploy" will transparently patch version suffix in the ...

0 kudos

04-09-2025 2:35:54 AM

8 More Replies

by Tommabip • Databricks Partner

04-09-2025 3:21:16 AM

2672 Views
3 replies
2 kudos

Resolved! Databricks Cluster Policies

Hi, I' m trying to create a terraform script that does the following:- create a policy where I specify env variables and libraries- create a cluster that inherits from that policy and uses the env variables specified in the policy.I saw in the decume...

Data Engineering

2672 Views
3 replies
2 kudos

04-09-2025 3:21:16 AM

View Replies

Latest Reply

Louis_Frolio
Databricks Employee

04-09-2025 7:26:30 AM

2 kudos

You're correct in observing this discrepancy. When a cluster policy is defined and applied through the Databricks UI, fixed environment variables (`spark_env_vars`) specified in the policy automatically propagate to clusters created under that policy...

2 kudos

04-09-2025 7:26:30 AM

2 More Replies

by Alex_Persin • New Contributor III

10-28-2021 2:59:06 AM

10123 Views
6 replies
8 kudos

How can the shared memory size (/dev/shm) be increased on databricks worker nodes with custom docker images?

PyTorch uses shared memory to efficiently share tensors between its dataloader workers and its main process. However in a docker container the default size of the shared memory (a tmpfs file system mounted at /dev/shm) is 64MB, which is too small to ...

Data Engineering

10123 Views
6 replies
8 kudos

10-28-2021 2:59:06 AM

View Replies

Latest Reply

stevewb
New Contributor III

04-09-2025 6:54:01 AM

8 kudos

Bump again... does anyone have a solution for this?

8 kudos

04-09-2025 6:54:01 AM

5 More Replies

by valde • New Contributor

04-08-2025 9:53:19 PM

932 Views
1 replies
0 kudos

Window function VS groupBy + map

Let's say we have an RDD like this:RDD(id: Int, measure: Int, date: LocalDate)Let's say we want to apply some function that compares 2 consecutive measures by date, outputs a number and we want to get the sum of those numbers by id. The function is b...

Data Engineering

932 Views
1 replies
0 kudos

04-08-2025 9:53:19 PM

View Replies

Latest Reply

Renu_
Valued Contributor II

04-09-2025 6:25:23 AM

0 kudos

Hi @valde, those two approaches give the same result, but they don’t work the same way under the hood. SparkSQL uses optimized window functions that handle things like shuffling and memory more efficiently, often making it faster and lighter.On the o...

0 kudos

04-09-2025 6:25:23 AM

by Nathant93 • New Contributor III

11-11-2024 1:38:27 AM

2647 Views
2 replies
0 kudos

(java.util.concurrent.ExecutionException) Boxed Error

Has anyone ever come across the error above?I am trying to get two tables from unity catalog and join them, the join is fairly complex as it is imitating a where not exists top 1 sql query.

Data Engineering

2647 Views
2 replies
0 kudos

11-11-2024 1:38:27 AM

View Replies

Latest Reply

pk13
New Contributor II

04-09-2025 4:33:36 AM

0 kudos

Hello @VZLA Recently, I am getting the exact same error.It has a caused by as below -```Caused by: kafkashaded.org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server does not host this topic-partition.```Stacktrace -ERROR: Some ...

0 kudos

04-09-2025 4:33:36 AM

1 More Replies

by eenaagrawal • Databricks Partner

04-08-2025 10:32:31 PM

5526 Views
1 replies
0 kudos

How to upload files from databricks to sharepoint?

I required steps.

Data Engineering

5526 Views
1 replies
0 kudos

04-08-2025 10:32:31 PM

View Replies

Latest Reply

SP_6721
Honored Contributor II

04-09-2025 4:15:51 AM

0 kudos

Hi @eenaagrawal ,There isn't a specific built-in integration in Databricks to directly interact with Sharepoint. However, you can accomplish this by leveraging libraries like Office365-REST-Python-Client, which enable interaction with Sharepoint's RE...

0 kudos

04-09-2025 4:15:51 AM

by rahuja • Contributor

07-01-2024 1:36:34 PM

2460 Views
2 replies
0 kudos

Resolved! Cloning Git Repository in Databricks via Rest API Endpoint using Azure Service principal

HelloI have written a python script that uses Databricks Rest API(s). I am trying to clone/ update an Azure Devops Repository inside databricks using Azure Service Principal. I am able to retrieve the credential_id for the service principal I am usin...

Data Engineering

2460 Views
2 replies
0 kudos

07-01-2024 1:36:34 PM

View Replies

Latest Reply

rahuja
Contributor

04-09-2025 3:09:12 AM

0 kudos

@nicole_lu_PM So sorry for coming back to this issue after such a long time. But I looked into it and it seems like this concept of OBO token is applicable in case we use Databricks with AWS as our cloud provider. In case of Azure most of the commen...

0 kudos

04-09-2025 3:09:12 AM

1 More Replies

Databricks Community

Forum Posts

not able to run python script even after everything is in place in azure data bricks

Resolved! Materialized Views Compute

Resolved! Delta table version protocol

Do I need many wheels for each job in project?

how to install the package using --index-url

Unexpected error while calling Notebook string matching regex `\$[\w_]+' expected but `M' found

Blade bridge Analyzer out of memory issue

Medaillon architecture

Resolved! Error Databricks Bundle Deploy with changes in the wheel file

Resolved! Databricks Cluster Policies

How can the shared memory size (/dev/shm) be increased on databricks worker nodes with custom docker images?

Window function VS groupBy + map

(java.util.concurrent.ExecutionException) Boxed Error

How to upload files from databricks to sharepoint?

Resolved! Cloning Git Repository in Databricks via Rest API Endpoint using Azure Service principal

File Arrival Trigger - Multiple tables

Issue while handling Deletes and Inserts in Struct...

DLT with CDC and schema changes in streaming pipel...

how to update not tracked column only in new row v...

Databricks Cost Estimation Template