Data Engineering

Forum Posts

Sorted by:

by Vasu_Kumar_T • Databricks Partner

05-27-2025 6:21:35 AM

586 Views
1 replies
0 kudos

Job performance issue : Configurations

Hello All, One job taking more than 7hrs, when we added below configuration its taking <2:30 mins but after deployment with same parameters again its taking 7+hrs. 1) spark.conf.set("spark.sql.shuffle.partitions", 500) --> spark.conf.set("spark.sql.s...

Data Engineering

586 Views
1 replies
0 kudos

05-27-2025 6:21:35 AM

View Replies

Latest Reply

lingareddy_Alva
Esteemed Contributor

05-27-2025 11:00:49 AM

0 kudos

Hi @Vasu_Kumar_T This is a classic Spark performance inconsistency issue. The fact that it works fine in your notebookbut degrades after deployment suggests several potential causes. Here are the most likely culprits and solutions:Primary Suspects1. ...

0 kudos

05-27-2025 11:00:49 AM

by Mahtab67 • New Contributor

05-27-2025 7:34:33 AM

1880 Views
1 replies
0 kudos

Spark Kafka Client Not Using Certs from Default truststore

Hi Team, I'm working on connecting Databricks to an external Kafka cluster secured with SASL_SSL (SCRAM-SHA-512 + certificate trust). We've encountered an issue where certificates imported into the default JVM truststore (cacerts) via an init script ...

Data Engineering

1880 Views
1 replies
0 kudos

05-27-2025 7:34:33 AM

View Replies

Latest Reply

lingareddy_Alva
Esteemed Contributor

05-27-2025 10:48:33 AM

0 kudos

Hi @Mahtab67 This is a common issue with Databricks and Kafka SSL connectivity.The problem stems from how Spark's Kafka connector handles SSL context initialization versus the JVM's default truststore.Root Cause Analysis:The Spark Kafka connector cre...

0 kudos

05-27-2025 10:48:33 AM

by Sainath368 • Contributor

05-27-2025 4:05:37 AM

1292 Views
1 replies
0 kudos

COMPUTE DELTA STATISTICS vs COMPUTE STATISTICS - Data Skipping

Hi all,I recently altered the data skipping stats columns on my Delta Lake table to optimize data skipping. Now, I’m wondering about the best practice for updating statistics:Is running ANALYZE TABLE <table_name> COMPUTE DELTA STATISTICS sufficient a...

Data Engineering

1292 Views
1 replies
0 kudos

05-27-2025 4:05:37 AM

View Replies

Latest Reply

Advika
Community Manager

05-27-2025 8:18:37 AM

0 kudos

Hello @Sainath368! Running ANALYZE TABLE <table_name> COMPUTE DELTA STATISTICS is a good practice after modifying data skipping stats columns on a Delta Lake table. However, this command doesn’t update query optimizer stats. For that, you’ll need to ...

0 kudos

05-27-2025 8:18:37 AM

by Miloud_G • New Contributor III

05-23-2025 9:48:34 AM

1926 Views
2 replies
2 kudos

Resolved! issue on databricks bundle deploy

HiI am trying to configure Databricks Asset Bundle, but got error on deploymentDatabricks bundle init ----------- OKDatabricks bundle validate ----- OKDatabricks bundle deploy ------ Failerror : PS C:\Databricks_DABs\DABs_Init\DABS_Init> databricks b...

Data Engineering

1926 Views
2 replies
2 kudos

05-23-2025 9:48:34 AM

View Replies

Latest Reply

Miloud_G
New Contributor III

05-27-2025 6:07:02 AM

2 kudos

Thank you AdvilaI was enable to enable worspace files with scrip :from databricks.sdk.core import ApiClientclient = ApiClient()client.do("PATCH", "/api/2.0/workspace-conf", body={"enableWorkspaceFilesystem": "true"}, headers={"Content-Type": "applica...

2 kudos

05-27-2025 6:07:02 AM

1 More Replies

by ankit001mittal • New Contributor III

05-26-2025 12:30:05 AM

1084 Views
1 replies
0 kudos

How to stop access SQL AI Functions usage

Hi Guys,Recently, Databricks came up with a new feature SQL AI FunctionsIs there a way to stop users from using it without downgrading the runtime on cluster? by using Policies?Also, is there a way to stop users from using serverless, before there w...

Data Engineering

1084 Views
1 replies
0 kudos

05-26-2025 12:30:05 AM

View Replies

Latest Reply

Advika
Community Manager

05-27-2025 4:12:13 AM

0 kudos

Hello @ankit001mittal! Currently, there's no direct way to disable SQL AI Functions in Databricks. To restrict the use of serverless compute, you can set up serverless budget policies that allow you to monitor and limit usage to some extent. However,...

0 kudos

05-27-2025 4:12:13 AM

by Divya_Bhadauria • New Contributor III

04-26-2023 2:25:15 PM

14099 Views
6 replies
2 kudos

Unable to run python script from git repo in Databricks job

I'm getting cannot read python file on running this job which is configured to run a python script from git repo. Run result unavailable: run failed with error message Cannot read the python file /Repos/.internal/7c39d645692_commits/ff669d089cd8f93e9...

Data Engineering

14099 Views
6 replies
2 kudos

04-26-2023 2:25:15 PM

View Replies

Latest Reply

SakthiGanesh
New Contributor II

05-15-2025 8:11:35 PM

2 kudos

Hi @Divya_Bhadauria, I'm facing the same internal commit issue from my end. I don't gave any internal path in the databricks workflow. I gave the source to azure DevOps services with branch name. But when I ran the workflow it gives the below error a...

2 kudos

05-15-2025 8:11:35 PM

5 More Replies

by amarnathpal • New Contributor III

05-26-2025 10:52:53 PM

1375 Views
4 replies
0 kudos

Adding a New Column for Updated Date in Pipeline

I've successfully set up my pipeline and everything is working fine. I'd like to add a new column to our table that records the date whenver any records got updated. Could you advise on how to go about this?

Data Engineering

1375 Views
4 replies
0 kudos

05-26-2025 10:52:53 PM

View Replies

Latest Reply

nikhilj0421
Databricks Employee

05-27-2025 12:25:55 AM

0 kudos

Do you want to add dates for the historical data as well?

0 kudos

05-27-2025 12:25:55 AM

3 More Replies

by Ramakrishnan83 • New Contributor III

02-04-2024 6:11:23 AM

3713 Views
2 replies
0 kudos

Optimize and Vaccum Command

Hi team,I am running a weekly purge process from databricks notebooks that cleans up chunk of records from my tables used for audit purposes. Tables are external tables. I need clarification on below items1.Should I need to run Optimize and Vacuum c...

Data Engineering

3713 Views
2 replies
0 kudos

02-04-2024 6:11:23 AM

View Replies

Latest Reply

JaimeAnders
New Contributor II

05-26-2025 6:58:03 PM

0 kudos

That's a valid point about minimal read queries! However, while immediate storage reduction might not be necessary, consistent data integrity and potential future reporting needs might still warrant occasional optimize and vacuuming, even with extern...

0 kudos

05-26-2025 6:58:03 PM

1 More Replies

by jeremy98 • Honored Contributor

01-08-2025 2:47:38 AM

3213 Views
6 replies
2 kudos

Resolved! Catch Metadata Workflow databricks

Hello community,Is it possible to get metadata workflow of a databricks job that is running? Like the start time, end time, triggered by etc.? Using dbutils.widgets.get()?

Data Engineering

3213 Views
6 replies
2 kudos

01-08-2025 2:47:38 AM

View Replies

Latest Reply

Juan_Cardona
Databricks Partner

05-26-2025 2:13:31 PM

2 kudos

Now The best practice for this is not using the API (some functions were deprecated for this objective) instead you should use job parameters job_id = dbutils.widgets.get("job parameter name with job_id") job_run = dbutils.widgets.get("job parameter ...

2 kudos

05-26-2025 2:13:31 PM

5 More Replies

by Ankit_Kothiya • Databricks Partner

05-22-2025 5:35:51 AM

1566 Views
2 replies
1 kudos

Databricks JDBC Driver Version 42 Limitations

We found that the Databricks JDBC driver does not support:Connection.setAutoCommit(false)Connection.commit()Connection.rollback()Execution of BEGIN TRANSACTIONCan you help us understand why these operations are not supported by the Databricks JDBC dr...

Data Engineering

1566 Views
2 replies
1 kudos

05-22-2025 5:35:51 AM

View Replies

Latest Reply

Ankit_Kothiya
Databricks Partner

05-26-2025 11:03:35 AM

1 kudos

Thank you, @SP_6721 , for your input!Could you please share an example snippet demonstrating how to handle batch processing, similar to what we typically do in a relational database?

1 kudos

05-26-2025 11:03:35 AM

1 More Replies

by venkad • Contributor

08-25-2022 5:52:35 AM

14788 Views
5 replies
7 kudos

Passing proxy configurations with databricks-sql-connector python?

Hi,I am trying to connect to databricks workspace which has IP Access restriction enabled using databricks-sql-connector. Only my Proxy server IPs are added in the allow list.from databricks import sql connection = sql.connect( server_hostname ='...

Data Engineering

14788 Views
5 replies
7 kudos

08-25-2022 5:52:35 AM

View Replies

Latest Reply

ss2025
New Contributor II

05-26-2025 8:50:22 AM

7 kudos

Is there any resolution for the above setting up proxy with databricks sql connector

7 kudos

05-26-2025 8:50:22 AM

4 More Replies

by Upendra_Dwivedi • Databricks Partner

05-16-2025 4:16:30 AM

3591 Views
4 replies
0 kudos

Resolved! How to enable Databricks Apps User Authorization?

Hi All,I am working on implementation of user authorization in my databricks app. but to enable user auth it is asking:"A workspace admin must enable this feature to be able to request additional scopes. The user's API downscoped access token is incl...

Data Engineering

3591 Views
4 replies
0 kudos

05-16-2025 4:16:30 AM

View Replies

Latest Reply

Upendra_Dwivedi
Databricks Partner

05-26-2025 2:45:53 AM

0 kudos

Hi All, We can find this setting under Previews. Go to workspace>click your username>Previews

0 kudos

05-26-2025 2:45:53 AM

3 More Replies

by Ipshi • New Contributor

05-23-2025 4:02:50 AM

1011 Views
1 replies
0 kudos

databricks Data Engineer associate

Hi everyone , can anyone guide me about any test papers or any test materials anyone can go through for the databricks data engineer associate exam

Data Engineering

1011 Views
1 replies
0 kudos

05-23-2025 4:02:50 AM

View Replies

Latest Reply

Advika
Community Manager

05-26-2025 2:36:10 AM

0 kudos

Hello @Ipshi! You can find resources for the Databricks Certified Data Engineer Associate exam in the Getting Ready for the Exam section of the exam-specific webpage on the website. This section includes a detailed list of topics covered and sample q...

0 kudos

05-26-2025 2:36:10 AM

by lawrence009 • Contributor

02-08-2023 3:42:37 PM

3198 Views
4 replies
0 kudos

Blank Page after Logging In

On Feb 8 Singapore time, our Singapore workspace displayed a blank page (no interface or content) after login. Meanwhile our workspace in Tokyo reason worked normally. This lasted whole day and none of our troubleshooting yielded any clues. Then ever...

Data Engineering

3198 Views
4 replies
0 kudos

02-08-2023 3:42:37 PM

View Replies

Latest Reply

ciro
New Contributor II

05-25-2025 7:15:45 AM

0 kudos

After logging in, I’m getting a white screen, and it won’t load. I’ve tried clearing my cache and switching browsers, but nothing seems to work. This feels like something that really needs to be looked into. Has anyone figured out a way to fix it?

0 kudos

05-25-2025 7:15:45 AM

3 More Replies

by pargit2 • New Contributor II

05-25-2025 4:33:44 AM

951 Views
1 replies
0 kudos

feature store delta sharing

Hi, I have 2 workspaces one for data engineers and one for data science team and I need to create in data engineering workspace the bronze and silver.I want to built them a feature store should I do it from data science workspace or data engineering ...

Data Engineering

951 Views
1 replies
0 kudos

05-25-2025 4:33:44 AM

View Replies

Latest Reply

ciro
New Contributor II

05-25-2025 7:13:11 AM

0 kudos

I like the idea of using Feature Store with Delta Sharing, but I’m a bit worried about its limits like no partition filtering and no streaming support. These could cause problems with performance and scaling in real situations.

0 kudos

05-25-2025 7:13:11 AM

Databricks Community

Forum Posts

Job performance issue : Configurations

Spark Kafka Client Not Using Certs from Default truststore

COMPUTE DELTA STATISTICS vs COMPUTE STATISTICS - Data Skipping

Resolved! issue on databricks bundle deploy

How to stop access SQL AI Functions usage

Unable to run python script from git repo in Databricks job

Adding a New Column for Updated Date in Pipeline

Optimize and Vaccum Command

Resolved! Catch Metadata Workflow databricks

Databricks JDBC Driver Version 42 Limitations

Passing proxy configurations with databricks-sql-connector python?

Resolved! How to enable Databricks Apps User Authorization?

databricks Data Engineer associate

Blank Page after Logging In

feature store delta sharing

File Arrival Trigger - Multiple tables

Issue while handling Deletes and Inserts in Struct...

DLT with CDC and schema changes in streaming pipel...

how to update not tracked column only in new row v...

Databricks Cost Estimation Template