Data Engineering

Forum Posts

Sorted by:

by jeremy98 • Honored Contributor

04-03-2025 12:52:11 AM

1637 Views
3 replies
0 kudos

how to pass secrets keys using a spark_python_task

Hello community,I was searching a way to pass secrets to spark_python_task. Using a notebook file is easy, it's only to use dbutils.secrets.get(...) but how to do the same thing using a spark_python_task set using serveless compute?Kind regards,

Data Engineering

1637 Views
3 replies
0 kudos

04-03-2025 12:52:11 AM

View Replies

Latest Reply

analytics_eng
New Contributor III

10-28-2025 8:44:45 AM

0 kudos

@Renu_ but passing them as spark_env will not work with serverless I guess? See also the limitations on the docs Serverless compute limitations | Databricks on AWS

0 kudos

10-28-2025 8:44:45 AM

2 More Replies

by dpc • Contributor III

10-25-2025 9:38:43 AM

1118 Views
5 replies
3 kudos

Resolved! Pass parameters between jobs

Hello I have jobIn that job, it runs a task (GetGid) that executes a notebook and obtains some value using dbutils.jobs.taskValuesSete.g. dbutils.jobs.taskValuesSet(key = "gid", value = gid)As a result, I can use this and pass it to another task for ...

Data Engineering

1118 Views
5 replies
3 kudos

10-25-2025 9:38:43 AM

View Replies

Latest Reply

dpc
Contributor III

10-27-2025 2:56:58 AM

3 kudos

Thanks @Hubert-Dudek and @ilir_nuredini I see this nowI'm setting using:dbutils.jobs.taskValues.Set()passing to the job task using Key - gid; Value - {{tasks.GetGid.values.gid}}Then reading using: pid = dbutils.widgets.get()

3 kudos

10-27-2025 2:56:58 AM

4 More Replies

by AlleyCat • New Contributor II

01-08-2025 2:53:03 AM

1345 Views
3 replies
0 kudos

To identify deleted Runs in Workflow.Job UI in "system.lakeflow"

Hi,I executed a few runs in a Workflow.Jobs UI. I then deleted some of them. I am seeing the deleted runs in "system.lakeflow.job_run_timeline". How do i know which runs are the deleted ones? Thanks

Data Engineering

1345 Views
3 replies
0 kudos

01-08-2025 2:53:03 AM

View Replies

Latest Reply

Ayushi_Suthar
Databricks Employee

01-08-2025 3:14:50 AM

0 kudos

Hi @AlleyCat , Hope you are doing well! The jobs table includes a delete_time column that records the time when the job was deleted by the user. So to identify deleted jobs, you can run a query like the following: SELECT * FROM system.lakeflow.jobs ...

0 kudos

01-08-2025 3:14:50 AM

2 More Replies

by DM0341 • New Contributor II

10-27-2025 12:17:23 PM

552 Views
2 replies
1 kudos

Resolved! SQL Stored Procedures - Notebook to always run the CREATE query

I have a stored procedure that is saved as a query file. I can run it and the proc is created. However I want to take this one step further. I want my notebook to run the query file called sp_Remit.sql so if there is any changes to the proc between t...

Data Engineering

552 Views
2 replies
1 kudos

10-27-2025 12:17:23 PM

View Replies

Latest Reply

DM0341
New Contributor II

10-28-2025 6:23:06 AM

1 kudos

Thank you. I did find this about an hour after I posted. Thank you Kevin

1 kudos

10-28-2025 6:23:06 AM

1 More Replies

by SuMiT1 • New Contributor III

10-28-2025 4:19:46 AM

655 Views
1 replies
1 kudos

Databricks to snowflake data load

Hi Team, I’m trying to load data from Databricks into Snowflake using the Snowflake Spark connector. I’m using a generic username and password, but I’m unable to log in using these credentials directly. In the Snowflake UI, I can only log in through ...

Data Engineering

655 Views
1 replies
1 kudos

10-28-2025 4:19:46 AM

View Replies

Latest Reply

nayan_wylde
Esteemed Contributor II

10-28-2025 5:53:15 AM

1 kudos

@SuMiT1 The recommended method to connect to snowflake from databricks is OAuth with Client Credentials Flow.This method uses a registered Azure AD application to obtain an OAuth token without user interaction.Steps:Register an app in Azure AD and c...

1 kudos

10-28-2025 5:53:15 AM

by StephanieAlba • Databricks Employee

08-16-2021 8:26:03 AM

3543 Views
2 replies
0 kudos

Is it possible to turn off the redaction of secrets? Is there a better way to solve this?

As part of our Azure Data Factory pipeline, we utilize Databricks to run some scripts that identify which files we need to load from a certain source. This list of files is then passed back into Azure Data Factory utilizing the Exit status from the n...

Data Engineering

3543 Views
2 replies
0 kudos

08-16-2021 8:26:03 AM

View Replies

Latest Reply

joanafloresc
New Contributor II

10-28-2025 3:13:01 AM

0 kudos

Hello, as of today, is it still not possible to unredact secret names?

0 kudos

10-28-2025 3:13:01 AM

1 More Replies

by GANAPATI_HEGDE • New Contributor III

10-27-2025 8:59:31 PM

1420 Views
8 replies
1 kudos

Unable to run sql alert task using databricks job using service principal

I am trying to run a SQL alert task in data bricks job, Service principal is the run_as as set to the job, and task fails with below error message. I also checked if SPN can be granted permission on SQL alert and SQL query, turns out only user id or...

Data Engineering

1420 Views
8 replies
1 kudos

10-27-2025 8:59:31 PM

View Replies

Latest Reply

GANAPATI_HEGDE
New Contributor III

10-28-2025 1:41:59 AM

1 kudos

unfortunately, only these options are available in my UI, is it the new UI?

1 kudos

10-28-2025 1:41:59 AM

7 More Replies

by crami • New Contributor III

10-27-2025 1:33:04 PM

392 Views
1 replies
0 kudos

Resolved! Declative Pipeline: Can pipeline or job be deployed run_as using asset bundle

Hi, I have very interesting scenario. I am trying to use Declarative pipelines for first time. The platform team has made workspace artefacts as devops based deployment [infra as code], meaning, I cannot create compute. I have to create compute with ...

Data Engineering

392 Views
1 replies
0 kudos

10-27-2025 1:33:04 PM

View Replies

Latest Reply

donna567taylor
New Contributor III

10-28-2025 1:19:37 AM

0 kudos

@crami wrote:Hi, I have very interesting scenario. I am trying to use Declarative pipelines for first time. The platform team has made workspace artefacts as devops based deployment [infra as code], meaning, I cannot create compute. I have to create ...

0 kudos

10-28-2025 1:19:37 AM

by Alessandro • New Contributor II

01-25-2024 1:01:54 AM

2007 Views
2 replies
0 kudos

Resolved! Update jobs parameter, when running, from API

Hi, When a Job is running, I would like to change the parameters with an API call.I know that I can set parameters value from API when I start a job from API, or that I can update the default value if the job isn't running, but I didn't find an API c...

Data Engineering

2007 Views
2 replies
0 kudos

01-25-2024 1:01:54 AM

View Replies

Latest Reply

XueChunmei
New Contributor II

10-27-2025 8:23:41 PM

0 kudos

Hi, Alessandro, I am trying to set job parameters value from API when I start a job from API call within python notebook, however, it has never succeeded, the job can be triggered, but always with job parameters' default values instead values from AP...

0 kudos

10-27-2025 8:23:41 PM

1 More Replies

by Sakthi0311 • New Contributor

10-27-2025 8:05:55 AM

652 Views
2 replies
1 kudos

How to enable Liquid Clustering on an existing Delta Live Table (DLT) and syntax for enabling it

Hi all,I’m working with Delta Live Tables (DLT) and want to enable Liquid Clustering on an existing DLT table that was already created without it.Could someone please clarify:How can I enable Liquid Clustering on an existing DLT table (without recre...

Data Engineering

652 Views
2 replies
1 kudos

10-27-2025 8:05:55 AM

View Replies

Latest Reply

szymon_dybczak
Esteemed Contributor III

10-27-2025 9:01:51 AM

1 kudos

Hi @Sakthi0311 ,For SQL language you can enable LC for materialized views and streaming tables. So the syntax looks following:If you want to use automatic clustering then use CLUSTER BY AUTO.

1 kudos

10-27-2025 9:01:51 AM

1 More Replies

by excavator-matt • Contributor III

10-06-2025 7:22:27 AM

1902 Views
6 replies
3 kudos

Resolved! How do use Databricks Lakeflow Declarative Pipeline on AWS DMS data?

Hi!I am trying to replicate an AWS RDS PostgreSQL database in Databricks. I have successfully manage to enable CDC using AWS DMS that writes an initial load file and continuous CDC files in parquet.I have been trying to follow the official guide Repl...

Data Engineering

AUTO CDC

AWS DMS

declarative pipelines

LakeFlow

1902 Views
6 replies
3 kudos

10-06-2025 7:22:27 AM

View Replies

Latest Reply

mmayorga
Databricks Employee

10-27-2025 8:46:29 AM

3 kudos

hey @excavator-matt Let's remember that the Bronze layer is for mere raw ingestion; this provides a baseline for auditing and to start applying transformations based on the different use cases you need to serve. Systems and their requirements change...

3 kudos

10-27-2025 8:46:29 AM

5 More Replies

by Adam_Borlase • New Contributor III

10-27-2025 5:58:04 AM

1980 Views
4 replies
5 kudos

Resolved! Quota Limit Exhausted Error when Creating Data Ingestion with SQL Server Connector (Azure)

Good Day all,I am having an issue with our first Data Ingestion Pipelines, I am wanting to connect to our Azure SQL Server with our Unity Connector (I can access the data in Unity Catalog). When I am on Step 3 of the process (Source) when it is scann...

Data Engineering

1980 Views
4 replies
5 kudos

10-27-2025 5:58:04 AM

View Replies

Latest Reply

Adam_Borlase
New Contributor III

10-27-2025 8:04:42 AM

5 kudos

you for all of your assistance!

5 kudos

10-27-2025 8:04:42 AM

3 More Replies

by ghofigjong • New Contributor

02-27-2023 12:29:55 AM

14555 Views
5 replies
3 kudos

Resolved! How does partition pruning work on a merge into statement?

I have a delta table that is partitioned by Year, Date and month. I'm trying to merge data to this on all three partition columns + an extra column (an ID). My merge statement is below:MERGE INTO delta.<path of delta table> oldData using df newData ...

Data Engineering

14555 Views
5 replies
3 kudos

02-27-2023 12:29:55 AM

View Replies

Latest Reply

Umesh_S
New Contributor II

03-30-2023 1:24:57 PM

3 kudos

Isn't the suggested idea only filtering the input dataframe (resulting in a smaller amount of data to match across the whole delta table) rather than prune the delta table for relevant partitions to scan?

3 kudos

03-30-2023 1:24:57 PM

4 More Replies

by yit • Databricks Partner

10-27-2025 7:04:03 AM

821 Views
3 replies
3 kudos

Resolved! Does Autoloader supports loading PDF files?

I need to process PDF files already ingested. Based on the documentation, Autoloader does not support PDFs - or am I missing something?Also, I've found this sparkPDF library in other discussions in the community, but from what I see it's only for bat...

Data Engineering

821 Views
3 replies
3 kudos

10-27-2025 7:04:03 AM

View Replies

Latest Reply

yit
Databricks Partner

10-27-2025 7:08:11 AM

3 kudos

Any suggestions how to handle PDFs? @szymon_dybczak

3 kudos

10-27-2025 7:08:11 AM

2 More Replies

by Filip • New Contributor II

08-22-2024 3:31:47 AM

8983 Views
7 replies
0 kudos

How to Assign User Managed Identity to DBR Cluster so I can use it for quering ADLSv2?

Hi,I'm trying to figure out if we can switch from Entra ID SPN's to User Assigned Managed Indentities and everything works except I can't figure out how to access the lake files from python notebook.I've tried with below code and was running it on a ...

Data Engineering

8983 Views
7 replies
0 kudos

08-22-2024 3:31:47 AM

View Replies

Latest Reply

Coffee77
Honored Contributor II

10-27-2025 7:03:34 AM

0 kudos

Besides, this only works in dedicated clusters, non working on shared ones. Why? No idea at all. Latest case, IMDS (Internal Metadata Service) used by Azure to inject token endpoint inside resources as a unique secure and valid channel to get tokens ...

0 kudos

10-27-2025 7:03:34 AM

6 More Replies

Databricks Community

Forum Posts

how to pass secrets keys using a spark_python_task

Resolved! Pass parameters between jobs

To identify deleted Runs in Workflow.Job UI in "system.lakeflow"

Resolved! SQL Stored Procedures - Notebook to always run the CREATE query

Databricks to snowflake data load

Is it possible to turn off the redaction of secrets? Is there a better way to solve this?

Unable to run sql alert task using databricks job using service principal

Resolved! Declative Pipeline: Can pipeline or job be deployed run_as using asset bundle

Resolved! Update jobs parameter, when running, from API

How to enable Liquid Clustering on an existing Delta Live Table (DLT) and syntax for enabling it

Resolved! How do use Databricks Lakeflow Declarative Pipeline on AWS DMS data?

Resolved! Quota Limit Exhausted Error when Creating Data Ingestion with SQL Server Connector (Azure)

Resolved! How does partition pruning work on a merge into statement?

Resolved! Does Autoloader supports loading PDF files?

How to Assign User Managed Identity to DBR Cluster so I can use it for quering ADLSv2?

File Arrival Trigger - Multiple tables

Issue while handling Deletes and Inserts in Struct...

DLT with CDC and schema changes in streaming pipel...

how to update not tracked column only in new row v...

Databricks Cost Estimation Template