Data Engineering

Forum Posts

Sorted by:

by asrivas • New Contributor II

09-10-2025 1:30:09 PM

1266 Views
3 replies
0 kudos

Azure Databricks – Lakehouse Federation MySQL Connection Fails but Works in Notebook

I am trying to set up a Lakehouse Federation connection to an Azure MySQL database. When I connect from a Databricks notebook using Python - mysql.connector, same cluster, it works fine.But when I set up the Lakehouse Federation connection and test i...

Data Engineering

1266 Views
3 replies
0 kudos

09-10-2025 1:30:09 PM

View Replies

Latest Reply

WiliamRosa
Databricks Partner

09-11-2025 4:38:46 AM

0 kudos

Hi @asrivas , I’ve been trying to simulate this on my side and in my case I was able to complete the connection, but I believe in your case the issue comes from the MySQL setting --require_secure_transport=ON. In the notebook it works because the dri...

0 kudos

09-11-2025 4:38:46 AM

2 More Replies

by jeremy98 • Honored Contributor

09-09-2025 8:57:59 AM

2442 Views
7 replies
10 kudos

Resolved! How to overwritten job parameter task inside a job task

Hi community,How to overwritten the job parameter inside the job task? Because seems that the job parameter has a higher priority than a task parameter although it is overwritten

Data Engineering

2442 Views
7 replies
10 kudos

09-09-2025 8:57:59 AM

View Replies

Latest Reply

jeremy98
Honored Contributor

09-11-2025 3:44:35 AM

10 kudos

Hi Pilsner,Thanks for your response, the issue is that I need to know it before. In this case, we need to set inside a notebook for example the task values. I want to be able to set it at task value. I think it is not provided from Databricks.

10 kudos

09-11-2025 3:44:35 AM

6 More Replies

by km1837 • Databricks Partner

09-10-2025 6:43:22 AM

850 Views
1 replies
0 kudos

DLT Pipeline from Streaming Table

HiI have a bronze table with Product_id, *, start_at, end_At which is a streaming and SCD Type 2 Table, which means any change in product_attributes would insert a new row with end_at as null. So when we take this table with end_at as null , the tabl...

Data Engineering

850 Views
1 replies
0 kudos

09-10-2025 6:43:22 AM

View Replies

Latest Reply

ilir_nuredini
Honored Contributor

09-11-2025 3:32:03 AM

0 kudos

Hi @km1837 ,Instead of trying to implement a stream table on a stream table, for your use case I think using Materialized View on next child table would be the best choice.For e.g.: @dlt.table(name="workspace.silver.current_product") def sample_trips...

0 kudos

09-11-2025 3:32:03 AM

by ganapati • New Contributor III

09-07-2025 10:48:38 PM

1730 Views
9 replies
3 kudos

Resolved! issue updating DLT pipeline configurations using databricks sdk

I am updating dlt pipeline configs with job id , run id and run_datetime of the job , so that i can access these values inside dlt pipeline. below is the code i am using to do that. # Databricks notebook sourceimport sysimport loggingfrom databricks....

Data Engineering

1730 Views
9 replies
3 kudos

09-07-2025 10:48:38 PM

View Replies

Latest Reply

ganapati
New Contributor III

09-10-2025 8:31:05 PM

3 kudos

Hi, just tested it out, it works!, thanks again for helping out

3 kudos

09-10-2025 8:31:05 PM

8 More Replies

by liu • Databricks Partner

09-04-2025 1:15:41 AM

696 Views
2 replies
1 kudos

Can the default cluster Serverless of Databricks install Scala packages

Can the default cluster Serverless of Databricks install Scala packagesI need to use the spark-sftp package, but it seems that serverless is different from purpose compute, and I can only install python packages?There is another question. I can use p...

Data Engineering

696 Views
2 replies
1 kudos

09-04-2025 1:15:41 AM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

09-05-2025 5:28:09 AM

1 kudos

no scala, you can't even run scala notebooks.about the sftp: the serverless compute is way more limited than general purpose clusters.what folder can't be found? dbfs or s3?

1 kudos

09-05-2025 5:28:09 AM

1 More Replies

by dbx_user • New Contributor II

06-03-2025 10:57:27 PM

2932 Views
8 replies
0 kudos

Intermittent error: "Command failed because warehouse <<warehouse id>> was stopped."

The error "Command failed because warehouse <<warehouse id>> was stopped." has started popping up during deployment runs. Some times the error correlates with serverless warehouse cluster count reducing to zero while a query is running, sometimes it ...

Data Engineering

2932 Views
8 replies
0 kudos

06-03-2025 10:57:27 PM

View Replies

Latest Reply

ADbksUser
New Contributor II

09-10-2025 4:44:21 PM

0 kudos

Hey all, having the same issue here. Just doing some development work connected to a serverless SQL warehouse from dbt. Suddenly getting the error "Command failed because warehouse <warehouse_id> was stopped."Nothing's changed between those runs

0 kudos

09-10-2025 4:44:21 PM

7 More Replies

by tabinashabir • New Contributor II

09-09-2025 10:45:52 PM

1673 Views
5 replies
3 kudos

AutoLoader options includeExistingFiles and modifiedAfter not working

I'm using this code to read data from an ADLS Gen2 location. There are txt files present in sub-folders in the container. df_stream = spark.readStream \ .format("cloudFiles") \ .option("cloudFiles.format", "text") \ .optio...

Data Engineering

1673 Views
5 replies
3 kudos

09-09-2025 10:45:52 PM

View Replies

Latest Reply

ManojkMohan
Honored Contributor II

09-09-2025 11:42:57 PM

3 kudos

Root cause:includeExistingFiles is only evaluated the first time the stream is started with a fresh checkpoint. If the stream is restarted or the checkpoint folder is reused, changing this option will have no effect on subsequent runs—old files previ...

3 kudos

09-09-2025 11:42:57 PM

4 More Replies

by ManojkMohan • Honored Contributor II

09-08-2025 1:34:23 PM

2459 Views
5 replies
5 kudos

Resolved! Extracting PDFs and using AI queries | best practices

Problem i am solving:Upload PDF → available in /Volumes/<catalog>/<schema>/<volume>/.Extract text with pdfplumber (or OCR if scanned).Store in Delta table for governance.Parse intelligently using:ai_query() with Databricks LLMs for flexible JSON outp...

Data Engineering

2459 Views
5 replies
5 kudos

09-08-2025 1:34:23 PM

View Replies

Latest Reply

szymon_dybczak
Esteemed Contributor III

09-08-2025 11:43:39 PM

5 kudos

Hi @ManojkMohan ,Maybe you're using wrong endpoint name. Try with databricks-meta-llama-3-3-70b-instructIn your case you're trying to call an API with following name: databricks-meta-llama-3-70b-instruct which I guess has small typo

5 kudos

09-08-2025 11:43:39 PM

4 More Replies

by cchiaramelli • Databricks Partner

09-10-2025 7:10:01 AM

723 Views
3 replies
5 kudos

Resolved! Unable to Delete Failed Databricks Job VMs in Azure

My Job Compute had trouble on starting the cluster, acusing "Unexpected failure while waiting for the cluster (xxxx) to be ready: Cluster 'xxxx' is unhealthy"After multiple retries, a new error message appeared:"Operation could not be completed as it...

Data Engineering

723 Views
3 replies
5 kudos

09-10-2025 7:10:01 AM

View Replies

Latest Reply

cchiaramelli
Databricks Partner

09-10-2025 10:33:49 AM

5 kudos

UPDATE: Before opening the Support Ticket, the machines suddently disappeared. I deleted the Jobs definitions with its JobClusters definitions, and maybe that solved it, or after some hours the machines were cleaned. Not sure what cleaned it.Also I n...

5 kudos

09-10-2025 10:33:49 AM

2 More Replies

by TechExplorer • New Contributor II

06-24-2025 6:08:28 AM

2448 Views
3 replies
1 kudos

Resolved! Unable to unpack or read rar file

Hi everyone,I'm encountering an issue with the following code when trying to unpack or read a RAR file in Databricks: with rarfile.RarFile(s3_path) as rf: for file_info in rf.infolist(): with rf.open(file_info) as file: file_c...

Data Engineering

2448 Views
3 replies
1 kudos

06-24-2025 6:08:28 AM

View Replies

Latest Reply

Upendra_Dwivedi
Databricks Partner

09-10-2025 9:24:17 AM

1 kudos

Hi @Walter_C,I am also using this unrar utility but the problem it is a proprietary software and i am working for a client and this license could cause issues. What is the alternative to unrar so that we eliminate the risk of any legal compliance.

1 kudos

09-10-2025 9:24:17 AM

2 More Replies

by Datalight • Contributor

09-09-2025 7:56:12 AM

1757 Views
5 replies
1 kudos

Resolved! How to design Airship Integration with Azure Databricks

Hello,I have to push data from Airship and persists it to Delta tables. I think We can used SFTP , May someone please help me how to design the inbound part , it using SFTP on Airship end to push file on ADLS Gen2.networking and security consideratio...

Data Engineering

1757 Views
5 replies
1 kudos

09-09-2025 7:56:12 AM

View Replies

Latest Reply

ManojkMohan
Honored Contributor II

09-09-2025 9:07:09 AM

1 kudos

Inbound Flow DesignEnable SFTP on the ADLS Gen2 (or Azure Blob Storage) account;Generate and register an SSH public/private key pair with Airship, enter your SFTP endpoint credentials (username, host, port, key) in Airship’s settings to authenticate ...

1 kudos

09-09-2025 9:07:09 AM

4 More Replies

by Khaja_Zaffer • Esteemed Contributor

09-05-2025 1:19:16 AM

2270 Views
10 replies
5 kudos

Resolved! CONTAINER_LAUNCH_FAILURE

Hello everyone!I need some help, unable to get cluster up and running. I did try creating classic compute but fails, is there any limit to use databricks community edition? Error here: { "reason": { "code": "CONTAINER_LAUNCH_FAILURE", "type...

Data Engineering

2270 Views
10 replies
5 kudos

09-05-2025 1:19:16 AM

View Replies

Latest Reply

Khaja_Zaffer
Esteemed Contributor

09-10-2025 7:46:50 AM

5 kudos

To all legacy community edition is working fine if you use dbr <= 15.4 for both general and ML modes.I think legacy community still far more better than free edition. I was selecting >15.4 DBR Thank you.

5 kudos

09-10-2025 7:46:50 AM

9 More Replies

by SiarheiSintsou • New Contributor

09-10-2025 2:25:29 AM

557 Views
2 replies
0 kudos

Serverless performance_target option is not available for one time jobs

Why this option https://docs.databricks.com/api/workspace/jobs/create#performance_target does not available for one-time runs https://docs.databricks.com/api/workspace/jobs/submit ?

Data Engineering

557 Views
2 replies
0 kudos

09-10-2025 2:25:29 AM

View Replies

Latest Reply

Advika
Community Manager

09-10-2025 7:18:44 AM

0 kudos

Hello @SiarheiSintsou! The performance_target isn’t currently supported in the SubmitRun API. However, it would be helpful if you could submit a feature request here.

0 kudos

09-10-2025 7:18:44 AM

1 More Replies

by yvishal519 • Contributor

01-22-2025 10:34:57 PM

3940 Views
2 replies
0 kudos

Identifying Full Refresh vs. Incremental Runs in Delta Live Tables

Hello Community,I am working with a Delta Live Tables (DLT) pipeline that primarily operates in incremental mode. However, there are specific scenarios where I need to perform a full refresh of the pipeline. I am looking for an efficient and reliable...

Data Engineering

3940 Views
2 replies
0 kudos

01-22-2025 10:34:57 PM

View Replies

Latest Reply

Takuya-Omi
Valued Contributor III

01-25-2025 2:22:13 AM

0 kudos

Hello,There are two ways to determine whether a DLT pipeline is running in Full Refresh or Incremental mode:DLT Event Log SchemaThe details column in the DLT event log schema includes information on "full_refresh". You can use this to identify whethe...

0 kudos

01-25-2025 2:22:13 AM

1 More Replies

by zyang • Contributor II

09-10-2025 5:08:06 AM

1176 Views
2 replies
0 kudos

Resolved! ModuleNotFoundError: No module named 'databricks.sdk.service.database'

Hi , https://learn.microsoft.com/en-gb/azure/databricks/oltp/sync-data/sync-table?source=docs#python-sdk The module cannot be found. The cluster is as screenshot and the code is from docs. Best regards,

Data Engineering

1176 Views
2 replies
0 kudos

09-10-2025 5:08:06 AM

View Replies

Latest Reply

WiliamRosa
Databricks Partner

09-10-2025 6:15:46 AM

0 kudos

The current version is the following:

0 kudos

09-10-2025 6:15:46 AM

1 More Replies

Databricks Community

Forum Posts

Azure Databricks – Lakehouse Federation MySQL Connection Fails but Works in Notebook

Resolved! How to overwritten job parameter task inside a job task

DLT Pipeline from Streaming Table

Resolved! issue updating DLT pipeline configurations using databricks sdk

Can the default cluster Serverless of Databricks install Scala packages

Intermittent error: "Command failed because warehouse <<warehouse id>> was stopped."

AutoLoader options includeExistingFiles and modifiedAfter not working

Resolved! Extracting PDFs and using AI queries | best practices

Resolved! Unable to Delete Failed Databricks Job VMs in Azure

Resolved! Unable to unpack or read rar file

Resolved! How to design Airship Integration with Azure Databricks

Resolved! CONTAINER_LAUNCH_FAILURE

Serverless performance_target option is not available for one time jobs

Identifying Full Refresh vs. Incremental Runs in Delta Live Tables

Resolved! ModuleNotFoundError: No module named 'databricks.sdk.service.database'

File Arrival Trigger - Multiple tables

Issue while handling Deletes and Inserts in Struct...

DLT with CDC and schema changes in streaming pipel...

how to update not tracked column only in new row v...

Databricks Cost Estimation Template