Data Engineering

Forum Posts

Sorted by:

by yzhang • Contributor

07-07-2025 10:18:05 AM

3116 Views
10 replies
3 kudos

iceberg with partitionedBy option

I am able to create a UnityCatalog iceberg format table: df.writeTo(full_table_name).using("iceberg").create()However, if I am adding option partitionedBy I will get an error. df.writeTo(full_table_name).using("iceberg").partitionedBy("ingest_dat...

Data Engineering

3116 Views
10 replies
3 kudos

07-07-2025 10:18:05 AM

View Replies

Latest Reply

LazyGenius
New Contributor III

01-05-2026 1:13:20 AM

3 kudos

@Sanjeeb2024 If your question is for me, then I will say it depends on use case!!As if you have very big data to be ingested in table then you would prefer creating table and then ingest data into it using simultaneous jobs

3 kudos

01-05-2026 1:13:20 AM

9 More Replies

by Ajay-Pandey • Databricks MVP

10-14-2024 11:23:46 PM

4053 Views
9 replies
2 kudos

Databricks Job cluster for continuous run

Hi AllI am having situation where I wanted to run job as continuous trigger by using job cluster, cluster terminating and re-creating in every run within continuous trigger.I just wanted two know if we have any option where I can use same job cluster...

Data Engineering

4053 Views
9 replies
2 kudos

10-14-2024 11:23:46 PM

View Replies

Latest Reply

mukul1409
Contributor II

01-07-2026 8:16:05 PM

2 kudos

Hi @Ajay-Pandey only solution for you 1. Create an all-purpose cluster called for example: continuous-job-cluster and Disable auto-termination or set it to a large value.2. Configure job to use existing_cluster_id In Jobs UI or DAB YAML: exi...

2 kudos

01-07-2026 8:16:05 PM

8 More Replies

by parth_db • New Contributor III

01-05-2026 9:57:23 PM

980 Views
5 replies
7 kudos

Resolved! AutoLoader Type Widening

I have a few doubts regarding AutoLoader behavior and capabilities. Please check and correct wherever my assumptions or understanding are incorrect, much appreciated. Below is my specific code Example scenario:Target Managed Delta Table (Type Widenin...

Data Engineering

980 Views
5 replies
7 kudos

01-05-2026 9:57:23 PM

View Replies

Latest Reply

Sanjeeb2024
Valued Contributor

01-07-2026 8:45:14 AM

7 kudos

Thank you @nayan_wylde for the details. This is really useful.

7 kudos

01-07-2026 8:45:14 AM

4 More Replies

by vamsi_simbus • Databricks Partner

01-07-2026 3:58:54 AM

840 Views
2 replies
3 kudos

Resolved! Databricks Apps - Auto Terminate Option

Hi Everyone,I’m exploring Databricks Apps and have two questions:Is there a way to automatically terminate an app after a certain period of inactivity?Does Databricks provide any scheduling mechanism for apps, similar to how Jobs can be scheduled?Any...

Data Engineering

840 Views
2 replies
3 kudos

01-07-2026 3:58:54 AM

View Replies

Latest Reply

Sanjeeb2024
Valued Contributor

01-07-2026 8:35:49 AM

3 kudos

Hi @vamsi_simbus - One option you can explore to start and stop apps using Databricks API. Have a look on the below document link - https://docs.databricks.com/api/workspace/apps/stop

3 kudos

01-07-2026 8:35:49 AM

1 More Replies

by slangenborg • Databricks Partner

01-06-2026 1:07:02 PM

536 Views
3 replies
1 kudos

Resolved! DAB Job - Serverless Cluster using configured base environment

I have configured a base serverless environment for my workspace that includes libraries from a private repositoryThis base environment has been set to default, and behaves as expected when running notebooks manually in the workspace with Serverless ...

Data Engineering

536 Views
3 replies
1 kudos

01-06-2026 1:07:02 PM

View Replies

Latest Reply

mukul1409
Contributor II

01-06-2026 11:06:30 PM

1 kudos

Hi @slangenborg According to the official Databricks Jobs REST API documentation, notebook tasks use the notebook environment only implicitly when no environment_key is provided. The API lets you explicitly configure environments only via an environ...

1 kudos

01-06-2026 11:06:30 PM

2 More Replies

by tonkol • New Contributor II

01-06-2026 12:22:39 PM

341 Views
1 replies
0 kudos

Migrate on-premise delta tables to Databricks (Azure)

Hi There,I have the situation that we've decided to migrate our on-premise delta-lake to Azure Databricks.Because of networking I can only "push" the data from on-prem to cloud.What would be the best way to replicate all tables: schema+partitioning i...

Data Engineering

341 Views
1 replies
0 kudos

01-06-2026 12:22:39 PM

View Replies

Latest Reply

mukul1409
Contributor II

01-06-2026 11:00:17 PM

0 kudos

The correct solution is not SQL based.Delta tables are defined by the contents of the delta log directory, not by CREATE TABLE statements. That is why SHOW CREATE TABLE cannot reconstruct partitions, properties or constraints.The only reliable migrat...

0 kudos

01-06-2026 11:00:17 PM

by dikla • New Contributor II

12-09-2025 7:45:00 AM

877 Views
4 replies
1 kudos

Resolved! Issues Creating Genie Space via API Join Specs Are Not Persisted

Hi,I’m experimenting with the new API to create a Genie Space.I’m able to successfully create the space, but the join definitions are not created, even though I’m passing a join_specs object in the same format returned by GET /spaces/{id} for an exis...

Data Engineering

877 Views
4 replies
1 kudos

12-09-2025 7:45:00 AM

View Replies

Latest Reply

mtaran
Databricks Employee

01-06-2026 10:51:28 AM

1 kudos

The serialized space JSON is incorrect. It has `join_specs` and `sql_snippets` nested under `data_sources`, but they should be nested under `instructions` instead. There they apply as expected.

1 kudos

01-06-2026 10:51:28 AM

3 More Replies

by Maxrb • New Contributor III

01-06-2026 12:40:49 AM

526 Views
1 replies
1 kudos

Resolved! Import functions in databricks asset bundles using source: WORKSPACE

Hi,We are using Databricks asset bundles, and we create functions which we import in notebooks, for instance:from utils import helperswhere utils is just a folder in our root. When running this with source: WORKSPACE, it will fail to resolve the impo...

Data Engineering

526 Views
1 replies
1 kudos

01-06-2026 12:40:49 AM

View Replies

Latest Reply

iyashk-DB
Databricks Employee

01-06-2026 7:35:23 AM

1 kudos

In Git folders, the repo root is auto-added to the Python path, so imports like from utils import helpers work, while in workspace folders, only the notebook’s directory is on the path, which is why it breaks. The quick fix is a tiny bootstrap that a...

1 kudos

01-06-2026 7:35:23 AM

by ramsai • New Contributor II

01-06-2026 6:10:45 AM

659 Views
3 replies
3 kudos

Resolved! Serverless Compute Access Restriction Not Supported at User Level

The requirement is to disable serverless compute access for specific users while allowing them to use only their assigned clusters, without restricting serverless compute at the workspace level. After reviewing the available configuration options, th...

Data Engineering

659 Views
3 replies
3 kudos

01-06-2026 6:10:45 AM

View Replies

Latest Reply

Masood_Joukar
Contributor

01-06-2026 7:22:24 AM

3 kudos

Hi @ramsai ,how about a workaround ?setting budget policies at account level.Attribute usage with serverless budget policies | Databricks on AWS

3 kudos

01-06-2026 7:22:24 AM

2 More Replies

by RyanHager • Contributor

12-31-2025 8:53:07 AM

725 Views
2 replies
2 kudos

Resolved! Liquid Clustering and S3 Performance

Are there any performance concerns when using liquid clustering and AWS S3. I believe all the parquet files go in the same folder (Prefix in AWS S3 Terms) verses folders per partition when using "partition by". And there is this note on S3 performa...

Data Engineering

725 Views
2 replies
2 kudos

12-31-2025 8:53:07 AM

View Replies

Latest Reply

iyashk-DB
Databricks Employee

01-05-2026 7:43:45 AM

2 kudos

Even though liquid clustering removes Hive-style partition folders, it typically doesn’t cause S3 prefix performance issues on Databricks. Delta tables don’t rely on directory listing for reads; they use the transaction log to locate exact files. In ...

2 kudos

01-05-2026 7:43:45 AM

1 More Replies

by EdemSeitkh • New Contributor III

01-23-2024 6:52:08 AM

9435 Views
6 replies
0 kudos

Resolved! Pass catalog/schema/table name as a parameter to sql task

Hi, i am trying to pass catalog name as a parameter into query for sql task, and it pastes it with single quotes, which results in error. Is there a way to pass raw value or other possible workarounds? query:INSERT INTO {{ catalog }}.pas.product_snap...

Data Engineering

9435 Views
6 replies
0 kudos

01-23-2024 6:52:08 AM

View Replies

Latest Reply

detom
New Contributor II

01-05-2026 12:52:45 PM

0 kudos

This works USE CATALOG IDENTIFIER({{ catalog_name }});USE SCHEMA IDENTIFIER({{ schema_name }});

0 kudos

01-05-2026 12:52:45 PM

5 More Replies

by Gilad-Shai • New Contributor III

01-03-2026 10:55:40 PM

1068 Views
12 replies
12 kudos

Resolved! Creating Serverless Cluster

Hi everyone,I am trying to create a cluster in Databricks Free Edition, but I keep getting the following error:"Cannot create serverless cluster, please try again later."I have attempted this on different days and at different times, but the issue pe...

Data Engineering

1068 Views
12 replies
12 kudos

01-03-2026 10:55:40 PM

View Replies

Latest Reply

Gilad-Shai
New Contributor III

01-05-2026 8:00:00 AM

12 kudos

Thank you all ( @Sanjeeb2024 , @Sanjeeb2024, @JAHNAVI , @Manoj12421 ), it works!It was not a DataBricks Free Edition as @Masood_Joukar said.

12 kudos

01-05-2026 8:00:00 AM

11 More Replies

by Sainath368 • Contributor

11-17-2025 5:48:11 AM

557 Views
4 replies
2 kudos

Migrating from directory-listing to Autoloader Managed File events

Hi everyone,We are currently migrating from a directory listing-based streaming approach to managed file events in Databricks Auto Loader for processing our data in structured streaming.We have a function that handles structured streaming where we ar...

Data Engineering

557 Views
4 replies
2 kudos

11-17-2025 5:48:11 AM

View Replies

Latest Reply

Raman_Unifeye
Honored Contributor III

11-17-2025 6:15:07 AM

2 kudos

Yes, for your setup, Databricks Auto Loader will create a separate event queue for each independent stream running with the cloudFiles.useManagedFileEvents = true option.As you are running - 1 stream per table, 1 unique directory per stream and 1 uni...

2 kudos

11-17-2025 6:15:07 AM

3 More Replies

by halsgbs • New Contributor III

01-05-2026 2:51:48 AM

389 Views
3 replies
2 kudos

Alerts V2 Parameters

Hi, I'm working on using Databricks python SDK to create an alert using a notebook, but it seems with V1 there is no way to add subscribers and with V2 there is no option for adding parameters. Is my understanding correct or am I missing something? A...

Data Engineering

389 Views
3 replies
2 kudos

01-05-2026 2:51:48 AM

View Replies

Latest Reply

iyashk-DB
Databricks Employee

01-05-2026 3:39:00 AM

2 kudos

Alerts V2 (Public Preview) do not support query parameters yet. This is a documented limitation. Legacy alerts (V1) do support parameters and will use the default values defined in the SQL editor. For notifications, both legacy alerts and Alerts V2 a...

2 kudos

01-05-2026 3:39:00 AM

2 More Replies

by lziolkow2 • Databricks Partner

01-04-2026 11:59:31 PM

1016 Views
4 replies
5 kudos

Resolved! Strange DELTA_MULTIPLE_SOURCE_ROW_MATCHING_TARGET_ROW_IN_MERGE error

I use databricks 17.3 runtime.I try to run following code.CREATE OR REPLACE TABLE default.target_table (key1 INT,key2 INT,key3 INT,val STRING) USING DELTA;INSERT INTO target_table(key1, key2, key3, val) VALUES(1, 1, 1, 'a');CREATE OR REPLACE TABLE de...

Data Engineering

1016 Views
4 replies
5 kudos

01-04-2026 11:59:31 PM

View Replies

Latest Reply

emma_s
Databricks Employee

01-05-2026 3:49:46 AM

5 kudos

Hi, you need to put all of the keys in the ON part of the clause rather then in the where condition. This code works: MERGE INTO target_table AS target USING source_table AS source ON target.key1 = source.key1 AND target.key2 = source.key2 AND target...

5 kudos

01-05-2026 3:49:46 AM

3 More Replies

Databricks Community

Forum Posts

iceberg with partitionedBy option

Databricks Job cluster for continuous run

Resolved! AutoLoader Type Widening

Resolved! Databricks Apps - Auto Terminate Option

Resolved! DAB Job - Serverless Cluster using configured base environment

Migrate on-premise delta tables to Databricks (Azure)

Resolved! Issues Creating Genie Space via API Join Specs Are Not Persisted

Resolved! Import functions in databricks asset bundles using source: WORKSPACE

Resolved! Serverless Compute Access Restriction Not Supported at User Level

Resolved! Liquid Clustering and S3 Performance

Resolved! Pass catalog/schema/table name as a parameter to sql task

Resolved! Creating Serverless Cluster

Migrating from directory-listing to Autoloader Managed File events

Alerts V2 Parameters

Resolved! Strange DELTA_MULTIPLE_SOURCE_ROW_MATCHING_TARGET_ROW_IN_MERGE error

DLT with CDC and schema changes in streaming pipel...

how to update not tracked column only in new row v...

Databricks Cost Estimation Template

Use .R file in data pipeline

CVE-2023-51385 and CVE-2023-38408 in Runtime 17.3 ...