Data Engineering

Forum Posts

Sorted by:

by dc-rnc • Contributor

05-06-2025 8:53:19 AM

3121 Views
2 replies
2 kudos

Issue pulling Docker Image on Databricks Cluster through Azure Container Registry

Hi Community.Essentially, we're using the ACR to push our custom Docker Image, then we would like to pull it to create a Databricks cluster. However, during the cluster creation, we got the following error:I'm convinced we tried to authenticate in al...

Data Engineering

3121 Views
2 replies
2 kudos

05-06-2025 8:53:19 AM

View Replies

Latest Reply

mark_ott
Databricks Employee

Friday

2 kudos

You are experiencing an authentication issue when trying to use a custom Docker image from Azure Container Registry (ACR) with Databricks clusters, despite successfully using admin tokens and service principals with acrpull permissions in other conte...

2 kudos

Friday

1 More Replies

by jeremy98 • Honored Contributor

04-02-2025 10:50:48 AM

3383 Views
1 replies
0 kudos

Hydra configuration and job parameters of DABs

Hello Community,I'm trying to create a job pipeline in Databricks that runs a spark_python_task, which executes a Python script configured with Hydra. The script's configuration file defines parameters, such as id.How can I pass this parameter at the...

Data Engineering

3383 Views
1 replies
0 kudos

04-02-2025 10:50:48 AM

View Replies

Latest Reply

mark_ott
Databricks Employee

Friday

0 kudos

You can pass and override configuration parameters for Hydra in a Databricks spark_python_task by specifying job-level parameters (as arguments) and using environment variables or Hydra’s command line overrides. For accessing secrets with dbutils.sec...

0 kudos

Friday

by siddharthsomni • New Contributor

05-13-2025 1:03:45 AM

2576 Views
2 replies
0 kudos

Databricks Bundle Asset - Notebook-based bundling alternative to CLI approach

Hello All - I have a scenario where we want to do entire bundling and packaging in notebook to deploy Jobs using Databricks Asset Bundle without using CLI or VS Code. I didn't find any material or reference that provides insights. Any input would be ...

Data Engineering

2576 Views
2 replies
0 kudos

05-13-2025 1:03:45 AM

View Replies

Latest Reply

mark_ott
Databricks Employee

Friday

0 kudos

Deploying Databricks Asset Bundles entirely from a notebook—without using the CLI or VS Code—is not a standard workflow but can be orchestrated using newer features in the Databricks workspace UI and by leveraging programmatic workspace operations. D...

0 kudos

Friday

1 More Replies

by Marcus_S • New Contributor

05-26-2025 7:55:38 AM

2738 Views
1 replies
0 kudos

Change in UNRESOLVED_COLUMN error behavior in Runtime 14.3 LTS

I've noticed a change in how Databricks handles unresolved column references in PySpark when using All-purpose compute (not serverless).In Databricks Runtime 14.3 LTS, referencing a non-existent column like this:df = spark.table('default.example').se...

Data Engineering

2738 Views
1 replies
0 kudos

05-26-2025 7:55:38 AM

View Replies

Latest Reply

mark_ott
Databricks Employee

Friday

0 kudos

Databricks has recently changed how unresolved column references are handled in PySpark on All-purpose compute clusters. In earlier Databricks Runtime (DBR) 14.3 LTS builds, referencing a non-existent column—such as: python df = spark.tabl...

0 kudos

Friday

by Michał • New Contributor III

09-03-2025 6:41:10 AM

1238 Views
6 replies
3 kudos

how to process a streaming lakeflow declarative pipeline in batches

Hi, I've got a problem and I have run out of ideas as to what else I can try. Maybe you can help? I've got a delta table with hundreds millions of records on which I have to perform relatively expensive operations. I'd like to be able to process some...

Data Engineering

1238 Views
6 replies
3 kudos

09-03-2025 6:41:10 AM

View Replies

Latest Reply

Michał
New Contributor III

Friday

3 kudos

thanks @mmayorga

3 kudos

Friday

5 More Replies

by deng_dev • New Contributor III

Friday

70 Views
3 replies
1 kudos

Databricks Apps pricing

Hi everyone!I was investigating Databricks Apps as solution for my task and didn't fully understood pricing.I have found this page and it indicates it will cost 75$ / DBU for Premium subscription plan when using AWS cloud. Is it full cost or will the...

Data Engineering

70 Views
3 replies
1 kudos

Friday

View Replies

Latest Reply

ManojkMohan
Honored Contributor

Friday

1 kudos

@deng_dev The $75 per DBU Premium subscription plan price for Databricks Apps on AWS shown on the Databricks Apps pricing page reflects the charge from Databricks itself. https://www.databricks.com/product/pricing/databricks-appsHowever, this is not ...

1 kudos

Friday

2 More Replies

by JameDavi_51481 • Contributor

01-17-2024 7:30:00 AM

9867 Views
11 replies
13 kudos

Can we add tags to Unity Catalog through Terraform?

We use Terraform to manage most of our infrastructure, and I would like to extend this to Unity Catalog. However, we are extensive users of tagging to categorize our datasets, and the only programmatic method I can find for adding tags is to use SQL ...

Data Engineering

9867 Views
11 replies
13 kudos

01-17-2024 7:30:00 AM

View Replies

Latest Reply

jlieow
Databricks Employee

Friday

13 kudos

In case anyone comes across this, have a look at databricks_entity_tag_assignment and see if it suits your needs.

13 kudos

Friday

10 More Replies

by DataGirl • New Contributor

09-08-2022 5:41:51 PM

16338 Views
7 replies
2 kudos

Multi value parameter on Power BI Paginated / SSRS connected to databricks using ODBC

Hi All, I'm wondering if anyone has had any luck setting up multi valued parameters on SSRS using ODBC connection to Databricks? I'm getting "Cannot add multi value query parameter" error everytime I change my parameter to multi value. In the query s...

Data Engineering

16338 Views
7 replies
2 kudos

09-08-2022 5:41:51 PM

View Replies

Latest Reply

kashti123
New Contributor

Friday

2 kudos

Hi I am also trying to set multi value parameters using the dynamic sql expression. However, the report gives error that multi value parameters are not supported by the data extension. Any help on this would be highly appreciated. Thanks , Drishti

2 kudos

Friday

6 More Replies

by kcyugesh • New Contributor

Thursday

67 Views
1 replies
1 kudos

Resolved! Delta live table not showing in workspace (Azure databricks with premium plan)

- I have a premium plan and owner level access

Screenshot 2025-11-07 at 12.15.29 PM.png

Screenshot 2025-11-07 at 12.22.33 PM.png

Data Engineering

67 Views
1 replies
1 kudos

Thursday

View Replies

Latest Reply

szymon_dybczak
Esteemed Contributor III

Thursday

1 kudos

Hi @kcyugesh ,They changed the name from DLT to Lakeflow Declartive Pipelines, so you won't find DLT name in UI.Click job & pipelines and then ETL pipeline to access declarative pipeline editior

1 kudos

Thursday

by DE5 • New Contributor

Thursday

44 Views
1 replies
1 kudos

Unable to see the Assistant suggested code and current code side by side

Hi,I'm unable to see the Assistant suggested code and current code side by side. Previously I'm able to see the my code and Assistant suggested code side by side which helped me to understand the changes. Please suggest if there is any ways for it. T...

Data Engineering

44 Views
1 replies
1 kudos

Thursday

View Replies

Latest Reply

ManojkMohan
Honored Contributor

Thursday

1 kudos

@DE5 Some recent updates moved comparison features into the SQL Editor side panel or rely on “Cell Actions,” where you can generate code or format it and then see differences before applying changeshttps://www.databricks.com/blog/introducing-new-sql-...

1 kudos

Thursday

by Dhruv-22 • Contributor II

2 weeks ago

352 Views
7 replies
6 kudos

Reading empty json file in serverless gives error

I ran a databricks notebook to do incremental loads from files in raw layer to bronze layer tables. Today, I encountered a case where the delta file was empty. I tried running it manually on the serverless compute and encountered an error.df = spark....

Data Engineering

352 Views
7 replies
6 kudos

2 weeks ago

View Replies

Latest Reply

K_Anudeep
Databricks Employee

a week ago

6 kudos

Hello @Dhruv-22 , Can you share the schema of the df? Do you have a _corrupt_record column in your dataframe? If yes.. where are you getting it from, because you said its an empty file correct?As per the design ,Spark blocks queries that only referen...

6 kudos

a week ago

6 More Replies

by dbdev • Contributor

Tuesday

151 Views
3 replies
1 kudos

Lakehouse Federation - fetch size parameter for optimization

Hi,We use lakehouse federation to connect to a database.A performance recommendation is to use 'fetchSize':Lakehouse Federation performance recommendations - Azure Databricks | Microsoft Learn SELECT * FROM mySqlCatalog.schema.table WITH ('fetchSiz...

Data Engineering

151 Views
3 replies
1 kudos

Tuesday

View Replies

Latest Reply

Louis_Frolio
Databricks Employee

Thursday

1 kudos

Hello @dbdev , I did some digging and here are some suggestions. The `fetchSize` parameter in Lakehouse Federation is currently only available through SQL syntax using the `WITH` clause, as documented in the performance recommendations. Unfortunately...

1 kudos

Thursday

2 More Replies

by databricksero • New Contributor II

Thursday

126 Views
2 replies
3 kudos

Resolved! Databricks Bundle Validation Error After CLI Upgrade (0.274.0 → 0.276.0)

After upgrading the Databricks CLI from version 0.274.0 to 0.276.0, bundle validation is failing with an error indicating that my configuration is formatted for "open-source Spark Declarative Pipelines" while the CLI now only supports "Lakeflow Decla...

Data Engineering

126 Views
2 replies
3 kudos

Thursday

View Replies

Latest Reply

szymon_dybczak
Esteemed Contributor III

Thursday

3 kudos

Hi @databricksero ,It's a bug. I've checked and the PR fixing this bug is already merged to main branch. Check below github thread and then once they build new release just update databricks CLI (soon they should release version without bug). Fix oss...

3 kudos

Thursday

1 More Replies

by Y_WANG • New Contributor

Thursday

72 Views
1 replies
0 kudos

Want to use DataFrame equality functions but also Numpy >= 2.0

In my team, we has a lot of Data science workflow using Spark and Pandas. In order to rassure the stability of workflows, we need to implement the unit test. Recently, I found out the DataFrame equality test functions introduced in Spark 3.5 which se...

Data Engineering

72 Views
1 replies
0 kudos

Thursday

View Replies

Latest Reply

ManojkMohan
Honored Contributor

Thursday

0 kudos

@Y_WANG The root cause of the AttributeError you face when importing assertDataFrameEqual from pyspark.testing in Spark 3.5 is due to Spark's code using the deprecated np.NaN attribute, which was removed in NumPy 2.0 (replaced by np.nan). This break...

0 kudos

Thursday

by der • Contributor II

Tuesday

161 Views
5 replies
1 kudos

EXCEL_DATA_SOURCE_NOT_ENABLED Excel data source is not enabled in this cluster

I want to read an Excel xlsx file on DBR 17.3. On the Cluster the library dev.mauch:spark-excel_2.13:4.0.0_0.31.2 is installed. V1 Implementation works fine:df = spark.read.format("dev.mauch.spark.excel").schema(schema).load(excel_file) display(df)V2...

Data Engineering

161 Views
5 replies
1 kudos

Tuesday

View Replies

Latest Reply

mmayorga
Databricks Employee

Thursday

1 kudos

hi @der First of all thank you for your patience and for providing more information about your case. Use of ".format("excel")" I replicated equally your cluster config in Azure. Without installing any library, I was able to run and load the xlsx fil...

1 kudos

Thursday

4 More Replies

Databricks Community

Forum Posts

Issue pulling Docker Image on Databricks Cluster through Azure Container Registry

Hydra configuration and job parameters of DABs

Databricks Bundle Asset - Notebook-based bundling alternative to CLI approach

Change in UNRESOLVED_COLUMN error behavior in Runtime 14.3 LTS

how to process a streaming lakeflow declarative pipeline in batches

Databricks Apps pricing

Can we add tags to Unity Catalog through Terraform?

Multi value parameter on Power BI Paginated / SSRS connected to databricks using ODBC

Resolved! Delta live table not showing in workspace (Azure databricks with premium plan)

Unable to see the Assistant suggested code and current code side by side

Reading empty json file in serverless gives error

Lakehouse Federation - fetch size parameter for optimization

Resolved! Databricks Bundle Validation Error After CLI Upgrade (0.274.0 → 0.276.0)

Want to use DataFrame equality functions but also Numpy >= 2.0

EXCEL_DATA_SOURCE_NOT_ENABLED Excel data source is not enabled in this cluster

Join Us as a Local Community Builder!

Databricks Asset Bundles - High Level Diagrams Flo...

Delta live table not showing in workspace (Azure d...

Unable to install libraries from requirements.txt ...

Databricks Bundle Validation Error After CLI Upgra...

DABs with multi github sources