Data Engineering

Forum Posts

Sorted by:

by 5UDO • New Contributor II

05-02-2025 4:42:16 AM

2975 Views
6 replies
4 kudos

Databricks warehouse table optimization

Hi everyone,I just started using the Databricks and wanted to evaluate the reading speeds when using the Databricks warehouse.So I've generated the dataset of 100M records, which contains name, surname, date of birth, phone number and an address. Dat...

Data Engineering

2975 Views
6 replies
4 kudos

05-02-2025 4:42:16 AM

View Replies

Latest Reply

5UDO
New Contributor II

05-06-2025 8:39:19 AM

4 kudos

Hi Brahmareddy and AndrewN,Thank you on your answers.I first need to apologize as I accidentally wrote wrong that I got 270ms with hashing the date of birth, surname and name and then using the z ordering.I actually achieved around 290ms with hashing...

4 kudos

05-06-2025 8:39:19 AM

5 More Replies

by jtjohnson • New Contributor II

05-01-2025 10:14:16 AM

1778 Views
4 replies
0 kudos

API Definition File

Hello. We are in the process of setting up Azure APIM to Databricks Rest API(s). Is there an official definition file available for download?Any help would be greatly appreciated

Data Engineering

1778 Views
4 replies
0 kudos

05-01-2025 10:14:16 AM

View Replies

Latest Reply

jtjohnson
New Contributor II

05-01-2025 11:41:40 AM

0 kudos

Thank you for the feedback. The postman collection would be ideal but the link is a no longer active

0 kudos

05-01-2025 11:41:40 AM

3 More Replies

by harika5991 • New Contributor II

05-06-2025 2:10:50 PM

1652 Views
1 replies
0 kudos

Unable to create a metastore for Unity Catalog as I don't have Account Admin rights

Hello guys,I just started learning Databricks. I created a Databricks workspace via the Azure Portal using the Trial (Premium - 14-Days Free DBUs) plan. The workspace name is `easewithdata-adb`.However,I do not currently see the option to create a Un...

Data Engineering

1652 Views
1 replies
0 kudos

05-06-2025 2:10:50 PM

View Replies

Latest Reply

lingareddy_Alva
Esteemed Contributor

05-06-2025 4:50:36 PM

0 kudos

Hi @harika5991 You're right about the root cause of your issue. Creating a Unity Catalog metastore requires Account Admin privileges, which is separate from just creating a workspace in Azure.These are options you can try:When you create a Databricks...

0 kudos

05-06-2025 4:50:36 PM

by Louis_Frolio • Databricks Employee

05-05-2025 1:30:17 PM

7890 Views
4 replies
4 kudos

Resolved! What are your most impactful use cases for schema evolution in Databricks?

Data Engineers, Share Your Experiences with Delta Lake Schema Evolution! We're calling on all data engineers to share their experiences with the powerful schema evolution feature in Delta Lake. This feature allows for seamless adaptation to changin...

Data Engineering

7890 Views
4 replies
4 kudos

05-05-2025 1:30:17 PM

View Replies

Latest Reply

Louis_Frolio
Databricks Employee

05-06-2025 12:50:17 PM

4 kudos

Outstanding!

4 kudos

05-06-2025 12:50:17 PM

3 More Replies

by flashmav • New Contributor II

05-06-2025 4:13:02 AM

1143 Views
1 replies
0 kudos

Resolved! ConcurrentDeleteDeleteException in liquid cluster table

I am doing a merge in a table in parallel via 2 jobs.The table is a liquid clustered table with the following properties:delta.enableChangeDataFeed=truedelta.enableDeletionVectors=truedelta.enableRowTracking=truedelta.feature.changeDataFeed=supported...

Data Engineering

1143 Views
1 replies
0 kudos

05-06-2025 4:13:02 AM

View Replies

Latest Reply

Louis_Frolio
Databricks Employee

05-06-2025 4:35:25 AM

0 kudos

Hey @flashmav , keep in mind that operations in Delta Lake often occur at the file level rather than the row level. For example, if two sessions attempt to update data in the same file (even if they’re not updating the same row), you may encounter a...

0 kudos

05-06-2025 4:35:25 AM

by SoniSole • New Contributor II

07-28-2022 5:34:04 PM

9967 Views
6 replies
6 kudos

Issue with Docker Image connection

Hello, I have created and pushed a docker image to Azure Container Registry . I used that image to start the cluster in Databricks. But the cluster doesn't start and as such when I try to run a Databricks' Jon using that Cluster, I get this error bel...

Data Engineering

9967 Views
6 replies
6 kudos

07-28-2022 5:34:04 PM

View Replies

Latest Reply

jeremy98
Honored Contributor

05-06-2025 3:57:34 AM

6 kudos

We have the same issue, right now. Which is the problem??

6 kudos

05-06-2025 3:57:34 AM

5 More Replies

by vgupta • New Contributor II

03-12-2023 10:20:31 AM

10807 Views
6 replies
4 kudos

DLT | Cluster terminated by System-User | INTERNAL_ERROR: Communication lost with driver. Cluster 0312-140502-k9monrjc was not reachable for 120 seconds

Dear Community, Hope you are doing well.For the last couple of days I am seeing very strange issues with my DLT pipeline, So every 60-70 mins it is getting failed in continuous mode, with the ERROR; INTERNAL_ERROR: Communication lost with driver. Clu...

Data Engineering

10807 Views
6 replies
4 kudos

03-12-2023 10:20:31 AM

View Replies

Latest Reply

Rahiman
Databricks Partner

05-06-2025 1:53:49 AM

4 kudos

We had similar error for one the DLT pipeline, This could be some times because of compute size, we had increased compute size of server in your DLT pipelines, still we were seeing this error while processing very large file. we then added below para...

4 kudos

05-06-2025 1:53:49 AM

5 More Replies

by aswinvishnu • New Contributor II

04-29-2025 10:39:18 AM

1703 Views
3 replies
1 kudos

Exporting table to GCS bucket using job

Hi all,Usecase: I want to send the result of a query to GCS bucket location in json format.Approach: From my java based application I create a job and that job will be running a notebook`. Notebook will have something like this```query = "SELECT * FR...

Data Engineering

1703 Views
3 replies
1 kudos

04-29-2025 10:39:18 AM

View Replies

Latest Reply

LorelaiSpence
New Contributor II

05-06-2025 1:23:13 AM

1 kudos

Consider using GCS signed URLs or access tokens for secure access.

1 kudos

05-06-2025 1:23:13 AM

2 More Replies

by Maverick1 • Valued Contributor II

03-04-2022 5:42:36 AM

5971 Views
6 replies
6 kudos

How to infer the online feature store table via an mlflow registered model, which is deployed to a sagemaker endpoint?

Can an mlflow registered model automatically infer the online feature store table, if that model is trained and logged via a databricks feature store table and the table is pushed to an online feature store (like AWS RDS)?

Data Engineering

5971 Views
6 replies
6 kudos

03-04-2022 5:42:36 AM

View Replies

Latest Reply

Janifer45
New Contributor II

05-05-2025 11:05:09 PM

6 kudos

Thanks for this

6 kudos

05-05-2025 11:05:09 PM

5 More Replies

by BrianLind • New Contributor II

05-05-2025 6:42:29 AM

1083 Views
2 replies
0 kudos

Need access to browse onprem SQL data

Our BI team has started using Databricks and would like to browse our local (onprem) SQL database servers from within Databricks. I'm not sure if that's even possible.So far, I've set up Databricks Secure Cluster Connectivity (SCC), created a privat...

Data Engineering

1083 Views
2 replies
0 kudos

05-05-2025 6:42:29 AM

View Replies

Latest Reply

Renu_
Valued Contributor II

05-05-2025 8:09:53 AM

0 kudos

Hi, based on what you’ve shared, it seems you’ve already completed many of the necessary steps. Just a few things to double-check as you move forward:SQL Warehouses used for BI tools need to run in Pro mode, not serverless, since only Pro or Classic ...

0 kudos

05-05-2025 8:09:53 AM

1 More Replies

by muano_makhokha • New Contributor II

05-05-2025 5:18:49 AM

1849 Views
1 replies
1 kudos

Resolved! Row filtering and Column masking not working even when requirements the are met

I have been trying to use the Row filtering and Column masking feature to redacted columns and and filter rows based on the group a user is in.I have all the necessary permissions and I've used cluster's with version 15.4 and higher.When I run the fo...

Data Engineering

1849 Views
1 replies
1 kudos

05-05-2025 5:18:49 AM

View Replies

Latest Reply

Louis_Frolio
Databricks Employee

05-05-2025 8:47:37 AM

1 kudos

Here are some things to consider/try: The UnityCatalogServiceException error you are encountering, ABORTED.UC_DBR_TRUST_VERSION_TOO_OLD, generally indicates that the Databricks Runtime (DBR) version you are using no longer supports the operation, s...

1 kudos

05-05-2025 8:47:37 AM

by meret • Databricks Partner

05-05-2025 7:46:52 AM

1576 Views
1 replies
0 kudos

Column Default Propagation

Hi Today I found I somewhat strange behavior when it comes to default values in columns. Apparently, column defaults are propagated to a new table, when you select the column without any operation on it. This is a bit unexpected for me. Here a short...

Data Engineering

1576 Views
1 replies
0 kudos

05-05-2025 7:46:52 AM

View Replies

Latest Reply

Louis_Frolio
Databricks Employee

05-05-2025 8:45:17 AM

0 kudos

The behavior you described regarding the propagation of default column values is expected and is tied to the specific usage of the delta.feature.allowColumnDefaults table property in Delta Lake. Here’s an explanation: Default Propagation Without Tra...

0 kudos

05-05-2025 8:45:17 AM

by Dinesh6351 • New Contributor II

05-05-2025 5:35:48 AM

1006 Views
2 replies
3 kudos

Resolved! Quota Exhaustion issues in your Student account

Data Engineering

1006 Views
2 replies
3 kudos

05-05-2025 5:35:48 AM

View Replies

Latest Reply

amos
New Contributor III

05-05-2025 8:00:12 AM

3 kudos

Esse erro ocorre quando sua conta Azure excede a cota regional de núcleos disponíveis, impedindo a criação do cluster no Databricks. Isso significa que o cluster tentou utilizar mais recursos do que o permitido na sua região.1-Revise a configuração d...

3 kudos

05-05-2025 8:00:12 AM

1 More Replies

by jonhieb • New Contributor III

02-04-2025 6:03:51 AM

4548 Views
6 replies
3 kudos

Resolved! [Databricks Asset Bundles] Triggering Delta Live Tables

I would like to know how to schedule a DLT pipeline using DAB's.I'm trying to trigger a Delta Live Table pipeline using Databricks Asset Bundles. Below is my YAML code:resources: pipelines: data_quality_pipelines: name: data_quality_pipeline...

Data Engineering

4548 Views
6 replies
3 kudos

02-04-2025 6:03:51 AM

View Replies

Latest Reply

Walter_C
Databricks Employee

02-04-2025 6:08:30 AM

3 kudos

As of now, Databricks Asset Bundles do not support direct scheduling of DLT pipelines using cron expressions within the bundle configuration. Instead, you can achieve scheduling by creating a Databricks job that triggers the DLT pipeline and then sch...

3 kudos

02-04-2025 6:08:30 AM

5 More Replies

by LasseL • Databricks Partner

06-04-2024 4:52:57 AM

4086 Views
4 replies
1 kudos

How to use change data feed when schema is changing between delta table versions?

How to use change data feed when delta table schema changes between delta table versions?I tried to read change data feed in parts (in code snippet I read version 1372, because 1371 and 1373 schema versions are different), but getting errorUnsupporte...

Data Engineering

4086 Views
4 replies
1 kudos

06-04-2024 4:52:57 AM

View Replies

Latest Reply

lingareddy_Alva
Esteemed Contributor

04-23-2025 10:31:16 AM

1 kudos

@LasseL When you read from the change data feed in batch mode, Delta Lake always uses a single schema:By default, it uses the latest table version’s schema, even if you’re only reading an older versionOn Delta Runtime ≥ 12.2 LTS with column mapping e...

1 kudos

04-23-2025 10:31:16 AM

3 More Replies

Databricks Community

Forum Posts

Databricks warehouse table optimization

API Definition File

Unable to create a metastore for Unity Catalog as I don't have Account Admin rights

Resolved! What are your most impactful use cases for schema evolution in Databricks?

Resolved! ConcurrentDeleteDeleteException in liquid cluster table

Issue with Docker Image connection

DLT | Cluster terminated by System-User | INTERNAL_ERROR: Communication lost with driver. Cluster 0312-140502-k9monrjc was not reachable for 120 seconds

Exporting table to GCS bucket using job

How to infer the online feature store table via an mlflow registered model, which is deployed to a sagemaker endpoint?

Need access to browse onprem SQL data

Resolved! Row filtering and Column masking not working even when requirements the are met

Column Default Propagation

Resolved! Quota Exhaustion issues in your Student account

Resolved! [Databricks Asset Bundles] Triggering Delta Live Tables

How to use change data feed when schema is changing between delta table versions?

File Arrival Trigger - Multiple tables

Issue while handling Deletes and Inserts in Struct...

DLT with CDC and schema changes in streaming pipel...

how to update not tracked column only in new row v...

Databricks Cost Estimation Template