Data Engineering

Forum Posts

Sorted by:

by Marcus_S • New Contributor II

05-26-2025 7:55:38 AM

4389 Views
2 replies
0 kudos

Change in UNRESOLVED_COLUMN error behavior in Runtime 14.3 LTS

I've noticed a change in how Databricks handles unresolved column references in PySpark when using All-purpose compute (not serverless).In Databricks Runtime 14.3 LTS, referencing a non-existent column like this:df = spark.table('default.example').se...

Data Engineering

4389 Views
2 replies
0 kudos

05-26-2025 7:55:38 AM

View Replies

Latest Reply

mark_ott
Databricks Employee

11-07-2025 8:28:24 AM

0 kudos

Databricks has recently changed how unresolved column references are handled in PySpark on All-purpose compute clusters. In earlier Databricks Runtime (DBR) 14.3 LTS builds, referencing a non-existent column—such as: python df = spark.tabl...

0 kudos

11-07-2025 8:28:24 AM

1 More Replies

by Asaph • New Contributor

01-22-2025 6:21:22 PM

5834 Views
8 replies
1 kudos

Issue with databricks.sdk - AccountClient Service Principals API

Hi everyone,I’ve been trying to work with the databricks.sdk Python library to manage service principals programmatically. However, I’m running into an issue when attempting to create a service principal using the AccountClient class. Below is the co...

Data Engineering

5834 Views
8 replies
1 kudos

01-22-2025 6:21:22 PM

View Replies

Latest Reply

MarlonFojas
New Contributor II

11-12-2025 10:41:21 PM

1 kudos

I am using the Python SDK and to authenticate I am using a SP and a Secret. Here is the code that worked for me in Azure Databricks notebook.from databricks.sdk import AccountClient acct_client = AccountClient( host="https://accounts.azuredatabr...

1 kudos

11-12-2025 10:41:21 PM

7 More Replies

by Ramana • Valued Contributor

09-11-2025 12:50:07 PM

1754 Views
6 replies
1 kudos

Resolved! Serverless Compute - Spark - Jobs failing with Max iterations (1000) reached for batch Resolution

Hello Community,We have been trying to migrate our jobs from Classic Compute to Serverless Compute. As part of this process, we face several challenges, and this is one of them.When we try to execute the existing jobs with Serverless Compute, if the ...

Data Engineering

1754 Views
6 replies
1 kudos

09-11-2025 12:50:07 PM

View Replies

Latest Reply

Ramana
Valued Contributor

11-12-2025 2:09:21 PM

1 kudos

In Serverless Version 4, Databricks fixed this issue.

1 kudos

11-12-2025 2:09:21 PM

5 More Replies

by akuma643 • New Contributor II

02-18-2025 6:48:44 AM

4903 Views
3 replies
1 kudos

The authentication value "ActiveDirectoryManagedIdentity" is not valid.

Hi Team,i am trying to connect to SQL server hosted in azure vm using Entra id authentication from Databricks.("authentication", "ActiveDirectoryManagedIdentity")Below is the notebook script i am using. driver = "com.microsoft.sqlserver.jdbc.SQLServe...

Data Engineering

4903 Views
3 replies
1 kudos

02-18-2025 6:48:44 AM

View Replies

Latest Reply

mark_ott
Databricks Employee

10-31-2025 8:21:01 AM

1 kudos

You are encountering an error because the default SQL Server JDBC driver bundled with Databricks may not fully support the authentication value "ActiveDirectoryManagedIdentity"—this option requires at least version 10.2.0 of the Microsoft SQL Server ...

1 kudos

10-31-2025 8:21:01 AM

2 More Replies

by cdn_yyz_yul • Contributor II

11-10-2025 12:53:21 PM

1810 Views
4 replies
1 kudos

Resolved! delta as streaming source, can the reader reads only newly appended rows?

Hello everyone,In our implementation of Medallion Architecture, we want to stream changes with spark structured streaming. I would like some advice on how to use delta table as source correctly, and if there is performance (memory usage) concern in t...

Data Engineering

1810 Views
4 replies
1 kudos

11-10-2025 12:53:21 PM

View Replies

Latest Reply

mark_ott
Databricks Employee

11-12-2025 8:53:28 AM

1 kudos

In your scenario using Medallion Architecture with Delta tables as both streaming source and sink, it is important to understand Spark Structured Streaming behavior and performance characteristics, especially with joins and memory usage. Here is a di...

1 kudos

11-12-2025 8:53:28 AM

3 More Replies

by Shubhankar_123 • New Contributor

11-10-2025 7:12:01 PM

782 Views
1 replies
0 kudos

Internal error 500 on databricks vector search endpoint

We are facing an internal 500 error accessing the vector search endpoint through streamlit application, if I refresh the application sometimes the error goes away, it has now started to become an usual occurrence. If I try to query the endpoint using...

Data Engineering

782 Views
1 replies
0 kudos

11-10-2025 7:12:01 PM

View Replies

Latest Reply

mark_ott
Databricks Employee

11-12-2025 8:55:47 AM

0 kudos

The intermittent Internal 500 errors you’re experiencing when accessing the vector search endpoint through a Streamlit app on Databricks—while direct console queries work—suggest an issue with the interaction between your Streamlit app’s environment ...

0 kudos

11-12-2025 8:55:47 AM

by SumitB14 • New Contributor

11-11-2025 12:43:11 AM

403 Views
1 replies
0 kudos

Databricks Nested Json Flattening

Hi Databricks Community,I am facing an issue while exploding nested JSON data.In the content column, I have dynamic nested JSON, and I am using the below approach to parse and explode it.from pyspark.sql import SparkSessionfrom pyspark.sql.functions ...

Data Engineering

403 Views
1 replies
0 kudos

11-11-2025 12:43:11 AM

View Replies

Latest Reply

mark_ott
Databricks Employee

11-12-2025 8:54:54 AM

0 kudos

You are encountering an AttributeError related to strip, which likely means that some entries for activity.value are not strings (maybe None or dicts) and your code expects all to be strings before calling .strip(). This kind of problem can arise if ...

0 kudos

11-12-2025 8:54:54 AM

by ShivangiB1 • New Contributor III

11-10-2025 10:22:27 AM

974 Views
2 replies
0 kudos

Resolved! DATABRICKS LAKEFLOW SQL SERVER INGESTION PIPELINE ERROR

Hey Team,I am getting below error while creating pipeline :com.databricks.pipelines.execution.extensions.managedingestion.errors.ManagedIngestionNonRetryableException: [INGESTION_GATEWAY_DDL_OBJECTS_MISSING] DDL objects missing on table 'coedb.dbo.so...

Data Engineering

974 Views
2 replies
0 kudos

11-10-2025 10:22:27 AM

View Replies

Latest Reply

mark_ott
Databricks Employee

11-12-2025 8:49:34 AM

0 kudos

The error you are seeing means Databricks cannot capture DDL (table definition) changes, even though CDC (Change Data Capture) and CT (Change Tracking) are enabled. You must run the specific DDL support objects script for Databricks ingestion and the...

0 kudos

11-12-2025 8:49:34 AM

1 More Replies

by shubham007 • Databricks Partner

11-09-2025 8:28:15 PM

830 Views
2 replies
0 kudos

Resolved! Urgency: How to do Data Migration task using Databricks Lakebridge tool ?

Dear community expert,I have completed two phases Analyzer & Converter of Databricks Lakebridge but stuck at migrating data from source to target using lakebridge. I have watched BrickBites Series on Lakebridge but did not find on how to migrate data...

Data Engineering

830 Views
2 replies
0 kudos

11-09-2025 8:28:15 PM

View Replies

Latest Reply

mark_ott
Databricks Employee

11-12-2025 8:48:11 AM

0 kudos

To migrate tables and views from Snowflake (source) to Databricks (target) using Lakebridge, you must export your data from Snowflake into a supported cloud storage (usually as Parquet files), then import these files into Databricks Delta tables. Lak...

0 kudos

11-12-2025 8:48:11 AM

1 More Replies

by Ajay-Pandey • Databricks MVP

05-27-2023 11:07:44 AM

9373 Views
8 replies
0 kudos

How we can send databricks log to Azure Application Insight ?

Hi All,I want to send databricks logs to azure application insight.Is there any way we can do it ??Any blog or doc will help me.

Data Engineering

9373 Views
8 replies
0 kudos

05-27-2023 11:07:44 AM

View Replies

Latest Reply

loic
Contributor

11-12-2025 6:34:27 AM

0 kudos

Hello,I finally used he AppInsights agent from OpenTelemetry which is documented in the official Microsoft documentation here:https://learn.microsoft.com/en-us/azure/azure-monitor/app/opentelemetry-enable?tabs=javaBelow is an adaptation of this "Get ...

0 kudos

11-12-2025 6:34:27 AM

7 More Replies

by pooja_bhumandla • Databricks Partner

10-27-2025 6:37:00 AM

4507 Views
3 replies
3 kudos

When to Use and when Not to Use Liquid Clustering?

Hi everyone,I’m looking for some practical guidance and experiences around when to choose Liquid Clustering versus sticking with traditional partitioning + Z-ordering.From what I’ve gathered so far:For small tables (<10TB), Liquid Clustering gives s...

Data Engineering

4507 Views
3 replies
3 kudos

10-27-2025 6:37:00 AM

View Replies

Latest Reply

mark_ott
Databricks Employee

11-11-2025 2:59:14 AM

3 kudos

Deciding between Liquid Clustering and traditional partitioning with Z-ordering depends on table size, query patterns, number of clustering columns, and file optimization needs. For tables under 10TB with queries consistently filtered on 1–2 columns,...

3 kudos

11-11-2025 2:59:14 AM

2 More Replies

by DatabricksEngi1 • Contributor

11-11-2025 2:36:15 AM

898 Views
4 replies
0 kudos

Resolved! MERGE operation not performing data skipping with liquid clustering on key columns

Hi, I need some help understanding a performance issue.I have a table that reads approximately 800K records every 30 minutes in an incremental manner.Let’s say its primary key is:timestamp, x, y This table is overwritten every 30 minutes and serves ...

Data Engineering

898 Views
4 replies
0 kudos

11-11-2025 2:36:15 AM

View Replies

Latest Reply

bianca_unifeye
Databricks MVP

11-11-2025 8:56:52 AM

0 kudos

MERGE is not a pure read plus filter operationEven though Liquid Clustering organizes your data by key ranges and writes min/max stats, the MERGE engine has to identify both matches and non-matches.That means the query planner must:Scan all candidate...

0 kudos

11-11-2025 8:56:52 AM

3 More Replies

by turagittech • Contributor

04-06-2025 9:59:20 PM

3109 Views
3 replies
1 kudos

Resolved! Schema updating with CI/CD development in SQL

Hi all,I am working to resolve how to build tables in a development workspace catalog and then easily migrate the code to a production catalog without manually altering the schema name. For those unaware, you can't have the same catalog names in deve...

Data Engineering

3109 Views
3 replies
1 kudos

04-06-2025 9:59:20 PM

View Replies

Latest Reply

evanc
New Contributor II

11-11-2025 4:38:58 PM

1 kudos

I will like to see an example from databricks that how will alembic and databricks works together, especially with schema evo. I think once schema got changed by schema evo, the alembic version not valid anymore. Wondering how to handle it properly. ...

1 kudos

11-11-2025 4:38:58 PM

2 More Replies

by petitregny • New Contributor II

04-16-2025 2:46:56 PM

4700 Views
5 replies
2 kudos

Reading from an S3 bucket using boto3 on serverless cluster

Hello All,I am trying to read a CSV file from my S3 bucket in a notebook running on serverless.I am using the two standard functions below, but I get a credentials error (Error reading CSV from S3: Unable to locate credentials).I don't have this issu...

Data Engineering

4700 Views
5 replies
2 kudos

04-16-2025 2:46:56 PM

View Replies

Latest Reply

Ramana
Valued Contributor

11-11-2025 2:16:47 PM

2 kudos

Boto3 with Access/Secret Key worked. I will try the Service Credentials. If Databricks Documentation is right, Instance Profiles with Serverless should work to establish Boto3 connection, but, unfortunately, setting up instance profiles on Serverles...

2 kudos

11-11-2025 2:16:47 PM

4 More Replies

by shubham007 • Databricks Partner

11-09-2025 8:47:03 PM

1637 Views
2 replies
0 kudos

Urgency: Getting Lakebridge installation failed in our organization environment (laptop)

Dear community expert,I’m reaching out for assistance with installing Databricks Lakebridge on my organization laptop. I have confirmed the stated prerequisites are installed: Java 22+, Python 3.11+, and the latest Databricks CLI, but the installer f...

Data Engineering

1637 Views
2 replies
0 kudos

11-09-2025 8:47:03 PM

View Replies

Latest Reply

WiliamRosa
Databricks Partner

11-11-2025 11:37:38 AM

0 kudos

Additionally, you can check the following documentation and links that I found, which may help you:1) To view the available Python versions and guidance on using a virtual environment (virtualenv or conda):https://www.piwheels.org/project/databricks-...

0 kudos

11-11-2025 11:37:38 AM

1 More Replies

Databricks Community

Forum Posts

Change in UNRESOLVED_COLUMN error behavior in Runtime 14.3 LTS

Issue with databricks.sdk - AccountClient Service Principals API

Resolved! Serverless Compute - Spark - Jobs failing with Max iterations (1000) reached for batch Resolution

The authentication value "ActiveDirectoryManagedIdentity" is not valid.

Resolved! delta as streaming source, can the reader reads only newly appended rows?

Internal error 500 on databricks vector search endpoint

Databricks Nested Json Flattening

Resolved! DATABRICKS LAKEFLOW SQL SERVER INGESTION PIPELINE ERROR

Resolved! Urgency: How to do Data Migration task using Databricks Lakebridge tool ?

How we can send databricks log to Azure Application Insight ?

When to Use and when Not to Use Liquid Clustering?

Resolved! MERGE operation not performing data skipping with liquid clustering on key columns

Resolved! Schema updating with CI/CD development in SQL

Reading from an S3 bucket using boto3 on serverless cluster

Urgency: Getting Lakebridge installation failed in our organization environment (laptop)

File Arrival Trigger - Multiple tables

Issue while handling Deletes and Inserts in Struct...

DLT with CDC and schema changes in streaming pipel...

how to update not tracked column only in new row v...

Databricks Cost Estimation Template