Data Engineering

Forum Posts

Sorted by:

by GANAPATI_HEGDE • New Contributor III

yesterday

51 Views
2 replies
0 kudos

Unable to configure custom compute for DLT pipeline

I am trying to configure cluster for a pipeline like above, However dlt keeps using the small cluster as usual, how to resolve this?

Data Engineering

51 Views
2 replies
0 kudos

yesterday

View Replies

Latest Reply

GANAPATI_HEGDE
New Contributor III

an hour ago

0 kudos

i updated my CLI and deployed the job, still i dont see the clusters updates in pipeline

0 kudos

an hour ago

1 More Replies

by Shubhankar_123 • Visitor

3 hours ago

13 Views
0 replies
0 kudos

Internal error 500 on databricks vector search endpoint

We are facing an internal 500 error accessing the vector search endpoint through streamlit application, if I refresh the application sometimes the error goes away, it has now started to become an usual occurrence. If I try to query the endpoint using...

Data Engineering

13 Views
0 replies
0 kudos

3 hours ago

by lecarusin • Visitor

10 hours ago

30 Views
1 replies
0 kudos

Help regarding a python notebook and s3 file structure

Hello all, I am new to this forum, so please forgive if I am posting in the wrong location (I'd appreciate if the post is moved by mods or am told where to post).I am looking for help with an optimization of a python code I have. This python notebook...

Data Engineering

30 Views
1 replies
0 kudos

10 hours ago

View Replies

Latest Reply

K_Anudeep
Databricks Employee

4 hours ago

0 kudos

Hello @lecarusin , You can absolutely make Databricks only read the dates you care about. The trick is to constrain the input paths (so Spark lists only those folders) instead of reading the whole directory. Build the exact S3 prefixes for your da...

0 kudos

4 hours ago

by sparmar • New Contributor

03-11-2025 1:56:25 AM

3543 Views
1 replies
0 kudos

I am Getting SSLError(SSLEOFError) error while triggering Azure DevOps pipeline from Databricks

While triggering Azure devOps pipleline from Databricks, I am getting below error:An error occurred: HTTPSConnectionPool(host='dev.azure.com', port=443): Max retries exceeded with url: /XXX-devops/XXXDevOps/_apis/pipelines/20250224.1/runs?api-version...

Data Engineering

3543 Views
1 replies
0 kudos

03-11-2025 1:56:25 AM

View Replies

Latest Reply

mark_ott
Databricks Employee

4 hours ago

0 kudos

The error you’re seeing (SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:1147)')) while triggering the Azure DevOps pipeline from Databricks indicates an issue with the SSL/TLS handshake, not the firewall or certificate itself. This is ...

0 kudos

4 hours ago

by Amit_Dass_Chmp • New Contributor III

03-13-2025 9:14:55 AM

3024 Views
1 replies
0 kudos

query on Databricks Arc :ARC will not work on 13.x or greater runtime

I have a query on Databricks Arc , is this statement true - Databricks Runtime Requirements for implementing Arc:ARC requires Databricks ML Runtime 12.2LTS. ARC will not work on 13.x or greater runtime

Data Engineering

3024 Views
1 replies
0 kudos

03-13-2025 9:14:55 AM

View Replies

Latest Reply

mark_ott
Databricks Employee

4 hours ago

0 kudos

The statement is true: Databricks Arc requires the Databricks ML Runtime 12.2 LTS and will not work on 13.x or greater runtimes. This requirement is confirmed by multiple Databricks Community discussions and documentation, which specifically state th...

0 kudos

4 hours ago

by j_h_robinson • New Contributor II

03-14-2025 7:39:09 AM

3098 Views
1 replies
0 kudos

GitHub CI/CD Best Practices

Using GitHub, what are some best-practice CI/CD approaches to use specifically with the silver and gold medallion layers? We want to create the bronze, silver, and gold layers in Databricks notebooks.Also, is using notebooks in production a "best pra...

Data Engineering

3098 Views
1 replies
0 kudos

03-14-2025 7:39:09 AM

View Replies

Latest Reply

mark_ott
Databricks Employee

4 hours ago

0 kudos

For Databricks projects using the medallion architecture (bronze, silver, gold layers), effective CI/CD strategies on GitHub include strict version control, environment isolation, automated testing and deployments, and careful notebook management—all...

0 kudos

4 hours ago

by SObiero • New Contributor

03-14-2025 7:26:23 AM

3323 Views
1 replies
0 kudos

Passing Microsoft MFA Auth from Databricks to MSSQL Managed Instance in a Databricks FastAPI App

I have a Databricks App built using FastAPI. Users access this App after authenticating with Microsoft MFA on Databricks Azure Cloud. The App connects to an MSSQL Managed Instance (MI) that also supports Microsoft MFA.I want the authenticated user's ...

Data Engineering

3323 Views
1 replies
0 kudos

03-14-2025 7:26:23 AM

View Replies

Latest Reply

mark_ott
Databricks Employee

4 hours ago

0 kudos

It is not possible in Databricks to seamlessly pass each authenticated user's Azure/MS identity from a web app running on Databricks to MSSQL MI for per-user MFA authentication, in the way your development code does. This limitation stems from how id...

0 kudos

4 hours ago

by kanikeom • New Contributor II

03-18-2025 10:52:14 AM

3606 Views
2 replies
1 kudos

Asset Bundle API update issues

I was working on a proof of concept (POC) using the assert bundle. My job configuration in the .yml file worked yesterday, but it threw an error today during a demo to the team.The error was likely due to an update to the Databricks API. After some t...

Data Engineering

3606 Views
2 replies
1 kudos

03-18-2025 10:52:14 AM

View Replies

Latest Reply

mark_ott
Databricks Employee

4 hours ago

1 kudos

Unexpected breaking changes to APIs—especially from cloud platforms like Databricks—can disrupt projects and demos. Proactively anticipating and rapidly adapting to such updates requires a combination of monitoring, process improvements, and technica...

1 kudos

4 hours ago

1 More Replies

by jeremy98 • Honored Contributor

03-14-2025 4:49:08 AM

3320 Views
2 replies
0 kudos

if else condition task doubt

Hi community,The if else condition task couldn't be used as real if condition? Seems that if the condition goes to False the entire job will be stop. Is it a right behaviour?

Data Engineering

3320 Views
2 replies
0 kudos

03-14-2025 4:49:08 AM

View Replies

Latest Reply

mark_ott
Databricks Employee

4 hours ago

0 kudos

In Databricks workflows, the "if-else" condition and depends_on logic do not behave exactly like standard programming if-else statements. If a task depends on another task's outcome and that outcome does not match (for example, the condition is false...

0 kudos

4 hours ago

1 More Replies

by Carl_B • New Contributor II

03-20-2025 1:35:56 PM

3759 Views
1 replies
0 kudos

ImportError: cannot import name 'override' from 'typing_extensions'

Hello,I'm facing an ImportError when trying to run my OpenAI-based summarization script in.The error message is:ImportError: cannot import name 'override' from 'typing_extensions' (/databricks/python/lib/python3.10/site-packages/typing_extensions.py)...

Data Engineering

3759 Views
1 replies
0 kudos

03-20-2025 1:35:56 PM

View Replies

Latest Reply

mark_ott
Databricks Employee

4 hours ago

0 kudos

This error is caused by a version mismatch between the OpenAI Python package and the typing_extensions library in your Databricks environment. The 'override' symbol is relatively new and only exists in typing_extensions version 4.5.0 and above; some ...

0 kudos

4 hours ago

by SQLBob • New Contributor II

05-06-2025 9:17:47 AM

3502 Views
2 replies
0 kudos

Unity Catalog Python UDF to Send Messages to MS Teams

Good Morning All - This didn't seem like such a daunting task until I tried it. Of course, it's my very first function in Unity Catalog. Attached are images of both the UDF and example usage I created to send messages via the Python requests library ...

Data Engineering

3502 Views
2 replies
0 kudos

05-06-2025 9:17:47 AM

View Replies

Latest Reply

mark_ott
Databricks Employee

4 hours ago

0 kudos

You're encountering a common limitation when trying to use an external HTTP request (like the Python requests library) inside a Unity Catalog UDF in Databricks. While your code is correct for a regular notebook environment, Unity Catalog UDFs (and, s...

0 kudos

4 hours ago

1 More Replies

by jash281098 • New Contributor II

05-10-2025 6:38:53 PM

3024 Views
2 replies
0 kudos

Issues when adding keystore spark config for pyspark to mongo atlas X.509 connectivity

Step followed - Step1: To add init script that will copy the keystore file in the tmp location.Step2: To add spark config in cluster advance options - spark.driver.extraJavaOptions -Djavax.net.ssl.keyStore=/tmp/keystore.jks -Djavax.net.ssl.keyStorePa...

Data Engineering

3024 Views
2 replies
0 kudos

05-10-2025 6:38:53 PM

View Replies

Latest Reply

mark_ott
Databricks Employee

4 hours ago

0 kudos

To achieve MongoDB Atlas X.509 connectivity from Databricks using PySpark, the standard keystore configuration may fail due to certificate, configuration, or driver method issues. The recommended approach involves several key steps, including properl...

0 kudos

4 hours ago

1 More Replies

by Surya-Prathap • Visitor

16 hours ago

36 Views
1 replies
0 kudos

Output Not Displaying in Databricks Notebook on All-Purpose Compute Cluster

Hello All,I’m encountering an issue where output from standard Python commands such as print() or display(df) is not showing up correctly when running notebooks on an All-Purpose Compute cluster.Cluster DetailsCluster Type: All-Purpose ComputeRuntime...

Data Engineering

36 Views
1 replies
0 kudos

16 hours ago

View Replies

Latest Reply

Sahil_Kumar
Databricks Employee

9 hours ago

0 kudos

Hi Surya, Do you face this issue only with DBR 17.3 all-purpose clusters? Did you try with lower DBRs? If not, please try and let me know. Also, from the Run menu, try “Clear state and outputs,” then re‑run the cell on the same cluster to rule out st...

0 kudos

9 hours ago

by ShivangiB1 • New Contributor III

11 hours ago

17 Views
0 replies
0 kudos

DATABRICKS LAKEFLOW SQL SERVER INGESTION PIPELINE ERROR

Hey Team,I am getting below error while creating pipeline :com.databricks.pipelines.execution.extensions.managedingestion.errors.ManagedIngestionNonRetryableException: [INGESTION_GATEWAY_DDL_OBJECTS_MISSING] DDL objects missing on table 'coedb.dbo.so...

Data Engineering

17 Views
0 replies
0 kudos

11 hours ago

by Akshay_Petkar • Valued Contributor

13 hours ago

26 Views
0 replies
0 kudos

Advanced Data Engineering Event and Free Certification Voucher

Hi everyone,In the past couple of years, Databricks has organized an Advanced Data Engineering event where attendees received a 100% free certification voucher under their organization account after attending the session.I wanted to check if this eve...

Data Engineering

26 Views
0 replies
0 kudos

13 hours ago

Databricks Community

Forum Posts

Unable to configure custom compute for DLT pipeline

Internal error 500 on databricks vector search endpoint

Help regarding a python notebook and s3 file structure

I am Getting SSLError(SSLEOFError) error while triggering Azure DevOps pipeline from Databricks

query on Databricks Arc :ARC will not work on 13.x or greater runtime

GitHub CI/CD Best Practices

Passing Microsoft MFA Auth from Databricks to MSSQL Managed Instance in a Databricks FastAPI App

Asset Bundle API update issues

if else condition task doubt

ImportError: cannot import name 'override' from 'typing_extensions'

Unity Catalog Python UDF to Send Messages to MS Teams

Issues when adding keystore spark config for pyspark to mongo atlas X.509 connectivity

Output Not Displaying in Databricks Notebook on All-Purpose Compute Cluster

DATABRICKS LAKEFLOW SQL SERVER INGESTION PIPELINE ERROR

Advanced Data Engineering Event and Free Certification Voucher

Join Us as a Local Community Builder!

Resource Throttling; Large Merge Operation - Recen...

Databricks Asset Bundles - High Level Diagrams Flo...

Delta live table not showing in workspace (Azure d...

Unable to install libraries from requirements.txt ...

Databricks Bundle Validation Error After CLI Upgra...