cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

ShivangiB1
by New Contributor III
  • 6 Views
  • 0 replies
  • 0 kudos

DATABRICKS LAKEFLOW SQL SERVER INGESTION PIPELINE ERROR

Hey Team,I am getting below error while creating pipeline :com.databricks.pipelines.execution.extensions.managedingestion.errors.ManagedIngestionNonRetryableException: [INGESTION_GATEWAY_DDL_OBJECTS_MISSING] DDL objects missing on table 'coedb.dbo.so...

  • 6 Views
  • 0 replies
  • 0 kudos
der
by Contributor II
  • 173 Views
  • 6 replies
  • 2 kudos

EXCEL_DATA_SOURCE_NOT_ENABLED Excel data source is not enabled in this cluster

I want to read an Excel xlsx file on DBR 17.3. On the Cluster the library dev.mauch:spark-excel_2.13:4.0.0_0.31.2 is installed. V1 Implementation works fine:df = spark.read.format("dev.mauch.spark.excel").schema(schema).load(excel_file) display(df)V2...

  • 173 Views
  • 6 replies
  • 2 kudos
Latest Reply
mmayorga
Databricks Employee
  • 2 kudos

hi @der  First of all thank you for your patience and for providing more information about your case. Use of ".format("excel")" I replicated equally your cluster config in Azure. Without installing any library, I was able to run and load the xlsx fil...

  • 2 kudos
5 More Replies
GJ2
by New Contributor II
  • 10443 Views
  • 12 replies
  • 2 kudos

Install the ODBC Driver 17 for SQL Server

Hi,I am not a Data Engineer, I want to connect to ssas. It looks like it can be connected through pyodbc. however looks like  I need to install "ODBC Driver 17 for SQL Server" using the following command. How do i install the driver on the cluster an...

GJ2_1-1739798450883.png
  • 10443 Views
  • 12 replies
  • 2 kudos
Latest Reply
Coffee77
Contributor
  • 2 kudos

If you only need to interact with your cloud SQL database, I recommend you use simple code like displayed below for running select queries. To write would be very similar. Take a look here: https://learn.microsoft.com/en-us/sql/connect/spark/connecto...

  • 2 kudos
11 More Replies
GANAPATI_HEGDE
by New Contributor III
  • 36 Views
  • 1 replies
  • 0 kudos

Unable to configure custom compute for DLT pipeline

I am trying to configure cluster for a pipeline like above, However dlt keeps using the small cluster as usual, how to resolve this? 

GANAPATI_HEGDE_0-1762754316899.png GANAPATI_HEGDE_1-1762754398253.png
  • 36 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @GANAPATI_HEGDE ,Your YAML definition looks correct. I've used following definition and it worked as expected. Which version of databricks cli are you using? In my case with CLI v0.272.0 I didn't encounter any issues. 

  • 0 kudos
73334
by New Contributor II
  • 3690 Views
  • 3 replies
  • 1 kudos

Dedicated Access Mode Interactive Cluster with a Service Principal

Hi, I am wondering if it is possible to set up an interactive cluster set to dedicated access mode and having that user be a machine user?I've tried the cluster creation API, /api/2.1/clusters/create, and set the user name to the service principal na...

  • 3690 Views
  • 3 replies
  • 1 kudos
Latest Reply
Coffee77
Contributor
  • 1 kudos

It turns out that now is possible to include deployment of interactive and SQL Warehouse clusters with Databricks Asset Bundles, so you can include a YAML file similar to this one to deploy that type of interactive clusters:Definition of Interactive ...

  • 1 kudos
2 More Replies
TomDeas
by New Contributor II
  • 2000 Views
  • 2 replies
  • 2 kudos

Resolved! Resource Throttling; Large Merge Operation - Recent Engine Change?

Morning all, hope you can help as I've been stumped for weeks.Question: have there been recent changes to the Databricks query engine, or Photon (etc) which may impact large sort operations?I have a Jobs pipeline that runs a series of notebooks which...

runhistory.JPG query1.png query2.png query_peak.JPG
Data Engineering
MERGE
Performance Optimisation
Photon
Query Plan
serverless
  • 2000 Views
  • 2 replies
  • 2 kudos
Latest Reply
mark_ott
Databricks Employee
  • 2 kudos

There have indeed been recent changes to the Databricks query engine and Photon, especially during the June 2025 platform releases, which may influence how large sort operations and resource allocation are handled in SQL pipelines similar to yours. S...

  • 2 kudos
1 More Replies
feliximmanuel
by New Contributor II
  • 1367 Views
  • 1 replies
  • 1 kudos

Error: oidc: fetch .well-known: Get "https://%E2%80%93host/oidc/.well-known/oauth-authorization-serv

I'm trying to authenticate databricks using WSL but suddenly getting this error./databricks-asset-bundle$ databricks auth login –host https://<XXXXXXXXX>.12.azuredatabricks.netDatabricks Profile Name:<XXXXXXXXX>Error: oidc: fetch .well-known: Get "ht...

  • 1367 Views
  • 1 replies
  • 1 kudos
Latest Reply
code-vj
Visitor
  • 1 kudos

It looks like the issue is caused by the dash before host. The command is using an en-dash (–) instead of a regular hyphen (-) — which breaks the URL parsing.Try running this instead:databricks auth login --host https://<your-instance>.azuredatabrick...

  • 1 kudos
Coffee77
by Contributor
  • 83 Views
  • 6 replies
  • 2 kudos

Resolved! Databricks Asset Bundles - High Level Diagrams Flow

Hi guys!Working recently in fully understanding (and helping others...) Databricks Asset Bundles (DAB) and having fun creating some diagrams with DAB flow at high level. First one contains flow with a simple deployment in PROD and second one contains...

databricks_dab_deployment_prod.png databricks_dab_deployment_prod_with_tests.png
  • 83 Views
  • 6 replies
  • 2 kudos
Latest Reply
Coffee77
Contributor
  • 2 kudos

I will go only with latest version then , that can be applied to any other lower environment for QA or testing.

  • 2 kudos
5 More Replies
QuanSun
by New Contributor II
  • 1491 Views
  • 6 replies
  • 3 kudos

How to select performance mode for Databricks Delta Live Tables

Hi everyone,Based on the official link,For triggered pipelines, you can select the serverless compute performance mode using the Performance optimized setting in the pipeline scheduler. When this setting is disabled, the pipeline uses standard perfor...

  • 1491 Views
  • 6 replies
  • 3 kudos
Latest Reply
mimimon
New Contributor II
  • 3 kudos

May I know if this was automatically on through all DLT tables? How do we monitor timestamp of turning this on and off and the id who did it? Or is automatically configured?

  • 3 kudos
5 More Replies
Anonymous
by Not applicable
  • 11369 Views
  • 9 replies
  • 8 kudos

Resolved! data frame takes unusually long time to write for small data sets

We have configured workspace with own vpc. We need to extract data from DB2 and write as delta format. we tried to for 550k records with 230 columns, it took 50mins to complete the task. 15mn records takes more than 18hrs. Not sure why this takes suc...

  • 11369 Views
  • 9 replies
  • 8 kudos
Latest Reply
Sown7
New Contributor II
  • 8 kudos

facing same issue - I have ~ 700 k rows and I am trying to write this table but it takes forever to write. Previously one time it took only like 5 sec to write but after that whenever we update the analysis and rewrite the table it takes very long an...

  • 8 kudos
8 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels