Data Engineering

Forum Posts

Sorted by:

by HW413 • New Contributor II

09-15-2025 11:36:06 PM

883 Views
4 replies
3 kudos

Copy into checkpoint location not able to find

Hi All, I have been using COPYINTO for ingesting the data from managed volumes and my destination is a managed delta table .I would like to know where is it storing the metadata information or a checkpoint location to maintain its idempotent feature...

Data Engineering

883 Views
4 replies
3 kudos

09-15-2025 11:36:06 PM

View Replies

Latest Reply

szymon_dybczak
Esteemed Contributor III

09-16-2025 12:37:15 AM

3 kudos

Hi @HW413 ,You won't find checkpoint. COPY INTO does not use checkpoint like autoloader or spark structured streaming. The COPY INTO command retrieves metadata about all files in the specified source directory/prefix . So, every time you run copy int...

3 kudos

09-16-2025 12:37:15 AM

3 More Replies

by DatabricksEngi1 • Contributor

09-15-2025 11:15:09 PM

2121 Views
7 replies
1 kudos

run a Databricks notebook on serverless environment version 4 with Asset Bundles

Hi everyone,I’m working with Databricks Asset Bundles and running jobs that use notebooks (.ipynb).According to the documentation, it should be possible to set an environment version for serverless jobs. I want to force all of my notebook tasks to ru...

Data Engineering

2121 Views
7 replies
1 kudos

09-15-2025 11:15:09 PM

View Replies

Latest Reply

szymon_dybczak
Esteemed Contributor III

09-16-2025 12:14:47 AM

1 kudos

Hi @DatabricksEngi1 When you're defining job in DAB you're using job mapping. One of the key of job mapping is called environments. This is the one you're looking for: Databricks Asset Bundles resources - Azure Databricks | Microsoft Learn

1 kudos

09-16-2025 12:14:47 AM

6 More Replies

by aonurdemir • Contributor

09-15-2025 5:45:03 AM

1320 Views
2 replies
2 kudos

Resolved! Creating an SCD Type 2 Table with Auto CDC API (One-Time Load + Ongoing Updates)

Hello everyone,I’m working with two CDC tables:table_x: 23,467,761 rows (and growing)table_y: 27,868,173,722 rowsMy goal is to build an SCD Type 2 table (table_z) using the Auto CDC API.The workflow I’d like to achieve is:Initial Load: Populate table...

Data Engineering

1320 Views
2 replies
2 kudos

09-15-2025 5:45:03 AM

View Replies

Latest Reply

aonurdemir
Contributor

09-16-2025 12:43:02 AM

2 kudos

I have solved it with the name parameter as this:dlt.create_streaming_table(name="table_z")dlt.create_auto_cdc_flow(name="backfill",target="table_z",source="table_y",keys=["user_id"],sequence_by=col("source_ts_ms"),ignore_null_updates=False,apply_as_...

2 kudos

09-16-2025 12:43:02 AM

1 More Replies

by saicharandeepb • Contributor

09-15-2025 5:41:01 AM

1116 Views
3 replies
1 kudos

How to get Spark run-time and structured metrics before job completion?

Hi all,I’m trying to get Spark run-time metrics and structured streaming metrics by enabling cluster logging and now I see the following folders:What I noticed is that the eventlog folder only gets populated after a job has completed. That makes it d...

Data Engineering

1116 Views
3 replies
1 kudos

09-15-2025 5:41:01 AM

View Replies

Latest Reply

ManojkMohan
Honored Contributor II

09-15-2025 10:33:48 PM

1 kudos

Did you try the above solution ? Keep us updated

1 kudos

09-15-2025 10:33:48 PM

2 More Replies

by lizou1 • New Contributor III

08-09-2025 4:34:00 PM

864 Views
3 replies
1 kudos

Resolved! serverless workflow Compute became unresponsive. Compute is likely out of memory.

I set up run 10 notebooks at same time in serverless workflowI got this error:serverless workflow Compute became unresponsive. Compute is likely out of memory.Is there a quota in serverless compute I can set in zure databricks? These notebooks are pr...

Data Engineering

864 Views
3 replies
1 kudos

08-09-2025 4:34:00 PM

View Replies

Latest Reply

lizou1
New Contributor III

09-14-2025 5:06:42 PM

1 kudos

The issue is new, and azure cloud providers are also not quite sure the details, will get more info later

1 kudos

09-14-2025 5:06:42 PM

2 More Replies

by BenDataBricks • New Contributor II

04-03-2024 10:57:24 AM

3701 Views
1 replies
2 kudos

Register more redirect URIs for OAuth U2M

I am following this guide on allowing OAuth U2M for Azure Databricks.When I get to Step 2, I make a request to account.azuredatabricks.net and specify a redirect URI to receive a code.The redirect URI in the example is localhost:8020. If I change thi...

Data Engineering

3701 Views
1 replies
2 kudos

04-03-2024 10:57:24 AM

View Replies

Latest Reply

AFox
Contributor

09-15-2025 8:56:27 AM

2 kudos

You have to register a new OAuth application.See: Enable or disable partner OAuth applications and API: Create Custom OAuth App Integration

2 kudos

09-15-2025 8:56:27 AM

by naveens • Databricks Partner

09-15-2025 7:55:00 AM

2502 Views
1 replies
1 kudos

Resolved! Power BI Service – OAuth2 Databricks Authentication Failing After Tenant Migration

Hi,We are working on Power BI migration from INFY to TATA.I have a user TATA.nato@tata.com.a. with this user Iam able to connect to Azure databricks using Power BI desktop in INFY tenant.b. Iam logged in as TATA.nato@tata.com and switched to tenant I...

Data Engineering

2502 Views
1 replies
1 kudos

09-15-2025 7:55:00 AM

View Replies

Latest Reply

nayan_wylde
Esteemed Contributor II

09-15-2025 8:23:57 AM

1 kudos

@naveens here are few things that you can try.1. Re-authenticate in Power BI ServiceGo to Power BI Service → Settings → Data Sources.Locate the Databricks data source.Click Edit Credentials.Choose OAuth2 and re-authenticate using the correct Azure AD...

1 kudos

09-15-2025 8:23:57 AM

by eballinger • Contributor

09-15-2025 7:16:56 AM

1302 Views
2 replies
3 kudos

Resolved! Databricks shared folder area permissions issue

We have some notebook code that I would like to share with our team only in the "shared folder" area of Databricks. I know by default this area is meant as a area to share stuff with the entire organization but from what I have read you should be abl...

Data Engineering

1302 Views
2 replies
3 kudos

09-15-2025 7:16:56 AM

View Replies

Latest Reply

Isi
Honored Contributor III

09-15-2025 7:37:55 AM

3 kudos

Hello @eballinger In Databricks, the users group (sometimes shown in the UI as All workspace users) has default permissions that cannot be revoked at the top-level Shared folder. DocsSo looks like:It’s not possible to create a folder under /Shared th...

3 kudos

09-15-2025 7:37:55 AM

1 More Replies

by David_M • Databricks Partner

09-08-2025 4:17:21 AM

705 Views
1 replies
0 kudos

Databricks Lakeflow Connector for PostgreSQL on GCP Cloud

Lakeflow connection for PosgresHi all,I hope this message finds you well.I am currently trying to create a Lakeflow connection in Databricks for a PostgreSQL database hosted on Google Cloud Platform (GCP). However, when testing the connection, I am e...

Data Engineering

705 Views
1 replies
0 kudos

09-08-2025 4:17:21 AM

View Replies

Latest Reply

Isi
Honored Contributor III

09-15-2025 5:32:55 AM

0 kudos

Hello @David_M To better support you we’d need to clarify a few points: PostgreSQL locationIs this PostgreSQL deployed inside a private VPC in GCP or is it exposed through a public IP accessible from the internet? This is key to understand what type ...

0 kudos

09-15-2025 5:32:55 AM

by tyhatwar785 • Databricks Partner

09-15-2025 12:12:24 AM

531 Views
1 replies
1 kudos

Solution Design Recommendation on Databricks

Hi Team,We need to design a pipeline in Databricks to:1. Call a metadata API (returns XML per keyword), parse, and consolidate into a combined JSON.2. Use this metadata to generate dynamic links for a second API, download ZIPs, unzip, and extract spe...

Data Engineering

531 Views
1 replies
1 kudos

09-15-2025 12:12:24 AM

View Replies

Latest Reply

nikhilmohod-nm
New Contributor III

09-15-2025 5:20:08 AM

1 kudos

Hi @tyhatwar785 1. Should metadata and file download be separate jobs/notebooks or combined?Keep them in separate notebooks but orchestrate them under a single Databricks Job.for better error handling, and retries .2. Cluster recommendationsstart wit...

1 kudos

09-15-2025 5:20:08 AM

by MGAutomation • New Contributor

09-09-2025 1:41:52 PM

647 Views
2 replies
0 kudos

How to connect to a local instance of SQL Server

How can I connect my Databricks AWS account to a local instance of SQL Server?

Data Engineering

647 Views
2 replies
0 kudos

09-09-2025 1:41:52 PM

View Replies

Latest Reply

Isi
Honored Contributor III

09-15-2025 5:13:08 AM

0 kudos

Hello @MGAutomation @szymon_dybczak You may also need to open the firewall of your on-premises SQL Server to the CIDR range of your Databricks VPC. This ensures that the EC2 instances used by Databricks have valid IPs that can reach your database.If...

0 kudos

09-15-2025 5:13:08 AM

1 More Replies

by pshuk • New Contributor III

12-19-2023 8:27:10 AM

1447 Views
3 replies
0 kudos

Access Databricks Volume through CLI

Hi,I am able to connect to DBFS and transfer files there or download from there. But when I change the path to Volumes, it doesn't work. Even though I created the volume I still get this error message:Error: no such directory: /Volumes/bgem_dev/text_...

Data Engineering

1447 Views
3 replies
0 kudos

12-19-2023 8:27:10 AM

View Replies

Latest Reply

nisarg0
New Contributor II

09-15-2025 2:48:46 AM

0 kudos

@arpit

0 kudos

09-15-2025 2:48:46 AM

2 More Replies

by tana_sakakimiya • Contributor

09-14-2025 11:32:36 PM

848 Views
1 replies
2 kudos

Resolved! What is "External tables backed by Delta Lake"?

Goal: event-driven without implementing job triggereed on file arrivalI see hope to incrementally update materialized views which have external tables as their sources.This is quite a game changer if it works for various data formats.(since MV starte...

Data Engineering

848 Views
1 replies
2 kudos

09-14-2025 11:32:36 PM

View Replies

Latest Reply

szymon_dybczak
Esteemed Contributor III

09-15-2025 12:13:00 AM

2 kudos

Hi @tana_sakakimiya ,Yes, only external tables that are in delta format are supported. Databricks supports other table formats, but to be able to use this particular feature, your table needs to be in Delta format. But if you have parquet files it's ...

2 kudos

09-15-2025 12:13:00 AM

by andr3s • New Contributor II

05-19-2023 8:14:07 AM

43899 Views
8 replies
2 kudos

SSL_connect: certificate verify failed with Power BI

Hi, I'm getting this error with Power BI:Any ideas?Thanks in advance,Andres

Data Engineering

43899 Views
8 replies
2 kudos

05-19-2023 8:14:07 AM

View Replies

Latest Reply

GaneshKrishnan
New Contributor II

09-14-2025 11:30:17 PM

2 kudos

In the proxy setup, PowerBI is not aware of process to fetch intermediate certificate like a browser. hence it fails. Recent PowerBI comes with additional option such as"Automatic Proxy Discovery (Optional): Enabled"Implementation (optional) : 2.0(be...

2 kudos

09-14-2025 11:30:17 PM

7 More Replies

by cpayne_vax • New Contributor III

01-17-2024 2:17:00 PM

28820 Views
16 replies
9 kudos

Resolved! Delta Live Tables: dynamic schema

Does anyone know if there's a way to specify an alternate Unity schema in a DLT workflow using the @Dlt.table syntax? In my case, I’m looping through folders in Azure datalake storage to ingest data. I’d like those folders to get created in different...

Data Engineering

28820 Views
16 replies
9 kudos

01-17-2024 2:17:00 PM

View Replies

Latest Reply

surajitDE
Contributor

09-14-2025 10:47:31 PM

9 kudos

if you add these settings in the pipeline JSON, the issue should get fixed:"pipelines.setMigrationHints" = "true""pipelines.enableDPMForExistingPipeline" = "true"I tried it on my side, and now it no longer throws the materialization error.

9 kudos

09-14-2025 10:47:31 PM

15 More Replies

Databricks Community

Forum Posts

Copy into checkpoint location not able to find

run a Databricks notebook on serverless environment version 4 with Asset Bundles

Resolved! Creating an SCD Type 2 Table with Auto CDC API (One-Time Load + Ongoing Updates)

How to get Spark run-time and structured metrics before job completion?

Resolved! serverless workflow Compute became unresponsive. Compute is likely out of memory.

Register more redirect URIs for OAuth U2M

Resolved! Power BI Service – OAuth2 Databricks Authentication Failing After Tenant Migration

Resolved! Databricks shared folder area permissions issue

Databricks Lakeflow Connector for PostgreSQL on GCP Cloud

Solution Design Recommendation on Databricks

How to connect to a local instance of SQL Server

Access Databricks Volume through CLI

Resolved! What is "External tables backed by Delta Lake"?

SSL_connect: certificate verify failed with Power BI

Resolved! Delta Live Tables: dynamic schema

File Arrival Trigger - Multiple tables

Issue while handling Deletes and Inserts in Struct...

DLT with CDC and schema changes in streaming pipel...

how to update not tracked column only in new row v...

Databricks Cost Estimation Template