Data Engineering

Forum Posts

Sorted by:

by Johannes_E • New Contributor III

02-07-2025 6:11:21 AM

4742 Views
1 replies
1 kudos

Resolved! How to develop with databricks connect smoothly?

We are working with Databrick Connect and Visual Studio Code in our project. We mainly want to program in the IDE (VS Code) so that we can use the advantages of the IDE compared to notebooks. Therefore, we write most of the code in .py files and actu...

Data Engineering

4742 Views
1 replies
1 kudos

02-07-2025 6:11:21 AM

View Replies

Latest Reply

ChrisChieu
Databricks Employee

02-11-2025 4:55:16 AM

1 kudos

You can set break points and debug within notebook cells. There's an example in this DAIS talk at 15:27. I recommend the entire talk as a demo. To complete the point, here is an additional documentation about notebook cells debugging with Databricks ...

1 kudos

02-11-2025 4:55:16 AM

by dbuserng • New Contributor II

02-11-2025 1:21:49 AM

2912 Views
1 replies
0 kudos

Trigger Databricks Workflow when other workflows succeeded

Hi,I have 3 separate workflows with 3 different triggers and the thing I would like to achieve is - after all of these 3 jobs completed & succeeded I would like to trigger another job. Is it possible?These 3 jobs have to stay separate (I cannot combi...

Data Engineering

2912 Views
1 replies
0 kudos

02-11-2025 1:21:49 AM

View Replies

Latest Reply

Alberto_Umana
Databricks Employee

02-11-2025 4:53:24 AM

0 kudos

Hi @dbuserng, It is possible but it requires custom code setup based on your use-case. You can use job REST API: https://docs.databricks.com/api/workspace/jobs. Create a Monitoring Job: Set up a job that will monitor the completion status of the thre...

0 kudos

02-11-2025 4:53:24 AM

by priyansh • Databricks Partner

09-17-2024 11:13:51 PM

8903 Views
10 replies
0 kudos

Error in migration with UCX tool

Hey folks!I am facing an issue while migration the tables from Hive to UC using UCX tool; after completely running the setup and getting the assessment overview, we ran the following commands i.e. "databricks labs ucx create-table-mapping" but after...

Data Engineering

8903 Views
10 replies
0 kudos

09-17-2024 11:13:51 PM

View Replies

Latest Reply

Akash_Wadhankar
Databricks Partner

02-11-2025 3:30:12 AM

0 kudos

While running the UCX tool in workspace we are not able to access the tables which are created in hive metastore on default schema. When we run the migrate-table workflow we get the error that the tables are not accessible. We get the following error...

0 kudos

02-11-2025 3:30:12 AM

9 More Replies

by zmsoft • Contributor

12-29-2024 9:50:57 PM

1863 Views
4 replies
0 kudos

Resolved! How to load PowerBI Dataset into databricks

Hi there, I would like to know how to load power bi dataset into databricks Thanks&Regards, zmsoft

Data Engineering

1863 Views
4 replies
0 kudos

12-29-2024 9:50:57 PM

View Replies

Latest Reply

jack533
New Contributor III

02-10-2025 5:46:08 AM

0 kudos

I don't think it's possible. While loading a table from DataBricks into a PowerBI dataset is possible, the opposite is not true.

0 kudos

02-10-2025 5:46:08 AM

3 More Replies

by Prashanth24 • New Contributor III

08-06-2024 2:50:33 AM

3914 Views
3 replies
1 kudos

Error connecting Databricks Notebook using managed identity from Azure Data Factory

I am trying to connect Databricks Notebook using managed identity authentication type from Azure Data Factory. Below are the settings done. Error message is appended at the bottom of this message. With the same settings but with different authenticat...

Data Engineering

3914 Views
3 replies
1 kudos

08-06-2024 2:50:33 AM

View Replies

Latest Reply

Hesareal
New Contributor II

02-10-2025 2:08:15 AM

1 kudos

Did you manage to solve it?I am getting the same error calling DBX REST API from ADF with System-assigned Managed Identity. The Databricks workspace is added a linked service and there no problem to run notebooks.

1 kudos

02-10-2025 2:08:15 AM

2 More Replies

by subhas_hati • New Contributor

02-08-2025 4:49:03 PM

2509 Views
1 replies
2 kudos

Change Data Capture(CDC)

May I know what is a Change Data Capture ?

Data Engineering

2509 Views
1 replies
2 kudos

02-08-2025 4:49:03 PM

View Replies

Latest Reply

szymon_dybczak
Esteemed Contributor III

02-09-2025 8:16:01 AM

2 kudos

Hi @subhas_hati ,Change data capture (CDC) is a data integration pattern that captures changes made to data in a source system, such as inserts, updates, and deletes. These changes, represented as a list, are commonly referred to as a CDC feed. You c...

2 kudos

02-09-2025 8:16:01 AM

by self-employed • Contributor

12-09-2021 4:33:58 AM

15265 Views
9 replies
8 kudos

The log in function and password reset function in the community edition do not work

I want to register a databrick account. I already set up my account. I also receive the email to set my password. However, I cannot use my password to log in the community account. I can use it to log in my standard account. I also click the reset th...

Data Engineering

15265 Views
9 replies
8 kudos

12-09-2021 4:33:58 AM

View Replies

Latest Reply

Anonymous
Not applicable

12-09-2021 1:40:20 PM

8 kudos

Hello, @lawrance Zhang - I wanted you to know that this isn't the first time we've heard of this recently. Thank you for opening a ticket. We've also escalated this to the team. We'll get there.

8 kudos

12-09-2021 1:40:20 PM

8 More Replies

by 210573 • New Contributor

05-18-2022 3:24:56 PM

4166 Views
4 replies
2 kudos

Unable to stream from google pub/sub

I am trying to run below for subscribing to a pubsub but this code is throwing this exception java.lang.NoClassDefFoundError: org/apache/spark/sql/sources/v2/DataSourceV2I have tried using all versions of https://mvnrepository.com/artifact/com.google...

Data Engineering

4166 Views
4 replies
2 kudos

05-18-2022 3:24:56 PM

View Replies

Latest Reply

davidkhala-ms
New Contributor II

02-09-2025 12:12:13 AM

2 kudos

I see some issues from using pubsub as source. in the writeStream, both .foreach or .foreachBatch cannot work to be called when stream data arrives

2 kudos

02-09-2025 12:12:13 AM

3 More Replies

by sujitmk77 • New Contributor II

02-07-2025 12:29:56 AM

1640 Views
2 replies
0 kudos

PySpark JSON read with strict schema check and mark the valid and invalid records based on the non-n

Hi,I have a use case where I have to read the JSON files from "/data/json_files/" location with schema enforced.For the completeness we want to mark the invalid records. The invalid records may be the ones where the mandatory field/s are null, data t...

Data Engineering

1640 Views
2 replies
0 kudos

02-07-2025 12:29:56 AM

View Replies

Latest Reply

Alberto_Umana
Databricks Employee

02-07-2025 5:11:59 AM

0 kudos

Hi @sujitmk77, You have to ensure that valid records are processed while invalid records are marked appropriately, you can use the following PySpark code. This code reads the JSON files with schema enforcement and handles invalid records by marking t...

0 kudos

02-07-2025 5:11:59 AM

1 More Replies

by Rita • New Contributor III

09-14-2021 12:38:18 PM

11720 Views
7 replies
6 kudos

How to connect Cognos 11.1.7 to Azure Databricks

We are trying to connect Cognos 11.1.7 to Azure Databricks, but no success.Can you please help or guide us how to connect Cognos 11.1.7 to Azure Databricks.This is very critical to our user community. Can you please help or guide us how to connect Co...

Data Engineering

11720 Views
7 replies
6 kudos

09-14-2021 12:38:18 PM

View Replies

Latest Reply

Hans2
New Contributor II

02-08-2025 11:36:43 AM

6 kudos

Have anyone got the Simba JDBC driver going with CA 11.1.7? The ODBC driver works fine but i can't get the JDBC running.Regd's

6 kudos

02-08-2025 11:36:43 AM

6 More Replies

by prathameshJoshi • Databricks Partner

08-19-2024 1:44:49 AM

9941 Views
10 replies
7 kudos

Resolved! How to obtain the server url for using spark's REST API

Hi,I want to access the stage and job information (usually available through Spark UI) through the REST API provided by Spark: http://<server-url>:18080/api/v1/applications/[app-id]/stages. More information can be found at following link: https://spa...

Data Engineering

9941 Views
10 replies
7 kudos

08-19-2024 1:44:49 AM

View Replies

Latest Reply

prathameshJoshi
Databricks Partner

08-23-2024 2:40:57 AM

7 kudos

Hi @Retired_mod and @menotron ,Thanks a lot; your solutions are working. I apologise for the delay, as I had some issue logging in.

7 kudos

08-23-2024 2:40:57 AM

9 More Replies

by jeremy98 • Honored Contributor

02-07-2025 2:59:03 PM

2446 Views
1 replies
0 kudos

Resolved! how read through jdbc from postgres to databricks a particular data type

Hi Community,I need to load data from PostgreSQL into Databricks through JDBC without changing the data type of a VARCHAR[]column in PostgreSQL, which should remain as an array of strings in Databricks.Previously, I used psycopg2, and it worked, but ...

Data Engineering

2446 Views
1 replies
0 kudos

02-07-2025 2:59:03 PM

View Replies

Latest Reply

jeremy98
Honored Contributor

02-08-2025 6:12:04 AM

0 kudos

Hi community,Yesterday, I found a solution. This is to query through jdbc from postgres creating two columns that are manageable in databricks. Here the code: query = f"""(SELECT *, array_to_string(columns_to_export, ',') AS columns_to_export_strin...

0 kudos

02-08-2025 6:12:04 AM

by asurendran • New Contributor III

02-06-2025 10:34:15 AM

2379 Views
7 replies
2 kudos

Some records are missing after window function

While loading data from one layer to another layer using pyspark window function, I noticed that some data is missing. This is happening if the data is huge. It's not happening for small quantity. Does anyone come across this issue before?

Data Engineering

2379 Views
7 replies
2 kudos

02-06-2025 10:34:15 AM

View Replies

Latest Reply

asurendran
New Contributor III

02-06-2025 1:21:11 PM

2 kudos

Is there a way caching the dataframe helps to fix this issue?

2 kudos

02-06-2025 1:21:11 PM

6 More Replies

by busuu • New Contributor II

02-05-2025 9:30:38 AM

2152 Views
3 replies
1 kudos

Failed to checkout Git repository: RESOURCE_DOES_NOT_EXIST: Attempted to move non-existing node

I'm having issues with checking out Git repo in Workflows. Databricks can access files from commit `a` but fails to checkout the branch when attempting to access commit `b`. The error occurs specifically when trying to checkout commit `b`, and Databr...

Data Engineering

2152 Views
3 replies
1 kudos

02-05-2025 9:30:38 AM

View Replies

Latest Reply

Augustus
Databricks Partner

02-07-2025 12:28:59 PM

1 kudos

I didn't do anything to fix it. Databricks support did something to my workspace to fix the issue.

1 kudos

02-07-2025 12:28:59 PM

2 More Replies

by ohnomydata • New Contributor

02-07-2025 10:20:44 AM

3450 Views
1 replies
0 kudos

Accidentally deleted files via API

Hello,I’m hoping you might be able to help me.I have accidentally deleted some Workspace files via API (an Azure DevOps code deployment pipeline). I can’t see the files in my Trash folder – are they gone forever, or is it possible to recover them on ...

Data Engineering

3450 Views
1 replies
0 kudos

02-07-2025 10:20:44 AM

View Replies

Latest Reply

Alberto_Umana
Databricks Employee

02-07-2025 11:36:32 AM

0 kudos

Hello @ohnomydata, Unfortunately files deleted via APIs or the Databricks CLI are permanently deleted and do not move to the Trash folder. The Trash folder is a UI-only feature, and items deleted through the UI can be recovered from the Trash within ...

0 kudos

02-07-2025 11:36:32 AM

Databricks Community

Forum Posts

Resolved! How to develop with databricks connect smoothly?

Trigger Databricks Workflow when other workflows succeeded

Error in migration with UCX tool

Resolved! How to load PowerBI Dataset into databricks

Error connecting Databricks Notebook using managed identity from Azure Data Factory

Change Data Capture(CDC)

The log in function and password reset function in the community edition do not work

Unable to stream from google pub/sub

PySpark JSON read with strict schema check and mark the valid and invalid records based on the non-n

How to connect Cognos 11.1.7 to Azure Databricks

Resolved! How to obtain the server url for using spark's REST API

Resolved! how read through jdbc from postgres to databricks a particular data type

Some records are missing after window function

Failed to checkout Git repository: RESOURCE_DOES_NOT_EXIST: Attempted to move non-existing node

Accidentally deleted files via API

File Arrival Trigger - Multiple tables

Issue while handling Deletes and Inserts in Struct...

DLT with CDC and schema changes in streaming pipel...

how to update not tracked column only in new row v...

Databricks Cost Estimation Template