Data Engineering

Forum Posts

Sorted by:

by Prashant777 • New Contributor II

05-15-2023 12:09:12 AM

6353 Views
4 replies
0 kudos

Error in SQL statement: UnsupportedOperationException: Cannot perform Merge as multiple source rows matched and attempted to modify the same

My code:- CREATE OR REPLACE TEMPORARY VIEW preprocessed_source ASSELECT Key_ID, Distributor_ID, Customer_ID, Customer_Name, ChannelFROM integr_masterdata.Customer_Master;-- Step 2: Perform the merge operation using the preprocessed source tableM...

Data Engineering

6353 Views
4 replies
0 kudos

05-15-2023 12:09:12 AM

View Replies

Latest Reply

Tread
New Contributor II

01-10-2024 6:56:04 AM

0 kudos

Hey as previously stated you could drop the duplicates of the columns that contain the said duplicates(code you can find online pretty easily), I have had this problem myself and it came when creating a temporary view from a dataframe, the dataframe ...

0 kudos

01-10-2024 6:56:04 AM

3 More Replies

by Prashant777 • New Contributor II

05-15-2023 12:07:37 AM

3234 Views
2 replies
0 kudos

UnsupportedOperationException: Cannot perform Merge as multiple source rows matched and attempted to modify the same

My Code:-- CREATE OR REPLACE TEMPORARY VIEW preprocessed_source ASSELECT Key_ID, Distributor_ID, Customer_ID, Customer_Name, ChannelFROM integr_masterdata.Customer_Master;-- Step 2: Perform the merge operation using the preprocessed source table...

Data Engineering

3234 Views
2 replies
0 kudos

05-15-2023 12:07:37 AM

View Replies

Latest Reply

Anonymous
Not applicable

05-23-2023 2:20:16 AM

0 kudos

Hi @Prashant Joshi Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us s...

0 kudos

05-23-2023 2:20:16 AM

1 More Replies

by sudhanshu1 • New Contributor III

05-22-2023 6:15:19 AM

1435 Views
0 replies
0 kudos

SCD Type-2

Hi All, I have 22 postgress tables and i need to implement SCD type 2 and create azure Databricks pipeline . However my project team doesn't want to use delta tables concept . Have anyone implemented this ? below is how i planned to do try: df_src = ...

Data Engineering

1435 Views
0 replies
0 kudos

05-22-2023 6:15:19 AM

by pc • New Contributor II

02-01-2023 5:53:28 AM

3028 Views
4 replies
0 kudos

Error in SQL statement: AnalysisException: The query operator `UpdateCommandEdge` contains one or more unsupported expression types Aggregate, Window or Generate.

com.databricks.backend.common.rpc.DatabricksExceptions$SQLExecutionException: org.apache.spark.sql.AnalysisException: The query operator `UpdateCommandEdge` contains one or more unsupportedexpression types Aggregate, Window or Generate.Invalid expres...

Data Engineering

3028 Views
4 replies
0 kudos

02-01-2023 5:53:28 AM

View Replies

Latest Reply

Anonymous
Not applicable

04-08-2023 9:41:34 PM

0 kudos

Hi @Pradeep Chauhan Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answe...

0 kudos

04-08-2023 9:41:34 PM

3 More Replies

by VVill_T • Contributor

12-14-2022 4:30:59 PM

4309 Views
4 replies
7 kudos

How to write a Delta Live Table(dlt) pipeline output to Databricks SQL directly

Hi,I am trying to see if it is possible to setup a direct connection from dlt pipeline to a table in Databricks SQL by configuring the Target Schema: with poc being a location of schema like "dbfs:/***/***/***/poc.db The error message was just a...

Data Engineering

4309 Views
4 replies
7 kudos

12-14-2022 4:30:59 PM

View Replies

Latest Reply

youssefmrini
Databricks Employee

12-21-2022 1:24:18 AM

7 kudos

When ever you store a Delta Table to Hive Metastore. This table will be available in Databricks SQL Workspace ( Data Explorer ) under hive_metastore catalog.

7 kudos

12-21-2022 1:24:18 AM

3 More Replies

by MC006 • New Contributor III

12-19-2022 4:45:42 AM

7171 Views
4 replies
2 kudos

Resolved! java.lang.NoSuchMethodError after upgrade to Databricks Runtime 11.3 LTS

Hi, I am using Databricks and want to upgrade to Databricks runtime version 11.3 LTS which uses Spark 3.3 now. Current system enviroment:Operating System: Ubuntu 20.04.4 LTSJava: Zulu 8.56.0.21-CA-linux64Python: 3.8.10Delta Lake: 1.1.0Target system ...

Data Engineering

7171 Views
4 replies
2 kudos

12-19-2022 4:45:42 AM

View Replies

Latest Reply

Meghala
Valued Contributor II

12-26-2022 6:22:10 AM

2 kudos

Hi everyone this data was helped me thanks

2 kudos

12-26-2022 6:22:10 AM

3 More Replies

by gauthamchettiar • New Contributor II

12-13-2022 5:57:27 AM

1942 Views
0 replies
1 kudos

Spark always performing broad casts irrespective of spark.sql.autoBroadcastJoinThreshold during streaming merge operation with DeltaTable.

I am trying to do a streaming merge between delta tables using this guide - https://docs.delta.io/latest/delta-update.html#upsert-from-streaming-queries-using-foreachbatchOur Code Sample (Java): Dataset<Row> sourceDf = sparkSession ...

Data Engineering

1942 Views
0 replies
1 kudos

12-13-2022 5:57:27 AM

by Harsh1 • New Contributor II

10-09-2022 11:22:46 PM

1074 Views
0 replies
0 kudos

Issues in Metastore Migration using Databricks Migration Tool

Hi Team,As I'm performing the Databricks workspace migration, during Metastore migration I'm facing below issue.As we found differences in the Metastore table count between Legacy and Target workspace, we checked error logs.After going through Failed...

Data Engineering

1074 Views
0 replies
0 kudos

10-09-2022 11:22:46 PM

by Harsh1 • New Contributor II

08-23-2022 9:36:02 AM

1600 Views
2 replies
1 kudos

Query on DBFS migration

We are doing DBFS migration. In that we have a folder 'user' in Root DBFS having data 5.8 TB in legacy workspace. We performed AWS CLi Sync/cp between Legacy to Target and again performed the same between Target bucket to Target dbfs While implemen...

Data Engineering

1600 Views
2 replies
1 kudos

08-23-2022 9:36:02 AM

View Replies

Latest Reply

Harsh1
New Contributor II

08-24-2022 5:43:27 AM

1 kudos

Thanks for the quick response.Regarding the suggested AWS data sync approach, we have tried data sync in multiple ways, it is creating folders in s3 bucket itself not on DBFS. As our task is to copy from bucket to DBFS.It seems that it only supports ...

1 kudos

08-24-2022 5:43:27 AM

1 More Replies

by BradSheridan • Valued Contributor

07-27-2022 6:13:27 AM

4721 Views
9 replies
4 kudos

Resolved! How to use cloudFiles to completely overwrite the target

Hey there Community!! I have a client that will produce a CSV file daily that needs to be moved from Bronze -> Silver. Unfortunately, this source file will always be a full set of data....not incremental. I was thinking of using AutoLoader/cloudFil...

Data Engineering

4721 Views
9 replies
4 kudos

07-27-2022 6:13:27 AM

View Replies

Latest Reply

BradSheridan
Valued Contributor

08-12-2022 10:44:42 AM

4 kudos

I "up voted'" all of @werners suggestions b/c they are all very valid ways of addressing my need (the true power/flexibility of the Databricks UDAP!!!). However, turns out I'm going to end up getting incremental data afterall :). So now the flow wi...

4 kudos

08-12-2022 10:44:42 AM

8 More Replies

by AmanSehgal • Honored Contributor III

04-26-2022 5:15:17 AM

7485 Views
1 replies
10 kudos

Resolved! How to merge all the columns into one column as JSON?

I have a task to transform a dataframe. The task is to collect all the columns in a row and embed it into a JSON string as a column.Source DF:Target DF:

Data Engineering

7485 Views
1 replies
10 kudos

04-26-2022 5:15:17 AM

View Replies

Latest Reply

AmanSehgal
Honored Contributor III

04-27-2022 12:14:26 AM

10 kudos

I was able to do this by converting df to rdd and then by applying map function to it.rdd_1 = df.rdd.map(lambda row: (row['ID'], row.asDict() ) ) ...

10 kudos

04-27-2022 12:14:26 AM