cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Prashant777
by New Contributor II
  • 5415 Views
  • 4 replies
  • 0 kudos

Error in SQL statement: UnsupportedOperationException: Cannot perform Merge as multiple source rows matched and attempted to modify the same

My code:- CREATE OR REPLACE TEMPORARY VIEW preprocessed_source ASSELECT  Key_ID,  Distributor_ID,  Customer_ID,  Customer_Name,  ChannelFROM integr_masterdata.Customer_Master;-- Step 2: Perform the merge operation using the preprocessed source tableM...

  • 5415 Views
  • 4 replies
  • 0 kudos
Latest Reply
Tread
New Contributor II
  • 0 kudos

Hey as previously stated you could drop the duplicates of the columns that contain the said duplicates(code you can find online pretty easily), I have had this problem myself and it came when creating a temporary view from a dataframe, the dataframe ...

  • 0 kudos
3 More Replies
Prashant777
by New Contributor II
  • 2558 Views
  • 2 replies
  • 0 kudos

UnsupportedOperationException: Cannot perform Merge as multiple source rows matched and attempted to modify the same

My Code:-- CREATE OR REPLACE TEMPORARY VIEW preprocessed_source ASSELECT  Key_ID,  Distributor_ID,  Customer_ID,  Customer_Name,  ChannelFROM integr_masterdata.Customer_Master;-- Step 2: Perform the merge operation using the preprocessed source table...

  • 2558 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Prashant Joshi​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us s...

  • 0 kudos
1 More Replies
sudhanshu1
by New Contributor III
  • 1261 Views
  • 0 replies
  • 0 kudos

SCD Type-2

Hi All, I have 22 postgress tables and i need to implement SCD type 2 and create azure Databricks pipeline . However my project team doesn't want to use delta tables concept . Have anyone implemented this ? below is how i planned to do try: df_src = ...

  • 1261 Views
  • 0 replies
  • 0 kudos
pc
by New Contributor II
  • 2554 Views
  • 4 replies
  • 0 kudos

Error in SQL statement: AnalysisException: The query operator `UpdateCommandEdge` contains one or more unsupported expression types Aggregate, Window or Generate.

com.databricks.backend.common.rpc.DatabricksExceptions$SQLExecutionException: org.apache.spark.sql.AnalysisException: The query operator `UpdateCommandEdge` contains one or more unsupportedexpression types Aggregate, Window or Generate.Invalid expres...

  • 2554 Views
  • 4 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Pradeep Chauhan​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answe...

  • 0 kudos
3 More Replies
VVill_T
by Contributor
  • 3503 Views
  • 4 replies
  • 7 kudos

How to write a Delta Live Table(dlt) pipeline output to Databricks SQL directly

Hi,I am trying to see if it is possible to setup a direct connection from dlt pipeline to a table in Databricks SQL by configuring the Target Schema: with poc being a location of schema like "dbfs:/***/***/***/poc.db The error message was just a...

image image
  • 3503 Views
  • 4 replies
  • 7 kudos
Latest Reply
youssefmrini
Databricks Employee
  • 7 kudos

When ever you store a Delta Table to Hive Metastore. This table will be available in Databricks SQL Workspace ( Data Explorer ) under hive_metastore catalog.

  • 7 kudos
3 More Replies
MC006
by New Contributor III
  • 6167 Views
  • 4 replies
  • 2 kudos

Resolved! java.lang.NoSuchMethodError after upgrade to Databricks Runtime 11.3 LTS

Hi,  I am using Databricks and want to upgrade to Databricks runtime version 11.3 LTS which uses Spark 3.3 now. Current system enviroment:Operating System: Ubuntu 20.04.4 LTSJava: Zulu 8.56.0.21-CA-linux64Python: 3.8.10Delta Lake: 1.1.0Target system ...

  • 6167 Views
  • 4 replies
  • 2 kudos
Latest Reply
Meghala
Valued Contributor II
  • 2 kudos

Hi everyone this data was helped me thanks ​

  • 2 kudos
3 More Replies
gauthamchettiar
by New Contributor II
  • 1705 Views
  • 0 replies
  • 1 kudos

Spark always performing broad casts irrespective of spark.sql.autoBroadcastJoinThreshold during streaming merge operation with DeltaTable.

I am trying to do a streaming merge between delta tables using this guide - https://docs.delta.io/latest/delta-update.html#upsert-from-streaming-queries-using-foreachbatchOur Code Sample (Java): Dataset<Row> sourceDf = sparkSession ...

BroadCastJoin 1M
  • 1705 Views
  • 0 replies
  • 1 kudos
Harsh1
by New Contributor II
  • 944 Views
  • 0 replies
  • 0 kudos

Issues in Metastore Migration using Databricks Migration Tool

Hi Team,As I'm performing the Databricks workspace migration, during Metastore migration I'm facing below issue.As we found differences in the Metastore table count between Legacy and Target workspace, we checked error logs.After going through Failed...

  • 944 Views
  • 0 replies
  • 0 kudos
Harsh1
by New Contributor II
  • 1317 Views
  • 2 replies
  • 1 kudos

Query on DBFS migration

We are doing DBFS migration. In that we have a folder 'user' in Root DBFS having data 5.8 TB in legacy workspace. We performed AWS CLi Sync/cp between Legacy to Target and again performed the same between Target bucket to Target dbfs   While implemen...

  • 1317 Views
  • 2 replies
  • 1 kudos
Latest Reply
Harsh1
New Contributor II
  • 1 kudos

Thanks for the quick response.Regarding the suggested AWS data sync approach, we have tried data sync in multiple ways, it is creating folders in s3 bucket itself not on DBFS. As our task is to copy from bucket to DBFS.It seems that it only supports ...

  • 1 kudos
1 More Replies
BradSheridan
by Valued Contributor
  • 4023 Views
  • 9 replies
  • 4 kudos

Resolved! How to use cloudFiles to completely overwrite the target

Hey there Community!! I have a client that will produce a CSV file daily that needs to be moved from Bronze -> Silver. Unfortunately, this source file will always be a full set of data....not incremental. I was thinking of using AutoLoader/cloudFil...

  • 4023 Views
  • 9 replies
  • 4 kudos
Latest Reply
BradSheridan
Valued Contributor
  • 4 kudos

I "up voted'" all of @werners suggestions b/c they are all very valid ways of addressing my need (the true power/flexibility of the Databricks UDAP!!!). However, turns out I'm going to end up getting incremental data afterall :). So now the flow wi...

  • 4 kudos
8 More Replies
AmanSehgal
by Honored Contributor III
  • 6595 Views
  • 1 replies
  • 10 kudos

Resolved! How to merge all the columns into one column as JSON?

I have a task to transform a dataframe. The task is to collect all the columns in a row and embed it into a JSON string as a column.Source DF:Target DF: 

image image
  • 6595 Views
  • 1 replies
  • 10 kudos
Latest Reply
AmanSehgal
Honored Contributor III
  • 10 kudos

I was able to do this by converting df to rdd and then by applying map function to it.rdd_1 = df.rdd.map(lambda row: (row['ID'], row.asDict() ) )   ...

  • 10 kudos
Labels