Hi @Ales ventus , Yes, it is possible to use the merge command when the source file is in Parquet format and the destination file is in Delta format. Delta Lake provides interoperability between different file formats, including Parquet.
You transform the Parquet file into Delta format in your code snippet before performing the merge operation.
However, to avoid this transformation step, you can directly merge the Parquet file with the Delta file without converting it. Delta Lake will handle the compatibility between the two formats.
Here's an updated version of your code to perform the merge operation between a Parquet source file and a Delta destination file:
from delta.tables import *
deltaTablePeople = DeltaTable.forPath(spark, 'abfss://destination-delta')
deltaTablePeopleUpdates = DeltaTable.forPath(spark, 'abfss://source-parquet')
dfUpdates = deltaTablePeopleUpdates.toDF()
deltaTablePeople.alias('people') \
.merge(
dfUpdates.alias('updates'),
'people.id = updates.id'
) \
.whenMatchedUpdate(set=...)
.whenNotMatchedInsert(values=...)
.execute()
Make sure to replace set = ... and values = ... with the appropriate update and insert operations you want to perform during the merge.
Remember to include the necessary dependencies and configurations to work with Delta Lake and Parquet files in your Spark environment.