Databricks Community

alesventus · ‎06-20-2023

Is it possible to use merge command when source file is parquet and destination file is delta? Or both files must delta files?

Currently, I'm using this code and I transform parquet into delta and it works. But I want to avoid of this tranformation.

Thanks

from delta.tables import *
 
deltaTablePeople = DeltaTable.forPath(spark, 'abfss://destination-delta')
deltaTablePeopleUpdates = DeltaTable.forPath(spark, 'abfss://source-parquet')
 
dfUpdates = deltaTablePeopleUpdates.toDF()
 
deltaTablePeople.alias('people') \
  .merge(
    dfUpdates.alias('updates'),
    'people.id = updates.id'
  ) \
  .whenMatchedUpdate(set =...

Anonymous · ‎06-20-2023

Hi @Ales ventus

We haven't heard from you since the last response from @Kaniz Fatma , and I was checking back to see if her suggestions helped you.

Or else, If you have any solution, please share it with the community, as it can be helpful to others.

Also, Please don't forget to click on the "Select As Best" button whenever the information provided helps resolve your question.

Databricks Community

Pyspark Merge parquet and delta file

Connect with Databricks Users in Your Area

Databricks Named a Leader in the 2024 Gartner® Magic Quadrant™ for Cloud Database Management Systems

Announcing the new Meta Llama 3.3 model on Databricks

Milestone: DatabricksTV Reaches 100 Videos!

Dotmatics and Databricks Partner to Advance Scientific Intelligence in Life Sciences

Databricks Community Champion - December 2024 - Sujesh Menon