Does Delta refresh DF cache automatically after a delete?
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-18-2021 12:49 PM
2 REPLIES 2
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-18-2021 12:52 PM
Yes. Delta actually explicitly refreshes the dataframe cache after performing delete.
Use this code to test it out.
Seq((0,"A"), (1,"B"),(2,"C") ).toDF("id","value").write.format("delta").mode("overwrite").saveAsTable("target_tbl")
val df = spark.sql("select * from target_tbl")
df.persist()
df.show()
spark.sql("delete from target_tbl where id = 2")
df.show()
df.unpersist()
Output:
+---+-----+
| id|value|
+---+-----+
| 0| A|
| 2| C|
| 1| B|
+---+-----+
+---+-----+
| id|value|
+---+-----+
| 0| A|
| 1| B|
+---+-----+
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-18-2021 02:11 PM
How about updates and inserts? does it refresh automatically?