@Direo Direo :
When you use deltaTable.write() method to write a DataFrame into a Delta table, it actually triggers the Delta write operation internally. This operation performs two actions:
- It writes the new data to disk in the Delta format, and
- It atomically updates the table metadata in the transaction log.
The CREATE OR REPLACE TABLE AS SELECT statement is used to create or replace a table with the data returned by a query. In Delta Lake, this statement is used to create or replace a Delta table with the results of a query.
The WRITE operation that you see in the Delta table history corresponds to the first action of the Delta
write operation: writing the new data to disk. This operation is recorded in the transaction log and can be used to replay the transaction in case of a failure.
So, the WRITE operation records the actual data being written to the Delta table, while the CREATE OR REPLACE TABLE AS SELECT statement records the metadata update for the Delta table.
In summary, when you write to a Delta table, two operations are triggered: WRITE to write the actual data to disk, and CREATE OR REPLACE TABLE AS SELECT to update the table metadata in the transaction log.