PySpark Lazy Evaluation

joggiri
New Contributor II

PySpark Lazy Evaluation - Why does my logging function seem to execute without an explicit action in Databricks?

Hello everyone,
I was scrolling and found some Medium post on a PySpark (https://medium.com/@sudeepwrites/pyspark-secrets-no-one-talks-about-but-every-data-engineer-should-k...) and have a question about lazy evaluation. It is written that transformations are lazy and will not execute until an action is called (which I know).
Arctical have some code I have tried to execute, according to article it should not print 'Logging something...' ,However, It is printing.

from pyspark.sql.functions import coldef log_step(df):
print("Logging something...")
return df

df = spark.sql("select * from delta.`s3://abc` limit 10")
log_step(df.filter(col("flag_source_system_delete") == "yes"))


Even without an explicit action like .show() or .count() on the final line, the print() statement inside log_step executes and I see the output in my notebook. My understanding is that the filter is a transformation and should not trigger the code.
Can someone please explain why this is happening? Is there an implicit action being triggered by the Databricks notebook environment, or am I fundamentally misunderstanding something about lazy evaluation with functions?

Thank you!