Looking into the following
https://docs.databricks.com/en/structured-streaming/delta-lake.html#specify-initial-position
I am unclear as to what is the exact difference (if any) between
"startingVersion: The Delta Lake version to start from. Databricks recommends omitting this option for most workloads. When not set, the stream starts from the latest available version including a complete snapshot of the table at that moment."
and
"To return only the latest changes, specify latest."
The two scenarios are:
- no startingVersion
- startingVersion = latest
Are both scenarios equal, or the first one consume the latest snapshot of the table first and whatever comes next, and the second one only whatever comes after the latest snapshot. I am confused about the two.
Could anyone clarify it for me please ?