You have hit a very specific, known behavioral gap in how Apache Spark and Delta Lake interact.
To answer your question directly: Yes, the Observable API is effectively incompatible with Delta Table merges when used directly.
Why It Hangs Indefinitely
The deadlock you are experiencing boils down to how Delta Lake plans its queries versus how Spark listens for metrics:
- How Observation works: The pyspark.sql.Observation object relies on standard Spark actions (like .collect(), .count(), or .write()) to complete. When these actions finish, they trigger a background physical query execution event that populates your observation object.
- How Delta MERGE works: A Delta MERGE is not processed as a standard Spark action. Internally, the Delta engine intercepts the logical plan, heavily modifies it to figure out matching/non-matching rows, and executes custom physical writes.
- The Clash: During this plan rewriting, the logical node attached by .observe() often gets stripped out or fails to trigger the expected listener event.
- The Hang: Because the background listener never receives the signal that the data flowed through, Observation.get defaults to its fallback behavior: waiting forever.
This is magnified inside foreachBatch because you are dealing with a static micro-batch DataFrame, but the core issue remains the Delta execution plan itself.
Feel free to add more info if i have misunderstood your issue or for a workaround.