cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

WAL for structured streaming

Brad
Contributor II

Hi, 

I cannot find deep-dive on this from latest links. So far the understanding is:

Previously SS (structured streaming) copies and caches the data in WAL. After a version, with retrieve less, SS doesn't copy the data to WAL any more, and only stores "offset", and WAL is not being used any more and only depends on checkpoint. Is this understanding right? 

1 REPLY 1

Thanks Kaniz. 

Theoretically even if without WAL, everything can be recovered from checkpoint right? Does the WAL exist only for perf reasons? E.g. for a micro batch, Spark might run multiple batches inside the microbatch and WAL is used to record the state of each micro micro-batch? 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group