Concurrent State Update from Worker Nodes Possible?

- - Certifications
- - Learning Paths
- - Databricks Product Tours
- - Get Started Guides
- - Product Platform Updates
- - What's New in Databricks

- - Get Started Resources
- - Events
- - Support FAQs
- - Technical Blog
- - Knowledge Sharing Hub
- - Announcements
- - DatabricksTV

- - Private Groups
- - Skills@Scale

- - Databricks Community Champions
- - Khoros Community Forums Support (Not for Databricks Product Questions)
- - Databricks Community Code of Conduct

Get Started Discussions

Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.

For a data processing pipeline I use structured streaming and arbitrary stateful processing. I was wondering if the partitioning over several worker nodes and thus updating the state from different worker nodes has to be considered (e.g. using a lock) when using applyInPandasWithState. Or is that handled automatically by PySpark and Databricks and abstracted away?

Thank you