Concurrent State Update from Worker Nodes Possible?

- - Certifications
- - Learning Paths
- - Databricks Product Tours
- - Get Started Guides

- - Get Started Resources
- - Events
- - Support FAQs
- - Technical Blog
- - Community Articles
- - Announcements
- - DatabricksTV
- - Product Platform Updates

- - Private Groups
  - Princeton Life Sciences Databricks User Group
- - Skills@Scale

- - Databricks Community Innovators
- - Khoros Community Forums Support (Not for Databricks Product Questions)
- - Databricks Community Code of Conduct

Get Started Discussions

Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.

For a data processing pipeline I use structured streaming and arbitrary stateful processing. I was wondering if the partitioning over several worker nodes and thus updating the state from different worker nodes has to be considered (e.g. using a lock) when using applyInPandasWithState. Or is that handled automatically by PySpark and Databricks and abstracted away?

Thank you