@somanath Sankaranโ :
Yes, custom state functions like mapGroupsWithState and applyInPandasWithState use the same state store as the built-in aggregation state store. By default, this state is stored in RocksDB, which is an embedded, persistent key-value store that is optimized for storing and retrieving large amounts of data.
The state store is managed by the Databricks runtime and is automatically distributed across the worker nodes in the cluster. This allows the state to be shared and updated across multiple tasks running in parallel. The state is also fault-tolerant and can be recovered in case of a node failure.
When using custom state functions, it's important to keep in mind that the amount of state maintained by the function can have a significant impact on cluster performance and memory usage. It's important to properly configure the state timeout and eviction policies to ensure that old, unused state is regularly cleaned up to avoid running out of memory.