Streaming foreachBatch _jdf jvm attribute not supported
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-17-2024 01:52 AM - edited 06-17-2024 02:00 AM
I'm trying to perform a merge inside a streaming foreachbatch using the command:
Streaming runs fine if I use a Personal cluster while if I use a Shared cluster streaming fails with the following error:
org.apache.spark.api.python.PythonException: Found error inside foreachBatch Python process: Traceback (most recent call last):
pyspark.errors.exceptions.base.PySparkAttributeError: [JVM_ATTRIBUTE_NOT_SUPPORTED] Attribute `_jdf` is not supported in Spark Connect as it depends on the JVM. If you need to use this attribute, do not use Spark Connect when creating your session. Visit https://spark.apache.org/docs/latest/sql-getting-started.html#starting-point-sparksession for creating regular Spark Session in detail.
Any idea?
Thanks