I'm trying to perform a merge inside a streaming foreachbatch using the command:
microBatchDF._jdf.sparkSession().sql(self.merge_query)
Streaming runs fine if I use a Personal cluster while if I use a Shared cluster streaming fails with the following error:
org.apache.spark.api.python.PythonException: Found error inside foreachBatch Python process: Traceback (most recent call last):
pyspark.errors.exceptions.base.PySparkAttributeError: [JVM_ATTRIBUTE_NOT_SUPPORTED] Attribute `_jdf` is not supported in Spark Connect as it depends on the JVM. If you need to use this attribute, do not use Spark Connect when creating your session. Visit https://spark.apache.org/docs/latest/sql-getting-started.html#starting-point-sparksession for creating regular Spark Session in detail.
Any idea?
Thanks