Hi
I am using the databricks extension on VSCode and am running against an issue since two days, prior it worked fine. I receive an error when I want to use Pandas-on-Spark during debugging.
from databricks.connect import DatabricksSession
spark = DatabricksSession.builder.getOrCreate()
df = spark.sql('select 1')
print(df.count()) # ---> printing 1
df.pandas_api().describe()
If I look deeper it seems that SPARK_REMOTE was not set and therefor the error is thrown.
Can anyone pinpoint what I did wrong.
I did following steps already:
- reinstall databricks-connect
- reinstall extension
- other cluster
- other cluster with DBR 13.1 (and corresponding databricks-connect)