cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Databricks-connect VSCode debugging pandas_api not working

FabriceDeseyn
Contributor

Hi

I am using the databricks extension on VSCode and am running against an issue since two days, prior it worked fine. I receive an error when I want to use Pandas-on-Spark during debugging.

from databricks.connect import DatabricksSession
spark = DatabricksSession.builder.getOrCreate()

df = spark.sql('select 1')
print(df.count()) # ---> printing 1

df.pandas_api().describe()

FabriceDeseyn_0-1689838667900.png

If I look deeper it seems that SPARK_REMOTE was not set and therefor the error is thrown.

FabriceDeseyn_1-1689838868477.png

Can anyone pinpoint what I did wrong.
I did following steps already:

  • reinstall databricks-connect
  • reinstall extension
  • other cluster
  • other cluster with DBR 13.1 (and corresponding databricks-connect)

 

1 REPLY 1

FabriceDeseyn
Contributor

Additional info:
It seems that the issue comes from the 1.1.0 version of the databricks extension in VSCode.
Downgrading to 1.0.0 solves my issue.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.