Hi!
As suggested by Databricks, we are working with Databricks from VSCode using Databricks bundles for our deployment and using the VSCode Databricks Extension and Databricks Connect during development.
However, there are some limitations that we are seeing (that hopefully can be fixed). One of them is when working with files from Unity Catalog using native python.
E.g: Using this code:
with open(my_file, 'r', encoding='utf-8') as f:
content = f.read()
When running this in the Databricks Workspace, I am returned:
/Volumes/<my catalog>/<my schema>/<my volume path>/<my file>.xsl
However, running it from VSCode, I am returned:
No such file or directory: /Volumes/<my catalog>/<my schema>/<my volume path>/<my file>.xslx
I know that the extension works so that spark commands are executed on the attached cluster, but native python works so that it is ran on the machine. However, should there not be a way of forcing this also to use the cluster, as it makes no sense running this locally as I am trying to read a volume path?
I know that a can make the entire file "Run as a workflow in Databricks", but I would prefer being able to run cell by cell locally. I also know that if a change my code to running spark commands, e.g. spark.read(...), then it would work - but I don't think I should be forced to write my code differently just because I want to develop in VSCode as per suggested by Databricks.