Context:
IDE: IntelliJ 2023.3.2
Library: databricks-connect 13.3
Python: 3.10
Description:
I develop notebooks and python scripts locally in the IDE and I connect to the spark cluster via databricks-connect for a better developer experience.
I download a file from the public internet and I want to store it in an external Unity Catalog Volume (hosted on S3). I would like to upload the file using a volume path and not directly uploading it to S3 via AWS Credentials.
Everything works fine using a Databricks Notebook:
E.g.:
dbutils.fs.cp("<local/file/path>", "/Volumes/<path>")
or:
source_file = ...
with open("/Volumes/<path>", 'wb') as destination_file:
destination_file.write(source_file)
I can't figure out a way to do that in my IDE locally.
Using dbutils:
dbutils.fs.cp("file:/<local/path>", "/Volumes/<path>")
I get the error:
databricks.sdk.errors.mapping.InvalidParameterValue: Path must be absolute: \Volumes\<path>
Using python's with statement won't work, because the Unity Catalog Volume is not mounted to my local machine.
Is there a way to upload files from the local machine or memory into Unity Catalog Volumes?