Background:
I am attempting to download the google cloud sdk on Databricks. The end goal is to be able to use the sdk to transfer files from a Google Cloud Bucket to Azure Blob Storage using Databricks. (If you have any other ideas for this transfer please feel free to share. I do not want to use Azure Data Factory.)
I also have Unity Catalog enabled if that makes a difference.
Right now, I was first attempting to unzip the google cloud sdk in dbfs after I moved it to the following location. I know the file exists here:
%fs
ls dbfs:/tmp/google_sdk
Returns:
dbfs:/tmp/google_sdk/google_cloud_sdk_352_0_0_linux_x86_64_tar.gz
I have tried the following to open the file with tarfile. None have worked:
tar = tarfile.open('dbfs:/tmp/google_sdk/google_cloud_sdk_352_0_0_linux_x86_64_tar.gz', mode="r|gz")
tar = tarfile.open('/dbfs/tmp/google_sdk/google_cloud_sdk_352_0_0_linux_x86_64_tar.gz', mode="r|gz")
tar = tarfile.open('/tmp/google_sdk/google_cloud_sdk_352_0_0_linux_x86_64_tar.gz', mode="r|gz")
tar = tarfile.open('/dbfs/dbfs/tmp/google_sdk/google_cloud_sdk_352_0_0_linux_x86_64_tar.gz', mode="r|gz")
tar = tarfile.open('dbfs/tmp/google_sdk/google_cloud_sdk_352_0_0_linux_x86_64_tar.gz', mode="r|gz")
tar = tarfile.open('tmp/google_sdk/google_cloud_sdk_352_0_0_linux_x86_64_tar.gz', mode="r|gz")
All of them returning that no such file or directory exists, but I know it does. What am I missing here? Why am I not able to open this file?
Thanks for any help!