Upload to Volume

ruoyuqian
New Contributor II

How to programmatically upload parquet files from Azure data lake to Catalog's Volumes? 

source_path = "abfss://datalake-raw-dev@xxx.dfs.core.windows.net/xxxxx/saxxles/xx/source/ETL/transformed_data/parquet/"

# Define the path to your Unity Catalog Volume
destination_path = "dbfs:/Volumes/xxx/xxx/transformed_parquet"

# Read the Parquet files from the source into a DataFrame
df = spark.read.parquet(source_path)
print('so far okay')
# Write the DataFrame to the Unity Catalog Volume
df.write.mode("overwrite").parquet(destination_path)

print(f"Data successfully copied to {destination_path}")

 

I try the method above but it says I cannot access Volume this way, how to programmatically do it without using the UI 

Ajay-Pandey
Databricks MVP

Hi @ruoyuqian 

Please use dbutils.fs.cp(sourcePath,destination_path) that will be able to load data in volume.

If still having issue, please check for access of running via job.

Ajay Kumar Pandey

Witold
Databricks Partner

Besides, when accessing volumes, you don't need to provide dbfs protocol: `/Volumes/xxx/xxx/transformed_parquet`