cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Upload to Volume

ruoyuqian
New Contributor II

How to programmatically upload parquet files from Azure data lake to Catalog's Volumes? 

source_path = "abfss://datalake-raw-dev@xxx.dfs.core.windows.net/xxxxx/saxxles/xx/source/ETL/transformed_data/parquet/"

# Define the path to your Unity Catalog Volume
destination_path = "dbfs:/Volumes/xxx/xxx/transformed_parquet"

# Read the Parquet files from the source into a DataFrame
df = spark.read.parquet(source_path)
print('so far okay')
# Write the DataFrame to the Unity Catalog Volume
df.write.mode("overwrite").parquet(destination_path)

print(f"Data successfully copied to {destination_path}")

 

I try the method above but it says I cannot access Volume this way, how to programmatically do it without using the UI 

2 REPLIES 2

Ajay-Pandey
Esteemed Contributor III

Hi @ruoyuqian 

Please use dbutils.fs.cp(sourcePath,destination_path) that will be able to load data in volume.

If still having issue, please check for access of running via job.

Ajay Kumar Pandey

Witold
Honored Contributor

Besides, when accessing volumes, you don't need to provide dbfs protocol: `/Volumes/xxx/xxx/transformed_parquet`

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group