Upload to Volume
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-29-2024 01:51 AM
How to programmatically upload parquet files from Azure data lake to Catalog's Volumes?
source_path = "abfss://datalake-raw-dev@xxx.dfs.core.windows.net/xxxxx/saxxles/xx/source/ETL/transformed_data/parquet/"
# Define the path to your Unity Catalog Volume
destination_path = "dbfs:/Volumes/xxx/xxx/transformed_parquet"
# Read the Parquet files from the source into a DataFrame
df = spark.read.parquet(source_path)
print('so far okay')
# Write the DataFrame to the Unity Catalog Volume
df.write.mode("overwrite").parquet(destination_path)
print(f"Data successfully copied to {destination_path}")
I try the method above but it says I cannot access Volume this way, how to programmatically do it without using the UI
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-29-2024 02:35 AM
Hi @ruoyuqian
Please use dbutils.fs.cp(sourcePath,destination_path) that will be able to load data in volume.
If still having issue, please check for access of running via job.
Ajay Kumar Pandey
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-29-2024 04:05 AM
Besides, when accessing volumes, you don't need to provide dbfs protocol: `/Volumes/xxx/xxx/transformed_parquet`