ilir_nuredini
Honored Contributor

Hello @DanielW 

DBFS (according to the mentioned output_dir variable) is now considered a legacy approach, and you would need to use Unity Catalog Volumes for storing and accessing data files going forward and it is recommended. FYI: the dbfs is disabled in the free edition. Refer below on how you can leverage UC Volume on interacting with files using csv format as an example.

Example upload to UC Volume using python:

1. Using pandas to save as csv file with an example data:

volume_path = "/Volumes/workspace/default/temp/output.csv"
import pandas as pd

df = pd.DataFrame([
    ["Ilir", 30],
    ["Nuredini", 25]
], columns=["name", "age"])

# Save to a Unity Catalog volume path
df.to_csv(volume_path, index=False)

2. Using with open() :

import csv

volume_path = "/Volumes/workspace/default/temp/output2.csv"

rows = [["name", "age"], ["Ilir2", 30], ["Nuredini2", 25]]

# Write CSV using `with open`
with open(volume_path, mode="w", newline="", encoding="utf-8") as file:
    writer = csv.writer(file)
    writer.writerows(rows)

 

Here it is an example how to read a csv file from Volume:

df = spark.read.csv("/Volumes/workspace/default/temp/output2.csv", header=True, inferSchema=True)
df.show()

 

Hope that helps. Let me know if you need a more specific scenario.

Best, Ilir

View solution in original post