Databricks Community

subhadeep · ‎01-21-2025

Can some write a sql query , which queries a table like select * from stages.benefit , creates a csv and upload on azure

Isi · ‎01-23-2025

Hey!

Well, not many details have been provided, but in general, this approach could work for you. Hope it helps!
df = spark.sql("SELECT * FROM stages.benefit")

df.write.mode("overwrite").option("header", "true").csv(output_path)

🙂

hari-prasad · ‎01-23-2025

Hi @subhadeep ,

You can achieve this in SQL similarly to how you write a dataframe into a table or blob path. We will create an external table pointing to the blob path or mounted blob path. Note that this table does not support ACID transactions and versioning like a Delta table. However, you can set partitions to manage the data. Instead of a single CSV file, you will see multiple part files created as CSVs based on your data inserts.

-- Create a table with csv format and pointing to blob location
Create or replace table new_external_table_in_blob (
id int,
...
)
Using CSV
Location '.../path/blob'
;

-- Insert data into table 
Insert into new_external_table_in_blob 
values (1, ...);

To obtain a single file, perform a PySpark or Scala DataFrame operation and coalesce to 1 for a single partition of data. You can then write this to a blob path. If your data volume is small, you can convert the Spark DataFrame into a Pandas DataFrame and write it to the blob path as a single CSV file.

Regards,
Hari Prasad