cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

When to use Dataframes API over Spark SQL

parthibsg
New Contributor II

Hello Experts,

I am new to Databricks. Building data pipelines, I have both batch and streaming data.

Should I use Dataframes API to read csv files then convert to parquet format then do the transformation?

or

write to table using CSV then use Spark SQL to do transformation?.

Appreciate pros and cons and which one is better

Thank you

Rathinam

2 REPLIES 2

Debayan
Esteemed Contributor III
Esteemed Contributor III

Hi Rathinam, It would be better to understand the pipeline more in this situation. Writing to table using CSV and then using spark SQL will be faster in few cases than the other one.

Kaniz
Community Manager
Community Manager

Hi @Parthib Rathnam​, Thank you for reaching out!

Let us look into this for you, and we'll follow up with an update.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.