cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Can we run pandas dataframe inside databricks?

tinendra
New Contributor III

Hi, I want to run

df=pd.read_csv('/dbfs/FileStore/airlines1.csv') while trying to run getting error like

FileNotFoundError: [Errno 2] No such file or directory: '/dbfs/FileStore/airlines1.csv'

Could you please help me out how to run pandas dataframe inside databricks or we can not run pandas dataframe directly inside databricks?

7 REPLIES 7

karthik_p
Esteemed Contributor

@Tinendra Kumar​ we have been seeing this issue, but as far as documentation i don't see any solid base why pandas read won't support. but we do have workaround 1. read your csv using spark and store in data frame 2. convert spark data frame to pandas . below links will help 1. CSV file | Databricks on AWS 2. Convert a spark DataFrame to pandas DF - Stack Overflow

tinendra
New Contributor III

Thanks, Karthik

jose_gonzalez
Databricks Employee
Databricks Employee

Hi @Tinendra Kumar​,

I would like to share the following documentation https://docs.databricks.com/languages/pandas-spark.html#pandas-api-on-spark This is Pandas APIs on Spark.

Pandas does not scale out to big data (runs on driver only). Pandas API on Spark fills this gap by providing pandas equivalent APIs that work on Apache Spark. Pandas API on Spark is useful not only for pandas users but also PySpark users, because pandas API on Spark supports many tasks that are difficult to do with PySpark, for example plotting data directly from a PySpark DataFrame.

Thank you Jose

Anonymous
Not applicable

Hi,

I don't see any incorrect in your code, I think your code is correct. Could you check the exist of your file or enable dbfs in your workspace?

Anonymous
Not applicable

Hi @Tinendra Kumar​ 

Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. 

We'd love to hear from you.

Thanks!

tinendra
New Contributor III

Hi @Vidula Khanna​ 

Yes, my query has been resolved and I got the solution. Thanks for your support.

Thanks,

Tinendra

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group