Fatal error when writing a big pandas dF

chari
Contributor

Hello DB community,

I was trying to write a pandas dataframe containing 100000 rows as excel. Moments in the execution I received a fatal error : "Python kernel is unresponsive."

However, I am constrained from increasing the number of clusters or other relevant infrastructure change. So my only option is to fix the code. 

I am looking to implement more than one option I would get from this forum. pls suggest.

Thanks

Ayushi_Suthar
Databricks Employee
Databricks Employee

Hi @chari ,Thanks for bringing up your concerns, always happy to help 😁

We understand that you are facing the following error while you are writing a pandas dataframe containing 100000rows in excel.

As per the Error >>> Fatal error: The Python kernel is unresponsive. The Python process exited with exit code 137 (SIGKILL: Killed). This may have been caused by an OOM error. Check your command's memory usage.

The driver node is OOM leading to this error. This can be fixed by:

  • Choosing a higher driver instance size depending on the workload.
  • Splitting workloads into multiple clusters
  • Moving to a jobs cluster

Please let me know if this helps and leave a like if this helps, followups are appreciated.
Kudos
Ayushi

Hello,

Unfortunately, I cant update my cluster until six months. 

But I want to use a spark dataframe to write as a CSV. Does it help?