topic Re: Fatal error when writing a big pandas dF in Data Engineering

Fatal error when writing a big pandas dF

chari — Thu, 08 Feb 2024 12:47:46 GMT

Hello DB community,

I was trying to write a pandas dataframe containing 100000 rows as excel. Moments in the execution I received a fatal error : "Python kernel is unresponsive."

However, I am constrained from increasing the number of clusters or other relevant infrastructure change. So my only option is to fix the code.

I am looking to implement more than one option I would get from this forum. pls suggest.

Thanks

Re: Fatal error when writing a big pandas dF

Ayushi_Suthar — Thu, 08 Feb 2024 13:33:53 GMT

Hi @chari ,Thanks for bringing up your concerns, always happy to help 😁

We understand that you are facing the following error while you are writing a pandas dataframe containing 100000rows in excel.

As per the Error >>> Fatal error: The Python kernel is unresponsive. The Python process exited with exit code 137 (SIGKILL: Killed). This may have been caused by an OOM error. Check your command's memory usage.

The driver node is OOM leading to this error. This can be fixed by:

Choosing a higher driver instance size depending on the workload.
Splitting workloads into multiple clusters
Moving to a jobs cluster

Please let me know if this helps and leave a like if this helps, followups are appreciated.
Kudos
Ayushi

Re: Fatal error when writing a big pandas dF

chari — Tue, 13 Feb 2024 07:19:11 GMT

Hello,

Unfortunately, I cant update my cluster until six months.

But I want to use a spark dataframe to write as a CSV. Does it help?