Databricks Community

desertstorm · 01-30-2024

I have a dataframe with abt 2 million text rows (1gb). I partition it into about 700 parititons as thats the no of cores available on my cluster exceutors. I run the transformations extracting medical information and then write the results in parquet...

desertstorm · 01-30-2024

thats the no of cores available on executors. i have tried driver with 256 gb as well as 128gb with same results

desertstorm · 01-30-2024

Hi @Lakshay Thanks so much for your reply. I have looked into most of those options and dont see any python code. Its mostly pipeline.transform. Here is the code where it crashes. I feel it should not bring to the driver either for with column or for...

Databricks Community

User Stats

User Activity

Driver Crash on processing large dataframe

Re: Driver Crash on processing large dataframe

Re: Driver Crash on processing large dataframe