- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-07-2023 04:29 AM
Thank you for your quick response.
Unfortunately, I have tried all the changes you mentioned but it still doesn't work. I have also been reading more about the cluster size recommendations on your website (https://docs.databricks.com/clusters/cluster-config-best-practices.html) and I have followed them (I have tried with a smaller number of workers but with a larger size of these), but the result is still the same error.
This is the current configuration I have in my cluster.
I have also tried to take a sample of only 5 records (so now our dataframe has 5 records and not 15.000.000) and surprisingly I'm having this error after more than 1h30 running:
Fatal error: The Python kernel is unresponsive.
During the modification of our original dataframe there is no problem, I get the error when I try to copy that data into a table (df.format("delta"). ... ).
In order to check if the problem was that write to the delta table, I have replaced that step by display(df) (which is now only 5 records) and I get exactly the same error.
Any idea what might be going on?
Thank you so much in advanced.