Anonymous
Not applicable

@Yuliya Valava​ : Giving you many possible threads to think about and implement.

  1. It's possible that the 7GB of Heap Memory on the driver is being used to store metadata related to the data being processed
  2. Iterating through the Python list to create DLTs could be causing this memory issue if the DLTs are being stored in memory. Can you try using spark Spark to process your data. This would allow you to distribute the processing across multiple nodes, which can help reduce memory usage on individual nodes
  3. To reduce the memory usage, you could also try creating a single DLT for all the data rather than creating a new DLT for each iteration of the loop

View solution in original post