Hello @vijamit
Good day!!
It was very hard to analyise the error but :
The causes for the error shared are : Your query has run out of memory during execution, specifically when using the BuildHashedRelation and PartitionedRelation functions.
Running out of memory happens when memory is improperly allocated during query execution. The Photon cluster relies on accurate table statistics to optimize query execution and manage memory usage. When the statistics are incorrect, Photon may allocate insufficient memory for the query, resulting in an Out of Memory error.
Additionally, memory management issues can occur when:
- Queries have multiple joins, subqueries, or aggregations, which increases the complexity of memory management. This makes it more challenging for Photon to accurately estimate memory needs.
- Youโre working with large datasets, which increases the likelihood of encountering an out-of-memory error. Photon may underestimate the memory required to process the data.
- You work with dependencies such as outdated libraries or incompatible versions, which also contribute to memory management problems in Photon.
Expected solution:
ANALYZE TABLE <table-name> COMPUTE STATISTICS;
If possible, simplify complex queries by breaking them down into smaller, more manageable parts. This can help Photon better estimate memory requirements and reduce the likelihood of an out-of-memory error.
Upgrade to Databricks Runtime 13.3 LTS or above. There is a new feature added to Databricks Runtime versions starting with 13.3 LTS that helps mitigate this issue.