DLT Pipeline failing (due > 500 tables) any graph tables limitation

venkatgmf — Mon, 22 Jul 2024 14:26:26 GMT

DLT Pipeline Faling due to INTERNAL_ERROR: Communication lost with driver. Cluster 0719-162209-rx37csry was not reachable for 120 seconds

Re: DLT Pipeline failing (due > 500 tables) any graph tables limitation

szymon_dybczak — Mon, 22 Jul 2024 15:15:12 GMT

Hi @venkatgmf ,

Yeah, you are right that high number of tables could be a problem

If you're experiencing issues with the driver node becoming unresponsive due to garbage collection (GC), it might be a sign that the resources allocated to the driver are insufficient.To manage the ingestion of a large number of tables, you can consider batching the tables. You can create multiple DLT pipelines, each handling a subset of the tables. This way, you can distribute the load across multiple pipelines, reducing the pressure on a single pipeline and potentially mitigating the GC issue.In terms of compute type on Azure, you might want to consider using larger VM sizes for your Databricks clusters, especially for the driver node, to handle the load of reading a large number of tables. The choice of VM size would depend on the size and complexity of your tables.Also, consider tuning the Spark configurations related to memory management and GC. For instance, you can adjust the Spark driver memory, the fraction of memory dedicated to Spark's storage and execution, and the GC settings.

Could attach also cluster logs? Also, take a look on below articles to find out most probable cause of this issue

https://kb.databricks.com/en_US/jobs/driver-unavailable

topic Re: DLT Pipeline failing (due &gt; 500 tables) any graph tables limitation in Data Engineering

DLT Pipeline failing (due > 500 tables) any graph tables limitation

Re: DLT Pipeline failing (due > 500 tables) any graph tables limitation

topic Re: DLT Pipeline failing (due > 500 tables) any graph tables limitation in Data Engineering