Supposedly there are 4 major types of cluster in Datbricks that are- General Purpose, Storage Optimized, Memory Optimized and Compute Optimized Clusters but I'm not able to find detailed information as on which cluster to choose specifically in which scenarios. I've tried reading from multiple sources but am more confused than ever specially between Memory Optimized and Compute Optimized clusters. I'm talking of worker nodes.
And then there are Graviton Series & Delta Cache enabled clusters!!!!!! 😿😭
Could someone please help me out here explaining in as much detail and as layman as possible the difference between these.
Also, I've a requirement for simple lift and shift of a 9GB Parquet data stored in S3 to Databricks Delta Table in a Unity Catalog with no transformations at all. What should my ideal #Driver and #Worker configuration should be?? (I believe taking a smaller driver node and a heavy config storage/compute optimized worker nodes with lot of workers will speed up my process but any info on this would be greatly helpful! �
@derar-alhussein @adipolak @AjayKumar_K_R @simonw @phanindrakuchip