Cluster:
Policy: Shared Compute
Access: Shared
Runtime: 14.1 (includes Apache Spark 3.5.0, Scala 2.12)
Worker type: Standard_L8s_v3 (64 GB Memory, 8 Cores) - workers- 1-60
Driver type: Standard_L8s_v3 (64 GB Memory, 8 Cores)
I added this line in my python notebook:
spark.conf.set("spark.sql.execution.arrow.pyspark.enabled", "true") which I believe will enable Apache Apache Arrow optimization.