by
Rinat
• New Contributor
- 1522 Views
- 0 replies
- 0 kudos
I know you can set "spark.sql.shuffle.partitions" and "spark.sql.adaptive.advisoryPartitionSizeInBytes". The former will not work with adaptive query execution, and the latter only works for the first shuffle for some reason, after which it just uses...
- 1522 Views
- 0 replies
- 0 kudos
- 2736 Views
- 3 replies
- 0 kudos
Hello everybody,I recently discovered (the hard way) that when a query plan uses cached data, the AQE does not kick-in. Result is that you loose the super cool feature of dynamic partition coalesce (no more custom shuffle readers in the DAG). Is ther...
- 2736 Views
- 3 replies
- 0 kudos
Latest Reply
Hi @Pantelis Maroudis,Did you check the physical query plan? did you check the SQL sub tab with in Spark UI? it will help you to undertand better what is happening.
2 More Replies
- 3006 Views
- 3 replies
- 14 kudos
I have this notebook which is scheduled by Data Factory on a daily basis.It works fine, up to today. All of a sudden I keep on getting NullpointerException when writing the data.After some searching online, I disabled AQE. But this does not help.Th...
- 3006 Views
- 3 replies
- 14 kudos
Latest Reply
After some tests it seems that if I run the notebook on an interactive cluster, I only get 80% of load (Ganglia metrics).If I run the same notebook on a job cluster with the same VM types etc (so the only difference is interactive vs job), I get over...
2 More Replies
- 4098 Views
- 2 replies
- 2 kudos
I have few fundamental questions in Spark3 while running a simple Spark app in my local mac machine (with 6 cores in total). Please help.local[*] runs my Spark application in local mode with all the cores present on my mac, correct? It also means tha...
- 4098 Views
- 2 replies
- 2 kudos
Latest Reply
That is a lot of questions in one topic.Let's give it a try:[1] this all depends on the values of the concerning parameters and the program you run(think joins, unions, repartition etc)[2] spark.default.parallelism is by default the number of cores *...
1 More Replies
- 1248 Views
- 1 replies
- 0 kudos
From what I have read about AQE it seems to do a lot of what skew join hints did automatically. So should I still be using skew hints in my queries? Is there harm in using them?
- 1248 Views
- 1 replies
- 0 kudos
Latest Reply
With AQE Databricks has the most up-to-date accurate statistics at the end of a query stage and can opt for a better physical strategy and or do optimizations that used to require hints,In the case of skew join hints, is recommended to rely on AQE...
- 897 Views
- 0 replies
- 0 kudos
From the demo notebook located here (https://databricks.com/blog/2020/05/29/adaptive-query-execution-speeding-up-spark-sql-at-runtime.html) it seems like the approach to demonstrate AQE was working was to first calculate the Spark query plan before r...
- 897 Views
- 0 replies
- 0 kudos