Tuesday
What is the best way to know what kind of join was used for a SQL query between broadcast, shuffle hash and sort merge? How can the spark UI or the query plan be interpreted?
Tuesday
Hello @smoortema , here are some helpful tips and tricks.
Hope this helps, Louis.
Tuesday
Hello @smoortema , here are some helpful tips and tricks.
Hope this helps, Louis.
Wednesday
Thanks for the useful informations! I have two additional questions:
1. Your answer looks like it is LLM-generated. If it is, could you share which LLM you used for it?
2. What is the best way to find the query in Spark UI? I am getting there through Compute, then selecting the compute I used, then Spark UI. Here I can find many queries, and it is not always evident which one I am looking for. Also, I can only see the queries that happened since the last restart of the compute. So finding the query becomes especially hard once it has already completed. Is there an easier way to find the query I am looking for?
Wednesday
@smoortema , Spark performance tuning is one of the hardest topics to teach or learn, and itโs even tougher to do justice to in a forum thread. That said, Iโm really glad to see you asking the question. Tuning is challenging precisely because there are so many moving pieces, which is why AQE was introduced in the first place โ to take a large portion of that burden off your shoulders.
If you want to go deeper, structured training is your best path. Databricks offers courses that walk through tuning concepts step by step, and Iโm sure platforms like Udemy have solid options as well. A guided approach will give you the most clarity and confidence as you level up your skills.
Hope this helps, Louis.
Passionate about hosting events and connecting people? Help us grow a vibrant local communityโsign up today to get started!
Sign Up Now