cancel
Showing results for 
Search instead for 
Did you mean: 
Technical Blog
cancel
Showing results for 
Search instead for 
Did you mean: 
PeteStern
New Contributor III
New Contributor III

transparency.jpg


Almost anyone who uses Spark or Databricks is aware of the Spark UI, and we all know that it’s a super powerful tool in the right hands. It can reveal what’s going wrong and any inefficiencies in your Spark jobs, queries, or streams, but it’s useless if you don’t know how to use it.

Fortunately I’m a Specialist Solutions Architect at Databricks, and I work with customers on a regular basis to figure out how to improve the performance of their workloads. The Spark UI is an important tool for what I do, and I wrote a how-to guide for the Spark UI.  We’ve published the guide on the Databricks docs. It’s written with Databricks in mind, but it can be useful for any flavor of Spark.

The guide is a bit different from other guides on the Spark UI you may have seen. It doesn’t simply explain what each page in the Spark UI does. It shows you how to actually use the Spark UI to find issues. It’s a choose-your-own-adventure tool. The guide tells you what to look for, and then you click into the part of the guide that matches what you actually see. Eventually, you will reach a diagnosis. Based on what you see in the Spark UI, there’s a set of possible issues you might be facing, and the guide will lead you to more information about these problems.  

Hopefully, this guide will help you bring better performance to your Spark jobs and demystify the Spark UI. If you find the guide helpful or have any constructive feedback, please leave me a comment!

Check out the guide here!