cancel
Showing results for 
Search instead for 
Did you mean: 
Technical Blog
Explore in-depth articles, tutorials, and insights on data analytics and machine learning in the Databricks Technical Blog. Stay updated on industry trends, best practices, and advanced techniques.
cancel
Showing results for 
Search instead for 
Did you mean: 
PeteStern
Databricks Employee
Databricks Employee

transparency.jpg


Almost anyone who uses Spark or Databricks is aware of the Spark UI, and we all know that it’s a super powerful tool in the right hands. It can reveal what’s going wrong and any inefficiencies in your Spark jobs, queries, or streams, but it’s useless if you don’t know how to use it.

Fortunately I’m a Specialist Solutions Architect at Databricks, and I work with customers on a regular basis to figure out how to improve the performance of their workloads. The Spark UI is an important tool for what I do, and I wrote a how-to guide for the Spark UI.  We’ve published the guide on the Databricks docs. It’s written with Databricks in mind, but it can be useful for any flavor of Spark.

The guide is a bit different from other guides on the Spark UI you may have seen. It doesn’t simply explain what each page in the Spark UI does. It shows you how to actually use the Spark UI to find issues. It’s a choose-your-own-adventure tool. The guide tells you what to look for, and then you click into the part of the guide that matches what you actually see. Eventually, you will reach a diagnosis. Based on what you see in the Spark UI, there’s a set of possible issues you might be facing, and the guide will lead you to more information about these problems.  

Hopefully, this guide will help you bring better performance to your Spark jobs and demystify the Spark UI. If you find the guide helpful or have any constructive feedback, please leave me a comment!

Check out the guide here!

1 Comment