cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Accessing Spark UI in free edition

Hritik_Moon
New Contributor

Hello, is it possible to access Spark UI in free edition, I want to check task and stages.

Ultimately I am working on how to check data skewness.

12 REPLIES 12

szymon_dybczak
Esteemed Contributor III

Hi @Hritik_Moon ,

Basically you can't. This is a limitation of serverless compute. What the recommend is to use query profile instead:

Query profile | Databricks on AWS

szymon_dybczak_0-1759994896738.png

 

BS_THE_ANALYST
Esteemed Contributor II

@Hritik_Moon if you're access a LAB environment for your learning, you may well be using classic compute in there. You could check the Spark UI that way. Alternatively, I think the Community Edition still has some life in it, perhaps you could sign up to it as I think that uses classic compute. 

If not, could you sign up for a Databricks Trial? This should have the ability for you to access non serverless compute. 

Are you learning the Apache Spark Developer Certification by any chance @Hritik_Moon? ๐Ÿ™‚

All the best,
BS

Thanks for the suggestions, I will look into it.

For the certification I have not decided yet but I will try for data engineer.

Hi @Hritik_Moon , @BS_THE_ANALYST 

If @Hritik_Moon  havenโ€™t created an account in the past, he wonโ€™t be able to register for Databricks Community Edition.  However, if you need access to the Spark UI completely for free, you can download a Docker container with preconfigured PySpark. That way, you can learn for free and have access to low-level APIs like RDD

I will see about docker. Is it must for data engineering certificate.

szymon_dybczak
Esteemed Contributor III

For data engineering exam Free Edition will be good enough. You can learn most things required by exam objectives there. Regarding Spark UI - if you have an access to labs then you can use classic compute as @BS_THE_ANALYST . If you don't have access and you don't want to spend money for that - you OSS Spark (the easiest way is to use docker container).

But don't focus too much on analyzing Spark UI plans - on exam you can get 1-2 question regarding this.

And as always - the best resource to prepare is to use Data Enginnering Learning path on Databricks Academy

@szymon_dybczak Thanks for the advice. 

I will try this on the weekend. I completely forget that spark exists outside of Databricks. They've become one in my head ๐Ÿ˜‚

Awesome to know about the Community Edition, I'll be mindful with providing that advice in the future.

All the best,
BS

Haha, that's right. I have the same. In my mind when someone mentions Spark then my brain put replaces it with Databricks ๐Ÿ˜„ And tbh, you can learn a lot more about internals if you're using OSS Apache Spark because you have an access to source code. So, if someone wants to dive deep into Spark it's better to build it from sources and learn the hard way ๐Ÿ™‚

Hritik_Moon
New Contributor

@szymon_dybczak @BS_THE_ANALYST Is there a specific guide or a flow to be a better databricks data engineer. I am learning as the topic comes up.

Finding it really difficult to maintain a flow and I lose track.

szymon_dybczak
Esteemed Contributor III

In my opinion to get started you need to have a good SQL skill, some python knowledge and some basic understanding of data modeling. I met in my life data engineers who were good at technical skills but they didn't have fundamental knowledge about data modeling.

If you learn that fundamental blocks like SQL and data modeling then you will find a lot easier to pickup different technologies. It won't matter if it will be Databricks or Snowflake or Redshift. Because in the end - those are just tools and if you know what to do you can easily learn tool along the way in a job ๐Ÿ™‚

Any course or leaning guide you could suggest for data modeling?

szymon_dybczak
Esteemed Contributor III

Here's I can recommend some books:

- Star Schema by Christopher Adamson

- The Data Warehouse Toolkit

And I think guys from SQLBI have some data modeling courses as well.

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local communityโ€”sign up today to get started!

Sign Up Now