6 hours ago
Hello, is it possible to access Spark UI in free edition, I want to check task and stages.
Ultimately I am working on how to check data skewness.
6 hours ago - last edited 6 hours ago
Hi @Hritik_Moon ,
Basically you can't. This is a limitation of serverless compute. What the recommend is to use query profile instead:
Query profile | Databricks on AWS
3 hours ago - last edited 3 hours ago
@Hritik_Moon if you're access a LAB environment for your learning, you may well be using classic compute in there. You could check the Spark UI that way. Alternatively, I think the Community Edition still has some life in it, perhaps you could sign up to it as I think that uses classic compute.
If not, could you sign up for a Databricks Trial? This should have the ability for you to access non serverless compute.
Are you learning the Apache Spark Developer Certification by any chance @Hritik_Moon? ๐
All the best,
BS
3 hours ago
Thanks for the suggestions, I will look into it.
For the certification I have not decided yet but I will try for data engineer.
3 hours ago
Hi @Hritik_Moon , @BS_THE_ANALYST
If @Hritik_Moon havenโt created an account in the past, he wonโt be able to register for Databricks Community Edition. However, if you need access to the Spark UI completely for free, you can download a Docker container with preconfigured PySpark. That way, you can learn for free and have access to low-level APIs like RDD
3 hours ago
I will see about docker. Is it must for data engineering certificate.
2 hours ago
For data engineering exam Free Edition will be good enough. You can learn most things required by exam objectives there. Regarding Spark UI - if you have an access to labs then you can use classic compute as @BS_THE_ANALYST . If you don't have access and you don't want to spend money for that - you OSS Spark (the easiest way is to use docker container).
But don't focus too much on analyzing Spark UI plans - on exam you can get 1-2 question regarding this.
And as always - the best resource to prepare is to use Data Enginnering Learning path on Databricks Academy
2 hours ago - last edited 2 hours ago
@szymon_dybczak Thanks for the advice.
I will try this on the weekend. I completely forget that spark exists outside of Databricks. They've become one in my head ๐.
Awesome to know about the Community Edition, I'll be mindful with providing that advice in the future.
All the best,
BS
2 hours ago
Haha, that's right. I have the same. In my mind when someone mentions Spark then my brain put replaces it with Databricks ๐ And tbh, you can learn a lot more about internals if you're using OSS Apache Spark because you have an access to source code. So, if someone wants to dive deep into Spark it's better to build it from sources and learn the hard way ๐
2 hours ago
@szymon_dybczak @BS_THE_ANALYST Is there a specific guide or a flow to be a better databricks data engineer. I am learning as the topic comes up.
Finding it really difficult to maintain a flow and I lose track.
an hour ago
In my opinion to get started you need to have a good SQL skill, some python knowledge and some basic understanding of data modeling. I met in my life data engineers who were good at technical skills but they didn't have fundamental knowledge about data modeling.
If you learn that fundamental blocks like SQL and data modeling then you will find a lot easier to pickup different technologies. It won't matter if it will be Databricks or Snowflake or Redshift. Because in the end - those are just tools and if you know what to do you can easily learn tool along the way in a job ๐
an hour ago
Any course or leaning guide you could suggest for data modeling?
an hour ago
Here's I can recommend some books:
- Star Schema by Christopher Adamson
- The Data Warehouse Toolkit
And I think guys from SQLBI have some data modeling courses as well.
Passionate about hosting events and connecting people? Help us grow a vibrant local communityโsign up today to get started!
Sign Up Now