cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Can databricks be used locally to learn it or is it cloud only

data_testing1
New Contributor III

I'm tired of telling clients or referrals I don't know databricks but it seems like the only option is to have a big AWS account and then use databricks on that data. Can I download it locally for training, upskilling with python or is it only for cloud deployments and I have to pay $$$ to host data and learn it?

7 REPLIES 7

mrbean
New Contributor II

If you need the platform purely for academic/learning purposes, I'd suggest having a look at Databricks Community Edition. This is different from the 14-day trial/subscription-based production platform. For the community edition, you don't have to key in any AWS/Azure account or your card details and you get a free micro-cluster and a notebook environment to get started with Spark.

community 

Prabakar
Esteemed Contributor III
Esteemed Contributor III

You cannot use Databricks locally as it is built-on cloud platform. To learn Databricks, use the community version.

Hubert-Dudek
Esteemed Contributor III

Yes, use the community version. Here is explained how to register: https://community.databricks.com/s/feed/0D53f00001ebEasCAE

If you want to install locally, your only choice is to install Spark. It will not have all functionality of databricks, but it is good to learn. If you want to give it a try, here is the link to the docker image: https://hub.docker.com/r/apache/spark

Anonymous
Not applicable

Thanks for linking directly to the docker image @Hubert Dudek​ ! And thanks for the info @Prabakar Ammeappin​ and @Amit Nainawati​ 👍

@Andrew Schell​ Let us know if you have more questions! If not, choose a best answer in this thread and let us know how you get on!

Ok so the community version offers me compute time of which I can’t find the limits for but the storage is 15gb per node and one Databricks node?

sorry for mixing my terms I’m trying to sort/choose between AWS, data bricks and Google cloud for a project.

Is there a python tutorial you could link me to?

Hubert-Dudek
Esteemed Contributor III

Ok so the community version offers me compute time of which I can’t find the limits for but the storage is 15gb per node and one Databricks node?

Yes, that is correct. However, if you want a limitless version, you must go for Azure, AWS, or Google.

Kaniz
Community Manager
Community Manager

Hi @Andrew Schell​, We haven't heard from you on the last response from @Hubert Dudek​ , and I was checking back to see if his suggestions helped you. Or else, If you have any solution, please share it with the community as it can be helpful to others.

Also, please don't forget to click on the "Select As Best" button whenever the information provided helps resolve your question.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.