cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Can databricks be used locally to learn it or is it cloud only

data_testing1
New Contributor III

I'm tired of telling clients or referrals I don't know databricks but it seems like the only option is to have a big AWS account and then use databricks on that data. Can I download it locally for training, upskilling with python or is it only for cloud deployments and I have to pay $$$ to host data and learn it?

6 REPLIES 6

mrbean
New Contributor II

If you need the platform purely for academic/learning purposes, I'd suggest having a look at Databricks Community Edition. This is different from the 14-day trial/subscription-based production platform. For the community edition, you don't have to key in any AWS/Azure account or your card details and you get a free micro-cluster and a notebook environment to get started with Spark.

community 

Prabakar
Databricks Employee
Databricks Employee

You cannot use Databricks locally as it is built-on cloud platform. To learn Databricks, use the community version.

Hubert-Dudek
Esteemed Contributor III

Yes, use the community version. Here is explained how to register: https://community.databricks.com/s/feed/0D53f00001ebEasCAE

If you want to install locally, your only choice is to install Spark. It will not have all functionality of databricks, but it is good to learn. If you want to give it a try, here is the link to the docker image: https://hub.docker.com/r/apache/spark

Anonymous
Not applicable

Thanks for linking directly to the docker image @Hubert Dudek​ ! And thanks for the info @Prabakar Ammeappin​ and @Amit Nainawati​ 👍

@Andrew Schell​ Let us know if you have more questions! If not, choose a best answer in this thread and let us know how you get on!

Ok so the community version offers me compute time of which I can’t find the limits for but the storage is 15gb per node and one Databricks node?

sorry for mixing my terms I’m trying to sort/choose between AWS, data bricks and Google cloud for a project.

Is there a python tutorial you could link me to?

Hubert-Dudek
Esteemed Contributor III

Ok so the community version offers me compute time of which I can’t find the limits for but the storage is 15gb per node and one Databricks node?

Yes, that is correct. However, if you want a limitless version, you must go for Azure, AWS, or Google.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group