โ07-05-2022 05:16 PM
I'm tired of telling clients or referrals I don't know databricks but it seems like the only option is to have a big AWS account and then use databricks on that data. Can I download it locally for training, upskilling with python or is it only for cloud deployments and I have to pay $$$ to host data and learn it?
โ07-05-2022 07:41 PM
If you need the platform purely for academic/learning purposes, I'd suggest having a look at Databricks Community Edition. This is different from the 14-day trial/subscription-based production platform. For the community edition, you don't have to key in any AWS/Azure account or your card details and you get a free micro-cluster and a notebook environment to get started with Spark.
โ07-06-2022 03:51 AM
You cannot use Databricks locally as it is built-on cloud platform. To learn Databricks, use the community version.
โ07-06-2022 05:58 AM
Yes, use the community version. Here is explained how to register: https://community.databricks.com/s/feed/0D53f00001ebEasCAE
If you want to install locally, your only choice is to install Spark. It will not have all functionality of databricks, but it is good to learn. If you want to give it a try, here is the link to the docker image: https://hub.docker.com/r/apache/spark
โ07-06-2022 04:09 PM
Thanks for linking directly to the docker image @Hubert Dudekโ ! And thanks for the info @Prabakar Ammeappinโ and @Amit Nainawatiโ ๐
@Andrew Schellโ Let us know if you have more questions! If not, choose a best answer in this thread and let us know how you get on!
โ07-06-2022 04:57 PM
Ok so the community version offers me compute time of which I canโt find the limits for but the storage is 15gb per node and one Databricks node?
sorry for mixing my terms Iโm trying to sort/choose between AWS, data bricks and Google cloud for a project.
Is there a python tutorial you could link me to?
โ07-07-2022 12:47 AM
Ok so the community version offers me compute time of which I canโt find the limits for but the storage is 15gb per node and one Databricks node?
Yes, that is correct. However, if you want a limitless version, you must go for Azure, AWS, or Google.
โ07-07-2022 10:59 PM
Hi @Andrew Schellโ, We haven't heard from you on the last response from @Hubert Dudekโ , and I was checking back to see if his suggestions helped you. Or else, If you have any solution, please share it with the community as it can be helpful to others.
Also, please don't forget to click on the "Select As Best" button whenever the information provided helps resolve your question.
Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections.
Click here to register and join today!
Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.