09-14-2021 11:00 AM
Hello,
How are you? I hope you are doing well!
I´m trying to use a databrick´s image (link: containers/ubuntu/R at master · databricks/containers (github.com)) to run a container when starting a cluster. I need that Rstudio is installed on the container. Although the cluster starts just fine, I can´t acces to Rstudio:
I think that I am meeting the requirements: I disabled table access control, automatic termination, or credential passthrough and I am running the init script specified on the README.md file on github.
Can you help me? Thank you!
11-10-2021 09:33 AM
hi @Francesco Camussoni I tested this with the rbase image and yes, I don't see the Rstudio enabled for the cluster.
However, I have built an image that works perfectly with Rstudio.
prabakar2610/rbase14:v2
You can use this image with the init script that you have.
You can find the dockerfile in Github
https://github.com/prabakar2610/Databricks/blob/master/dockerfile
09-15-2021 05:44 AM
Thank you Kaniz! 🙂
09-20-2021 12:56 AM
@Francesco Camussoni : have you tried steps here: https://docs.databricks.com/spark/latest/sparkr/rstudio.html
In the script you have shared, I see you are directly using RSTUDIO_BIN , but I do not see any place where you have downloaded and installed rstudio (required)
To check the result of the init script , enable logging on the cluster, that way you can see the stdout and stderr of the init script from all nodes (note that since $DB_IS_DRIVER is used, the script is expected to run fine only on the driver node and the worker nodes will report failure, this is expected so when you enable logging on the cluster, if you see error that says DB_IS_DRIVER is missing, that just means you are checking worker node log)
10-25-2021 10:43 AM
Hi noob, rstudio is installed on the Dockerfile, in fact, the init script I used is specified on databrick's github: containers/ubuntu/R at master · databricks/containers (github.com)
11-10-2021 09:33 AM
hi @Francesco Camussoni I tested this with the rbase image and yes, I don't see the Rstudio enabled for the cluster.
However, I have built an image that works perfectly with Rstudio.
prabakar2610/rbase14:v2
You can use this image with the init script that you have.
You can find the dockerfile in Github
https://github.com/prabakar2610/Databricks/blob/master/dockerfile
11-10-2021 09:56 AM
11-10-2021 10:11 AM
Hi @Francesco Camussoni , I built the image considering the Rstudio app. To use the same image to run commands from a notebook, I might need to add packages that are required to run R commands from the notebook. Let me try to build the image during this week that supports running commands in the notebook as well.
11-10-2021 10:17 AM
Thank you a lot! 🙂
11-17-2021 06:40 AM
Hi @Francesco Camussoni apologies for the delay.
Here is the working image prabakar2610/rbase14:v3
And the dockerfile for the same.
11-24-2021 09:52 AM
Hi @Francesco Camussoni I wanted to follow on this. Is this image working as expected? And were you able to build your own image using the dockerfile that I shared?
11-24-2021 11:07 AM
Hello @Prabakar Ammeappin , it worked just fine.
Thank you very much!
11-24-2021 12:33 PM
Hello @Francesco Camussoni Thank you for the confirmation. It would be great if you could mark it as the Best Answer, so the question will be marked as answered and will go on top of the search and benefit other users.
11-24-2021 09:53 AM
If the issue is resolved would you be happy to mark the answer as best so that others can quickly find the solution in the future.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group