cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Databricks rbase container: Rstudio doesn´t work

francescocamuss
New Contributor III

Hello,

How are you? I hope you are doing well!

I´m trying to use a databrick´s image (link: containers/ubuntu/R at master · databricks/containers (github.com)) to run a container when starting a cluster. I need that Rstudio is installed on the container. Although the cluster starts just fine, I can´t acces to Rstudio:1 

I think that I am meeting the requirements: I disabled table access control, automatic termination, or credential passthrough and I am running the init script specified on the README.md file on github.

2354Can you help me? Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions

Prabakar
Databricks Employee
Databricks Employee

hi @Francesco Camussoni​ I tested this with the rbase image and yes, I don't see the Rstudio enabled for the cluster.

However, I have built an image that works perfectly with Rstudio.

prabakar2610/rbase14:v2

You can use this image with the init script that you have.

You can find the dockerfile in Github

https://github.com/prabakar2610/Databricks/blob/master/dockerfile

View solution in original post

12 REPLIES 12

francescocamuss
New Contributor III

Thank you Kaniz! 🙂

User16873042947
New Contributor II

@Francesco Camussoni​  : have you tried steps here: https://docs.databricks.com/spark/latest/sparkr/rstudio.html

In the script you have shared, I see you are directly using RSTUDIO_BIN , but I do not see any place where you have downloaded and installed rstudio (required)

To check the result of the init script , enable logging on the cluster, that way you can see the stdout and stderr of the init script from all nodes (note that since $DB_IS_DRIVER is used, the script is expected to run fine only on the driver node and the worker nodes will report failure, this is expected so when you enable logging on the cluster, if you see error that says DB_IS_DRIVER is missing, that just means you are checking worker node log)

Hi noob, rstudio is installed on the Dockerfile, in fact, the init script I used is specified on databrick's github: containers/ubuntu/R at master · databricks/containers (github.com)

Prabakar
Databricks Employee
Databricks Employee

hi @Francesco Camussoni​ I tested this with the rbase image and yes, I don't see the Rstudio enabled for the cluster.

However, I have built an image that works perfectly with Rstudio.

prabakar2610/rbase14:v2

You can use this image with the init script that you have.

You can find the dockerfile in Github

https://github.com/prabakar2610/Databricks/blob/master/dockerfile

francescocamuss
New Contributor III

Hi @Prabakar Ammeappin​ , thank you a lot for your help! Can you tell me what did you do to solve this problem?

As you said, Rstudio works perfectly but I see that it doesn't work on notebooks, do you know why?image

Prabakar
Databricks Employee
Databricks Employee

Hi @Francesco Camussoni​ , I built the image considering the Rstudio app. To use the same image to run commands from a notebook, I might need to add packages that are required to run R commands from the notebook. Let me try to build the image during this week that supports running commands in the notebook as well.

Thank you a lot! 🙂

Prabakar
Databricks Employee
Databricks Employee

Hi @Francesco Camussoni​ apologies for the delay.

Here is the working image prabakar2610/rbase14:v3

And the dockerfile for the same.

Prabakar
Databricks Employee
Databricks Employee

Hi @Francesco Camussoni​  I wanted to follow on this. Is this image working as expected? And were you able to build your own image using the dockerfile that I shared?

Hello @Prabakar Ammeappin​ , it worked just fine.

Thank you very much!

Hello @Francesco Camussoni​ Thank you for the confirmation. It would be great if you could mark it as the Best Answer, so the question will be marked as answered and will go on top of the search and benefit other users.

Prabakar
Databricks Employee
Databricks Employee

If the issue is resolved would you be happy to mark the answer as best so that others can quickly find the solution in the future.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group