cancel
Showing results for 
Search instead for 
Did you mean: 
Community Platform Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results for 
Search instead for 
Did you mean: 

Installing R packages for a customer docker container for compute

BenCCC
New Contributor

Hi,

I'm trying to create a customer docker image with some R packages re-installed. However, when I try to use it in a notebook, it can't seem to find the installed packages. The build runs fine.

FROM databricksruntime/rbase:14.3-LTS

## update system libraries
RUN apt-get update && \
apt-get upgrade -y && \
apt-get clean

RUN apt-get -y --no-install-recommends install \
libxml2-dev \
libcairo2-dev \
libsqlite3-dev \
libpq-dev \
libssl-dev \
libcurl4-openssl-dev \
libssh2-1-dev \
unixodbc-dev \
libglpk40


RUN R -e 'install.packages(c(\
"devtools", \
"tidyverse",\
"sparklyr", \
"tidyquant", \
"plotly", \
"quantmod", \
"RTL" \
), \
dependencies = TRUE, \
repos = "https://packagemanager.rstudio.com/cran/__linux_/jammy/latest" \
)'

1 REPLY 1

Kaniz_Fatma
Community Manager
Community Manager
Hi @BenCCC,

Here are a few things you can check:

  1. Package Installation in Dockerfile:

    • In your Dockerfile, you’re using the RUN R -e 'install.packages(...)' command to install R packages. While this approach works, there are alternative methods that might be more reliable.
    • Consider using the install2.r function from the rocker-org images. These images are specifically designed for R and include additional features like error handling dur...1.
    • Example using rocker/tidyverse image:
      FROM rocker/tidyverse:latest
      # Install R packages
      RUN install2.r --error \
          methods \
          jsonlite \
          tseries
      
    • The --error flag ensures that the build fails if any package installation fails.
  2. Base Image:

  3. Check Package Availability:

    • Verify that the packages you’re installing are available in the specified repository. Sometimes, a package might not be found, leading to installation issues.
    • Double-check the repository URL you’re using in your Dockerfile. In your case, it’s https://packagemanager.rstudio.com/cran/__linux_/jammy/latest.
    • Consider using the default CRAN repository (repos = 'http://cran.rstudio.com/') to ensure package availability.
  4. Docker Build Logs:

    • Review the build logs when creating the Docker image. Look for any warnings or errors related to package installation.
    • If there are any issues, they should be visible in the logs.
  5. Run R Script via Docker:

    • If you’re still facing issues, consider running an R script inside the container to install packages. This way, you can see any error messages directly.
    • For example:
      FROM rocker/tidyverse:latest
      # Copy your R script into the container
      COPY my_script.R /path/to/script.R
      # Run the script
      CMD ["Rscript", "/path/to/script.R"]
      

Remember to rebuild your Docker image after making any changes to the Dockerfile. Hopefully, one of these approaches will help resolve the issue with package installation in your custom Docker image. If you encounter any specific errors, feel free to share them, and we can dive deeper into troubleshooting! 😊🐳📦

 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group