- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-01-2023 10:32 AM
What there has been no answer here! @Debayan Mukherjee @Vartika Nain
So I am running into this same problem as the idea of having to wait 45 minutes for libraries to install is absolutely wild as well as I have done everything outside of working with the docker container.
FROM databricksruntime/standard:9.x
# based on these instructions (avoiding firewall issue for some users):
# https://cran.rstudio.com/bin/linux/ubuntu/#secure-apt
RUN apt-get update \
&& DEBIAN_FRONTEND="noninteractive" apt-get install --yes software-properties-common apt-transport-https \
&& gpg --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys E298A3A825C0D65DFD57CBB651716619E084DAB9 \
&& gpg -a --export E298A3A825C0D65DFD57CBB651716619E084DAB9 | sudo apt-key add - \
&& add-apt-repository 'deb https://cloud.r-project.org/bin/linux/ubuntu bionic-cran40/' \
&& apt-get update \
&& DEBIAN_FRONTEND="noninteractive" apt-get install --yes \
libssl-dev \
r-base \
r-base-dev \
&& add-apt-repository -r 'deb https://cloud.r-project.org/bin/linux/ubuntu bionic-cran40/' \
&& apt-key del E298A3A825C0D65DFD57CBB651716619E084DAB9 \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
# UPDATE A SERIES OF PACKAGES
# RUN apt-get update --fix-missing && apt-get install -y ca-certificates libglib2.0-0 libxext6 libsm6 libxrender1 libxml2-dev
# hwriterPlus is used by Databricks to display output in notebook cells
# Rserve allows Spark to communicate with a local R process to run R code
# shiny is used by Databricks interpreter
RUN R -e "install.packages(c('hwriter', 'TeachingDemos', 'htmltools'))"
RUN R -e "install.packages('https://cran.r-project.org/src/contrib/Archive/hwriterPlus/hwriterPlus_1.0-3.tar.gz', repos=NULL, type='source')"
RUN R -e "install.packages('Rserve', repos='http://rforge.net/', type='source')"
RUN R -e "install.packages('shiny', repos='https://cran.rstudio.com/')"
# Added packages for the project that I am currently working on
RUN R -e "install.packages(c('sparklyr', 'remotes', 'plyr', 'dplyr', 'rlist', 'stringr', 'rlist', 'ggplot2', 'patchwork', 'scales', 'Robyn', 'reticulate'))"
# Install nevergrad Python package
RUN python3 -m pip install nevergrad
RUN R -e "library(reticulate); reticulate::py_config()"
RUN R -e "install.packages('devtools', repos='https://cran.rstudio.com/')"
RUN R -e "remotes::install_github('mlflow/mlflow', subdir = 'mlflow/R/mlflow')"I went with using the runtime because there is a use case for MLflow I get hit by the stan issues as well as the mlflow issues being installed.
it is very clear that R isn't supported much in DB as there was a resolved issue that never was merged into the main and the last time it was updated was 10 months ago.
@Navneet Sonak let me know if you end up solving this with the docker image I would be super grateful