cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

How can I specify a custom CRAN mirror to be used permanently by default when installing packages within R Notebooks?

niklas
Contributor

When installing Notebook-scoped R libraries I don't want to manually specify the custom CRAN mirror each time like this:

install.packages("diffdf", repos="my_custom_cran_url'')

Instead I want to take the custom CRAN mirror URL by default so that I don't have to specify the URL each time:

install.packages("diffdf")

Normally this is done by adjusting the .Rprofile or Rprofile.site files. Unfortunately those files will only have an effect for the RStudio sessions and not for the SparkR sessions in the R Notebooks.

After some try and error I figured out that specifying the default CRAN mirror URL under /databricks/spark/R/lib/SparkR/R/SparkR will work as desired. However I can't update this file automatically via a cluster-scoped init script as at the time of the init script execution the lib/SparkR/R/SparkR doesn't exist already (The path/file is somehow dynamically build at a later time).

Unfortunately I couldn't find any useful information for this particular use case on the internet. Does anyone have an idea?

1 ACCEPTED SOLUTION

Accepted Solutions

niklas
Contributor

Got solution on Stack Overflow for this problem: https://stackoverflow.com/a/76777228/18082636

View solution in original post

2 REPLIES 2

Anonymous
Not applicable

Hi @Niklas Letz​ 

Great to meet you, and thanks for your question!

Let's see if your peers in the community have an answer to your question. Thanks.

niklas
Contributor

Got solution on Stack Overflow for this problem: https://stackoverflow.com/a/76777228/18082636