Louis_Frolio
Databricks Employee
Databricks Employee

Hi @NW1000 ,

Glad you tried my suggestion, and thanks for sharing the details.

1. Why the init script failed

This message:

Init script failure: Cluster scoped init script ... failed: Script exit status is non-zero

really just means that something inside the bash script returned a non-zero exit code during cluster startup. In other words, the script hit an error and stopped.

The real clue will be in the init script log.

Here is where I would look:

  • Open the cluster details

  • Go to the Event Log or driver logs

  • Find the init script log file

For the script we were discussing, it should be something like:

/tmp/init-r-libs.log

Once you open that log, scroll to the bottom and look for the first real error message. That is usually where the root cause shows up.

In most cases, it tends to be one of these:

  • a typo in a path, such as the Volume path or script path

  • missing execute permissions on the script, for example:

    chmod +x init-script-RLib.sh

  • an R command inside the script failing, such as install.packages() returning an error, which will cause the whole script to exit non-zero

Once you have the last few lines from that log, it should be much easier to pinpoint exactly what failed and tighten up the script accordingly.

2. About the default CRAN / Posit Package Manager URL

Yes — the URL you are seeing in the Libraries UI, something like:

https://databricks.packagemanager.posit.co/cran/__linux__/noble/2025-03-20/

is the Databricks-managed Posit Package Manager snapshot used by Databricks runtimes for R packages.

A few important things to know here:

  • Databricks pins R libraries to a specific CRAN snapshot, in this case 2025-03-20, so installs remain reproducible and stable

  • The __linux__/<codename>/2025-03-20 portion reflects the underlying Ubuntu release, such as jammy or noble

  • Databricks determines that automatically from the runtime OS for newer runtimes, including 17.x and above

  • That URL is intended to be used as the repos= value in install.packages(), not really as a browser-friendly page

  • So if you paste it into a browser and get something like “Invalid request,” that is not necessarily a problem — that can be expected behavior

 

If you want your own scripts to follow the same pattern across runtimes, the safest approach is to detect the OS codename dynamically and construct the URL from there, like this:

release <- system("lsb_release -c --short", intern = TRUE)
snapshot_date <- "2025-03-20"

options(
  HTTPUserAgent = sprintf(
    "R/%s R (%s)",
    getRversion(),
    paste(getRversion(), R.version["platform"], R.version["arch"], R.version["os"])
  ),
  repos = paste0(
    "https://databricks.packagemanager.posit.co/cran/__linux__/",
    release, "/", snapshot_date
  )
)

That way:

  • if the runtime is on jammy, it uses .../__linux__/jammy/2025-03-20/

  • if it is on noble, it uses .../__linux__/noble/2025-03-20/

That mirrors how Databricks handles the default CRAN configuration internally.

Hope this helps, Louis.

 


 

I can also make this a little shorter and more Community-post conversational if you want.