cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Ray as a cluster library instead of notebook-scoped library

Fed
New Contributor III

This article rightly suggests to install `ray` with `%pip`, although it fails to mention that installing it as a cluster library won't work.

The reason, I think, is that `setup_ray_cluster` will use `sys.executable` (ie `/local_disk0/.ephemeral_nfs/envs/pythonEnv-{UUID}/bin/python`) to run start_ray_node.py, which in turn calls the `ray` executable command.

If `ray` is installed with `%pip` its executable command will be in the same folder as `sys.executable` and so everything works fine, but if `ray` is installed as a cluster library (ie in `/local_disk0/.ephemeral_nfs/cluster_libraries/python`) then it won't find it.

 I've tried to add it to PATH but didn't work

import sys
 
sys.path.append("/local_disk0/.ephemeral_nfs/cluster_libraries/python/bin")

And some more debugging (in a new session)

import subprocess
import sys
import os
 
print("/local_disk0/.ephemeral_nfs/cluster_libraries/python/bin" in sys.path)  # False
print("/local_disk0/.ephemeral_nfs/cluster_libraries/python/bin" in os.environ["PATH"])  # True
print(subprocess.run(["ray", "--version"], capture_output=True).stdout.decode("utf-8"))  # ray, version 2.3.0
 
 
 

1 ACCEPTED SOLUTION

Accepted Solutions

Fed
New Contributor III

Ugly, but this seems to work for now

import sys
import os
import shutil
from ray.util.spark import setup_ray_cluster, shutdown_ray_cluster
 
shutil.copy(
    "/local_disk0/.ephemeral_nfs/cluster_libraries/python/bin/ray",
    os.path.dirname(sys.executable),
)
 
setup_ray_cluster(
  num_worker_nodes=4,
  num_cpus_per_node=8,
  collect_log_to_path="/dbfs/ray/logs"
)

View solution in original post

1 REPLY 1

Fed
New Contributor III

Ugly, but this seems to work for now

import sys
import os
import shutil
from ray.util.spark import setup_ray_cluster, shutdown_ray_cluster
 
shutil.copy(
    "/local_disk0/.ephemeral_nfs/cluster_libraries/python/bin/ray",
    os.path.dirname(sys.executable),
)
 
setup_ray_cluster(
  num_worker_nodes=4,
  num_cpus_per_node=8,
  collect_log_to_path="/dbfs/ray/logs"
)

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.