cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Adding extra libraries to databricks (rosbag)

PiotrU
Contributor

Hello

I have interesting challenge, I am required to install few libraries which are part of rosbag packages, for allowing some data deserialization tasks.

While creating cluster I do use init_script that install this software using apt 

 

 

 

sudo apt update && sudo apt install -y curl gnupg2 lsb-release
sudo curl -sSL https://raw.githubusercontent.com/ros/rosdistro/master/ros.asc | sudo apt-key add -
sudo sh -c 'echo "deb [arch=amd64] http://packages.ros.org/ros2/ubuntu $(lsb_release -cs) main" > /etc/apt/sources.list.d/ros2-latest.list'

sudo apt update
sudo apt install -y ros-humble-ros-base ros-humble-rclpy ros-humble-std-msgs python3-argcomplete

/databricks/python/bin/pip install mcap
echo "source /opt/ros/humble/setup.bash" >> ~/.bashrc
source /opt/ros/humble/setup.bash

PYTHON_VERSION=$(/databricks/python3/bin/python3 --version | cut -d' ' -f2 | cut -d'.' -f1-2)
echo "export PYTHONPATH=\$PYTHONPATH:/opt/ros/humble/lib/python${PYTHON_VERSION}/site-packages" >> ~/.bashrc
export PYTHONPATH=$PYTHONPATH:/opt/ros/humble/lib/python${PYTHON_VERSION}/site-packages

 

 

 

 
That part is working fine, the trouble is starting when I try to use it (whether is it master or worker node), I have selected example library rclpy - whenever I do execute it via "sh" and start with sourcing variables - it is working fine 
lakime_0-1717597430889.png

However - If I would like to do same thing - natively using notebook - it doesn't find libraries

lakime_1-1717597470819.png

I have a script which read the variables generated by /opt/ros/humble/setup.bash -

 

 

 

import subprocess
import os

def source_ros_setup():
    # Source the ROS 2 setup script and capture the environment variables
    command = ['bash', '-c', 'source /opt/ros/humble/setup.bash && env']
    proc = subprocess.Popen(command, stdout=subprocess.PIPE, universal_newlines=True)
    env_vars = {}
    for line in proc.stdout:
        key, _, value = line.partition("=")
        env_vars[key] = value.strip()
    proc.communicate()

    # Update os.environ with the new environment variables
    os.environ.update(env_vars)

# Source the ROS 2 setup script
source_ros_setup()

# Verify environment variables (Optional)
print("PYTHONPATH:", os.environ.get('PYTHONPATH'))
print("PATH:", os.environ.get('PATH'))
print("LD_LIBRARY_PATH:", os.environ.get('LD_LIBRARY_PATH'))

# Now run your main code
try:
    import rclpy
    print("rclpy is installed and accessible")
    # Your main ROS 2 related code goes here
except ModuleNotFoundError:
    print("rclpy is not found, ensure the correct PYTHONPATH is set")

 

 

 

but it also doesn't work - 

"rclpy is not found, ensure the correct PYTHONPATH is set"

Any ideas?

1 ACCEPTED SOLUTION

Accepted Solutions

PiotrU
Contributor

Issue mostly solved:

import sys
import os
library_path = "/opt/ros/humble/local/lib/python3.10/dist-packages"
 
if library_path not in sys.path:
    sys.path.append(library_path)

 That solves issue with databricks not founding libraries, still - it gives some challenges while loading library - but they are not related to paths.

View solution in original post

6 REPLIES 6

shan_chandra
Esteemed Contributor
Esteemed Contributor

@PiotrU - Can you please check if any error in the driver logs regarding this library installation?  Do you have the required access to install as a sudo user?  It may require password (when i tried locally )

sudo apt install -y ros-humble-ros-base ros-humble-rclpy ros-humble-std-msgs python3-argcomplete

 

There are no issues while I do install packages

PiotrU
Contributor

Issue mostly solved:

import sys
import os
library_path = "/opt/ros/humble/local/lib/python3.10/dist-packages"
 
if library_path not in sys.path:
    sys.path.append(library_path)

 That solves issue with databricks not founding libraries, still - it gives some challenges while loading library - but they are not related to paths.

amandaK
New Contributor II

@PiotrU did adding the path to sys.path resolve all of your ModuleNotFoundErrors? i'm trying to do something similar and adding the path to the sys.path resolved ModuleNotFoundError for rclpy, but i continue to see others related to ros

amandaK
New Contributor II

After working on it a bit, I was able to get rid of all the ModuleNotFoundErrors, but can't seem to figure out how to resolve this issue

ImportError: librcl_action.so: cannot open shared object file: No such file or directory

Did you happen to run into this as well? 


@amandaK wrote:

@PiotrU did adding the path to sys.path resolve all of your ModuleNotFoundErrors? i'm trying to do something similar and adding the path to the sys.path resolved ModuleNotFoundError for rclpy, but i continue to see others related to ros


 

nope, also - end of the day, I've totally dropped usage of ros env on databricks

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group