Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-14-2024 11:42 AM - edited 11-14-2024 11:44 AM
Hello @TX-Aggie-00,
To ensure that LibreOffice is consistently installed on your Databricks cluster without relying on internet access (which can fail sometimes), you can manually download the necessary packages and store them in a Unity Catalog volume or a workspace location. Here’s a step-by-step guide:
- Download the Packages:
- On a local machine, download the .deb packages for LibreOffice, python3-uno, and poppler-utils from a reliable source such as the official repositories or a trusted mirror.
- Upload the Packages to Unity Catalog or Workspace:
- Upload the downloaded .deb files to a Unity Catalog volume or a workspace location (DBFS). You can use the Databricks UI or the Databricks CLI to upload these files. For example, you can use the following CLI command to upload to a Unity Catalog volume:
databricks fs cp local_path_to_deb_file /Volumes/your_catalog/your_schema/your_volume/
Bash - Modify the Init Script:
- Update your init script to install the packages from the local volume instead of downloading them from the internet. Here’s an example of how your init script might look:
#!/bin/bash - echo "----------------INIT SCRIPT---------------"
- echo "----------------Installing libreoffice---------------"
- dpkg -i /dbfs/Volumes/your_catalog/your_schema/your_volume/libreoffice.deb
- echo "----------------Installing python3-uno---------------"
- dpkg -i /dbfs/Volumes/your_catalog/your_schema/your_volume/python3-uno.deb
- echo "----------------Installing poppler-utils---------------"
- dpkg -i /dbfs/Volumes/your_catalog/your_schema/your_volume/poppler-utils.deb
- echo "----------INIT SCRIPT COMPLETE------------"