01-14-2025 12:20 AM
Hi,
I need to ingest and transform historical climate data into a Delta table. The data is stored in .nc format (NetCDF). To work with this format, specific C libraries for Python are required, along with particular versions of Python libraries (e.g., numpy).
On my local machine, I resolved this using Anaconda, which installed the necessary libraries (xarray, netCDF4) and handled all dependencies seamlessly.
However, I'm encountering issues when trying to achieve the same on a Databricks cluster:
I came across the Databricks Container Service, which seems to allow customization by using custom containers.
https://docs.databricks.com/en/compute/custom-containers.html#enable
Is this the only way to install xarray, netCDF4, and upgrade pre-installed libraries? Are there alternative approaches to handle this without compromising the cluster's stability?
Any help or guidance would be much appreciated!
Thanks!
01-14-2025 02:14 AM
Using custom containers is generally the most stable and flexible approach to ensure all dependencies are correctly managed and do not interfere with the cluster's functionality.
01-14-2025 02:14 AM
Using custom containers is generally the most stable and flexible approach to ensure all dependencies are correctly managed and do not interfere with the cluster's functionality.
01-14-2025 02:46 AM
Thanks! Will proceed with custom containers then.
01-14-2025 04:38 AM
Great, please let us know in case any assistance is needed
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up Now