03-16-2023 01:22 PM
Using the install_libraries API, I installed a custom Python whl file on a running cluster. For certain types of requests, we have a requirement to install a different version of the same custom whl file in the running cluster. My problem is that uninstalling the previous version does not take effect until the cluster is restarted. So when I install the new version of the library, is there a way for me to force the cluster to use the "newly installed version" instead of the "uninstalled, pending restart" version?
03-24-2023 11:37 PM
@Priya K :
When you install a custom library using the install_libraries API in Databricks, the installed version is cached on the worker nodes until the cluster is restarted. Uninstalling the library will remove it from the cache, but it will still be loaded in memory by any running tasks until the cluster is restarted.
To force the cluster to use the newly installed version of the library, you can try a few options:
However, keep in mind that these options may have some limitations and risks depending on your use case. For example, reloading the module or restarting the Python interpreter may cause some inconsistencies or conflicts with the existing tasks running on the cluster. Additionally, creating a new cluster may increase your overall costs and may not be feasible for all use cases.
03-24-2023 11:37 PM
@Priya K :
When you install a custom library using the install_libraries API in Databricks, the installed version is cached on the worker nodes until the cluster is restarted. Uninstalling the library will remove it from the cache, but it will still be loaded in memory by any running tasks until the cluster is restarted.
To force the cluster to use the newly installed version of the library, you can try a few options:
However, keep in mind that these options may have some limitations and risks depending on your use case. For example, reloading the module or restarting the Python interpreter may cause some inconsistencies or conflicts with the existing tasks running on the cluster. Additionally, creating a new cluster may increase your overall costs and may not be feasible for all use cases.
03-27-2023 09:15 AM
Thank you for the response. Are there any ways to mitigate the risks of options 2 and 3? Is it possible to check whether the existing tasks are completed and the cluster is an idle state before we attempt to reload the module/restart the interpreter?
04-05-2023 10:19 AM
@Suteja Kanuri Any thoughts on the above question?
04-05-2023 09:14 PM
@Priya K :
Yes, there are some ways to mitigate the risks of options 2 and 3:
Regarding your second question, it is possible to check whether the existing tasks are completed and the cluster is idle before attempting to reload the module or restart the interpreter. One way to do this is by monitoring the active task count and cluster utilization using the Databricks REST API or the Databricks CLI. You can also use the Databricks Jobs API to schedule the module reload or interpreter restart during a specific time window when there are no running tasks on the cluster.
04-06-2023 05:21 PM
@Suteja Kanuri Thanks for your suggestions. Please combine both your responses, I will mark that as the best answer.
04-06-2023 06:45 PM
@Priya K : Glad that my suggestions are helping you! You could go ahead and mark both as the best answers 🙂 This works just fine.
03-25-2023 03:49 AM
Hi @Priya K
Thank you for posting your question in our community! We are happy to assist you.
To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your question?
This will also help other community members who may have similar questions in the future. Thank you for your participation and let us know if you need any further assistance!
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group