cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Multiple versions of custom libraries on the cluster

priyak
New Contributor III

Using the install_libraries API, I installed a custom Python whl file on a running cluster. For certain types of requests, we have a requirement to install a different version of the same custom whl file in the running cluster. My problem is that uninstalling the previous version does not take effect until the cluster is restarted. So when I install the new version of the library, is there a way for me to force the cluster to use the "newly installed version" instead of the "uninstalled, pending restart" version?

1 ACCEPTED SOLUTION

Accepted Solutions

Anonymous
Not applicable

@Priya K​ :

When you install a custom library using the install_libraries API in Databricks, the installed version is cached on the worker nodes until the cluster is restarted. Uninstalling the library will remove it from the cache, but it will still be loaded in memory by any running tasks until the cluster is restarted.

To force the cluster to use the newly installed version of the library, you can try a few options:

  1. Use a new cluster: You can create a new cluster and install the new version of the library on that cluster. This way, you can ensure that the new version of the library is used without any conflicts with the old version.
  2. Reload the module: If you have already installed the new version of the library and want to start using it in your code without restarting the cluster, you can try reloading the module using importlib.reload(). This will reload the module and use the new version of the library.
  3. Restart the Python interpreter: You can also try restarting the Python interpreter by using the os.execv() function. This will restart the Python interpreter and load the new version of the library.

However, keep in mind that these options may have some limitations and risks depending on your use case. For example, reloading the module or restarting the Python interpreter may cause some inconsistencies or conflicts with the existing tasks running on the cluster. Additionally, creating a new cluster may increase your overall costs and may not be feasible for all use cases.

View solution in original post

7 REPLIES 7

Anonymous
Not applicable

@Priya K​ :

When you install a custom library using the install_libraries API in Databricks, the installed version is cached on the worker nodes until the cluster is restarted. Uninstalling the library will remove it from the cache, but it will still be loaded in memory by any running tasks until the cluster is restarted.

To force the cluster to use the newly installed version of the library, you can try a few options:

  1. Use a new cluster: You can create a new cluster and install the new version of the library on that cluster. This way, you can ensure that the new version of the library is used without any conflicts with the old version.
  2. Reload the module: If you have already installed the new version of the library and want to start using it in your code without restarting the cluster, you can try reloading the module using importlib.reload(). This will reload the module and use the new version of the library.
  3. Restart the Python interpreter: You can also try restarting the Python interpreter by using the os.execv() function. This will restart the Python interpreter and load the new version of the library.

However, keep in mind that these options may have some limitations and risks depending on your use case. For example, reloading the module or restarting the Python interpreter may cause some inconsistencies or conflicts with the existing tasks running on the cluster. Additionally, creating a new cluster may increase your overall costs and may not be feasible for all use cases.

priyak
New Contributor III

Thank you for the response. Are there any ways to mitigate the risks of options 2 and 3? Is it possible to check whether the existing tasks are completed and the cluster is an idle state before we attempt to reload the module/restart the interpreter?

priyak
New Contributor III

@Suteja Kanuri​ Any thoughts on the above question?

Anonymous
Not applicable

@Priya K​ :

Yes, there are some ways to mitigate the risks of options 2 and 3:

  1. To mitigate the risks of using option 2 (reloading the module), you can try to limit the scope of the module reload to a specific part of your code that doesn't have any running tasks that use the old version of the library. For example, you can use a try-except block to catch any exceptions that might be raised during the reload process, and then retry the operation later when the cluster is idle.
  2. To mitigate the risks of using option 3 (restarting the Python interpreter), you can try to schedule the restart during a maintenance window when there are no running tasks on the cluster. You can also use a similar try-except block as in option 2 to catch any exceptions that might be raised during the restart process, and then retry the operation later when the cluster is idle.

Regarding your second question, it is possible to check whether the existing tasks are completed and the cluster is idle before attempting to reload the module or restart the interpreter. One way to do this is by monitoring the active task count and cluster utilization using the Databricks REST API or the Databricks CLI. You can also use the Databricks Jobs API to schedule the module reload or interpreter restart during a specific time window when there are no running tasks on the cluster.

priyak
New Contributor III

@Suteja Kanuri​ Thanks for your suggestions. Please combine both your responses, I will mark that as the best answer.

Anonymous
Not applicable

@Priya K​ : Glad that my suggestions are helping you! You could go ahead and mark both as the best answers 🙂 This works just fine.

Anonymous
Not applicable

Hi @Priya K​ 

Thank you for posting your question in our community! We are happy to assist you.

To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your question?

This will also help other community members who may have similar questions in the future. Thank you for your participation and let us know if you need any further assistance! 

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.