Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
I am trying to update libraries on Data bricks cluster by uninstalling existing libraries and installing new ones. When I make api call to uninstall libraries and restart cluster, libraries first show as they are being uninstalled but after cluster comes back into Running state uninstalled libraries gets installed back to the cluster. I have seen similar behavior when I uninstall and restart from Data bricks UI as well. This behavior is intermittent and doesn't happen consistently. I have seen some instances where libraries stay uninstalled.
Notes :
1. There are no init_scripts in the cluster
2. "is_library_for_all_clusters" is set to false when checking for libraries/cluster-status
3. This is happening for both Python and Scala libraries
4. Runtime being used is 12.2LTS
As you can see in the screenshots attached, I have 3 libraries installed initially. I try to uninstall both PyPi libraries and restart. After the restart, we can see that one of the library got installed back on the cluster.
maybe you can add some time between uninstalling of libraries and restart
uninstall_libraries()
time.sleep(60) # Wait for the libraries to be uninstalled
restart_cluster()
Connect with Databricks Users in Your Area
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.