- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-06-2023 03:25 AM
Hi All,
Yesterday (05.07.2023) databricks experienced a near total outage in the West and North Europe regions.
It took the full working day, but they posted an update in the late afternoonstating that a workaround has been put in place and service is back to 100%.
Unfortunatley there are now some issues. My clusters and notebooks are accessible again, and they start. But a lot of python libs are failing to install now due to "user is not the owner of the resource" - although I am.
Additional issues are being seen, such as all my drivers are messed up and not installed properly anymore. I've tried a bunch of things, for example:
- Reconfigure dpkg Database
- Force-Install the Software
- Remove Bad Software Package
- Clean Out Unused Software Packages
- Remove Post Files
Basically, service is back but everything is messed up with my libs and drivers on every cluster. (It was all fine before the outage).
I am going to try anf clone the cluster, run a script to install the libs on the new cluster and see if it works, and if it works, do it for the other clusters.
I want to ask if anyone else is having this issue, and if theres a better way of resolving this. I also don't want to have to do this whenever an outage happens.
The outage only affected Databricks in Azure.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-06-2023 05:12 AM
Dear All, I have resolved the issue by re-installing all libraries and drivers. The reason the drivers were causing me iussues was because my runtime version was updated, and I had no idea. This is probably ehat caused all the issues in the first place.
I assume that, as databricks implemented a "Workaround mitigation" to bring service back to 100% folloeing bad weather destroying fiber cables between datacenters... whatever their solution was seems to have changed my runtime version, causing all my drivers and many libs to have compatibility issues. They should warn us about this possibility in future outages as our service went down longer than was necessary. I hope this helps someone else if they experince the same thing.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-06-2023 03:33 AM
Additional info: the python libs can be reinstalled and then they load. So I have reinstalled them all to get my cluster to load with them.
The driver issues are more tricky, an example driver that is not working is the msodbcsql package.
Error: ('01000', "[01000] [unixODBC][Driver Manager]Can't open lib 'ODBC Driver 17 for SQL Server' : file not found (0) (SQLDriverConnect)")
I have tried to remove it, install it, update it - always get an error.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-06-2023 05:12 AM
Dear All, I have resolved the issue by re-installing all libraries and drivers. The reason the drivers were causing me iussues was because my runtime version was updated, and I had no idea. This is probably ehat caused all the issues in the first place.
I assume that, as databricks implemented a "Workaround mitigation" to bring service back to 100% folloeing bad weather destroying fiber cables between datacenters... whatever their solution was seems to have changed my runtime version, causing all my drivers and many libs to have compatibility issues. They should warn us about this possibility in future outages as our service went down longer than was necessary. I hope this helps someone else if they experince the same thing.

