We observed the following error in a notebook which was running from a Databricks workflow:
ModuleNotFoundError: No module named '<python package>'
The error message speaks for itself - it obviously couldn't find the python package. What is peculiar is that this is a library that we had manually specified for installation, at the job cluster level. And indeed, when we checked the job cluster settings of this failed job (via the "Edit Details" button under "Compute", then clicking the "Libraries" tab), we verified that the python package (Type "PyPi", for whatever it's worth) is indeed listed there.
We are using Databricks runtime 14.2 (Apache Spark 3.5.0, Scala 2.12)
Our job runs daily, normally runs fine, and since this error has been running fine. This error appears to have been a one-off.
Has anyone else run into the issue? Is this a known issue in Databricks, or with distributed computing in general? Is there anyway to prevent it?