cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

Training Job Failure (Driver Error)

jonathanhodges
New Contributor II

We have a new model training job that was running fine for a few days and then started failing. I have attached images for more details.

I am wondering if 'can't reach driver cluster' is a red herring. It says the driver is healthy right before execution

When I look into the logs, it looks like a library problem potentially with numpy.

ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject
Traceback (most recent call last):
from pandas._libs.interval import Interval
File "pandas/_libs/interval.pyx", line 1, in init pandas._libs.interval

 

Has anyone seen this before and have any ideas or suggestions?

4 REPLIES 4

jonathanhodges
New Contributor II

Sorry here is the full stack trace and one additional screen shot:

ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject
Traceback (most recent call last):
File "/databricks/python_shell/scripts/db_ipykernel_launcher.py", line 37, in <module>
from dbruntime.PipMagicOverrides import PipMagicOverrides
File "/databricks/python_shell/dbruntime/PipMagicOverrides.py", line 8, in <module>
from pyspark.sql.connect.session import SparkSession as RemoteSparkSession
File "/databricks/spark/python/pyspark/sql/connect/session.py", line 19, in <module>
check_dependencies(__name__)
File "/databricks/spark/python/pyspark/sql/connect/utils.py", line 33, in check_dependencies
require_minimum_pandas_version()
File "/databricks/spark/python/pyspark/sql/pandas/utils.py", line 27, in require_minimum_pandas_version
import pandas
File "/databricks/python/lib/python3.10/site-packages/pandas/__init__.py", line 22, in <module>
from pandas.compat import is_numpy_dev as _is_numpy_dev
File "/databricks/python/lib/python3.10/site-packages/pandas/compat/__init__.py", line 15, in <module>
from pandas.compat.numpy import (
File "/databricks/python/lib/python3.10/site-packages/pandas/compat/numpy/__init__.py", line 4, in <module>
from pandas.util.version import Version
File "/databricks/python/lib/python3.10/site-packages/pandas/util/__init__.py", line 1, in <module>
from pandas.util._decorators import ( # noqa:F401
File "/databricks/python/lib/python3.10/site-packages/pandas/util/_decorators.py", line 14, in <module>
from pandas._libs.properties import cache_readonly # noqa:F401
File "/databricks/python/lib/python3.10/site-packages/pandas/_libs/__init__.py", line 13, in <module>
from pandas._libs.interval import Interval
File "pandas/_libs/interval.pyx", line 1, in init pandas._libs.interval

Charuvil
New Contributor II

@jonathanhodges By any chance did you manage to solve this issue?

We are also having same issue

jonathanhodges
New Contributor II

In our case, we needed to correct our dependent libraries. We had an incorrect path referenced.

Same in my case as well. It was related to Numpy version 2. Reducing the version fixed the issue. Error messages from Databricks were pretty confusing.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group