cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

ApplyInPandas failing at a particular grouped item

mbejarano89
New Contributor III

Hello,

I have a code that performs a forecast for 21k items in parallel. It looks like this:

 def forward_forecast(data):
    model = ETSModel(window_data, error='add', trend='add', seasonal=None)
                fitted_model = model.fit(disp=0)
                # Forecast the missing value
                forecast = fitted_model.forecast(steps=1).values[0]
                # Replace the missing value with the forecasted value
                data.loc[start_index:end_index-1, 'y_hat'] = forecast
                return data
result = data.groupBy("item").applyInPandas(foreward_forecast,schema)

When I run this code with a couple of items it run fine, but when I try using the 21k items, it fails at one item and gives this error: "unsupported operand type(s) for -: 'NoneType' and 'int'"

I am trying to figure out how to troubleshoot it and find  out at which item my function is failing.

Thanks

0 REPLIES 0

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group