cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

ApplyInPandas failing at a particular grouped item

mbejarano89
New Contributor III

Hello,

I have a code that performs a forecast for 21k items in parallel. It looks like this:

 def forward_forecast(data):
    model = ETSModel(window_data, error='add', trend='add', seasonal=None)
                fitted_model = model.fit(disp=0)
                # Forecast the missing value
                forecast = fitted_model.forecast(steps=1).values[0]
                # Replace the missing value with the forecasted value
                data.loc[start_index:end_index-1, 'y_hat'] = forecast
                return data
result = data.groupBy("item").applyInPandas(foreward_forecast,schema)

When I run this code with a couple of items it run fine, but when I try using the 21k items, it fails at one item and gives this error: "unsupported operand type(s) for -: 'NoneType' and 'int'"

I am trying to figure out how to troubleshoot it and find  out at which item my function is failing.

Thanks

0 REPLIES 0