cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

Throwing IndexoutofBound Exception in Pyspark

Santhanalakshmi
New Contributor II

Hello All,

I am trying to read the data and trying to group the data in order to pass it to predict function via @F.pandas_udf method.

#Loading Model 
pkl_model = pickle.load(open(filepath,'rb')) 
 
 # build schema for output labels
 filter_schema=[]
  t = T.StructField("anomaly_prediction", T.IntegerType(),True)
  filter_schema.append(t)         
  
  t1 = T.StructField("anomaly_score", T.DoubleType(),True)
  filter_schema.append(t1)         
  
  return_schema = T.StructType(df.select(df.columns).schema.fields+filter_schema)                                       
 
  @F.pandas_udf(return_schema, F.PandasUDFType.GROUPED_MAP)
  def inferdata(data):
    dt = data[labelnames].to_numpy()
    #dt = np.asarray(dt).astype('float64')
    score, pred = pkl_model.predict(dt)
    print('score and prediction is ',score, pred)
    data["anomaly_prediction"] = pred
    data["anomaly_score"] = score
    return(data)
  
  df = df.groupby('filename').apply(inferdata)
  print(df.show(2))

But it is throwing an error:

"java.lang.IndexOutOfBoundsException: index: 16384, length: 4 (expected: range(0, 16384))"

error_db 

error_2_dberror_3_db 

I have attached the code snippet and error images for your reference. I have been stuck with this problem for a week.

Could anybody please help me to resolve this issue?

3 REPLIES 3

AmanSehgal
Honored Contributor III

You might have to share the code above the cell. Please paste the code using code editor and not as an image..

Thanks I have updated the code in the cell

Vindhya
New Contributor II

@Santhanalakshmi Manoharan​  Was this issue resolved, Am also getting same error, any guidance would be of great help.

Appreciate your help.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group