Databricks Community

Mr_K · ‎05-19-2023

Hello,

forecast_date = '2017-12-01'
spark.conf.set('spark.sql.shuffle.partitions', 500 ) 
 
# generate forecast for this data
forecasts = (
  history
  .where(history.date < forecast_date) # limit training data to prior to our forecast date
  .groupBy('store', 'item', lit(30).alias('days_to_forecast'))
    .applyInPandas(get_forecast, "store integer, item integer, date timestamp, sales float, sales_pred_mean float, sales_pred_lower float, sales_pred_upper float")
    .withColumn('forecast_date', lit(forecast_date).cast(TimestampType())) 
    ).cache()
 
forecast_evals = (
  forecasts      
    .select('forecast_date', 'store', 'item', 'sales', 'sales_pred_mean')
    .where(forecasts.date < forecasts.forecast_date)
    .groupBy('forecast_date', 'store', 'item')
    .applyInPandas(evaluate_forecast, "forecast_date timestamp, store integer, item integer, mse float, rmse float, mae float, mape float")
    )
 
forecast_evals_cv = (
  forecasts      
    .select('forecast_date', 'store', 'item', 'sales', 'sales_pred_mean')
    .where(forecasts.date < forecasts.forecast_date)
    .groupBy('forecast_date', 'store', 'item', lit(30).alias('days_to_forecast'))
    .applyInPandas(evaluate_forecast_cv, "forecast_date timestamp, store integer, item integer, horizon integer, mse float, rmse float, mae float, mape float, mdape float, coverage float")
    )
 
forecasts.createOrReplaceTempView('forecasts_tmp')
forecast_evals.createOrReplaceTempView('forecast_evals_tmp')
forecast_evals_cv.createOrReplaceTempView('forecast_evals_cv_tmp')

When I run the above code, it's throwing error

AnalysisException: [UC_COMMAND_NOT_SUPPORTED] Spark higher-order functions are not supported in Unity Catalog.;

I'm using a shared cluster with 12.2 LTS Databricks Runtime and unity catalog is enabled.

Also getting similar error for faker package.

User16502773013 · ‎08-09-2023

Hello @Mr_K ,

Running applyInPandas on UC enabled cluster is not currently supported.

As an alternative/interim solution we suggest to implement the forecast function as a Spark UDF

For more information on currently supported Python UDF's please check release notes here

Regards

Tharun-Kumar · ‎08-10-2023

@Mr_K

ApplyInPandas is a higher order function in Python. As of now, we do not support higher order functions in Unity Catalog. We do support direct calls made to python UDFs.

Here is an example of how to reference UDFs in UC - https://docs.databricks.com/en/udf/python.html

Databricks Community

AnalysisException: [UC_COMMAND_NOT_SUPPORTED] Spark higher-order functions are not supported in Unity Catalog.;

Connect with Databricks Users in Your Area

Virtual Learning Festival: 9 April - 30 April

Get Started With Lakehouse Architecture | Pass a quiz to earn your certificate completion.

Data + AI Summit 2025 — registration now open!

Databricks DevConnect: Global Community Meetups for Data Engineers

Databricks Community Champion - February 2025 - Stefan Koch