cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Unable to use SQL UDF

Avinash_Narala
New Contributor III
Hello,
 
I want to create an sql udf as follows:
%sql
CREATE or replace FUNCTION get_type(s STRING)
  RETURNS STRING
  LANGUAGE PYTHON
  AS $$
    def get_type(table_name):
      from pyspark.sql.functions import col
      from pyspark.sql import SparkSession

      spark = SparkSession.builder.getOrCreate()
      return spark.sql(f'describe extended  {table_name}').filter(col('col_name')=='Type').select('data_type').collect()[0]['data_type']
    return get_type(s) if s else None
  $$
 
I can verify that get_type function is created in my unity catalog. But while accessing it, it's throwing the error and not working as expected. I am attaching the error message as attachment. Can you please help me with this.
 
 
1 REPLY 1

Kaniz
Community Manager
Community Manager

Hi @Avinash_NaralaThe error message indicates that the execution of your user-defined function (UDF) get_type failed. This could be due to a variety of reasons.

Here are a few things you could check:

  1. Data Type Mismatch: Ensure that the data type of the input parameter s in your function matches the data type of object_id#407. If there’s a mismatch, it could cause the function to fail.

  2. Runtime Errors in UDF: Your UDF is written in Python and uses PySpark functions. Make sure there are no runtime errors in your UDF. For example, check if the table_name passed to the get_type function is valid and exists in your Spark session.

  3. Spark Session: The error message indicates that the task failed on the executor. Make sure that your Spark session is correctly initialized and that there are no issues with the executor.

  4. Resource Availability: The error message also mentions that the task failed 4 times. This could be due to resource contention. Check if your Spark job has enough resources (like CPU, memory) to run successfully.

If none of these suggestions help, please provide more details about your Spark job and the context in which this function is being used. This will help me give more specific advice. Remember to redact any sensitive information before sharing it.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.