cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

PythonException: 'RuntimeError: The length of output in Scalar iterator pandas UDF should be the same with the input's; however, the length of output was 1 and the length of input was 2.'.

Ancil
Contributor II

I have pandas_udf, its working for 1 rows, but I tried with more than one rows getting below error.

PythonException: 'RuntimeError: The length of output in Scalar iterator pandas UDF should be the same with the input's; however, the length of output was 1 and the length of input was 2.'.

Code

@func.pandas_udf(StringType())
 
def find_data(inputs : Iterator[pd.Series]) -> Iterator[pd.Series]:
 
       for input in inputs :
 
           --doing logic have for loop, if etc.
 
           yield pd.Series(str(result_json))
 
 
 
df = df.withColumn("outData",find_data("inputData"))

1 ACCEPTED SOLUTION

Accepted Solutions

Hubert-Dudek
Esteemed Contributor III

I was testing, and your function is correct. So you need to have an error in inputData

type (is all string) or with result_json. Please also check the runtime version. I was using 11 LTS.image.png 

View solution in original post

3 REPLIES 3

Hubert-Dudek
Esteemed Contributor III

I was testing, and your function is correct. So you need to have an error in inputData

type (is all string) or with result_json. Please also check the runtime version. I was using 11 LTS.image.png 

Thanks @Hubert Dudek​ . Let me check the version.

Hi @Hubert Dudek​ 

I tried, hard coded data Frame input data , its working as expected.

But if am loading same data from file getting above mentioned error, do you have any idea.