cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Warehousing & Analytics
Engage in discussions on data warehousing, analytics, and BI solutions within the Databricks Community. Share insights, tips, and best practices for leveraging data for informed decision-making.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

null object while running a query against parquet

Shaimaa
New Contributor II

I am running this query against parquet:

SELECT 
SUM(CASE WHEN match_result.ecommerce.has_online_payments THEN 1 ELSE 0 END)
FROM parquet.`s3://folder_path/*`

when all the values of the object `match_result.ecommerce` are null, I get the following error:

[INVALID_EXTRACT_BASE_FIELD_TYPE] Can't extract a value from "match_result.ecommerce". Need a complex type [STRUCT, ARRAY, MAP] but got "INT". SQLSTATE: 42000; line 2, pos 14

How can I fix this issue?

5 REPLIES 5

Kaniz_Fatma
Community Manager
Community Manager

Hi @Shaimaa,  The column "match_result.ecommerce" is not a complex type (like a struct, array, or map), but the query is trying to extract a value from it. Since itโ€™s not a complex type, you canโ€™t directly access its subfields.

To address this issue, you can modify your query to handle the null values in the "match_result.ecommerce" column. One approach is to use a CASE statement to check for null values before attempting to extract any subfields.

giuseppegrieco
New Contributor III

Without additional context, my assumption is that since all the values are null, it's impossible to determine the column's type. Do you have a default value? You could consider coalescing with that value. For example, if the default value is false:

 

SELECT 
  SUM(
    CASE WHEN COALESCE(
      CAST(match_result.ecommerce.has_online_payments AS BOOLEAN),
      FALSE
    ) THEN
      1 
    ELSE 
      0 
    END
  )
FROM
  parquet.`s3://folder_path/*`

Otherwise you can handle the null values as @Kaniz_Fatma suggested.

 

Shaimaa
New Contributor II

@Anonymous Thank you but the issue persists even if I add a null check:

SELECT 
SUM(CASE WHEN match_result IS NOT NULL AND match_result.ecommerce IS NOT NULL AND match_result.ecommerce.has_online_payments THEN 1 ELSE 0 END) 
FROM parquet.`s3://folder_path/*`

the values in this particular file for "match_result.ecommerce" are null but they are not always null for all files that I run the query against. How can I tell specify that the condition should evaluate to 0 if the value of "match_result.ecommerce" is null instead of erroring out?

Hi @ShaimaaI apologize for the oversight. Letโ€™s modify the query to handle null values in a way that evaluates to 0 when "match_result.ecommerce" is null. We can achieve this by using the COALESCE function, which returns the first non-null value from a list of expressions

SELECT 
  SUM(CASE WHEN COALESCE(match_result.ecommerce.has_online_payments, 0) = 1 THEN 1 ELSE 0 END)
FROM parquet.`s3://folder_path/*`

In this modified query:

  • The COALESCE(match_result.ecommerce.has_online_payments, 0) expression checks if match_result.ecommerce.has_online_payments is null. If itโ€™s null, it returns 0; otherwise, it returns the actual value.
  • We then compare the result with 1 to determine whether online payments are available.
  • If the condition is met, we add 1 to the sum; otherwise, we add 0.

This approach ensures that null values are treated as 0, preventing errors. Feel free to test this query, and let me know if it resolves the issue! ๐Ÿ˜Š๐Ÿš€

If you have any further questions or need additional assistance, feel free to ask!

Shaimaa
New Contributor II

None of these solutions with coalesce work because it's "match_result.ecommerce" that is null not "match_result.ecommerce.has_online_payments". So it's still trying to extract a value from a null. Help me modify the query accordingly please. 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group