null object while running a query against parquet
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-24-2024 10:39 AM
I am running this query against parquet:
SELECT
SUM(CASE WHEN match_result.ecommerce.has_online_payments THEN 1 ELSE 0 END)
FROM parquet.`s3://folder_path/*`
when all the values of the object `match_result.ecommerce` are null, I get the following error:
[INVALID_EXTRACT_BASE_FIELD_TYPE] Can't extract a value from "match_result.ecommerce". Need a complex type [STRUCT, ARRAY, MAP] but got "INT". SQLSTATE: 42000; line 2, pos 14
How can I fix this issue?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-24-2024 01:08 PM
Without additional context, my assumption is that since all the values are null, it's impossible to determine the column's type. Do you have a default value? You could consider coalescing with that value. For example, if the default value is false:
SELECT
SUM(
CASE WHEN COALESCE(
CAST(match_result.ecommerce.has_online_payments AS BOOLEAN),
FALSE
) THEN
1
ELSE
0
END
)
FROM
parquet.`s3://folder_path/*`
Otherwise you can handle the null values as @Retired_mod suggested.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-24-2024 01:09 PM
@Anonymous Thank you but the issue persists even if I add a null check:
SELECT
SUM(CASE WHEN match_result IS NOT NULL AND match_result.ecommerce IS NOT NULL AND match_result.ecommerce.has_online_payments THEN 1 ELSE 0 END)
FROM parquet.`s3://folder_path/*`
the values in this particular file for "match_result.ecommerce" are null but they are not always null for all files that I run the query against. How can I tell specify that the condition should evaluate to 0 if the value of "match_result.ecommerce" is null instead of erroring out?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-25-2024 04:44 AM
None of these solutions with coalesce work because it's "match_result.ecommerce" that is null not "match_result.ecommerce.has_online_payments". So it's still trying to extract a value from a null. Help me modify the query accordingly please.

