How does one access/use SparkSQL functions like array_size?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-29-2022 04:51 PM
The following doesn't work for me:
%sql
SELECT user_id, array_size(education) AS edu_cnt
FROM users
ORDER BY edu_cnt DESC
LIMIT 10;
I get an error saying:
Error in SQL statement: AnalysisException: Undefined function: array_size. This function is neither a built-in/temporary function, nor a persistent function that is qualified as spark_catalog.default.array_size.; line 1 pos 16
The documentation pretty clearly says this function exists. Help?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-30-2022 12:17 AM
https://docs.databricks.com/spark/latest/spark-sql/language-manual/functions/array_size.html
Since: Databricks Runtime 10.5
So you might wanna check your databricks version.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-30-2022 10:11 AM
Thanks! I was incorrectly thinking the Community Edition clusters would be whatever the latest was; not the case, what I was getting was older (dated back to a year or so ago when I was last working with Spark). I can use the functions now. 🙂

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-28-2022 10:32 AM
Hey there @Michael Carey
Hope everything is going great!
We are glad to hear that you were able to find a solution to your question. Would you be happy to mark an answer as best so that other members can find the solution more quickly?
Cheers!

