Hello,
I am trying to use the getArgument() function in a spark.sql query. It works fine if I run the notebook via an interactive cluster, but gives an error when executed via a job run in an instance Pool.
query:
OPTIMIZE <table>
where date = replace(regexp_extract(getArgument('folderPath'), '=\\\d{8}', 0), '=', '')
Error:
Exception: Undefined function: getArgument. This function is neither a built-in/temporary function, nor a persistent function that is qualified as spark_catalog.default.getargument.; line 2 pos 49
I can't seem to figure out why this error occurs and how I can solve the issue. As the notebook used for the query is a common notebook used by multiple jobs, I would like to resolve the issue without changing anything to the notebook. A change to the query executed by spark.sql() is possible. Could someone help me out?
Interactive cluster settings:
Driver: Standard_DS3_v2
Workers: Standard_DS3_v2
10.4LTS (includes Apache Spark 3.2.1, Scala 2.12)
Pool settings:
Driver: Standard_DS3_v2
Workers: Standard_DS3_v2
10.4LTS (includes Apache Spark 3.2.1, Scala 2.12)