Hi @RobinK ,
I've tested your code and I was able to reproduce the error. Unfortunately, I haven't found a pure SQL alternative to selecting the results of the LIST command as part of a subquery or CTE, and create an alert based on that.
Fortunately, the content of a Volume (with the info you need, i.e. path, modification time, etc.) can be obtained via other means.
My assumption here is that not every Workflow run will ingest a new Excel sheet in the Volume? And that is why you need additional logic to know when to send the alert? If that is the case, then my proposition to you is the following:
First, you can use Databricks Utils to list the content of the volume in a Python notebook:
dbutils.fs.ls("/Volumes/<catalog>/<schema>/<volume>/")
Then, you can iterate through the files and get each element's information into a JSON object:
data = [{"path": x.path, "name": x.name, "size": x.size, "modificationTime": x.modificationTime} for x in files_in_volume]
And finally, you can create a dataframe from this JSON object and save it as a table:
df = spark.createDataFrame(data)
df.write.mode("overwrite").saveAsTable("<catalog_name>.<schema_name>.<table_name>")
๐ This could be ran as an additional step to your Databricks Workflow, so that every time the workflow runs, the data table containing the volume content is updated as well.
โ
Once the results are loaded in a table, you can run SELECT queries on it and therefore you can set alerts as well.
If, on the contrary, a new Excel sheet is loaded in the Volume everytime the Workflow runs, then you could simply set an alert on success directly within your workflow.
Hope that can be useful to you.
Thanks,
Gab