Hey all,
I'm curious how do teams manage Databricks alerts?
My use case is that I have around 10 Spark workflows, and need to validate their output tables.
My first iteration was to create alerts manually, e.g. define SQL, evaluation criteria, notification email, schedule etc. If anything has to be changed, I would go and modify this alert manually.
This approach doesn't scale well as you might have imagined. With more people in the team, and more workflows, alerts management became a bit chaotic.
I am looking at the DABS approach, to codify alerts and deploy them through CI/CD, but it lack ergonomics in my opinion. The alert definition has notification text and sql embedded, which makes it hard to make changes to it - json is hardly readable
I also would like to understand if I can create/update/delete alerts without interfering with other teams, as they run their workflows and alerts in the same workspace. If I remove one of my alerts from the repository, will DABS deploy command detect and safely remove from the Databricks workspace? I'm aiming for the "single source of truth" model, where what I have in the repo would be reflected in the Databricks.
I would also avoid hardcoding the warehouse id in the alerts definition. It would be great to select it either by tag, or by size.
Could you please share your experience managing alerts in your team?