Databricks Community

abelian-grape · ‎03-12-2025

I need to automatically trigger a Databricks job whenever a new row is inserted to a Snowflake table. Additionally, I need the job to receive the exact details of the newly inserted row as parameters.

What are the best approaches to achieve this? I’m considering options like Snowflake Streams, AWS EventBridge, Lambda, or any other efficient method that integrates well with Databricks.

Has anyone implemented something similar?

Any guidance or best practices would be greatly appreciated!

ashraf1395 · ‎03-12-2025

I think lamba function/ event bridge would be a good way - You can query your snowflake table there and create logic for any new row insert mabe CDC etc and then you send a job trigger using databricks API / databricks SDK
where you can pass your newly inserted rows as job parameters.
https://docs.databricks.com/api/workspace/jobs/runnow

I don't know the exact purpose of the job but if you want to be in databricks environment completely you can also try lakehouse federation for snowflake.
- this way you can bring your table as foreign catalog in databricks and
- then maybe create a materialized view on top of it with cdf enabled ,
- and set file trigger for the location where the mv table writes and then create your job whenever anything will be written in the mv table your job will trigger.
i don't know how exactly will it work but if you want to be in databricks ecosystem lakehouse federation with snowflake can be given a try or else first option is better.
https://docs.databricks.com/aws/en/query-federation/snowflake

abelian-grape · ‎03-14-2025

Is there any reason no to simply use databricks workflow with a continous trigger to consume from a snowflake stream?

abelian-grape · ‎03-14-2025

You cannot read a change data feed from a materialized view.

According to https://docs.databricks.com/aws/en/views/materialized

ashraf1395 · ‎03-14-2025

Hi @abelian-grape , My bad - it will be either a streaming or managed table

abelian-grape · ‎03-17-2025

what do you mean by "set file trigger for the location where the mv table writes and then create your job whenever anything will be written in the mv table your job will trigger." ?

Isn't the file trigger for external location such as s3? It will work for snowflake foreign catalog?