cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Trigger a Databricks Job When there is an insert to a Snowflake Table?

abelian-grape
New Contributor III

I need to automatically trigger a Databricks job whenever a new row is inserted to a Snowflake table. Additionally, I need the job to receive the exact details of the newly inserted row as parameters.

What are the best approaches to achieve this? I’m considering options like Snowflake Streams,  AWS EventBridge, Lambda, or any other efficient method that integrates well with Databricks.

Has anyone implemented something similar?

Any guidance or best practices would be greatly appreciated!

5 REPLIES 5

ashraf1395
Honored Contributor

I think lamba function/ event bridge would be a good way - You can query your snowflake table there and create logic for any new row insert mabe CDC etc and then you send a job trigger using databricks API / databricks SDK 
where you can pass your newly inserted rows as job parameters.
https://docs.databricks.com/api/workspace/jobs/runnow

I don't know the exact purpose of the job but if you want  to be in databricks environment completely you can also try lakehouse federation for snowflake.
- this way you can bring your table as foreign catalog in databricks and
- then maybe create a materialized view on top of it with cdf enabled ,
- and set file trigger for the location where the mv table writes and then create your job whenever anything will be written in the mv table your job will trigger. 
i don't know how exactly will it work but if you want to be in databricks ecosystem lakehouse federation with snowflake can be given a try or else first option is better.
https://docs.databricks.com/aws/en/query-federation/snowflake

Is there any reason no to simply use databricks workflow with a continous trigger to consume from a snowflake stream?

According to https://docs.databricks.com/aws/en/views/materialized

Hi @abelian-grape  , My bad - it will be either a streaming or managed table

what do you mean by "set file trigger for the location where the mv table writes and then create your job whenever anything will be written in the mv table your job will trigger." ? 

Isn't the file trigger for external location such as s3? It will work for snowflake foreign catalog?