cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

Databricks SDK vs bundles

Dali1
New Contributor III

Hello,

In this article: https://www.databricks.com/blog/from-airflow-to-lakeflow-data-first-orchestration

I understand that if I want to create and deploy ml pipeline in production the recommandation is to use databricks asset bundles. 
But by using it and databricks sdk the second one is easier to write. Why not directly using databricks sdk for creating ml pipeline using lakeflow jobs and using it also in prod ?

1 ACCEPTED SOLUTION

Accepted Solutions

szymon_dybczak
Esteemed Contributor III

Hi @Dali1 ,

When you deploy with Asset Bundles, DABk keeps track of what’s already been deployed and what has changed. That means:

  • it only updates what needs updating,

  • detects drift between your desired state and the workspace,

  • lets you generate plans/diffs,

  • and reduces deployment errors.

It you've worked with Terraform is the same concept (in fact, under the hood DABs are using terraform).  

SDK calls by themselves are stateless: if you run the same API calls over and over, you’re responsible for tracking what exists or changed - this becomes complex as your pipelines grow.

View solution in original post

1 REPLY 1

szymon_dybczak
Esteemed Contributor III

Hi @Dali1 ,

When you deploy with Asset Bundles, DABk keeps track of what’s already been deployed and what has changed. That means:

  • it only updates what needs updating,

  • detects drift between your desired state and the workspace,

  • lets you generate plans/diffs,

  • and reduces deployment errors.

It you've worked with Terraform is the same concept (in fact, under the hood DABs are using terraform).  

SDK calls by themselves are stateless: if you run the same API calls over and over, you’re responsible for tracking what exists or changed - this becomes complex as your pipelines grow.