cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

ML model promotion from Databricks dev workspace to prod workspace

datastones
New Contributor

Hi everybody. I am relatively new to Databricks. I am working on an ML model promotion process between different Databricks workspaces. I am aware that best practice should be deployment as code (e.g. export the whole training pipeline and model registration from dev to prod workspace, either via terraform export or Databricks assets bundle). However, since the model training process is extremely expensive, so we only wanted to promote and register the model to prod workspace and build our inference pipeline there. I have been considering the following options:

  • Use Open Source MLFlow Export-Import Tool: due to security reasons, this method didn't pass our firm's security review re: the shared dbfs

  • I am aware that models registered in Databricks Unity Catalog (UC) in the prod workspace can be loaded from dev workspace for model comparison/debugging. But to comply with best practices, we restrict access to assets in UC in the dev workspace from prod workspace.

  • Use Remote Registry as the mean to share models: this method is part of the legacy way. The current suggestion from Databricks is to use Unity Catalog (UC) for managing ml life cycle

My current workaround this is as follow via github actions:

  1. Use terraform's experimental resource exporter to export the configuration of the model registered in UC in dev workspace. This config gives me the s3 location of this model that i want to promote to prod workspace.

  2. Using the s3 location output from step 1, I can have another workflow to copy the model object from dev's s3 to prod's s3.

  3. Once I have the model copied into prod's s3 bucket, I can then use terraform to register the model (resource) to the prod's UC

I am wondering if there is any simpler way to directly promote an ML object from 1 workspace to another. Thank you very much for your help!

1 REPLY 1

amr
Contributor III
Contributor III
  • I am aware that models registered in Databricks Unity Catalog (UC) in the prod workspace can be loaded from dev workspace for model comparison/debugging. But to comply with best practices, we restrict access to assets in UC in the dev workspace from prod workspace.


it should be the opposit, the prod workspace, can see the catalogs of the dev workspace, or even better, push these models to a special catalog model_registery.models and then make this visible in dev and prod and contains nothing but the models, no data.

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!