cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

2021-08-Best-Practices-for-Your-Data-Architecture-v3-OG-1200x628

MadelynM
New Contributor III
New Contributor III

Thanks to everyone who joined the Best Practices for Your Data Architecture session on Getting Workloads to Production using CI/CD. You can access the on-demand session recording here, and the code in the Databricks Labs CI/CD Templates Repo.

Posted below is a subset of the questions asked and answered throughout the session. Please feel free to ask follow-up questions or add comments as threads.

Q: What are examples of scheduling Notebooks with Airflow? 

Check out the blog detailing the integration between Databricks and Airflow and read the docs with examples (AWS | Azure | GCP). Also, take a look at the Multitask Jobs capabilities, which is a Databricks-Native jobs scheduler. 

Q: Will AWS MWAA also work with notebooks?

Yes, the docs show that Databricks Connection is available for AWS MWAA. 

Q: Unit Testing and Integration testing - are there frameworks for testing notebooks?

The session has an example leveraging a framework using Nutter and pytest. Here are a couple of links to the documentation for you to take a look at:

1. https://github.com/microsoft/nutter [integration testing]

2. https://docs.pytest.org/en/6.2.x/ [unit testing]

There certainly are other frameworks depending on what code you're testing and the nature of the tests you are conducting, but we like these frameworks due to the tools’ simplicity and open source nature.

Q: Is it possible to integrate MLFlow to deploy models artifact within this CI/CD process?

Yes, please take a look at this blog, Using MLOps with MLflow and Azure.

Add your follow-up questions to threads!

2 REPLIES 2

Chris_Shehu
Valued Contributor III

Would it be possible to get the power point that was used for this? There are several embedded links that would be beneficial but cannot be accessed from a video. Thanks!

MadelynM
New Contributor III
New Contributor III

Here's the embedded links list!

Jobs scheduling and orchestration

Development interface resources

Testing Code

Source code repository resources

Code promotion resources

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.