cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

2021-08-Best-Practices-for-Your-Data-Architecture-v3-OG-1200x628

MadelynM
Databricks Employee
Databricks Employee

Thanks to everyone who joined the Best Practices for Your Data Architecture session on Getting Workloads to Production using CI/CD. You can access the on-demand session recording here, and the code in the Databricks Labs CI/CD Templates Repo.

Posted below is a subset of the questions asked and answered throughout the session. Please feel free to ask follow-up questions or add comments as threads.

Q: What are examples of scheduling Notebooks with Airflow? 

Check out the blog detailing the integration between Databricks and Airflow and read the docs with examples (AWS | Azure | GCP). Also, take a look at the Multitask Jobs capabilities, which is a Databricks-Native jobs scheduler. 

Q: Will AWS MWAA also work with notebooks?

Yes, the docs show that Databricks Connection is available for AWS MWAA. 

Q: Unit Testing and Integration testing - are there frameworks for testing notebooks?

The session has an example leveraging a framework using Nutter and pytest. Here are a couple of links to the documentation for you to take a look at:

1. https://github.com/microsoft/nutter [integration testing]

2. https://docs.pytest.org/en/6.2.x/ [unit testing]

There certainly are other frameworks depending on what code you're testing and the nature of the tests you are conducting, but we like these frameworks due to the tools’ simplicity and open source nature.

Q: Is it possible to integrate MLFlow to deploy models artifact within this CI/CD process?

Yes, please take a look at this blog, Using MLOps with MLflow and Azure.

Add your follow-up questions to threads!

2 REPLIES 2

Chris_Shehu
Valued Contributor III

Would it be possible to get the power point that was used for this? There are several embedded links that would be beneficial but cannot be accessed from a video. Thanks!

MadelynM
Databricks Employee
Databricks Employee

Here's the embedded links list!

Jobs scheduling and orchestration

Development interface resources

Testing Code

Source code repository resources

Code promotion resources

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group