cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Pytest on Notebook

vinayaka_pallak
New Contributor

 

I am currently exploring testing methodologies for Databricks notebooks and would like to inquire whether it's possible to write pytest tests for notebooks that contain code not encapsulated within functions or classes.
***********************
a = 4
b = 5
print(a + b)
***********************

Given this scenario, I am interested in understanding whether it's feasible to write pytest tests to validate the functionality of such code blocks within Databricks notebooks.

Any insights or guidance you could provide on this matter would be greatly appreciated.

NOTE: we should not touch the notebook code.
if we can able to just write a test case for the above code.

 

1 REPLY 1

Kaniz_Fatma
Community Manager
Community Manager
Hi @vinayaka_pallak, Testing Databricks Notebooks is essential to ensure the correctness and reliability of your code. While notebooks are often used for exploratory analysis and prototyping, it’s still possible to write tests for code blocks within them.
 
Let’s explore some approaches:
  1. Splitting Code into Functions:

    • To make code in the notebook testable, consider splitting it into separate functions. Create a dedicated notebook where you define these functions. This notebook will serve as your library of reusable code.
    • For example, you can create a function that performs the calculation a + b and call it from your main notebook.
    • This separation allows you to write unit tests specifically for these functions.
  2. Create a Separate Notebook for Tests:

    • In addition to the main notebook, create another notebook specifically for writing tests. This notebook will contain your test cases.
    • In this test notebook, you can use the pytest framework to write and execute tests. pytest is a popular testing framework for Python that provides concise and expressive syntax for writing test cases.
  3. Explicitly Define and Execute Test Suites:

    • In your test notebook, define test functions that exercise different aspects of your code. For instance, you can write a test function that checks whether the result of a + b matches the expected value.
    • Execute the test suite using pytest. It will discover and run all the test functions you’ve defined.
    • If any test fails, pytest will provide detailed information about the failure, making it easier to debug.

Remember that while notebooks are convenient for exploration, separating code into functions and writing tests ensures maintainability and robustness. By following these practices, you can validate the functionality of code blocks without modifying th...1.

Additionally, if you’re using the Databricks extension for Visual Studio Code, you can run pytest on local code that doesn’t need a connection to a cluster in a remote Databricks workspace. This is especially useful for testing functions that work with PySpark DataFrames in local memory23.

Happy testing! Let me know if you have any further questions or need additional guidance. 😊

 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group