@Gubbanoa wrote:
What is currently the best way of doing unit testing from pycharm into databricks? I have previously used databricks connect. However after upgrades, and now that even unit catalog has become a requirement, it appears quirky. Is it possible to use the new pycharm databricks plugin for running pytests inside databricks? Official Site
Hello,
Unit testing in Databricks can indeed be a bit tricky, especially with recent updates. Here are some current best practices and options for integrating unit testing from PyCharm into Databricks:
Databricks Connect:
While Databricks Connect has been a popular choice, recent updates and requirements like the Unity Catalog have introduced some quirks. If you still prefer using Databricks Connect, ensure you have the latest version and check the Databricks Connect documentation for any new configurations.
PyCharm Databricks Plugin:
The new PyCharm Databricks plugin can be used to run PyTests inside Databricks. This plugin allows you to connect to your Databricks workspace directly from PyCharm, making it easier to manage and run your tests. You can find more details and setup instructions in the PyCharm Databricks plugin documentation.
Unit Testing for Notebooks:
Databricks provides built-in support for unit testing within notebooks. You can organize your functions and their unit tests within the same notebook or in separate notebooks. This approach is detailed in the Databricks documentation.
Using GitHub Repositories:
There are several GitHub repositories that provide sample setups for unit testing in Databricks. For example, the databricks-unit-testing repository offers sample PySpark functions and PyTest unit tests. Another useful tool is Nutter, a testing framework specifically designed for Databricks notebooks.
Best Practices for PySpark:
For PySpark-specific unit testing, you can follow best practices such as creating isolated test environments, using mock data, and integrating tests into CI/CD pipelines. More details can be found in the Databricks documentation.
By leveraging these tools and practices, you can streamline your unit testing process in Databricks and ensure your code remains robust and reliable.
Hope this will help you.
Best regards,