Hello,
I am trying to run PyTest from a notebook or python file that exists due to being deployed by a Databricks Asset Bundle (DAB).
I have a repository that contains a number of files with the end goal of trying to run PyTest in a directory to validate my code.
I shall explain the structure of the repo and the steps to reproduce the issue but in essence I am seeing different behavior from the same code when running in the `/Workspace/Repos/USER_EMAIL/REPO_NAME/NOTEBOOK_FILE` and running in `/Workspace/Users/USER_EMAIL/.bundle/BUNDLE_NAME/dev/files/NOTEBOOK_FILE`
When running in the repos folder I am able to run the NOTEBOOK_FILE that runs pytest and see the tests passing result. When running in the DAB folder I am able to run the NOTEBOOK_FILE that runs pytest and get the error
________________________ ERROR collecting spark_test.py ________________________
ImportError while importing test module '/Workspace/Users/USER_EMAIL/.bundle/any-name-you-want/dev/files/spark_test.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/lib/python3.10/importlib/__init__.py:126: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
E ModuleNotFoundError: No module named 'spark_test'
Files and folder Structure
The files and folder structure in the repo:
REPO_NAME/
āāā execute_pytest.py
āāā execute_pytest_nb.py
āāā databricks.yml
āāā spark_test.py
execute_pytest.py
import pytest
import os
import sys
# Skip writing pyc files on a readonly filesystem.
sys.dont_write_bytecode = True
# Run pytest.
retcode = pytest.main([".", "-v", "-p", "no:cacheprovider"])
# Fail the cell execution if there are any test failures.
assert retcode == 0, "The pytest invocation failed. See the log for details."
execute_pytest_nb.py
# Databricks notebook source
import pytest
import os
import sys
# Skip writing pyc files on a readonly filesystem.
sys.dont_write_bytecode = True
# Run pytest.
retcode = pytest.main([".", "-v", "-p", "no:cacheprovider"])
# Fail the cell execution if there are any test failures.
assert retcode == 0, "The pytest invocation failed. See the log for details."
spark_test.py
from pyspark.sql import SparkSession
import pytest
@pytest.fixture
def spark() -> SparkSession:
# Create a SparkSession (the entry point to Spark functionality) on
# the cluster in the remote Databricks workspace. Unit tests do not
# have access to this SparkSession by default.
return SparkSession.builder.getOrCreate()
# COMMAND ----------
def test_scenario_a(spark):
assert 1==1
databricks.yml
bundle:
name: any-name-you-want
targets:
# The 'dev' target, used for development purposes.
# Whenever a developer deploys using 'dev', they get their own copy.
dev:
# We use 'mode: development' to make sure everything deployed to this target gets a prefix
# like '[dev my_user_name]'. Setting this mode also disables any schedules and
# automatic triggers for jobs and enables the 'development' mode for Delta Live Tables pipelines.
mode: development
default: true
workspace:
host: https://adb-XXXXXXXXXXXXXXXXX.azuredatabricks.net
Cluster Specs:
DBR: 14.3 LTS ML
Libraries: PyPI PyTest
Steps to reproduce working pytest in databricks repos:
- Create the repo into the workspace at `/Workspace/Repos/USER_EMAIL/REPO_NAME/` location.
- Open the "execute_pytest.py" file which should now exist at "/Workspace/Repos/USER_EMAIL/REPO_NAME/execute_pytest.py"
- Attach the cluster and run all.
Steps to reproduce failing pytest in databricks DAB:
- Clone the repo to your local computer
- In the root of the repo open a terminal and run `databricks bundle deploy` (assuming you have databricks-cli already installed and configured for the workspace)
- In the workspace navigate to the notebook "execute_pytest.py" which should now exist at "/Workspace/Users/USER_EMAIL/.bundle/any-name-you-want/dev/files/execute_pytest.py"
- Attach the cluster and run all.
Things that have been tried:
- I have tested that the same outcome happens regardless of using a python file or a notebook. That is why the repo contains both "execute_pytest.py" (Python file) and "execute_pytest_nb.py" (Notebook).
- Adding the CWD to the sys.path As referenced here . I have also tried this with the pytest.ini file As Referenced Here
- I have tried different file names