<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: PyTest working in Repos but not in Databricks Asset Bundles in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/pytest-working-in-repos-but-not-in-databricks-asset-bundles/m-p/76860#M35347</link>
    <description>&lt;P&gt;Hello &lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/9"&gt;@Retired_mod&lt;/a&gt;,&lt;BR /&gt;&lt;BR /&gt;Thankyou for your response. I am aware of what the error message means and that is exactly why I am requesting support. The same code deployed to two different locations in a workspace working differently is what I am trying to understand. Have you tried to replicate the issue ? I have supplied all of the necessary code to prove this.&lt;/P&gt;&lt;P&gt;I assume it will result in a pathing issue as I can rule out the directory structure being incorrect due to the code working when deployed to a Databricks Repo but not working when being deployed as a Databricks Asset Bundle.&lt;BR /&gt;I look forward to your response.&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Fri, 05 Jul 2024 08:17:32 GMT</pubDate>
    <dc:creator>ChrisLawford</dc:creator>
    <dc:date>2024-07-05T08:17:32Z</dc:date>
    <item>
      <title>PyTest working in Repos but not in Databricks Asset Bundles</title>
      <link>https://community.databricks.com/t5/data-engineering/pytest-working-in-repos-but-not-in-databricks-asset-bundles/m-p/76612#M35280</link>
      <description>&lt;P&gt;Hello,&lt;BR /&gt;I am trying to run PyTest from a notebook or python file that exists due to being deployed by a Databricks Asset Bundle (DAB).&lt;BR /&gt;I have a repository that contains a number of files with the end goal of trying to run PyTest in a directory to validate my code.&lt;BR /&gt;I shall explain the structure of the repo and the steps to reproduce the issue but in essence I am seeing different behavior from the same code when running in the `&lt;SPAN&gt;/Workspace/Repos/USER_EMAIL/REPO_NAME/NOTEBOOK_FILE` and running in `/Workspace/Users/USER_EMAIL/.bundle/BUNDLE_NAME/dev/files/NOTEBOOK_FILE`&lt;BR /&gt;&lt;/SPAN&gt;When running in the repos folder I am able to run the NOTEBOOK_FILE that runs pytest and see the tests passing result. When running in the DAB folder I am&amp;nbsp;able to run the NOTEBOOK_FILE that runs pytest and get the error&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;________________________ ERROR collecting spark_test.py ________________________
ImportError while importing test module '/Workspace/Users/USER_EMAIL/.bundle/any-name-you-want/dev/files/spark_test.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/lib/python3.10/importlib/__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
E   ModuleNotFoundError: No module named 'spark_test'&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;H2&gt;Files and folder Structure&lt;/H2&gt;&lt;P&gt;The files and folder structure in the repo:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;REPO_NAME/
├── execute_pytest.py
├── execute_pytest_nb.py
├── databricks.yml
└── spark_test.py&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;execute_pytest.py&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;import pytest
import os
import sys
    
# Skip writing pyc files on a readonly filesystem.
sys.dont_write_bytecode = True
    
# Run pytest.
retcode = pytest.main([".", "-v", "-p", "no:cacheprovider"])
# Fail the cell execution if there are any test failures.
assert retcode == 0, "The pytest invocation failed. See the log for details."&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;execute_pytest_nb.py&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;# Databricks notebook source
import pytest
import os
import sys

# Skip writing pyc files on a readonly filesystem.
sys.dont_write_bytecode = True
    
# Run pytest.
retcode = pytest.main([".", "-v", "-p", "no:cacheprovider"])
# Fail the cell execution if there are any test failures.
assert retcode == 0, "The pytest invocation failed. See the log for details."&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;spark_test.py&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;from pyspark.sql import SparkSession
import pytest
@pytest.fixture
def spark() -&amp;gt; SparkSession:
  # Create a SparkSession (the entry point to Spark functionality) on
  # the cluster in the remote Databricks workspace. Unit tests do not
  # have access to this SparkSession by default.
  return SparkSession.builder.getOrCreate()


# COMMAND ----------

def test_scenario_a(spark):
    assert 1==1&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;databricks.yml&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;bundle:
  name: any-name-you-want

targets:
  # The 'dev' target, used for development purposes.
  # Whenever a developer deploys using 'dev', they get their own copy.
  dev:
    # We use 'mode: development' to make sure everything deployed to this target gets a prefix
    # like '[dev my_user_name]'. Setting this mode also disables any schedules and
    # automatic triggers for jobs and enables the 'development' mode for Delta Live Tables pipelines.
    mode: development
    default: true
    workspace:
      host: https://adb-XXXXXXXXXXXXXXXXX.azuredatabricks.net&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;H2&gt;Cluster Specs:&lt;/H2&gt;&lt;P&gt;DBR: 14.3 LTS ML&lt;/P&gt;&lt;P&gt;Libraries: PyPI PyTest&lt;/P&gt;&lt;H2&gt;Steps to reproduce working pytest in databricks repos:&lt;/H2&gt;&lt;OL&gt;&lt;LI&gt;Create the repo into the workspace at&amp;nbsp;`&lt;SPAN&gt;/Workspace/Repos/USER_EMAIL/REPO_NAME/` location.&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN&gt;Open the "execute_pytest.py" file which should now exist at&amp;nbsp;"/Workspace/Repos/USER_EMAIL/REPO_NAME/execute_pytest.py"&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN&gt;Attach the cluster and run all.&lt;/SPAN&gt;&lt;/LI&gt;&lt;/OL&gt;&lt;H2&gt;Steps to reproduce failing pytest in databricks DAB:&lt;/H2&gt;&lt;OL&gt;&lt;LI&gt;Clone the repo to your local computer&lt;/LI&gt;&lt;LI&gt;In the root of the repo open a terminal and run `databricks bundle deploy` (assuming you have databricks-cli already installed and configured for the workspace)&lt;/LI&gt;&lt;LI&gt;In the workspace navigate to the notebook "execute_pytest.py" which should now exist at "&lt;SPAN&gt;/Workspace/Users/USER_EMAIL/.bundle/any-name-you-want/dev/files/execute_pytest.py"&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN&gt;Attach the cluster and run all.&lt;/SPAN&gt;&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;&lt;SPAN&gt;Things that have been tried:&lt;/SPAN&gt;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;SPAN&gt;I have tested that the same outcome happens regardless of using a python file or a notebook. That is why the repo contains both "execute_pytest.py" (Python file) and&amp;nbsp;"execute_pytest_nb.py" (Notebook).&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;Adding the CWD to the sys.path&amp;nbsp;&lt;A href="https://docs.databricks.com/en/files/workspace-modules.html#import-python-and-r-modules" target="_blank" rel="noopener"&gt;As referenced here&lt;/A&gt;&lt;SPAN&gt;&amp;nbsp;. I have also tried this with the pytest.ini file&amp;nbsp;&lt;/SPAN&gt;&lt;A href="https://stackoverflow.com/a/50610630" target="_blank" rel="noopener"&gt;As Referenced Here&lt;/A&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN&gt;I have tried different file names&lt;/SPAN&gt;&lt;/LI&gt;&lt;/UL&gt;</description>
      <pubDate>Wed, 03 Jul 2024 11:46:07 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/pytest-working-in-repos-but-not-in-databricks-asset-bundles/m-p/76612#M35280</guid>
      <dc:creator>ChrisLawford</dc:creator>
      <dc:date>2024-07-03T11:46:07Z</dc:date>
    </item>
    <item>
      <title>Re: PyTest working in Repos but not in Databricks Asset Bundles</title>
      <link>https://community.databricks.com/t5/data-engineering/pytest-working-in-repos-but-not-in-databricks-asset-bundles/m-p/76860#M35347</link>
      <description>&lt;P&gt;Hello &lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/9"&gt;@Retired_mod&lt;/a&gt;,&lt;BR /&gt;&lt;BR /&gt;Thankyou for your response. I am aware of what the error message means and that is exactly why I am requesting support. The same code deployed to two different locations in a workspace working differently is what I am trying to understand. Have you tried to replicate the issue ? I have supplied all of the necessary code to prove this.&lt;/P&gt;&lt;P&gt;I assume it will result in a pathing issue as I can rule out the directory structure being incorrect due to the code working when deployed to a Databricks Repo but not working when being deployed as a Databricks Asset Bundle.&lt;BR /&gt;I look forward to your response.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 05 Jul 2024 08:17:32 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/pytest-working-in-repos-but-not-in-databricks-asset-bundles/m-p/76860#M35347</guid>
      <dc:creator>ChrisLawford</dc:creator>
      <dc:date>2024-07-05T08:17:32Z</dc:date>
    </item>
    <item>
      <title>Re: PyTest working in Repos but not in Databricks Asset Bundles</title>
      <link>https://community.databricks.com/t5/data-engineering/pytest-working-in-repos-but-not-in-databricks-asset-bundles/m-p/95326#M39081</link>
      <description>&lt;P&gt;Hey, Chris. Did you ever get this working? Same issue here.&lt;/P&gt;</description>
      <pubDate>Mon, 21 Oct 2024 16:21:24 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/pytest-working-in-repos-but-not-in-databricks-asset-bundles/m-p/95326#M39081</guid>
      <dc:creator>538014</dc:creator>
      <dc:date>2024-10-21T16:21:24Z</dc:date>
    </item>
    <item>
      <title>Re: PyTest working in Repos but not in Databricks Asset Bundles</title>
      <link>https://community.databricks.com/t5/data-engineering/pytest-working-in-repos-but-not-in-databricks-asset-bundles/m-p/95392#M39092</link>
      <description>&lt;P&gt;I think you need to wrap your code into a python wheel file:&amp;nbsp;&lt;A href="https://docs.databricks.com/en/dev-tools/bundles/python-wheel.html#create-the-bundle-manually" target="_blank"&gt;Develop a Python wheel file using Databricks Asset Bundles | Databricks on AWS&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 21 Oct 2024 23:20:53 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/pytest-working-in-repos-but-not-in-databricks-asset-bundles/m-p/95392#M39092</guid>
      <dc:creator>uzi49</dc:creator>
      <dc:date>2024-10-21T23:20:53Z</dc:date>
    </item>
    <item>
      <title>Re: PyTest working in Repos but not in Databricks Asset Bundles</title>
      <link>https://community.databricks.com/t5/data-engineering/pytest-working-in-repos-but-not-in-databricks-asset-bundles/m-p/113464#M44543</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/110373"&gt;@ChrisLawford&lt;/a&gt;&amp;nbsp;&lt;BR /&gt;You can run pytest through job&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;databricks bundle run -t dev pytest_job&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I was able to work around in this way.&lt;/P&gt;&lt;P&gt;resource/pytest.job.yml&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;resources:
  jobs:
    pytest_job:
      name: pytest_job

      tasks:
        - task_key: pytest_task
          notebook_task:
            notebook_path: src/pytest
&lt;/LI-CODE&gt;&lt;P&gt;src/pytest.ipynb&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;# pytest.main runs our tests directly in the notebook environment, providing
# fidelity for Spark and other configuration variables.
#
# A limitation of this approach is that changes to the test will be
# cache by Python's import caching mechanism.
#
# To iterate on tests during development, we restart the Python process 
# and thus clear the import cache to pick up changes.
dbutils.library.restartPython()

import pytest
import os
import sys

# Run all tests in the repository root.
notebook_path = dbutils.notebook.entry_point.getDbutils().notebook().getContext().notebookPath().get()
repo_root = os.path.dirname(os.path.dirname(notebook_path))
os.chdir(f'/Workspace/{repo_root}')
%pwd

# Skip writing pyc files on a readonly filesystem.
sys.dont_write_bytecode = True

retcode = pytest.main(["./tests/test_sample.py", "-p", "no:cacheprovider"])

# Fail the cell execution if we have any test failures.
assert retcode == 0, 'The pytest invocation failed. See the log above for details.'&lt;/LI-CODE&gt;&lt;P&gt;tests/test_sample.py&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;def test_aa():
    assert True&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 25 Mar 2025 05:47:09 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/pytest-working-in-repos-but-not-in-databricks-asset-bundles/m-p/113464#M44543</guid>
      <dc:creator>cinyoung</dc:creator>
      <dc:date>2025-03-25T05:47:09Z</dc:date>
    </item>
  </channel>
</rss>

