cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

custom python module not found while using dbx on pycharm

sasidhar
New Contributor II

Am new to databricks and pyspark. Building a pyspark application using pycharm IDE. I have tested the code in local and wanted to run on databricks cluster from IDE itself. Following the dbx documentation and able to run the single python file successfully. However, I have some custom python modules developed and calling some functions from those modules in the main python file. Am getting module not found error in this case. Could anyone please assist me here ?

Below is my python project structure

Databricks

apps

test.py

__init__.py

utils

GenericUtils.py

__init__.py

__init__.py

Am importing GenericUtils into my main python file which is test.py and below is the error

#############################################

 Running the entrypoint file[dbx][2022-12-11 21:23:08.580] Execution failed, please follow the given error

---------------------------------------------------------------------------

ModuleNotFoundError            Traceback (most recent call last)

<command--1> in <module>

   1 from pyspark.sql import SparkSession

----> 2 import databricks.utils.GenericUtils as GenUt

   3 

   4 spark = SparkSession \

   5   .builder \

/databricks/python_shell/dbruntime/PythonPackageImportsInstrumentation/__init__.py in import_patch(name, globals, locals, fromlist, level)

  165       # Import the desired module. If youโ€™re seeing this while debugging a failed import,

  166       # look at preceding stack frames for relevant error information.

--> 167       original_result = python_builtin_import(name, globals, locals, fromlist, level)

  168 

  169       is_root_import = thread_local._nest_level == 1

ModuleNotFoundError: No module named 'databricks.utils'

#############################################

Below is the deployment.yaml

build:

no_build: true

environments:

default:

workflows:

- name: "dbx-demo-job"

spark_python_task:

python_file: "file://src/databricks/apps/test.py"

Below is the dbx command used

dbx execute --cluster-id <cluster_id> dbx-demo-job --no-package --debug

4 REPLIES 4

Aviral-Bhardwaj
Esteemed Contributor III

@Sasidhar Reddyโ€‹  it is working for me

image 

If you like or get any hint from my answer please upvote it.

Thanks

Aviral Bhardwaj

AviralBhardwaj

 sasidhar,

Are you able to solve the "module not found" error

Thanks,

Prakasam.

Rajeev_Basu
Contributor III

it works for me too and didnt face the "module not found" error.

Meghala
Valued Contributor II

Even I got errorโ€‹

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group