cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Databricks Job: Package Name and EntryPoint parameters for the Python Wheel file

sandeep91
New Contributor III

I have created Python wheel file with simple file structure and uploaded into cluster library and was able to run the packages in Notebook but, when I am trying to create a Job using python wheel and provide the package name and run the task it fails with Wheel Name not found or package not found.

Can you help us with exact parameters need to be pass into package Name and entrypoint. I followed doc but with no success.

File structure:

| - PythonWheel

| - src

| - __init__.py

| - test1.py

| - test2.py

| - setup.py

Setup.py:

from setuptools import find_packages, setup

setup(

    name="testbronze",

    packages = find_packages(),

    setup_requires=["wheel"],

    description="demo",

    version="0.0.1",

    include_package_data=True,

)

Job PackageName:

testbronze (Note: also tried with "src")

image

1 ACCEPTED SOLUTION

Accepted Solutions

sandeep91
New Contributor III

I have resolved the Issue.

Problem was in Setup.py

In the name parameter I need to provide the proper structure of my package

Note: If your package structure is src.bronze you need to name it exactly.

setup(

  name="src",

  packages = ['.src'],

  description="demo",

  version="0.1",

  author="Sandeep",

  long_description=long_description,

  long_description_content_type="text/markdown",

  license='LICENSE.txt',

  include_package_data=True,

)

And from the Databricks UI:

PackageName: src

Entry_Point : print_name (Function)

image

View solution in original post

5 REPLIES 5

Hubert-Dudek
Esteemed Contributor III

Advancing analytics explain wheel in that highly recommended video https://www.youtube.com/watch?v=nN-NPnfJLNY That video also explaining how to use files in repos which is more better solution than wheel package (as wheel have to be installed on server every time, files in repos can just stay in your git repo)

sandeep91
New Contributor III

Thank you for your reply @Hubert Dudek​ 

But, based on my use case I cannot use Notebook to run the Job.

We need to orchestrate directly through task using python wheel option.

I found many docs pointing to run through Notebook but no documents refer directly to trigger task using wheel file.

sandeep91
New Contributor III

I have resolved the Issue.

Problem was in Setup.py

In the name parameter I need to provide the proper structure of my package

Note: If your package structure is src.bronze you need to name it exactly.

setup(

  name="src",

  packages = ['.src'],

  description="demo",

  version="0.1",

  author="Sandeep",

  long_description=long_description,

  long_description_content_type="text/markdown",

  license='LICENSE.txt',

  include_package_data=True,

)

And from the Databricks UI:

PackageName: src

Entry_Point : print_name (Function)

image

Anonymous
Not applicable

@Sandeep Toopran​ - Thank you for letting us know how you solved the problem!

AndréSalvati
New Contributor III

There you can see a complete template project with (the new!!!) Databricks Asset Bundles tool and a python wheel task. Please, follow the instructions for deployment.

https://github.com/andre-salvati/databricks-template

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group