cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Scheduling a Complete Python Project in Databricks

naineel
New Contributor

Hi everyone,

I have a simple Python project with the following structure:

root/  
│── src/  
│   ├── package_name/  
│   │   ├── __init__.py  
│   │   ├── main.py  
│   │   ├── submodules1/  
│   │   │   ├── __init__.py  
│   │   │   ├── base1.py  
│   │   ├── submodules2/  
│   │   │   ├── __init__.py  
│   │   │   ├── base2.py  
│── pyproject.toml  

My primary concern is that while I can schedule a notebook or a single Python file in Databricks, I am unable to schedule an entire project that depends on main.py. The project functions as a CLI application and requires arguments to run.

I need guidance on how to properly schedule this complete Python project in Databricks. Any suggestions or best practices would be greatly appreciated!

Thanks in advance.

1 REPLY 1

ashraf1395
Honored Contributor

Hi there @naineel , one approach can be you can convert your project into a whl file
and then create a python whl task for it and schedule

https://docs.databricks.com/aws/en/jobs/python-wheel

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now