Asset Bundles: Adding project_directory in DBT task breaks previous python task
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-15-2024 01:39 PM
Hi,
I have a job consisting of three tasks:
tasks:
- task_key: Kinesis_to_S3_new
spark_python_task:
python_file: ../src/kinesis.py
parameters: ["${var.stream_region}", "${var.s3_base_path}"]
job_cluster_key: general_cluster
# Run delta live view
- task_key: delta_live_view_file
pipeline_task:
pipeline_id: ${resources.pipelines.dlt_file_pipeline.id}
depends_on:
- task_key: Kinesis_to_S3_new
# Run dbt
- task_key: dbt
depends_on:
- task_key: delta_live_view_file
dbt_task:
project_directory: ./dbt
commands:
Without the last task (the dbt task), the bundle can be validated, deployed and executed fine. After adding the dbt task, I receive an error when deploying the bundle:
Error: cannot create job: Invalid python file reference: ../src/kinesis.py
Does the project_directory element cause this? Did I use it incorrectly and if so, how is it done correctly?
best
Mathias
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-15-2024 07:44 PM - edited 05-15-2024 07:45 PM
The issue is related to project directory only, Can you please try to create same job using UI and extract the YAML and check if there is any changes requires in code.
Also can you confirm if you are running the code from Workspace or GIT
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-16-2024 12:33 AM
Hi @Ajay-Pandey ,
thank you for the hints. I will try to recreate the job via UI. I ran the tasks in a Github workflow. The file locations are mixed: the first two tasks (python and dlt) are located in the databricks/src folder. The dbt files come from git.

