cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Asset Bundles: Adding project_directory in DBT task breaks previous python task

Mathias_Peters
Contributor

Hi, 

I have a job consisting of three tasks: 

      tasks:
        - task_key: Kinesis_to_S3_new
          spark_python_task:
            python_file: ../src/kinesis.py
            parameters: ["${var.stream_region}", "${var.s3_base_path}"]
          job_cluster_key: general_cluster
        # Run delta live view
        - task_key: delta_live_view_file
          pipeline_task:
            pipeline_id: ${resources.pipelines.dlt_file_pipeline.id}
          depends_on:
            - task_key: Kinesis_to_S3_new
        # Run dbt
        - task_key: dbt
          depends_on:
            - task_key: delta_live_view_file
          dbt_task:
            project_directory: ./dbt
            commands:

Without the last task (the dbt task), the bundle can be validated, deployed and executed fine. After adding the dbt task, I receive an error when deploying the bundle: 

Error: cannot create job: Invalid python file reference: ../src/kinesis.py

Does the project_directory element cause this? Did I use it incorrectly and if so, how is it done correctly?

best

Mathias

2 REPLIES 2

Ajay-Pandey
Esteemed Contributor III

Hi @Mathias_Peters 

The issue is related to project directory only, Can you please try to create same job using UI and extract the YAML and check if there is any changes requires in code.

Also can you confirm if you are running the code from Workspace or GIT

Mathias_Peters
Contributor

Hi @Ajay-Pandey ,

thank you for the hints. I will try to recreate the job via UI. I ran the tasks in a Github workflow. The file locations are mixed: the first two tasks (python and dlt) are located in the databricks/src folder. The dbt files come from git.