cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Databricks -Terraform- (condition_task)

RajaPalukuri
New Contributor II

Hi Team ,

I am planning to create IF/ELSE condition task in databricks using terraform code . My requirement is 

Task A ( Extract records from DB and Count recs) --> Task B ( validate the counts using Condition_task) --> Task c ( load data if Task B validate the counts >0)

I am able to implement it in databricks manually but trying to implement the same using Terraform code . Could you please help with this coditional task in data bricks . How can i refer "Total_record_counts' set in the task A is refered  in Task b using condition task.

 following sample code. 
 
   dynamic "task"{
     for_each = var.map_of_tables # this is map variable of tables 
   content {
task_key = "${var.env}_validate_total_rec_count_${lower(task.key)}"
run_if = "ALL_SUCCESS"
depends_on{
         task_key = "${var.env}_get_total_rec_count_${lower(task.key)}"   -- Task to get the record counts 
}
 
condition_task {         # conditional task 
  left = "{{tasks.${var.env}_get_total_rec_count_${lower(task.key)}.values.total_rec_count}}"    # facing issue with this how can i refer variable which was set in predecessor notebook task
        op = "GREATER_THAN"
        right = 0
      }
   

Any help with sample code will be helpful

 

 

2 REPLIES 2

Hi Kaniz,

Thank you for replying back to my request. Yes I am consolidating all table record counts using notebook task (Task A) and passing 'Total_rec_counts' to condition_task (Task B) . It validate the 'Total_rec_counts' and decide to run DLT pipeline. 

code 

 

 

dynamic "task" {
  for_each = var.map_of_tables
  content {
    task_key = "${var.env}_validate_total_rec_count_${lower(task.key)}"
    run_if = "ALL_SUCCESS"
    depends_on{
      Task_keY="${var.env}_get_total_rec_count_${lower(task.key)}" # Task to get the record counts
    }

    condition_task {
      left = "{{tasks.${var.env}_get_total_rec_count_${lower(task.key)}.values.total_rec_count}}"
      op = "GREATER_THAN"
      right = 0
    }
  }
}

 

Error :-  it is not able to identify left operand

Error: cannot update job: The "left" operand of the if/else condition alid reference. Invalid reference: '{{tasks.env-HLQ_UNIT_TEST_get_total_rec_count.values.total_rec_count}}'. '{{tasks.env-HLQ_UNIT_TEST_get_total_rec_count.values.total_rec_count}}' is unknown.

please suggest how to fix it . My dependency syntax is little different than yours. The syntax you proposed was giving error.

 

 

Hi Kaniz,

To give you more details above request , In task A where we consolidate all tables records counts we are doing it through notebook task through python program. ''Total_rec_counts' , is out put variable which is set in python program . How could we refer it in the validation task in Terraform ?. 

There is anyway we can create output variable in TASK A in terraform and refer in TASK B ?. If so how to define the output variable in Terraform in TASK A